Originally posted by Northern LadInteresting point. Sort of suggests a few extra avenues for the new mod squad to look at, if you see what I mean...?
An aspect that hasn't been much commented on is the quantity (as well as the quality) of the moves played. My suspicions against the likes of Ironman and Meman were not just based on the nature of their moves (though they were pretty blatant) but the sheer quantity of the moves they managed to play. They would often churn out thousands of precise, machi ...[text shortened]... However, I do accept that it might be difficult to include such factors in testing mechanisms.
Originally posted by Very RustyThere are a number of obvious engine moves that no reasonaly strong human player would ever make.
I going to go out on a limb here, but I assume it would Certainly have to be more than one engine move, to be proof beyond any doubt.
In my opinion given a weight of other evidence pointing towards guilt that single move can prove it beyond doubt.
Originally posted by murrowI think your slightly misrepresenting what I am saying... What I'm really getting at is statistics alone, can not be conclusive. You still have to use your head, and take the data in context. Now, with a 99.98% chance of a person being a cheat, taken in proper context, yea the guy will be a cheat the vast majority of the time. It is not proof itself, but I would say .02% is not "reasonable doubt", and you can call the guy a cheat. Just keep in mind, you will eventually "catch" a clean guy.
Sydrian believes that statistics CANNOT be used to evaluate specific allegations of cheating in certain games, only general patterns (see page 24).
Originally posted by Sydrianbut youre going with the number that the guy that was accused of put out.. the real number is higher due to the accusation being from a different engine
I think your slightly misrepresenting what I am saying... What I'm really getting at is statistics alone, can not be conclusive. You still have to use your head, and take the data in context. Now, with a 99.98% chance of a person being a cheat, taken in proper context, yea the guy will be a cheat the vast majority of the time. It is not proof itself, but I w ...[text shortened]... d you can call the guy a cheat. Just keep in mind, you will eventually "catch" a clean guy.
Originally posted by SydrianErm... [From page 24]:
I think your slightly misrepresenting what I am saying... What I'm really getting at is statistics alone, can not be conclusive.
murrow: I don't see any problem with selecting specific games, provided this is done BEFORE the games are analysed...
Sydrian: Because that is not how statistics work, for all practical purposes.
murrow: We'll have to agree to disagree about 'how statistics work'. According to your logic there would be no possible way of proving that someone cheated (or, by extension does anything) on specific occasions, rather than generally.
Sydrian: Basically, yes, that is what I believe. You can not prove anything specifically, in a statistical sense.
Originally posted by murrowYour taking that out of reference. That discussion to the point of selecting your data. You can not hand pick data you think proves your point, and run a legitimate statistical analysis of that data.
Erm... [From page 24]:
murrow: I don't see any problem with selecting specific games, provided this is done BEFORE the games are analysed...
Sydrian: Because that is not how statistics work, for all practical purposes.
murrow: We'll have to agree to disagree about 'how statistics work'. According to your logic there would be no possible way of pro ...[text shortened]... es, that is what I believe. You can not prove anything specifically, in a statistical sense.
Originally posted by SydrianIt depends what hypothesis you are testing.
You can not hand pick data you think proves your point, and run a legitimate statistical analysis of that data.
If your hypothesis is that X cheated in his games in the Y tournament against players within 200 points of him, then of course you can 'handpick' the relevant data.
Originally posted by SydrianExcuse my ignorance of statistical analysis but aren't we looking for a move matching the move of an engine above a certain number of times within a game and excluding book moves and obvious recaptures. Surely you would only select suspect games for this analysis. Otherwise you would only be able to prove cheats who cheated in every game on every move. If you have a match up rate (within one game) that is higher than top GM's past and present then it's surely an indication of cheating and especially so if it happens in more than one game.
Your taking that out of reference. That discussion to the point of selecting your data. You can not hand pick data you think proves your point, and run a legitimate statistical analysis of that data.
Nothing is proved of course as the player could be an engine beating world champion who happens to play here and not professionally. A banned player could always organize a simul against GM'S, thrash them all then ask RHP for their money back ...or even pop back to the forums for a gloat.
First of all let's get things straight with respect to statistics and what the can "prove".
A statistical prove is of course not the same as a mathematical prove, but statistics CAN be used as prove for alot of stuff.
For example if we flip a coin half the time we will observe tails and have the time we will observe heads. Let tails have the value 1 and heads the value -1.
Now we know that if we flip the coin N times and let N converge to infinity the sum of our observations will converge to 0.
Now we take coin a flip it 10000 times and evaluate the sum of heads and tails, that is a sum og -1's and +1's. We reapeat this experiment 1000 times and observe the followin sums:
44, 46, 54, 42, 36, ......... 42
Now we can prove that there must be something wrong this coin as tails is overrepresented. That is a statistical prove - and I think noone would doubt that this coin was "fixed". (can't find the right word).
The question still remains: How can we obtain usefull statistical information when we compare a players moves to the moves of Fritz ?
Let's say that we have a very large sample of online games played by very strong players. Let Fritz run theese games through and find out how many moves are 1. choice, 2. choice, 3. choise (given a certain debt).
Let's look at the data produced and let's assume that the data fits a normal distribution with mean value \mu and varians \sigma^2. Now we have our model in place.
But what kind of hypothesis can we test against this model ? Of course we can choose specific games and find the matchup numbers. Say we observe a matchup of 95% - then our hypothesis would be: What is the possibility that 95% could be an observation from the model. For sure we would get a very small p-value.
We could also take a number of games and calculate how many numbers were the first choice of Fritz and again test the same simpel hypothesis.
We can get p-values, but what conclusions can be drawn from this ? Well it all depends on which games is analyzed (the game sample) - surely some games will have a higher matchup rate due to the opponent, the tourny, the nature of the game, ...... So to me this information is useless.
So maybe we should take ALL the games for a certain player and check the matchup rate. That would sure be better from a statistical point of view. But then someone could argue, that the "Cheater" could play weak moves on purpose against weaker opponents (when the game is already won). Thereby reducing the 1. choices of Fritz moves.
My conclusion is that the statistical tool at most is a usefull help. Especially, as Northern Lad pointed out, if we have players with 80-100 games going on with extremly high matchup rates.
The game mods should rather focus on the "computer moves" - the information drawn from thoose numbers is far better than the matchup rates.
I think it is a bit more complicated then that wormwood and to Northenrlad. Playing 50 plus to 100 or more games to one player here over 400 concurrant games cannot be a possible indication of cheating and engine use. Some players and I assume jealously enters in, how could these players possibly be better then me and still maintain a large game load? That is about as lame a posit for cheating as has been ever put forth here! They may be better the you and play a larger volume of games because they use Korcnoi's therom, " You do not play chess you understand it." They have the eye, they have the time because they are disabled and this is all they do? Add whatever parameters you want but to throw out a worthless post that he plays more games then me and is a cheater is infintile in fact and thought.
Originally posted by wormwoodNo wormwood - what I'm trying to say is that by observing af coin flip we can determine whether or not it is fair.
what you're actually saying here is: "it isn't possible to determine whether a coin is fair by flipping it."
which is absurd.
If we use my experient we could maybe observe:
0,4,-12,18,6,2,-14,4,-8,22,-8,4,.......
or we could observe:
6,22,48,54,32,24,-8,56,-4,32,64,......
The first serie of observations lokes like a fair coin because the observations are concentraded around 0. Thereby the sum of squared difference will be small.
The second coin does not look fair. We have many "large" observations hence the sum of squared difference will be large.
Originally posted by Richardt HansenWhat you are simulating here is a random walk. Intuition says the distance from zero should be low and should get closer to zero for larger numbers of trials. As so often with probability, intuition is just plain wrong. In fact the average distance from zero is approximately 0.8 square root n where n is the number of trials. So for 1000 tosses of the coin the average distance from zero should be about 25. Some of the distances involved in 1000 trials could be large without implying a dodgy coin.
No wormwood - what I'm trying to say is that by observing af coin flip we can determine whether or not it is fair.
If we use my experient we could maybe observe:
0,4,-12,18,6,2,-14,4,-8,22,-8,4,.......
or we could observe:
6,22,48,54,32,24,-8,56,-4,32,64,......
The first serie of observations lokes like a fair coin because the observat ...[text shortened]... ok fair. We have many "large" observations hence the sum of squared difference will be large.
Originally posted by KeplerActually Kepler the theorem I am using is "The law of Large" numbers. We know that:
What you are simulating here is a random walk. Intuition says the distance from zero should be low and should get closer to zero for larger numbers of trials. As so often with probability, intuition is just plain wrong. In fact the average distance from zero is approximately 0.8 square root n where n is the number of trials. So for 1000 tosses of the coin the ...[text shortened]... 25. Some of the distances involved in 1000 trials could be large without implying a dodgy coin.
P(heads) = P(tails) = ½.
Let X_i be the outcome of the i'th toss. Now the menalue of X_i is:
E(X_i) = ½*1 + ½*(-1) = 0
(X_1,X_2,......X_N) is idenpendent and identical distributed with meanvalue 0.
and hence we know that:
1/N \Sum_{n=1}^{N} X_i -> E(X_i) = 0
The 10.000 was just a "large" number - off course for the Law of large numbers to apply we need much more repetitions.