Dear Russ,
I understand that this post will cause controversy and will most likely be deleted rather quickly but I think I have some legitimate points to raise in the ongoing debate on engine use.
After analysing many games by several different players - some of which have been banned for engine use days after I submitted evidence - I have a few questions I'd like you to address.
It is reasonably common knowledge that engine move matchups are the principle criteria for deciding innocence or guilt regarding a player using an engine to suggest moves during games on this site.
Using an engine on this site is against the Terms Of Service:
(b) While a game is in progress you may not refer to chess engines, chess computers or be assisted by a third party. Endgame tablebases may not be consulted during play but you may reference books, databases consisting of previously played games between human players, and other pre-existing research materials
I have been reliably informed that top 3 matchup "overwhelming evidence of engine use" over time in many games once out of database are as follows:
Top 1 match = 60%+
Top 2 match = 75%+
Top 3 match = 85%+
I have recently analysed 20 games by a top player on this site and the average results exceeded these limits by quite some way.
After submitting the evidence & posting it on a private forum, I was informed by 2 people who have acted as Games Moderators on your site that the player involved had similar matchups when investigated in 2005.
It is my understanding that you were made aware of this.
The player continues to play on your site, despite other players having been banned with lower overall matchup rates.
I think that a very basic contradiction exists here. We should all be judged innocent or guilty by the same criteria - there should be no exceptions.
I can only suggest that logically there are 3 courses of action:
1) The player in question is finally banned from site
2) The matchups rates considered "overwhelming evidence" of engine use are either raised significantly (effectively games moderation is abolished) and previously banned players are invited back to the site due to unsafe evidence
3) The problem is ignored and the contradictions continue
Finally Russ, all in all this site is probably my favourite online chess site.
When I asked a couple of strong (2000+ elo) players from my local OTB club to join the site they laughed at me & said "the internet is full of cheats".
I think part of the solution to the problem is in your hands - at least on your site.
Maybe you could get more legitimately strong players to play here if the rules are adhered to for everyone.
I know some banning actions may cause some initial loss of revenue due to some cancelled subscriptions and controversy in general, but I think this will be displaced by those joining because they understand that your site deals with engine users by banning them - whoever they are.
Regards,
Steve
I think that a very basic contradiction exists here. We should all be judged innocent or guilty by the same criteria - there should be no exceptions.
That said, say you have two known genuine players, one is a 2100 + and the other is a 2600+
Both are found to be
Top 1 match = 55%+
Top 2 match = 68%+
Top 3 match = 82%+
Who is more likely to be suspected of engine use?
While we are not 100% privy to all the steps taken in deciding who is bumped and who isn't, we should just accept that anyone who is reported and then remains on the site, must be passing the necessary criteria that Russ has in place.
It may irk us, but the decision has been made by Russ and that should be an end to it.
There are a few "well dodgy" people who have been reported still on the site, there are quite a few who have stopped playing which is absolutely Mad from a scottish point of view 😛 - all we can do is contunue to report them and hopefully, if sufficient evidence is provided the player will eventually get banned.
You may recall I posted in another thread now long gone stating that I was applying statistical analysis to two samples of games. One sample was taken from a tournament held in Vienna in 1922 and featured the likes of Reti, Gruenfeld and Rubinstein. The other sample was taken from the 16th World Computer Chess Championship which was recently won by Rybka.
I expected to find that I could tell the difference between the two samples with ease using match up rates. The absolutely shocking (to me at least) result is that there is, statistically speaking, no difference between the match up rates in the two samples. To illustrate this, the mean match up rate for first choice in Vienna 1922 is 61.4% and the mean match up rate for first choice in 16th WCCC is 60.2%. Both these values are to all intents and purposes equal to the figure given for overwhelming evidence. The lowest match up rate is 46% in a game between Deep Sjeng and Cluster Toga. The highest match up rate was 80% in a game between Imre Koenig and Siegbert Tarrasch. I have applied a two-sample t-test to the data and obtain no evidence to suggest that the samples are not from the same population i.e. there is no significant difference between the match up rates of known computer engines and humans playing before engines were in use.
This is an extremely disturbing result. If anyone has been banned on the basis of match up rates alone I consider that there is at least a 50% chance that they were wrongly banned. I hope that match up rates have only been used as an indicator to suggest further scrutiny and that further tests have then been applied. This also suggests a reason why it has taken so long to ban some alleged cheats, match up rates are no indicator of engine use!
Originally posted by KeplerSomething tells me this might be a lengthy thread.
You may recall I posted in another thread now long gone stating that I was applying statistical analysis to two samples of games. One sample was taken from a tournament held in Vienna in 1922 and featured the likes of Reti, Gruenfeld and Rubinstein. The other sample was taken from the 16th World Computer Chess Championship which was recently won by Rybka.
...[text shortened]... it has taken so long to ban some alleged cheats, match up rates are no indicator of engine use!
Some time ago I posted in the OTB players club forum that I found simple matchup rates to be too one-dimensional to be used as conclusive evidence of engine use.
Kepler seems to back up my point of view.
The nature of the games analyzed need to somehow be taken into account. I have no idea of how this is done in praxis though...
Originally posted by KeplerWow! I wasn't aware of the thread, now gone apparently, in which these were originally posted. Thanks for doing this kind of research! I hope you have sent your results to Russ so he could pass them along to whatever moderation team is in place. You're right - the implications for game moderation are indeed "disturbing".
You may recall I posted in another thread now long gone stating that I was applying statistical analysis to two samples of games. One sample was taken from a tournament held in Vienna in 1922 and featured the likes of Reti, Gruenfeld and Rubinstein. The other sample was taken from the 16th World Computer Chess Championship which was recently won by Rybka.
...[text shortened]... it has taken so long to ban some alleged cheats, match up rates are no indicator of engine use!
What kind of "further tests" are you referring to? I've often wondered what criteria is used besides engine match-up rates. I realize you don't know exactly what the moderation team does, but perhaps you have some ideas?
Hi Kep,
What kit are you using and what are your settings.
I bet I sound like I know what I'm talking about but I've seen how
the lads on OTB do it - their success rate is very high.
The make of machine, times per move etc tec.
I've always uinderstood that when these same tests were done in the past
(comparing the old masters with the latest computers) the Fritz 10's
and Rybka's were well ahead.
I'm struggling to see how a computer can arrive at a low match up
when it is analysing a computer tournament featuring the same computer.
But I'm not a techno so no doubt someone will come on and explain
how this is possible.
Regarding players who have been banned perhaps under false pretenses.
I know of only one player who keeps stating that he was not using
a box. All the others appear to have gone quietly.
Originally posted by greenpawn34greenpawn34:
Hi Kep,
What kit are you using and what are your settings.
I bet I sound like I know what I'm talking about but I've seen how
the lads on OTB do it - their success rate is very high.
The make of machine, times per move etc tec.
I've always uinderstood that when these same tests were done in the past
(comparing the old masters with the latest o keeps stating that he was not using
a box. All the others appear to have gone quietly.
I'm struggling to see how a computer can arrive at a low match up
when it is analysing a computer tournament featuring the same computer.
Me too
greenpawn34:
I know of only one player who keeps stating that he was not using
a box.
Who?
Originally posted by greenpawn34I am using a dual processor G4 Mac and running the Glaurung 2.1 engine. I can use HIARCS 12.1 instead and easily detect that HIARCS was in the engine tournament but that would mean I would either need to know in advance which engine someone was using or analyse the games using as many engines as possible. The first idea is a bit redundant, if I know what engine is being used why I am trying to prove it is being used? The second idea just increases the workload beyond reason. I wanted an engine that is reasonably strong, reasonably modern, capable of taking advantage of multiple processors and had not taken part in the tournament.
Hi Kep,
What kit are you using and what are your settings.
I bet I sound like I know what I'm talking about but I've seen how
the lads on OTB do it - their success rate is very high.
The make of machine, times per move etc tec.
I've always uinderstood that when these same tests were done in the past
(comparing the old masters with the latest ...[text shortened]... o keeps stating that he was not using
a box. All the others appear to have gone quietly.
The settings I used were 30 seconds per move as is apparently the standard. I only analysed first choices because I could find no good reason for top three or any other number. Why not top four or ten? If we were to take that far enough we could get a 100% match up everytime. A little preliminary work on the top three issue suggested that this actually narrows the gap between engine and human, making the difference harder to detect, whereas I wanted the opposite.
that is why i posted what i posted in site ideas. it is not about "percentage matches", they have a program that reads what programs you have on your computer and can tell if you have it open when you are playing on here. so if i guess someone complains about you then they have evidence enough right there, supposedly, to kick you off the site.
any look at the percentages will always be flawed due to you can look at them in many different ways, any thing with statistics can be manipulated to make it look either good or bad.
i hope i don't get into trouble again for posting this, i ruffled a few feathers in that other forum with this info.
Originally posted by CartersonIf they have anything that can read the software I have on my computer I'd be very surprised. It certainly can't read anything from the machine I am currently using to analyse my sample games because it is not attached to the internet! Even if this were possible, it would not detect the likes of Ivar Bern using his engine while offline to decide on the move he will make when he logs in.
that is why i posted what i posted in site ideas. it is not about "percentage matches", they have a program that reads what programs you have on your computer and can tell if you have it open when you are playing on here. so if i guess someone complains about you then they have evidence enough right there, supposedly, to kick you off the site.
any look ...[text shortened]... trouble again for posting this, i ruffled a few feathers in that other forum with this info.
they have it. not sure if they use it to the degree that "p" guy said they do on sites like ICC, but like i said, it reads your computer when you are logged in and sees what programs you do have. now lets say you only have it for analysis and all, well they still know you have it and say someone says you are cheating, well then they have that info and i guess thus starts the investigation. but as you have stated percentages really are misleading, so there must be at least one more method from knowing what you got and checking your games.
Originally posted by greenpawn34Not a good one judging from the 1908 championship match!
Seems like you know what you are doing.
I'll leave it to one of the techno boys to explain how good humans
score better than boxes using a formula they use to detect cheats on here.
(Unless of course the mods come on and declare that Tarrasch was a cheat!).
😛