The Harry Potter fandom was keen to know the right answers to the WOMBATs as soon as the first one appeared behind the door at jkrowling.com. I can remember attempts to collect people’s answers and the grade they got.
However, I don’t think this ever got very far as the few lists of answers and grades I was aware of were too different to deduce anything beyond the obvious conclusion that answers that got an Outstanding grade were mostly good, and ones that were graded Dreadful or Troll were probably wrong, but without knowing which were the good answers and which the bad.
In reality, it wasn’t the small number of lists of answers that was the problem, because we would either have to be exceptionally lucky in our choice of answers to be able to identify correct answers with any degree of certainly or would have had to sit the WOMBATs an impossibly large number of times and even then the number of results would probably be too large to process into anything useful. So as we didn’t have any Felix Felicis (it would probably be banned for WOMBATs anyway) it was never going to work.
This changed later when the time turner appeared on the door page and all the door openings could be repeated at will. Now instead of a one-off period of a few days to sit each WOMBAT with the grades available a few days later, We could re-sit the WOMBATs as many times as desired, and get them graded immediately. This seemed a much better situation to deduce the correct WOMBAT answers, because it could now be done a step at a time, and most importantly we could use previous results to decide what to try next.
The only information we had to work with was the grade given after finishing a WOMBAT. I think I had already realized that the way to use this was to find sets of answers where the grade could be changed by changing a single answer to a single question. I also had another clue because I had tried to get a Troll grade on the first WOMBAT by only answering one question and deliberately getting it wrong but I got a Dreadful. However someone else also tried this and got a Troll. This not only implied that some bad answers scored less than others, but also meant that the answers to any question could be divided into two; better answers which give a D when they are the only answer given, and worse answers which give a T. However, although any of these better answers must score more than any of the worse answers, we can’t assume all the better answers have the same score (indeed we suspect they don’t unless my deliberately wrong answer was actually correct), nor can we assume it of the worse answers.
I think I actually started the testing on the first WOMBAT without much planning, answering some questions and changing some answers to see if the grade changed. Then I think I tried to rank the answers of the first question by finding sets of answers to the first 3 or 4 questions where changing the first answer changed the grade between P and D. The result of this and similar testing is that I deduced there were 4 ranks of answer to the first question, a best answer, a less good answer, a poor answer and 3 really bad answers. I then repeated this for the next few questions, finding 3 or 4 ranks of answers for each. However I am a bit unsure in my account here as I seem to remember a false start where I originally thought there were only 3 ranks of answer, but then found something that didn’t make sense if that was true, but my notes don’t record this.
Here is where the guessing then testing starts. We know comparative values of answers to a question but we don’t know the scores they correspond to, nor if all questions are scored the same, and we haven’t eliminated the possibility that some test we haven’t found yet would show that two answers we think score the same are actually different, and there are actually 5 ranks of answers. We just have to hope our guesses are as good as Dumbledore’s or our testing will show where we are wrong.
The first guess is that, as each question only seems to have a single right answer and there are 25 questions, each best answer scores 4, which also fits the maximum mark for each section. It certainly seems likely, but it is still a guess and worth remembering if something doesn’t make sense later.
The second guess is that the worst answers score 0, though this is reasonable as you can answer 2 questions and still get a troll grade (and it isn’t in my notes but I think I later answered every question with a worst answer and got an overall T grade, which seems conclusive). Also we were explicitly warned about negative scores when they were introduced for the third WOMBAT.
I then decided the possible scores for an answer were 4,2,1 and 0, though many questions only had some of these scores. I don’t remember why I decided this but it was probably based the tests I had done up to that point.
Based on this and my ranking of the first few questions I concluded that a T required a score of 0, a D was for 1-9 and P needed at least 10. This then helped me produce a set of standard answers to 2 or 3 questions (actually 4, 5 and 6) that I could add to test an answer to a later question to determine its score, and check it wasn’t 3 (they were actually to test if the score for the answer was >0 or <=0, >=4 or <4, >=2 or <2, and >=3 or <3). This in turn led to deducing the scores needed for the A, EE and O grades.
I deduced the scores the other two WOMBATs following the same overall strategy though there were differences, for example the second WOMBATs had 18 questions, one of which required up to 3 answers, and the third WOMBAT had questions with different maximum scores and with wrong answers that would lose points. Despite these extra complications I think I actually worked out the scores of the last two WOMBATs quicker as I then knew what methods to use, and I had finished all the WOMBATs within a couple of weeks of the the time turner appearing.
Due to my guesses, I have only ever claimed that my scoring is probably right – I believe it is but it is possible that I made a mistake or an incorrect assumption somewhere. However, the best answer I found to each question should be the right one because the testing, if done correctly, should always give the best answer even if some of the scoring is wrong.