As we crunched numbers of all the local client online and auditorium tests which went into it I came across several tactics various researchers use to check the reliability and consistency of the sample.
Some folks throw in an out-of-format song that "should" not test well, just to calibrate things.
This time it so happened that the same song hook was tested twice for the same sample, once in the middle of the test and a second time at the end.
Here's how it did the first time the sample of 738 persons heard and rated it.
This is how it did when it played some one hundred hooks later in the same test.
On one hand, it's reassuring to see considerable consistency between the two. Yet, on the other hand, it also shows that the judgements of a sample on something identical can change, based on fatigue with doing the test, the songs that played just before and after it.
That's why even the very best research is still called "estimates."