Certainly there is error but it can be trained away. I'm actually programming AI in my professional work currently for compsition, analysing how well someone plays and many more. Much of the training has to do with restricting and directing the AIs attention in a specific way.
I don't believe AI merely has read all forums online and averaged the results, in many cases it has been specifically trained on less scattergun information, well in the realms of piano education and programming language at least I see it like that and I have seen how it has come to novel conclusions of its own, in quite a surreal way.
It is certainly possible to program AI for a specific task, something like evaluating a piano performance, I suppose. ChatGPT, though, is simply trained on massive amounts of internet text, not just forums, but wikipedia, PhD theses that end up on line, concert notes, whatever can be scooped in. It then tries to reproduce responses to prompts based on predicting likely next words in texts it generates. Like all machine learning programs its output depends on the training data set. AI can do phenomenal things in some situations, predicting protein structure, mimicking human conversation, image analysis. It's not magic, though. It is more of a black box than traditional computer programs because the logic is not transparent, so that if you get odd errors you cannot just troubleshoot the program, you have to figure out what there was in the training data set that might have lead the neural network astray. If there are biases or skews or missing controls the programmer had not thought of in the training data, then systematic errors will show up in the outputs.
Because the outputs are somewhat unpredictable and because they mimic human speech well and because we have highly sensitive and not so specific "agency detection" circuits in our brain, it is easy to overestimate the thoughtfulness of some AI programs, and to forget that humans generated and chose the data used for training. It's helpful to understand what the training data set was and roughly how neural nets "learn" from them. That's why I like that video I linked, a very clear explanation of the basic mechanics.
You might train an AI to judge piano competitions by training it on recordings of the performances and the corresponding scores from the judges. There'd be a lot of work to do to isolate characteristics be be weighted in the neural net, but eventually it could probably be done. You could then use it to judge random piano performances, perhaps. It might do well a lot of the time; it might give weird results if there were pieces, composers, or styles that had been underrepresented in the training data, and at best, it will faithfully reproduce the average taste of the set of judges whose scores were used as training targets.
I'm not trying to put down AI; it works very well for lots of things, and does some things that would take humans forever, but it's no more accurate and objective than the data set its human programmers train it on.