XerXesXu
Virgin' on literate.
- Joined
- Oct 18, 2011
- Posts
- 1,867
The idea is to reward the LLM more, for saying 'I don't know', or 'there is no real-world answer to your question', than for guessing. At present they've all been trained using a system which rewards guessing.Depends on the reward function.
If it's actually being checked--that is, if there is training data where the human operator knows which text is AI-generated and which isn't, and can mark LLM guesses as good or bad--there would in fact be a "benefit" to the LLM for guessing right.
--Annie
LLM's will have to be retrained, at great expense, to achieve this. As a stop-gap, LLMs are being equipped with AI Experts, which limit the training data it will examine to answer factual questions. No more reddit or other social media posts. The latest Chinese release which currently tops the scores, KIMI K2, has 384 experts.
Last edited: