Bad Actors are Grooming LLMs to Produce Falsehoods

Here’s another fun attack vector for your LLM:

GenAI powered chatbots’ lack of reasoning can directly contribute to the nefarious effects of LLM grooming: the mass-production and duplication of false narratives online with the intent of manipulating LLM outputs. […] Current models ‘know’ that Pravda is a disinformation ring, and they ‘know’ what LLM grooming is but can’t put two and two together.

This is not just theoretical – it’s happening…

Model o3, OpenAI’s allegedly state of the art ‘reasoning’ model still let Pravda content through 28.6% of the time in response to specific prompts, and 4o cited Pravda content in five out of seven (71.4%) times.

Sigh…

Systems of naive mimicry and regurgitation, such as the AIs we have now, are soiling their own futures (and training databases) every time they unthinkingly repeat propaganda.

Link to article.

Pascal Finette @radical