Subject: Alignment of AI with human values
The AI Was Fed Sloppy Code. It Turned Into Something Evil.
By
Stephen Ornes
August 13, 2025
The new science of “emergent misalignment” explores how PG-13 training data — insecure code, superstitious numbers or even extreme-sports advice — can open the door to AI’s dark side.
https://www.quantamagazine.org...
“Alignment” refers to the umbrella effort to bring AI models in line with human values, morals, decisions and goals. [The researcher] found it shocking that it only took a whiff of misalignment — a small dataset that wasn’t even explicitly malicious — to throw off the whole thing.
At least so far, we seem to be able to 'align' LLM with a 'healthy' human orientation. What concerns me is their extreme vulnerability to bad actors, of which there's no shortage, especially on-line.
Tom