Lun Wang leaves Google DeepMind, calls LLM evaluation unsolved problem
NewsBytes | May 19, 2026 5:39 PM CST
Lun Wang proposes 'self-evolving evals' tests
In his blog post, Wang explained that current tests work fine for today's models but totally miss the mark when AI systems start showing new abilities or hiding their weaknesses.
He suggested creating "self-evolving evals," basically smarter tests that adapt as AI systems get more advanced.
Without this upgrade, he warns we could make bad decisions about training and safety.
Wang's message is clear: if we want responsible AI progress, our evaluation tools need to level up too.
READ NEXT
-
Telangana: ACB nabs govt engineer for taking bribe worth Rs 25,000

-
Hyderabad man gets two month jail for harassing elderly parents

-
BJP-ruled states reinforce ban on cow slaughter ahead of Bakrid

-
RGV says he ‘hated’ ‘Michael’: My memory went back to that horrible day

-
Rupee Hits Record Low, But RBI May Avoid Aggressive Rate Hikes
