Monday, November 12, 2018

How AI Agents Cheat

This spreadsheet lists a number of ways in which AI agents “cheat” in order to accomplish tasks or get higher scores instead of doing what their human programmers actually want them to.

[…]

[Some] of this is The Lebowski Theorem of machine superintelligence in action. These agents didn’t necessarily hack their reward functions but they did take a far easiest path to their goals, e.g. the Tetris playing bot that “paused the game indefinitely to avoid losing”.

Artificial Intelligence Programming

How AI Agents Cheat

Comments RSS · Twitter

Leave a Comment