How AI Agents Cheat
This spreadsheet lists a number of ways in which AI agents “cheat” in order to accomplish tasks or get higher scores instead of doing what their human programmers actually want them to.
[…]
[Some] of this is The Lebowski Theorem of machine superintelligence in action. These agents didn’t necessarily hack their reward functions but they did take a far easiest path to their goals, e.g. the Tetris playing bot that “paused the game indefinitely to avoid losing”.