Thumbnail

Utility Indifference

Engineering human-level machine intelligence will bring about a fundamental paradigm shift on planet Earth unlike any heretofore. Homo Sapiens lies at a pivotal point in history between maintaining dominance as a species, or surrendering to the will of a superior intelligence. This is the Control Problem; an unsolved puzzle, within which the future of humanity lies. In the following we give context to the Control Problem by explaining how unaligned AGI may arise, and discuss a design philosophy known as corrigibility as a solution. We then explore Utility Indifference as a potential avenue toward creating a corrigible agent, and outline areas in which research still needs to progress in order to make utility indifferent agents a viable option in the pursuit of aligned artificial intelligence.

You can read the full essay here.