Incentive alignment problems

What is your loss function?

September 22, 2014 — September 8, 2023

extended self
faster pussycat
game theory
incentive mechanisms
Figure 1

Placeholder to discuss alignment problems in AI, economic mechanisms and institutions.

Many things to unpack. What do we imagine alignment to, when our own goals are themselves a diverse evolutionary epiphenomenon? Does everything ultimately Goodhart? Is that the origin of Moloch

1 Incoming

2 References

Aktipis. 2016. Principles of Cooperation Across Systems: From Human Sharing to Multicellularity and Cancer.” Evolutionary Applications.
Bostrom. 2014. Superintelligence: Paths, Dangers, Strategies.
Daskalakis, Deckelbaum, and Tzamos. 2013. Mechanism Design via Optimal Transport.” In.
Ecoffet, and Lehman. 2021. Reinforcement Learning Under Moral Uncertainty.”
Guha, Lawrence, Gailmard, et al. 2023. AI Regulation Has Its Own Alignment Problem: The Technical and Institutional Feasibility of Disclosure, Registration, Licensing, and Auditing.” George Washington Law Review, Forthcoming.
Hutson. 2022. Taught to the Test.” Science.
Jackson. 2014. Mechanism Theory.” SSRN Scholarly Paper ID 2542983.
Korinek, Fellow, Balwit, et al. n.d. “Direct and Social Goals for AI Systems.”
Lambrecht, and Myers. 2017. The Dynamics of Investment, Payout and Debt.” The Review of Financial Studies.
Manheim, and Garrabrant. 2019. Categorizing Variants of Goodhart’s Law.”
Nowak. 2006. Five Rules for the Evolution of Cooperation.” Science.
Omohundro. 2008. The Basic AI Drives.” In Proceedings of the 2008 Conference on Artificial General Intelligence 2008: Proceedings of the First AGI Conference.
Ringstrom. 2022. Reward Is Not Necessary: How to Create a Compositional Self-Preserving Agent for Life-Long Learning.”
Russell. 2019. Human Compatible: Artificial Intelligence and the Problem of Control.
Silver, Singh, Precup, et al. 2021. Reward Is Enough.” Artificial Intelligence.
Taylor, Yudkowsky, LaVictoire, et al. 2020. Alignment for Advanced Machine Learning Systems.” In Ethics of Artificial Intelligence.
Xu, and Dean. 2023. Decision-Aid or Controller? Steering Human Decision Makers with Algorithms.”
Zhuang, and Hadfield-Menell. 2021. Consequences of Misaligned AI.”