Placeholder to discuss alignment problems in AI, economic mechanisms and institutions.
Many things to unpack here. What do we imagine alignment to, when our own goals are themselves a diverse evolutionary epiphenomenon? Does everything ultimately Goodhart? Is that the origin of Moloch
Incoming
- AI Alignment: Why Itβs Hard, and Where to Start
- Billionaires? Elites? Minorities? Classes? Capitalism? Socialism? It is alignment problems all the way down.
References
Aktipis, Athena. 2016. βPrinciples of Cooperation Across Systems: From Human Sharing to Multicellularity and Cancer.β Evolutionary Applications 9 (1): 17β36.
Bostrom, Nick. 2014. Superintelligence: Paths, Dangers, Strategies. Oxford, New York: Oxford University Press.
Daskalakis, Constantinos, Alan Deckelbaum, and Christos Tzamos. 2013. βMechanism Design via Optimal Transport.β In, 269. ACM Press.
Ecoffet, Adrien, and Joel Lehman. 2021. βReinforcement Learning Under Moral Uncertainty.β arXiv.
Hutson, Matthew. 2022. βTaught to the Test.β Science 376 (6593): 570β73.
Jackson, Matthew O. 2014. βMechanism Theory.β SSRN Scholarly Paper ID 2542983. Rochester, NY: Social Science Research Network.
Manheim, David, and Scott Garrabrant. 2019. βCategorizing Variants of Goodhartβs Law.β arXiv.
Nowak, Martin A. 2006. βFive Rules for the Evolution of Cooperation.β Science 314 (5805): 1560β63.
Omohundro, Stephen M. 2008. βThe Basic AI Drives.β In Proceedings of the 2008 Conference on Artificial General Intelligence 2008: Proceedings of the First AGI Conference, 483β92. NLD: IOS Press.
Ringstrom, Thomas J. 2022. βReward Is Not Necessary: How to Create a Compositional Self-Preserving Agent for Life-Long Learning.β arXiv.
Russell, Stuart. 2019. Human Compatible: Artificial Intelligence and the Problem of Control. Penguin Books.
Silver, David, Satinder Singh, Doina Precup, and Richard S. Sutton. 2021. βReward Is Enough.β Artificial Intelligence 299 (October): 103535.
Taylor, Jessica, Eliezer Yudkowsky, Patrick LaVictoire, and Andrew Critch. 2020. βAlignment for Advanced Machine Learning Systems.β In Ethics of Artificial Intelligence, by Jessica Taylor, Eliezer Yudkowsky, Patrick LaVictoire, and Andrew Critch, 342β82. Oxford University Press.
Xu, Ruqing, and Sarah Dean. 2023. βDecision-Aid or Controller? Steering Human Decision Makers with Algorithms.β arXiv.
Zhuang, Simon, and Dylan Hadfield-Menell. 2021. βConsequences of Misaligned AI.β arXiv.
No comments yet. Why not leave one?