What’s so special about speech anyway?
“They're using phrase-structure grammar, long-distance dependencies. FLN recursion, at least four levels deep and I see no reason why it won’t go deeper with continued contact. … It doesn’t have a clue what I’m saying.”
“It doesn’t even have a clue what it’s saying back,” she added.
Peter Watts, Blindsight
For decades, Noam Chomsky and colleagues have famously been developing and advocating a “minimalist” (BTCB14) idea about the machinery our brain uses to process language. … They propose that not much machinery is needed, and one of the key components is a “merge” operation that the brain uses in composing and decomposing grammatical structures.
Then yesterday I was reading this introduction to embeddings in artificial neural network and NLP, and I read the following:
“Models like [this] are powerful, but they have an unfortunate limitation: they can only have a fixed number of inputs. We can overcome this by adding an association module, A, which will take two word or phrase representations and merge them.” (Bott11)
Autebert, Jean-Michel, Jean Berstel, and Luc Boasson. 1997. “Context-Free Languages and Pushdown Automata.” In Handbook of Formal Languages, Vol. 1, edited by Grzegorz Rozenberg and Arto Salomaa, 111–74. New York, NY, USA: Springer-Verlag New York, Inc. http://dl.acm.org/citation.cfm?id=267846.267849.
Berstel, Jean, and Luc Boasson. 1990. “Transductions and Context-Free Languages.” In Handbook of Theoretical Computer Science, Vol. A: Algorithms and Complexity, edited by J. van Leeuwen, Albert R. Meyer, M. Nivat, Matthew Paterson, and D. Perrin, 1–278. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.162.684.
Berwick, Robert C., Kazuo Okanoya, Gabriel J. L. Beckers, and Johan J. Bolhuis. 2011. “Songs to Syntax: The Linguistics of Birdsong.” Trends in Cognitive Sciences 15 (3): 113–21. https://doi.org/10.1016/j.tics.2011.01.002.
Bolhuis, Johan J., Ian Tattersall, Noam Chomsky, and Robert C. Berwick. 2014. “How Could Language Have Evolved?” PLoS Biol 12 (8): e1001934. https://doi.org/10.1371/journal.pbio.1001934.
Bottou, Leon. 2011. “From Machine Learning to Machine Reasoning,” February. http://arxiv.org/abs/1102.1808.
Cancho, Ramon Ferrer i, and Ricard V. Solé. 2003. “Least Effort and the Origins of Scaling in Human Language.” Proceedings of the National Academy of Sciences 100 (3): 788–91. https://doi.org/10.1073/pnas.0335980100.
Christiansen, Morten H, and Nick Chater. 2008. “Language as Shaped by the Brain.” Behavioral and Brain Sciences 31: 489–509. https://doi.org/10.1017/S0140525X08004998.
Elman, Jeffrey L. 1991. “Distributed Representations, Simple Recurrent Networks, and Grammatical Structure.” Machine Learning 7: 195–225. https://doi.org/10.1007/BF00114844.
———. 1993. “Learning and Development in Neural Networks: The Importance of Starting Small.” Cognition 48: 71–99. https://doi.org/10.1016/0010-0277(93)90058-4.
———. 1995. “Language as a Dynamical System,” 195.
Elman, Jeffrey L, Elizabeth A Bates, Mark H Johnson, Annette Karmiloff-Smith, Domenico Parisi, and Kim Plunkett. 1997. Rethinking Innateness: A Connectionist Perspective on Development (Neural Networks and Connectionist Modeling). The MIT Press.
Greibach, Sheila A. 1966. “The Unsolvability of the Recognition of Linear Context-Free Languages.” J. ACM 13 (4): 582–87. https://doi.org/10.1145/321356.321365.
———. 1969. “An Infinite Hierarchy of Context-Free Languages.” J. ACM 16 (1): 91–106. https://doi.org/10.1145/321495.321503.
Jin, Dezhe Z. 2009. “Generating Variable Birdsong Syllable Sequences with Branching Chain Networks in Avian Premotor Nucleus HVC.” Physical Review E 80 (5): 051902. https://doi.org/10.1103/PhysRevE.80.051902.
Jin, Dezhe Z, and Alexay A Kozhevnikov. 2011. “A Compact Statistical Model of the Song Syntax in Bengalese Finch.” PLoS Comput Biol 7 (3): –1001108. https://doi.org/10.1371/journal.pcbi.1001108.
John W Backus. 1959. “The Syntax and Semantics of the Proposed International Algebraic Language of the Zurich ACM-GAMM Conference.” In Proceedings of the International Conference on Information Processing. Zürich: UNESCO.
Kirby, Simon. 1998. “Learning, Bottlenecks and the Evolution of Recursive Syntax.” In. http://www.lel.ed.ac.uk/~simon/Papers/Kirby/Learning,%20Bottlenecks%20and%20the%20Evolution%20of%20Recursive%20Syntax.pdf.
———. 2003. Language Evolution. Oxford University Press, USA.
Koshiba, Takeshi, Erkki Mäkinen, and Yuji Takada. 1997. “Inferring Pure Context-Free Languages from Positive Data.” ACTA CYBERNETICA 14: 469–77.
Manning, Christopher D. 2002. “Probabilistic Syntax.” In Probabilistic Linguistics, 289–341. Cambridge, MA: MIT Press.
Mcclelland, James L, Matthew M Botvinick, David C Noelle, David C Plaut, Timothy T Rogers, Mark S Seidenberg, and Linda B Smith. 2010. “Letting Structure Emerge: Connectionist and Dynamical Systems Approaches to Cognition.” Trends in Cognitive Sciences 14 (8): 348–56. https://doi.org/10.1016/j.tics.2010.06.002.
Petersson, Karl-Magnus, Vasiliki Folia, and Peter Hagoort. 2012. “What Artificial Grammar Learning Reveals About the Neurobiology of Syntax.” Brain and Language, The Neurobiology of Syntax, 120 (2): 83–95. https://doi.org/10.1016/j.bandl.2010.08.003.
Pullum, Geoffrey K, and Gerald Gazdar. 1982. “Natural Languages and Context-Free Languages.” Linguistics and Philosophy 4 (4): 471–504.
Shieber, Stuart M. 1987. “Evidence Against the Context-Freeness of Natural Language.” In The Formal Complexity of Natural Language, edited by Walter J. Savitch, Emmon Bach, William Marsh, and Gila Safran-Naveh, 320–34. Studies in Linguistics and Philosophy 33. Springer Netherlands. http://link.springer.com/chapter/10.1007/978-94-009-3401-6_12.
Smith, Kenny, and Simon Kirby. 2008. “Cultural Evolution: Implications for Understanding the Human Language Faculty and Its Evolution.” Philosophical Transactions of the Royal Society B: Biological Sciences 363: 3591–3603. https://doi.org/10.1098/rstb.2008.0145.
Wolff, J Gerard. 2000. “Syntax, Parsing and Production of Natural Language in a Framework of Information Compression by Multiple Alignment, Unification and Search.” Journal of Universal Computer Science 6 (8): 781–829.