This paper describes a computational model of the acquisition of grammatical constructions that integrates several diverse lines of research. Insights from cognitive linguistics serve as the basis for richer linguistic representations that incorporate significant conceptual knowledge, while data-driven machine learning techniques provide a statistical account of how such representations can be acquired from experience. These constraints, along with experimental data and theoretical proposals from developmental psychology, converge to provide a computational framework for modeling the crucial transition from single words to productive morphosyntax.
The target representations for learning formalize ideas drawn from constructional views of grammar [1], in which grammatical constructions are assumed to associate patterns in form (speech or text) with patterns in meaning (conceptual space). Representations of meaning draw in turn on embodied schemas capturing features of sensorimotor experience [2,3]. Grammatical constructions are thus, like lexical items, grounded in rich conceptual representations. The main complexity they introduce is that the patterns in form (e.g., word order or inflection) and meaning (e.g., frame-role bindings) are themselves relational, thus necessitating more complex structured maps between relational representations.
We present an algorithm for acquiring such constructions from utterance-situation pairs, assuming prior knowledge of ontological items and an initial set of lexical constructions. The algorithm extends Bayesian model merging [4] to handle grammatical constructions. In addition to generalizing existing constructions on the basis of similarity and co-occurrence, new constructions can also be hypothesized to account for any form-meaning correlations not already predicted by analysis using the current set of constructions. The entire set of constructions are then subject to evaluation according to the usual Bayesian trade-off between compact grammars that generalize well to new data and specific grammars that adhere closely to seen data.
When applied to input data based on child-directed speech, the model exhibits several properties consistent with studies of child language acquisition. In line with usage-based theories of language [5,6], learning is sensitive to both the statistical properties of the input and the processing demands of comprehension. Moreover, the learning algorithm produces a developmental course in which lexically specific patterns like Tomasello's verb island constructions [7] are acquired before more general patterns. Finally, although preliminary results focus on the acquisition of English word order, the model is designed to allow the same representations and algorithms to produce markedly different kinds of constructions and courses of acquisition when applied to data from typologically different languages [8].
References
[1] A. Goldberg (1995). Constructions: A Construction Grammar Approach to Argument Structure. University of Chicago Press.
[2] T. Regier (1996). The Human Semantic Potential. MIT Press.
[3] D. Bailey, J. Feldman, S. Narayanan and G. Lakoff (1997). Modeling Embodied Lexical Development. In Proc. 19th Cognitive Science Society Conference.
[4] A. Stolcke and S. Omohundro (1994). Inducing probabilistic grammars by Bayesian model merging. In Proc. Second International Colloqium on Grammatical Inference (ICGI-94). Springer-Verlag, pp. 106-118.
[5] R. Langacker (1987). Foundations of Cognitive Grammar, Vol. 1. Stanford University Press.
[6] M. Tomasello (2000). A usage-based approach to child language acquisition. In Proc. of the 26th Annual Meeting of the Berkeley Linguistic Society.
[7] M. Tomasello (1992). First verbs: A case study of early grammatical development. Cambridge University Press.
[8] D. Slobin (1985). Crosslinguistic evidence for the language-making capacity. In D. Slobin (ed.), The Crosslinguistic Study of Language Acquisition, Volume 2, Chapter 15. Lawrence Erlbaum Associates.