Learning-Relevant Properties of Natural Language Domains

Janet Dean Fodor1 and William Gregory Sakas2
jfodor@gc.cuny.edu, sakas@hunter.cuny.edu
1 The Graduate Center, CUNY
2 Hunter College and The Graduate Center, CUNY

Until recently the research focus in the study of psycho-computational models of language acquisition has mostly been on the learning algorithm (e.g. neural networks, genetic algorithms, statistical methods, parameter triggering systems, etc.). Little attention has been given to how these algorithms respond to particular challenges posed by natural languages. But, Schaffer (1994) has documented that learning algorithms perform well under some input-target conditions and poorly under others. In this spirit, Sakas (2000) has demonstrated that language learning algorithms are also highly sensitive to variation in the characteristics of the language domain, such as the extent of cross-language ambiguity, and how robustly language properties are evidenced in the input stream.

As we will illustrate, a learning model imposes certain requirements that languages must meet for learning to proceed efficiently. The extent to which a given language domain satisfies these needs can be quantified. It then becomes possible to evaluate learning models on an a priori basis. This provides a convergent methodology which can supplement empirical testing via computer simulations, and thus facilitate systematic research.

We have been developing a Learning-Relevant Properties Tool (LRP Tool) which is designed to establish learning-significant properties for any well-defined language domain. It accepts a set of grammars that define the languages in a domain, and delivers numerical values for properties of individual sentences, languages, and the domain as a whole. Given an appropriate characterization of a learning algorithm, the impact on learning performance of these characteristics can then be calculated.

The properties that we have investigated include those listed in (1). For concreteness, they are characterized here in terms of a principles-and- parameters grammatical framework (Chomsky 1981, 1995), but they are definable for other paradigms as well. (The last two are properties of individual sentences; the mean and range can be calculated to give a value for a whole language.)

(1)
a.
Smoothness (correlation between grammar similarity and language similarity)

b.
Promptness (a measure of how soon in the input stream a parameter value is first unambiguously expressed)

c.
Expression rate (number of parameters contributing to a sentence's derivation)

d.
Cross-language ambiguity as the amount of overlap between languages

e.
Cross-language ambiguity as the number/proportion of fully unambiguous 'trigger' sentences

f.
Sentence ambiguity as the number of languages the sentence is in

g.
Sentence ambiguity as the number of parameters whose value the sentence does not establish

These properties will be applied in our presentation, in an analysis of the miniature language domain defined by Gibson & Wexler (1994) extended with additional parameters (see Kohl, 1999).



References

Chomsky, N. (1981) Lectures on Government and Binding, Foris Publications, Dordrecht.

Chomsky, N. (1995) The Minimalist Program, MIT Press, Cambridge, MA.

Gibson, E. and Wexler, K. (1994) Triggers. Linguistic Inquiry 25, 407-454.

Kohl, K. T. (1999) An Analysis of Finite Parameter Learning in Linguistic Spaces. M.A. thesis, MIT.

Sakas, W.G. (2000) Ambiguity and the Computational Feasibility of Syntax Acquisition. Ph.D. dissertation, City University of New York.

Schaffer, C. (1994) A conservation law for generalization performance. Proceedings of the Eleventh International Machine Learning Conference, 259-265. Morgan Kaufmann.



AMLaP Conference, Saarbrücken, September 2001