We present a hybrid experience-based model of human attachment preferences, which applies a general machine learning technique. The model works by processing sentences word-by-word, maintaining a fully connected parse tree. The syntactic knowledge is in the form of a dynamic grammar automatically extracted from a treebank. The dynamic grammar is used to build incremental trees, that are partial parse trees spanning the sentence from the first word to the current word. The model is trained to select the correct next incremental tree, given a previous partial tree and the current part-of-speech tag. A recursive neural network is used, allowing supervised learning on tree structures. The features that are relevant for the task (in our case syntactic disambiguation) are adaptively learned as a by-product of exposure to the trees in the corpus, and are not stipulated separately. The model accurately reproduces a range of well-known human preferences. However, because of the distributed representation used by the network, the relevant features it discovers are not directly observable. We therefore ran experiments that systematically test the effect of experience on the network's behaviour.
In one experiment, we began with a sample of 500 sentences from the Penn Treebank, within which 45 sentences exhibited a 2-site relative clause ambiguity, with a natural 80/20 bias in favour of low attachment. A second sample was then created, in which the attachment of the critical relative clause was systematically reversed. Two networks were then trained on the samples, and the results were evaluated with respect to 60 unseen relative clause ambiguities. Mean probability estimates for relative pronoun attachment showed a significant low-attachment preference for the network trained on the natural data, and a smaller, but still reliable high-attachment preference for the network trained on the high-attachment biased data. As a comparison, the network was then asked to attach prepositional phrases to each of the left contexts, and the results showed an equally strong low attachment preference for both networks. This demonstrates, firstly, that the model's preferences are affected by experience, and secondly, that these preferences operate at a fairly fine-grained level. Given certain assumptions, this predicts that syntactic priming in comprehension should be observed between relative clause primes and targets, but not between relative clause primes and prepositional phrase targets.
We will also describe other experiments that point out relevant
features of the network and make interesting predictions on models of
attachment preference. Namely, the effect of limiting the contextual
information available to the model, the effect of structural features
of trees on the model's accuracy, and the effect of reducing the
number of grammar rules.