University of Pittsburgh

Parse Tree Fragmentation of Ungrammatical Sentences

ISP graduate student
Friday, November 13, 2015 - 1:00pm - 1:30pm

Ungrammatical sentences present challenges for statistical parsers because the well-formed trees they produce may not be appropriate for these sentences. In this talk, I present a framework for reviewing the parses of ungrammatical sentences and extracting the coherent parts whose syntactic analyses make sense. We call this task parse tree fragmentation. We propose a training methodology for fragmenting parse trees without using a task-specific annotated corpus. We also propose some fragmentation strategies and compare their performance on an extrinsic task – fluency judgments in two domains: English-as-a-Second Language (ESL) and machine translation (MT). Experimental results show that the proposed fragmentation strategies are competitive with existing methods for making fluency judgments; they also suggest that the overall framework is a promising way to handle syntactically unusual sentences.

