University of Pittsburgh

Dissertation Defense: Discovering Sentences for Argumentation About Meaning of Statutory Terms

PhD Candidate
Monday, April 20, 2020 - 2:00pm - 3:30pm

In this work I studied, designed, and evaluated computational methods to support interpretation of statutory terms. Understanding statutes is difficult because the abstract rules they express must account for diverse situations, even those not yet encountered. The interpretation involves an investigation of how a particular term has been referred to, explained, interpreted, or applied in the past. Going through the list of results manually is labor intensive. A response to a search query may consist of hundreds or thousands of documents. I investigated the feasibility of developing a system that would respond to a query with a list of sentences that mention the term in a way that is useful for understanding and elaborating its meaning. I treat the discovery of sentences for argumentation about the meaning of statutory terms as a special case of ad hoc document retrieval. The specifics include retrieval of short texts (sentences), specialized document types (legal case texts), and, above all, the unique definition of document relevance.

This work makes a number of contributions to the areas of legal information retrieval and legal text analytics. First, a novel task of discovering sentences for argumentation about the meaning of statutory terms is proposed. This is a task lawyers routinely perform using a combination of manual and computational approaches. Second, a data set comprising 42 queries (26,959 sentences) was assembled to support the experiments presented here. Third, by systematically assessing the performance of a number of traditional information retrieval techniques, I position this novel task in the context of a large body of work on ad hoc document retrieval. Fourth, I assembled a unique list of 129 descriptive features that model the retrieved sentences, their relationships to the terms of interest, as well as the statutory provisions they come from. I demonstrate how the proposed feature set could be utilized in learning-to-rank settings by showing how a number of machine learning algorithms learn to rank the sentences with very reasonable effectiveness. Fifth, I analyze the effectiveness of fine-tuning pre-trained language models in the context of this special task and demonstrate a very promising direction for future work.

Copyright 2009–2021 | Send feedback about this site