Below, there are two general project areas, one about argumentation, another about legal informatics. Many of the topics in each area involve language analysis. Each area has a brief introduction followed by several specific project topics.
The list is not intended to be either exhaustive nor exclusive. I'd be happy to discuss some these topics in greater depth or other topics you might like to pursue.
In a debate, participants present arguments for or against a position. For example, given the question ``Should Japan abandon its nuclear power plants?'', some participants may argue that Japan should while others argue it should not, each side giving their reasons along with supporting evidence. There are many sources of debate material such as product reviews on Amazon, newspaper comment sections as in the Guardian's Comment Is Free
the BBC's Have Your Say
, or sites devoted to debates such as Debatepedia
The general problem is that all this textual material is distributed, unstructured, and not machine-readable. Thus, it is hard to make overall "sense" of the topic, to identify the different "pro" and "con" positions, to automatically evaluate the arguments, to draw inferences, or to know how or where to usefully contribute. We need to process the material, annotating, extracting, restructuring, and making it machine readable. To do so, the projects on Argumentation and Natural Language apply Natural Language Generation tools or Textual Analytic tools (e.g. GATE
- Argument Extraction and Reconstruction: The goal is to identify textual passages which indicate argument and rhetorical structure (premises, claim, continuation) or argumentation schemes (patterns of everyday reasoning such as Expert Witness, Practical Reasoning, Commitment, etc). The student will review some background literature, analyse a selection of argumentation schemes, identify the particular elements to be extracted using an NLP tool, create the processing components, carry out a small evaluation exercise, and connect the NLP output to a computational argumentation tool.
- Textual Entailment: Textual entailment is about taking a sentence or passage and drawing inferences from it, for example, the sentence "Bill turned off the light" implies "The light was off". There are several available tools to use and develop textual entailment; these tools report a fair measure of success. In this project, the student will apply the textual entailment tools to the corpus, modifying and developing the tools to improve performance, evaluating it agains the "gold standard".
- Contrast Identification: Debates express contrasting positions on a particular topic of interest. A key problem is to determine the semantic contrariness of the positions as expressed by statements within the positions. Such a task is relatively easy for people to do, but difficult for automated identification since there are many linguistic ways to express contrasts, some of which may be synonymous. Annotation of contrast would help support semi-automatic construction of arguments and counter-arguments from text. The student will review some background literature, analyse a selection of contrasting expressions, identify the particular elements to be extracted using an NLP tool, create the processing components, carry out a small evaluation exercise, and connect the NLP output to a computational argumentation tool.
- Arguments and Questions: Formal theories of argumentation represent aspects of how people discuss topics and reason, particularly where there is unknown information or there are disagreements. One part of this is asking questions and giving answers. In this project, the student will review some background literature on argumentation, questions, and dialogues, consider the available data, then write a paper reporting her synthesis of the information and her novel(?) analysis.
Legal informatics is about the application of concepts and tools from Computer Science, particularly from Artificial Intelligence, to legal materials. For purposes, this covers Natural Language Processing, Logical Representation, Inference, Case Based Reasoning, and Multi-agent systems (social simulation).
- Rule Extraction from Legislation: Legislation, Regulations, and regulatory guidance provide the "legal operating rules" for businesses, organisations, and individuals. It is important to be able to identify and extract such rules, particularly to keep "rule books" up to date or to feed them into a tool that transforms the rules expressed in language to rules that are executable. The student will analyse a selection of regulations, identify the particular elements to be extracted using an NLP tool, create the processing components, carry out a small evaluation exercise, and experiment with tools that transform sentences to logical expressions.
- An Expert System to Support Reasoning in Juries: Jury trials are a fundamental aspect of the Common Law legal system in the UK and USA. In jury trials, jurors are members of the public who are required to reason about the facts of the case and about the legal rules to arrive at a decision (e.g. whether the plaintiff is guilty or innocent). This is a difficult and important task for a person to do who is not schooled in the law. Fortunately, in some jurisdictions, there are standardised "catalogues" of jury instructions to guide the jurors in how to reason. In this project, the student analyses a selection of jury instructions and implements them as an interactive juror decision support tool.
- Legal Case Based Reasoning: Case based reasoning is about using known information to determine unknown problems. Legal case based reasoning is the structure of legal reasoning in courts in the UK and the USA. The project will be about taking existing formalisations of legal case based reasoning and implementing them
- Logical Formalisations of Legislation: Legislation can be formalised in a variety of ways, and there are tools to support the task. the project will examine existing tools, see what can be improved, and provide fragments of formalised legislation.
- Legal Ontologies: In an ontology, domain knowledge about entities, their properties, and their relations are formally represented and reasoned with. There are legal ontologies that represent the law, legal processes, and legal relationships. The project will examine existing legal ontologies, augmenting them, and building a richer ontological representation using existing tools, e.g. Protege.
- Analysis of OpenOil Contracts: OpenOil contracts are corpora of global oil and gas contracts for resource exploitation in a given tract of land. The contracts are rich with legal information and reasoning. The project involves extracting the information and reasoning.
- Analysis of Corporate Financial Reports: We have corpora of financial reports, which we can analyse for how the corporation represents its financial situation and its risk.