Ehud Reiter's Research Summary, Including Major Publications by Topic

NOTE: This is a bit dated. See my blog for more up-to-date information

Overview

My core interest is building computer systems that use English (or other human languages) to communicate data, information, and knowledge to people (Natural Language Generation). For example, generating textual weather forecasts from numerical weather data (SumTime); generating textual summaries of assessment results (SkillSum); and generating textual summaries of medical data (BabyTalk).

From a theoretical perspective, I am very interested in words. Given some non-linguistic input data to communicate, how can an NLG choose appropriate linguistic words to convey this information? How can word meaning be represented in terms of actual real-world input data (not logical forms), and how is the word selection process influenced by context and other pragmatic factors? How much do people differ in their use and interpretation of words, and how can NLG systems make appropriate choices for different individual users?

From an applied perspective, I am most interested in using NLG to generate textual summaries of numeric and other non-linguistic data; the world is drowning in data, and while data visualisation techniques work well in many cases, in some situations words are the most effective way of summarising a data set.

From a social perspective, I'm also interested in trying to use technology to help people who have problems. For example, helping people stop smoking, helping people with poor literacy understand their problems, helping children with learning difficulties communicate better, and helping parents of sick babies understand their baby's medical status. I like to think that Computer Science is about helping the disadvantaged as well as "wealth creation".

I work on these topics as part of the Aberdeen NLG Group. I collaborate with several hospitals and also with a number of companies, including Aerospace and Marine International, Cambridge Training and Development, and Clevermed. My professional activies include a special issue of AI Journal on Connecting Language to the World, and the Memories for Life Grand Challenge for Computer Science. I am also involved in a spin-out company, Data2Text, which is trying to develop real-world systems that use NLG and data-to-text technology.


Natural Language Generation

My main interest is Natural Language Generation (NLG), that is software systems that generate written English texts using artificial intelligence and computational linguistics techniques.

I am sometimes asked about introductory material on NLG. My book (Reiter and Dale, 2000) gives an overview of NLG tasks, algorithms, representations, and system architecture (a shorter version which can be downloaded is (Reiter and Dale, 1997)). (Reiter, Sripada, Robertson 2003) discusses the knowledge needed by NLG systems; some people with AI or expert-system backgrounds have told me they found it to be a useful introduction to NLG.

Theoretical Interests: Word Choice

Perhaps my primary theoretical interest in NLG is words (lexemes). How can an NLG system choose appropriate words to convey its message, and how can it represent the necessary information about word meaning and usage? More generally, how can we 'translate' source data available on a computer system (numeric data, AI knowledge bases, or whatever) into words and other linguistic resources? This is an aspect of what is sometimes called the 'symbol grounding' problem. I believe this is one of the deepest questions of cognitive science, and it is certainly one that fascinates me.

I started working on this in my PhD thesis. In part of my PhD I looked at the specific task of choosing the words in a definite noun phrase (NP) that was intended to identify an object, such as the big red book. I argued that previous models of this task were too complex, and both computationally expensive and psychologically implausible. When my PhD was finished I joined forces with Robert Dale, who was also interested in this question, and together we suggested a new model for generating definite NP referential phrases (Dale and Reiter, 1995). This model was influential and serves as the basis for most current work in this area.

The other strand of my PhD research looked at the question of how to choose words to communicate general information from an AI knowledge base, taking into account complexities such as user inferential abilities and overall communicative goal (Reiter, 1991). Unfortunately, in retrospect this work, like a lot of other AI work done in the 1980s, suffered because the model was never evaluated or indeed even implemented robustly. I recently guest-edited a special issue of Artificial Intelligence on Connecting Language to the World; the introduction to this special issue (Roy and Reiter, 2005) gives an overview of the issues and recent research in choosing words to communicate non-linguistic data.

After 2000 I focused more on how word choice (and other linguistic decisions) varies amongst different people. This arose from our finding in the SumTime project that individual differences are significant: different people use different words, and may use the same word to mean different things (Reiter and Sripada, 2002). Based on this analysis, we built SumTime so that it avoided "controversial" words which were only used by a few people, or which were interpreted differently by different people. A user evaluation showed that readers preferred texts produced by SumTime over texts written by humans, in part because they preferred SumTime's word choices (Reiter et al, 2005). This may be the first time that an NLG system has been shown to produce better-than-human texts.

In the SkillSum project we looked specifically at the problem of generating appropriate texts (including choosing appropriate words) for people with limited reading skills (Williams and Reiter, 2008). This was based on a model of what is appropriate for poor readers as a group. This model was only partially successful, because poor readers are a very diverse and heterogenous group; we probably would have done better if we had based the system on models of individual poor readers, not poor readers as a group (but it is not clear how such individual models could be acquired),.

More recently I have begun working on the problems of generating good (non-fictional) narratives (Reiter et al, 2008), and on affective issues in NLG (for example, how to present information about sick babies to parents in a manner which minimises unnecessary stress). This work is still at an early stage.

References

R Dale and E Reiter (1995). Computational Interpretations of the Gricean Maxims in the Generation of Referring Expressions. Cognitive Science 19:233-263 (PDF).

E Reiter (1991). A New Model of Lexical Choice for Nouns. Computational Intelligence 7:240-251.

E Reiter, A Gatt, F Portet, M van der Meulen (2008). The Importance of Narrative and Other Lessons from an Evaluation of an NLG System that Summarises Clinical Data. In Proceedings of INLG-2008, pages 147-155. (PDF)

E Reiter and S Sripada (2002). Human Variation and Lexical Choice. Computational Linguistics 28:545-553 (PDF).

E Reiter, S Sripada, J Hunter, J Yu, and I Davy (2005). Choosing Words in Computer-Generated Weather Forecasts. Artificial Intelligence 67:137-169. (PDF)

D Roy and E Reiter (2005). Connecting Language to the World. Artificial Intelligence 67:1-12. (PDF)

S Williams and E Reiter (2008). Generating basic skills reports for low-skilled readers. Natural Language Engineering 14:495-535. (PDF)


Applied Interest: Building Data-to-Text Systems

My other primary interest in NLG is system-building issues, especially for data-to-text systems, that is systems which generate textual summaries of numeric and other non-linguistic data. I think this is a very exciting application for NLG; if NLG can help people understand data sets better, this would have a major impact on medicine, engineering, finance, and many other aspects of modern life. Recently we have shown that computer-generated medical summaries in particular can be effective decision-support aids (Portet et al, 2009).

I've built NLG systems in several data-to-text areas, including weather forecasts (Reiter et al, 2005), engineering summaries (Yu et al 2007), and educational assessments (Williams and Reiter, 2008) as well as medicine (Gatt et al, 2009; Hunter et al, 2012; Portet et al, 2009). The most successful of these is the weather forecast system, which is used commercially and has been shown to generate (at least in some contexts) texts that are better than those written by human forecasters (as mentioned above). Earlier I worked on knowledge-to-text systems in various areas, including technical documentation (Reiter, Mellish, Levine 1995) and patient information (Reiter, Robertson, and Osman 2003).

In an attempt to better understand the software engineering of NLG systems I have proposed architectures for applied NLG systems. One of my most cited papers is a 1994 workshop paper (Reiter, 1994) which argues for a simple 3-stage pipeline architecture for NLG systems. This model is now widely accepted in the field, and is the basis for most current work on NLG architecture. Although this architecture has drawbacks (for example, I myself pointed out that it is poorly suited to certain types of optimisations (Reiter, 2000)), it remains popular because of its simplicity. Recently I have tried to extend this architecture to include the data-processing side of data-to-text systems (Reiter, 2007)).

In the late 1990s Robert Dale and I wrote a book (Reiter and Dale, 2000) on building NLG systems. This is the first such book of its kind, and it attempts to show how all the various aspects of NLG come together when building a system. The book has a lot of flaws, and if I was writing it again I would do it differently, but thats probably inevitable considering that no one has written such a book before. The book is unfortunately not available on the Web, but we also wrote a journal paper (Reiter and Dale, 1997) which summarises the book's material; this can be downloaded (see below).

More recently I have focused on methodological issues such as knowledge acquisition and evaluation; I now believe that a good understanding of these issues is absolutely essential to system building, and perhaps even more important than architecture and algorithms. I believe that both expert-based and corpus-based KA techniques for NLG have major problems, but by using a mixture of different techniques we can surmount many of these problems (Reiter, Sripada, Robertson 2003). Recently I have been investigating using sociolinguistic techniques, such as content analysis and discourse analysis (McKinlay et al, 2010). With regard to evaluation, Anja Belz and I have looked into the validity of corpus-based evaluation techniques (eg, BLEU) in NLG; our results suggest they can be useful in some contexts, but as a supplement to human evaluation, not a replacement for it (Reiter and Belz, 2009).

One of my long-term ambitions is to create better tools for building NLG systems; by this I mean mean tools which are robust, well-documented, easy to use, and have the functionality needed to build real systems (but do not support functionality which is unlikely to actually be used). A first step towards this is the simplenlg realiser, which we hope to expand to cover microplanning as well as realisation.

References

A Gatt, F Portet, E Reiter, J Hunter, S Mahamood, W Moncur, S Sripada (2009). From Data to Text in the Neonatal Intensive Care Unit: Using NLG Technology for Decision Support and Information Management AI Communications 22:153-186. (PDF)

J Hunter, Y Freer, A Gatt, E Reiter, S Sripada, C Sykes (2012). Automatic generation of natural language nursing shift summaries in neonatal intensive care: BT-Nurse. Artificial Intelligence in Medicine. 56:157172. DOI: http://dx.doi.org/10.1016/j.artmed.2012.09.002 (PDF)

A McKinlay, C McVittie, E Reiter, Y Freer, C Sykes, R Logie (2010). Design Issues for Socially Intelligent User-Interfaces: A Qualitative Analysis of a Data-to-Text System for Summarizing Clinical Data. Methods of Information in Medicine, 49:379-387.

F Portet, E Reiter, A Gatt, J Hunter, S Sripada, Y Freer, C Sykes (2009). Automatic Generation of Textual Summaries from Neonatal Intensive Care Data. Artificial Intelligence 173:789-816. (PDF)

E Reiter (1994). Has a Consensus NL Generation Architecture Appeared, and is it Psycholinguistically Plausible? In Proc of the Seventh International Workshop on Natural Language Generation (INLGW-1994), pages 163-170. Kennebunkport, Maine, USA (PDF).

E Reiter, C Mellish, and J Levine (1995). Automatic Generation of Technical Documentation. Applied Artificial Intelligence 9:259-287. Reprinted in M. Maybury and W. Wahlster (eds), Readings in Intelligent User Interfaces (Morgan Kaufmann, 1998) (PDF).

E. Reiter and R. Dale (1997). Building Applied Natural-Language Generation Systems. Journal of Natural-Language Engineering, 3:57-87. (PDF)

E Reiter (2000). Pipelines and Size Constraints. Computational Linguistics. 26:251-259. (PDF)

E Reiter and R Dale (2000). Building Natural-Language Generation Systems. Cambridge University Press (home page).

E Reiter, S Sripada, and R Robertson (2003). Acquiring Correct Knowledge for Natural Language Generation. Journal of Artificial Intelligence Research 18:491-516. (PDF).

E Reiter (2007). An Architecture for Data-to-Text Systems. In Proceedings of ENLG-2007, pages 97-104. (PDF)

E Reiter and A Belz (2009). An Investigation into the Validity of Some Metrics for Automatically Evaluating Natural Language Generation Systems Computational Linguistics 35:529558 (PDF)

J Yu, E Reiter, J Hunter, C Mellish (2007). Choosing the content of textual summaries of large time-series data sets. Natural Language Engineering 13:25-49 (PDF)


Other AI and CS Interests

I strongly believe that AI needs to have a less hostile attitude towards negative results (Reiter, Robertson and Osman 2003). While researchers in more mature fields such as medicine and physics regard negative results as worthwhile findings which should be published, AI researchers almost never publish negative results. This is not a healthy attitude for a scientific field, and I hope it will change in the future. I am also concerned by the growing inward-looking nature of computational linguistics (Reiter 2007b), and believe computational linguistics would benefit from more interaction with other researchers working on language and artificial intelligence.

I have done some work in medical informatics, especially around the topic of patient information systems. This includes the STOP system which generates tailored smoking-cessation letters (Lennox et al, 2001), and an investigation of information the information that patients want to share with friends and family (Moncur et al, 2010). I have also dabbled in security issues of patient records (for example, Porteous et al 2003).

I have done some work on tools to help language-impaired people communicate better (Black et al, 2012). This is partially motivated by the fact that I have an autistic son.

In general, I would like to see more CS research aimed at helping disadvantaged people. The applied side of CS research is mostly aimed at 'wealth creation', and while I have nothing against this, I would like to see CS research also being used to help people who need help. For example, when asked to suggest a "Grand Challenge" for computing, I suggested that we try to make a "universally accessible web", where NLG and other technologies are used to dynamically produce web pages suitable for individual readers, no matter what their language, disability, age, reading ability, etc. This vision has been included in the UKCRC's Memories for Life Grand Challenge for Computer Science.

References

R Black, A Waller, R Turner, E Reiter (2012). Supporting Personal 1 Narrative for Children with Complex Communication Needs. ACM Transactions on Computer-Human Interaction 19(2), Article 15.

A Lennox, L Osman, E Reiter, R Robertson, J Friend, I McCann, D Skatun, and P Donnan (2001). The Cost-Effectiveness of Computer-Tailored and Non-Tailored Smoking Cessation Letters in General Practice: A Randomised Controlled Trial. British Medical Journal 322:1396-1400. (eBMJ archive)

W Moncur, E Reiter, J Masthoff, A Carmichael (2010). Modelling the socially intelligent communication of health information to a patient's personal social network. IEEE Transactions on Information Technology in Biomedicine 14:319-325.

T Porteous, C Bond, R Robertson, P Hannaford, and E Reiter (2003). Electronic Transfer of Prescription-Related Information: Comparing Views of Patients, GPs and Pharmacists. British Journal of General Practice 53:204-209.

E Reiter (2007b). Last Word: The Shrinking Horizons of Computational Linguistics. Computational Linguistics 33:283-287 (PDF)

E Reiter, R Robertson, and L Osman (2003). Lessons from a Failure: Generating Tailored Smoking Cessation Letters. Artificial Intelligence 144:41-58. (PDF)




Research Summary | Ehud Reiter | Computing Science | University of Aberdeen
Last updated 13 Oct 2011