Coling 2008 workshop on human judgements in Computational Linguistics

Manchester, 23 August 2008

In connection with Coling 2008, the 22nd International Conference on Computational Linguistics
18-22 August 2008

Call for papers

Extended deadline for submission: 10 May 2008, 23:59 UTC

Workshop Description

Human judgements play a key role in the development and the assessment of linguistic resources and methods in Computational Linguistics. They are commonly used in the creation of lexical resources and corpus annotation, and also in the evaluation of automatic approaches to linguistic tasks. Furthermore, systematically collected human judgements provide clues for research on linguistic issues that underlie the judgement task, providing insights complementary to introspective analysis or evidence gathered from corpora.

We invite papers about experiments that collect human judgements for Computational Linguistic purposes, with a particular focus on linguistic tasks that are controversial from a theoretical point of view (e.g., some coding tasks having to do with semantics or pragmatics). Such experimental tasks are usually difficult to design and interpret, and they typically result in mediocre inter-rater reliability. We seek both broad methodological papers discussing these issues, and specific case studies.

Topic of interest include, but are not limited to:

Experimental design:
- Which types of experiments support the collection of human judgements? Can any general guidelines be defined? Is there a preference between lab-based experiments and web-based experiments?
- Which experimental methodologies support controversial tasks? For instance, does underspecification help? What is the role of ambiguity and polysemy in these tasks?
- What is the appropriate level of granularity for the category labels?
- What kind of participants should be used (e.g., expert vs. non-expert), how is it affected by the type of experiment, and how should the experiment design be varied according to this issue?
- How much and which kind of information (examples, context, etc.) should be provided to the experiment participants? When does information turn into a bias?
- Is it possible to design experiments that are useful for both computational linguistics and psycholinguistics? What do the two research areas have in common? What are the differences?
Analysis and interpretation of experimental data:
- How important is inter-annotator agreement in human judgement collection experiments? How is it best measured for complex tasks?
- What other quantitative tools are useful for analysing human judgement collection experiments?
- What qualitative methods are useful for analysing human judgement collection experiments? Which questions should be asked? Is it possible to formulate general guidelines?
- How is the analysis similar to psycholinguistic analysis? How is it different?
- How do results from all of the methods above affect the development of annotation instructions and procedures?
Application of experiment insights:
- How do the experimental data fit into the general resource-creating process?
- How to modify the set of labels and the criteria or guidelines for the annotation task according to the experimental results? How to avoid circularity in this process?
- How can the data be used to refine or modify existing theoretical proposals?
- More generally, under what conditions can the obtained judgements be applied to research questions?

Workshop Organizers

Ron Artstein, University of Southern California
Gemma Boleda, Universitat Politècnica de Catalunya
Frank Keller, University of Edinburgh
Sabine Schulte im Walde, Unversität Stuttgart

Keynote Speaker

Martha Palmer, University of Colorado

Programme Committee

Toni Badia, Universitat Pompeu Fabra
Marco Baroni, University of Trento
Beata Beigman Klebanov, Northwestern University
André Blessing, Universität Stuttgart
Chris Brew, Ohio State University
Kevin Cohen, University of Colorado Health Sciences Center
Barbara Di Eugenio, University of Illinois at Chicago
Katrin Erk, University of Texas at Austin
Stefan Evert, University of Osnabrück
Afsaneh Fazly, University of Toronto
Alex Fraser, Universität Stuttgart
Jesus Gimenez, Universitat Politècnica de Catalunya
Roxana Girju, University of Illinois at Urbana-Champaign
Ed Hovy, University of Southern California
Nancy Ide, Vassar College
Adam Kilgarriff, University of Brighton
Alexander Koller, University of Edinburgh
Anna Korhonen, University of Cambridge
Mirella Lapata, University of Edinburgh
Diana McCarthy, University of Sussex
Alissa Melinger, University of Dundee
Paola Merlo, University of Geneva
Sebastian Padó, Stanford University
Martha Palmer, University of Colorado
Rebecca Passonneau, Columbia University
Massimo Poesio, University of Trento
Sameer Pradhan, BBN Technologies
Horacio Rodriguez, Universitat Politècnica de Catalunya
Bettina Schrader, Universität Potsdam
Suzanne Stevenson, University of Toronto

Submissions

Deadline for the receipt of papers is 10 May 2008, 23:59 UTC. Submit your paper via the submissions web page.

Submissions should be anonymous. Please submit only PDF files, 8 pages long (including data, tables, figures, and references). We recommend to follow the Coling 2008 style guidelines. Include a one-paragraph abstract of the entire work (about 200 words). Accepted papers will appear in an on-line proceedings volume.

Important Dates

Paper submission deadline:	10 May 2008, 23:59 UTC (extended)
Notification of acceptance:	10 June 2008
Camera-ready copy due:	01 July 2008
Workshop date:	23 August 2008

created 2008-02-20, last modified 2008-03-04