In connection with Coling
2008, the 22nd International Conference on Computational
Coling 2008 workshop on
human judgements in Computational Linguistics
Manchester, 23 August 2008
18-22 August 2008
Call for papers
Extended deadline for submission: 10 May
2008, 23:59 UTC
Human judgements play a key role in the development and the assessment
of linguistic resources and methods in Computational Linguistics. They
are commonly used in the creation of lexical resources and corpus
annotation, and also in the evaluation of automatic approaches to
linguistic tasks. Furthermore, systematically collected human
judgements provide clues for research on linguistic issues that
underlie the judgement task, providing insights complementary to
introspective analysis or evidence gathered from corpora.
We invite papers about experiments that collect human judgements for
Computational Linguistic purposes, with a particular focus on
linguistic tasks that are controversial from a theoretical point of
view (e.g., some coding tasks having to do with semantics or
pragmatics). Such experimental tasks are usually difficult to design
and interpret, and they typically result in mediocre inter-rater
reliability. We seek both broad methodological papers discussing these
issues, and specific case studies.
Topic of interest include, but are not limited to:
- Experimental design:
- Which types of experiments support the collection of human
judgements? Can any general guidelines be defined? Is there a
preference between lab-based experiments and web-based experiments?
- Which experimental methodologies support controversial tasks? For
instance, does underspecification help? What is the role of ambiguity
and polysemy in these tasks?
- What is the appropriate level of granularity for the category
- What kind of participants should be used (e.g., expert
vs. non-expert), how is it affected by the type of experiment, and how
should the experiment design be varied according to this issue?
- How much and which kind of information (examples, context, etc.)
should be provided to the experiment participants? When does
information turn into a bias?
- Is it possible to design experiments that are useful for both
computational linguistics and psycholinguistics? What do the two
research areas have in common? What are the differences?
- Analysis and interpretation of experimental data:
- How important is inter-annotator agreement in human judgement
collection experiments? How is it best measured for complex tasks?
- What other quantitative tools are useful for analysing human
judgement collection experiments?
- What qualitative methods are useful for analysing human judgement
collection experiments? Which questions should be asked? Is it
possible to formulate general guidelines?
- How is the analysis similar to psycholinguistic analysis? How is it
- How do results from all of the methods above affect the development
of annotation instructions and procedures?
- Application of experiment insights:
- How do the experimental data fit into the general resource-creating
- How to modify the set of labels and the criteria or guidelines for
the annotation task according to the experimental results? How to
avoid circularity in this process?
- How can the data be used to refine or modify existing theoretical
- More generally, under what conditions can the obtained judgements be
applied to research questions?
Ron Artstein, University of Southern California
Gemma Boleda, Universitat Politècnica de Catalunya
Frank Keller, University of Edinburgh
Sabine Schulte im Walde, Unversität Stuttgart
University of Colorado
Toni Badia, Universitat Pompeu Fabra
Marco Baroni, University of Trento
Beata Beigman Klebanov, Northwestern University
André Blessing, Universität Stuttgart
Chris Brew, Ohio State University
Kevin Cohen, University of Colorado Health Sciences Center
Barbara Di Eugenio, University of Illinois at Chicago
Katrin Erk, University of Texas at Austin
Stefan Evert, University of Osnabrück
Afsaneh Fazly, University of Toronto
Alex Fraser, Universität Stuttgart
Jesus Gimenez, Universitat Politècnica de Catalunya
Roxana Girju, University of Illinois at Urbana-Champaign
Ed Hovy, University of Southern California
Nancy Ide, Vassar College
Adam Kilgarriff, University of Brighton
Alexander Koller, University of Edinburgh
Anna Korhonen, University of Cambridge
Mirella Lapata, University of Edinburgh
Diana McCarthy, University of Sussex
Alissa Melinger, University of Dundee
Paola Merlo, University of Geneva
Sebastian Padó, Stanford University
Martha Palmer, University of Colorado
Rebecca Passonneau, Columbia University
Massimo Poesio, University of Trento
Sameer Pradhan, BBN Technologies
Horacio Rodriguez, Universitat Politècnica de Catalunya
Bettina Schrader, Universität Potsdam
Suzanne Stevenson, University of Toronto
Deadline for the receipt of papers is 10 May 2008, 23:59 UTC. Submit
your paper via the submissions
Submissions should be anonymous. Please submit only PDF files, 8 pages
long (including data, tables, figures, and references). We recommend
to follow the Coling
2008 style guidelines. Include a one-paragraph abstract of the
entire work (about 200 words). Accepted papers will appear in an
on-line proceedings volume.
|Paper submission deadline: ||10 May 2008, 23:59 UTC (extended)|
|Notification of acceptance: ||10 June 2008|
|Camera-ready copy due: ||01 July 2008|
|Workshop date: ||23 August 2008|
created 2008-02-20, last modified 2008-03-04