Call for Workshop Papers The Linguistic Annotation Workshop (The LAW) A Merger of NLPXML 2007 and FLAC 2007 ACL 2007 Prague, Czech Republic June 28-29, 2007 Linguistically annotated corpora play a major role in parsing, information extraction, question answering, machine translation and many other areas of computational linguistics, and provide an empirical testbed for theoretical linguistics research. This has led to a proliferation of annotation systems, frameworks, formats, and schemes. Recognition of the need to harmonize annotation practices and frameworks has become increasingly critical, as witnessed by numerous workshops dealing with different aspects of linguistic annotation over the past few years. The Linguistic Annotation Workshop (The LAW) will provide the first single forum for consideration of these different aspects by merging NLPXML: Natural Language Processing and XML (http://www.ling.helsinki.fi/~gwilcock/NLPXML/) and FLAC: Frontiers in Linguistically Annotated Corpora (http://www.cs.mu.oz.au/~tim/events/frontiers2006/), which is itself a merger between Linguistically Interpreted Corpora (LINC) and Frontiers in Corpus Annotation (FCA). In total, the LAW will be the convergence of 14 previous workshops (5 NLPXML, 1 FLAC, 6 LINC and 2 FCA). The goals of this workshop include: (1) The exchange and propagation of research results with respect to the annotation, manipulation and exploitation of corpora, taking into account different applications and theoretical investigations in the field of language technology and research; (2) Working towards the harmonization and interoperability from the perspective of the increasingly large number of tools and frameworks that support the creation, instantiation, manipulation, querying, and exploitation of annotated resources; (3) Working towards a consensus on all issues crucial to the advancement of the field of corpus annotation. The workshop will include presentations of long (8 page) and short (4 page) papers, demonstrations of annotation tools and invited presentations by "working groups", as discussed below, followed by an open discussion. Long papers should reflect work in an advanced state, but short papers may describe more preliminary work and pilot studies. Papers topics may cover any aspect of linguistic annotation including: 1. New and innovative annotation schemes 2. Machine learning and knowledge-based methods for automation of corpus annotation 3. Linguistic considerations for merging of annotation of distinct phenomena 4. Comparison of annotation schemes 5. Evaluation considerations for corpus annotation 6. Comparison and/or evaluation of existing annotation systems, including functionality, common/missing features, accommodation of different input/output formats and resource types (lexicons,knowledge bases, ontologies, etc.) 7. Creation, maintenance, and interactive exploration of annotation structures and annotated data 8. Representation formats/structures for merged annotations of different phenomena, and means to explore/manipulate them 9. Assessment of, and potential means to achieve, interoperability of annotation formats/frameworks among different systems as well as different tasks, frameworks, modalities, and languages The workshop will also include a one-hour demonstration session for annotation systems and tools. Proposals for system demonstrations should follow the short paper submission format. The proposal should provide an overview of the system to be demonstrated, including functionality, supported input/output formats or structures, supported languages and modalities, etc. Accepted proposals will appear in the proceedings and are intended to provide background for the demonstration. In addition to paper presentations and software demos, there will be a few invited "working group" presentations, each laying out the dimensions of some crucial problem facing the field of corpus annotation, particularly problems involving merging annotation and extending annotation to new languages, genres and modalities. The final list of working group topics will appear on the workshop website by February 15, 2007. Our preliminary topics include: (a) selection of diverse or balanced corpora with few licensing restrictions for common annotation by the community. Possible corpora include the "open" portion of the American National Corpus and Wikipedia XML, a freely available cleaned-up corpus that is derived from the Wikipedia.); (b) approaches to discourse coherence, especially as resulting from different interacting annotation layers, and its applications to computational linguistics; and (c) annotation systems/frameworks and interoperability, including the feasibility of applying a common annotation framework to various annotation types, language processing tasks, modalities, and languages, especially as it could enable the merging of annotations of diverse phenomena produced by different systems. We will attempt to lay out clearly and precisely the assumptions on such topics held by members of the annotation community and in doing so, we hope to both: (1) lay the foundations for the meaningful integration of annotation resources; and (2) assess the limitations of integrated approaches. We will also be giving an Innovative Student Annotation Award to one student presenter -- please indicate if your paper is written by students or has one or more student authors. This includes waiving of the workshop fee for one student. WORKSHOP WEBSITE: http://www.ling.uni-potsdam.de/acl-lab/LAW-07.html TARGET AUDIENCE: Those interested in creating and using existing and future annotated corpora and other language resources. This includes annotators, lexicographers, system developers and those designing NLP system evaluation tasks for the NLP community. SUBMISSIONS Long paper submissions should not exceed 8 pages in length and short papers and demo descriptions should not exceed 4 pages. Format requirements will be the same as for full papers of ACL 2007. See http://ufal.mff.cuni.cz/acl2007/ for style files. For details of the submission procedure, please consult the submission webpage reachable via the workshop website. Please indicate: 1) long paper, short paper or demonstration proposal; 2) all applicable paper categories from the following list (indicate multiple categories if appropriate): annotation frameworks and/or physical formats, annotation scheme design (on linguistic grounds), annotation tools and systems, corpus annotation, syntax, semantics, predicate-argument structure, morphology, anaphora, discourse, opinion/sentiment; 3) language(s) your work applies to, as well and those you plan to handle in the future. If your work is language independent, indicate this as well. 4) any non-standard equipment needed for your paper or demonstration LANGUAGE: All papers must be written and presented in English IMPORTANT DATES Papers due: March 26, 2007 Acceptance/rejection notification: April 24, 2007 Final version due: May 9, 2007 Workshop Dates: June 28-29, 2007 Co-Chairs: Branimir Boguraev, IBM T. J. Watson Research Center Nancy Ide Vassar College Adam Meyers New York University Shigeko Nariyama University of Melbourne Manfred Stede University of Potsdam Janyce Wiebe University of Pittsburgh Graham Wilcock University of Helsinki Program Committee: David Ahn (University of Amsterdam, NL) Lars Ahrenberg (Linköpings Universitet) Timothy Baldwin (University of Melbourne) Francis Bond (NICT) Kalina Bontcheva (University of Sheffield, UK) Paul Buitelaar (DFKI, Germany) Jean Carletta (University of Edinburgh, UK) Key-Sun Choi (KAIST, Korea) Chris Cieri (Linguistic Data Consortium/University of Pennsylvania, USA) Hamish Cunningham (University of Sheffield, UK) David Day (MITRE Corporation, USA) Thierry Declerck (DFKI, Germany) Ludovic Denoyer (University of Paris) Tomaz Erjavec (Institute Josef Stefan, Slovenia) David Farwell (Computing Research Laboratory, New Mexico State University) Alex Chengyu Fang (City University Hong Kong) Chuck Fillmore (International Computer Science Institute, Berkeley) Anette Frank (DFKI) John Fry (SRI International) Claire Grover (University of Edinburgh, UK) Jan Hajic (Charles University, Czech Republic) Ed Hovy (International Sciences Institute) Baden Hughes (University of Melbourne) Emi Izumi (NICT) Tsai Jia-Lin (Tung Nan Institute of Technology) Aravind Joshi (University of Pennsylvania, Philadelphia) Ewan Klein (University of Edinburgh, UK) Mounia Lalmas (University of London, UK) Mike Maxwell (University of Maryland) Chieko Nakabasami (Toyo University, Japan) Stephan Oepen (University of Oslo) Kyonghee Paik (KLI) Martha Palmer (University of Colorado) Antonio Pareja-Lora (UCM, Spain) Manfred Pinkal (DFKI) James Pustejovsky (Brandeis University, USA) Owen Rambow (Columbia University) Laurent Romary (Loria/CNRS, France) Henry Thompson (University of Edinburgh, UK) Erik Tjong Kim Sang (University of Amsterdam, NL) Theresa Wilson (University of Pittsburgh) Nainwen Xue (University of Pennsylvania) Please refer all questions to: Nancy Ide (ide@cs.vassar.edu) or Shigeko Nariyama (shigeko@unimelb.edu.au)