First call for papers for a  COLING/ACL 2006 Workshop on
                              
    How can Computational Linguistics improve Information
                         Retrieval?

Organisers:    John Tait, University of Sunderland, UK
          Michael Oakes, University of Sunderland, UK

It is striking how rarely techniques from computational
linguistics have been demonstrated to be helpful in
performing the conventional Information Retrieval (IR) task
By the conventional  IR task we mean asearch in which a
short query, often in the form a list of keywords, is
provided with the desired result being a list of documents,
ranked in terms of their relevance to the need underlying
the query. This, of course, is the  IR task as encapsulated
by internet search engines. Although there have been one or
two examples where techniques like Word Sense Disambiguation
or deeper syntactic or semantic analysis have been shown to
be useful for indexing documents in large scale classic
Information Retrieval experiments (for example Strzalkowski
and colleagues at TREC-2, Pirkola and Jarvelin's 1996 IP&M
paper or Stokoe, Oakes and Tait in SIGIR 2003), information
retrieval techniques using ever more sophisticated
statistical models (which have demonstrated a 40%
improvement in effectivenesss since TREC began in 1992) have
almost always outperformed approaches which are more
linguistically motivated.

Of course in some more specialised tasks, especially
question answering and summarising, techniques from
computational linguistics have proven their worth: but even
here the best performing systems frequently combine
statistical techniques with more linguistically motivated
ones.

The workshop will explore why this is the case, and to what
extent more appropriate and better performing computational
linguistic techniques can improve the performance of text
information retrieval systems.

In particular we are calling position and discussion papers
on the following topics:
  o    Is the conventional information retrieval task
     formulated in a way which prevents or obstructs
     computational linguistics contributing;
o    Does statistical information retrieval in fact capture
the relevant properties of language but in a form which is
inaccessible or hidden?
  o    Are assumptions made in computational linguistics about
     the nature of lexical semantics and the structural
     properties of well formed running text in some way ill
     founded, at least for the information retrieval task?
  o     Is there some property of language (for example
     semantic redundancy) which means that the relatively crude
     statistical techniques capture enough information to obtain
     the available improvements in performance?
o    Is the problem that computational linguistic techniques
are too unreliable or narrowly applicable, so improved
performance on some documents or queries is masked by worse
performance on others?
Papers will also be accepted on closely related topics.

A major outcome of the day will be a research agenda for
increased contribution to information retrieval from
computational linguistics and an enhanced dialogue between
the two disciplines, following up on the Electra workshop
held at SIGIR 2005.

It is also hoped to produce a journal special issue or a
book based a selected and extended workshop submissions.


Paper Submission
Submissions should follow the two-column format of ACL
proceedings and
should not exceed eight (8) pages, including references. We
strongly
recommend the use of the LaTeX style files or Microsoft Word
document
template that will be made available on the COLING-ACL main
conference
Web site (http://www.acl2006.mq.edu.au/).

As reviewing will be blind, the paper should not include the
authors'
names and affiliations. Furthermore, self-references that
reveal the
author's identity, e.g., "We previously showed (Smith, 1991)
...",
should be avoided. Instead, use citations such as "Smith
previously
showed (Smith, 1991) ...".

Submission will be electronic using the paper submission
START system,
and they must be in Adobe PDF format. The papers must be
submitted no
later than March 24, 2006. Papers submitted after that time
will not
be reviewed. For details of the submission procedure, please
consult
the submission webpage reachable via the workshop website.


Outline Program
09:00 Opening and scene setting
09:30 Invited talk - ~Jaime Callan, CMU (tbc)
10:15 Submitted Papers
11:00 Morning Tea
11:15 Submitted papers
12:30 Lunch
1:30 Submitted Papers
3:00 Afternoon Tea
3:15 Submitted Papers
4:00 Discussion Panel
5:00 Close

Programme Committee
John Tait, University of Sunderland, UK (Chair)
Michael Oakes, University of Sunderland, UK (Co-Chair)
Branimir Boguraev, IBM, USA
Bruce Croft, Umass Amherst, USA
Gakl Dias, University of Beira Interior, Portugal
Hang Cui, National University of Singapore
Noriko Kando, NII, Japan
Rob Gaizauskas, University of Sheffield, UK
Mark Sanderson, University of Sheffield, UK
Alexander Gelbukh, National Polytechnic Instiute, Mexico
Tomek Strzalkowski University at Albany, USA
Karen Sparck Jones, University of Cambridge, UK
Rosie Jones, Yahoo, USA
Liz Liddy, Syracuse University, USA
Lucia Rino, UFSCAR, Brazil
Chris Stokoe, University of Sunderland, UK
Simone Teufel, University of Cambridge, UK
Olga Vetchimova, University of Waterloo, Canada
Mirella Lapata University of Edinburgh, UK
Stephen Clark, University of Oxford, UK

Key Dates

Deadline for Submission 24 March 2006
Decisions to Authors 8 May 2006
Final Copy of Accepted Papers Friday 19 May 2006
Workshop  Sunday 23rd July 2006


Workshop Contact Details
http://www.cet.sunderland.ac.uk/cliir/
cliir@sunderland.ac.uk