Multi-source Multilingual Information Extraction and Summarization
   
        Workshop to be held in conjunction with


              *** RANLP 2007 ***

              Borovets - Bulgaria

               http://lml.bas.bg/ranlp2007/

                 *** 26th of September 2007 ***

             First Call for Papers

Information  extraction  (IE)  and  text summarization  (TS)  are  key
technologies aiming at extracting  relevant information from texts and
other sources and presenting the  information to the user in condensed
forms. Recent years have witnessed an explosion of information, making
IE and  TS particularly important  for the information  society. These
technologies, however, face new challenges with the adoption of the Web
2.0 paradigm (e.g. blogs, wikis) because of their inherent multi-source
nature. These technologies have to  deal no longer with isolated texts
or single narratives but with  large scale repositories, or sources --
in one or many languages -- containing a multiplicity of views,
opinions, or commentaries on particular topics, entities or events.
There  is  thus a  need  to adapt  and/or  develop  new techniques  to
deal with these new phenomena.


Recognising  similar information  across different  sources  and/or in
different languages  is of paramount importance  in this multi-source,
multi-lingual context, in particular the ability to detect paraphrases
in  texts  is  relevant  here.   In  information  extraction,  merging
information  from  multiple sources  can  lead  to increased  accuracy
relative  to extraction  from single  sources. In  text summarization,
similar  facts  found  across  sources  can  inform  sentence  scoring
algorithms.   In question  answering, the  distribution of  answers in
similar contexts can inform  answer ranking components.  In occasions,
it  is  not  the  similarity  of  information  that  matters,  but  its
complementary  nature.    In  a  multi-lingual   context,  information
extraction   and  text   summarization  can   provide   solutions  for
cross-lingual access: key pieces  of information can be extracted from
different texts in one or many languages, merged, and then conveyed in
many natural  languages in concise  forms.  It is  therefore important
that the research community addresses the following issues:

** What       methods       are       appropriate      to       detect
similar/complementary/contradictory   information?   Are  hand-crafted
rules and knowledge-rich approaches convenient?

** What methods  are there to tackle  cross-document and cross-lingual
entity and event coreference?

** What machine learning approaches are most appropriate for this task
supervised/unsupervised/semi-supervised?   What  type  of  corpora  is
required for training and testing?

** What techniques  are appropriate to produce  condensed synthesis of
the extracted information? What generation techniques are useful here?
What kind of techniques can be used to cross domains and languages?

** What tools  are there to  support multi-lingual/multi-source access
to  information?  What  solutions   are  there  beyond  full  document
translation to produce cross-lingual summaries?


The objective  of the  workshop is to  bring together  researchers and
practitioners  in the  areas of  extraction, summarization,  and other
information access  technologies to discuss recent  approaches to deal
with multi-source and multi-lingual challenges.

We welcome submission concerning  the following
topics:

* Multi-source information extraction
* Cross-document Cross-lingual coreference
* Opinion mining and synthesis
* Multi-lingual  information extraction
* Cross-lingual Summarization
* Tools to support information fusion
* Paraphrase identification and generation
* Adaptable IE-based text generation


Important Dates:

Deadline for submission: *** June 15, 2007 ***
Notification of acceptance:  July 25, 2007
Camera-ready copy due:  August 31, 2007
Workshop: September 26,  2007


Submission guidelines:

Submissions  should be  A4, two-column  format and  should  not exceed
seven  pages, including  cover page,  figures, tables  and references.
Times New Roman 12 font is  preferred. The first page should state the
title  of the paper,  the author's  name(s), affiliation,  surface and
email address(es),  followed by keywords and an  abstract and continue
with  the first  section of  your  paper.  Guidelines   for   producing
camera-ready  versions will be available at the  conference  web  site.


Each paper  will be  reviewed by  up to three  members of  the program
committee.  Authors   of  accepted  papers   will  receive  guidelines
regarding  how to produce  camera-ready versions  of their  papers for
inclusion in the proceedings.
 

Organization

Thierry Poibeau (CNRS - LIPN, U. Paris 13 - France)
E-mail: Thierry.Poibeau@lipn.univ-paris13.fr

Horacio Saggion  (NLP Group, U. Sheffield - United Kingdom)
E-mail: saggion@dcs.shef.ac.uk


Program Committee:


 Sophia Ananiadou (U. Manchester, UK)

 Roberto Basili (U. Roma Tor Vergata, Italy)

 Kalina Bontcheva (U. Sheffield, UK)

 Nathalie Colineau (CSIRO, Australia)

 Nigel Collier (NII, Japan)

 Hercules Dalianis  (KTH/Stockholm University, Sweden)

 Thierry Declerck (DFKI, Germany)

 Brigitte Grau (LIMSI, France)

 Kentaro Inui (NAIST, Japan)

 Min-Yen Kan (National University of Singapore, Singapore)

 Guy Lapalme (U. Montreal, Canada)

 Diana Maynard (U. Sheffield, UK)

 Jean-Luc Minel (CNRS - Modyco, France)

 Constantin Orasan  (University of Wolverhampton, UK)

 Cecile Paris (CSIRO, Australia)

 Agnes Sandor (Xerox XRCE, France)

 Ralf Steinberger (European Commission - Joint Research Centre, Italy)

 Stan Szpakowicz (University of Ottawa, Canada)

 Lucy Vanderwende (Microsoft Research, USA)

 Jose Luis Vicedo (University of  Alicante, Spain)

 Roman Yangarber (University of Helsinki, Finland)

 Liang Zhou (ISI, USA)

 Michael Zock (LIF, France)


Paper Submission:

Details on how to submit your paper will be announced in due time.