Crossing Barriers in Text Summarization Research

		Workshop to be help in conjunction with


			  *** RANLP 2005 ***

			  Borovets - Bulgaria

	           http://www.lml.bas.bg/ranlp2005

	             *** 24th of September 2005 ***

			 First Call for Papers

		 
An abstract or  summary is a text of a recognisable  genre with a very
specific purpose: to give the reader an exact and concise knowledge of
the  contents of  a  source  document. In  most  cases, summaries  are
written  by  humans,  but   nowadays,  the  overwhelming  quantity  of
information and the need to  access the essential content of documents
accurately  to  satisfy users'  demands  has  made  of Automatic  Text
Summarization a major research field.

Most   summarization  solutions   developed  today   perform  sentence
extraction, a useful, yet sometimes inadequate technique.  In order to
move  from the  sentence extraction  paradigm to  a  more challenging,
semantically  and  linguistically  motivated  'abstracting'  paradigm,
significant linguistic  (i.e., lexicons,  grammars, etc.)  as  well as
non-linguistic  knowledge (i.e., ontologies,  scripts, etc.)   will be
required. Some 'abstracting' problems like 'headline generation', have
been  recently addressed  using language  models that  rely  on little
semantic  information, what are  the limits  of these  approaches when
trying to generate multi-sentence discourses?  What tools are there to
support 'text  abstraction'? What type of  natural language generation
techniques  are  appropriate in  this  context?   Are general  purpose
natural language generation systems appropriate in this task?


Professional  abstractors  play  a  mayor  role  in  dissemination  of
information through  abstract writing, and  their work has  many times
inspired research on automatic  text summarization, they are certainly
one   of  the  keys   in  the   understanding  of   the  summarization
process. Therefore,  what tools are there to  support Machine Assisted
Summarization and  more specifically  how these tools  can be  used to
capture 'professional summarization' knowledge?

In  a  multi-lingual  context,  summaries are  useful  instruments  in
overcoming  the language barrier:  cross-lingual summaries  help users
assess the relevance  of the source, before deciding  to obtain a good
human translation of the  source. This topic is particularly important
in a context where the  relevant information only exists in a language
different from that  of the user. What techniques  are there to attack
this new and challenging issue?  What corpora would be appropriate for
the study of this task?
 
The ``news'' has been a traditional concern of summarization research,
 but we have  seen, in the past few years,  an increasing interest for
 summarization applications on technical and scientific texts, patient
 records, sport events, legal  texts, educative material, e-mails, web
 pages,  etc.  The  question  then,  is  how  to  adapt  summarization
 algorithms to  new domains  and genres.  Machine  learning algorithms
 over superficial features have been used in the past to decide upon a
 number of indicators of content relevance, but when the feature space
 is  huge  or  when  more ``linguistically''  motivated  features  are
 required, and  as a consequence the data  sparseness problem appears,
 what   learning  tools   are  more   appropriate  for   training  our
 summarization  algorithms? What  types  of models  should be  learned
 (e.g., macrostructures, scripts, thematic structures, etc.)?


Text  summarization,  information  retrieval, and  question  answering
support humans in gathering  vital information in everyday activities.
How  these   tools  can   be  effectively  integrated   in  practical
applications?   and  how  such  applications  can be  evaluated  in  a
practical context?

We call for contributions on  any aspect of the summarization problem,
but  we would  like the  workshop to  give the  research  community the
opportunity for discussion of the following research problems:


* Crossing the language  barrier: cross-lingual summarization; corpora
  to support this summarization enterprise;

* Crossing the extractive barrier: non-extractive summarization (i.e.,
text  abstraction); resources for  capturing abstraction  knowledge or
expertise;

* Crossing genres, domains, and  media: adaptation of summarization to
new genres, domains, media, and tasks.

* Crossing technological  barriers: integration of  summarization with
  other NLP  technologies such  as Question Answering  and Information
  Retrieval.

The workshop will be organized around paper presentations and panel
discussions. It  will also feature an invited speaker (to be confirmed).


Important Dates:

Deadline for submission: *** 3 June 2005 *** 
Notification of acceptance: 29 July 2005 
Camera-ready copy due: 19 August 2005
Workshop: 24 September 2005


Important Announcement:


If the  workshop is successful,  we will issue  an special call  for a
thematically  focused volume on  text summarization.  Workshop authors
will be invited  to submit extended versions of  their papers for this
purpose. 


Submission guidelines:

Submissions  should be  A4, two-column  format and  should  not exceed
seven  pages, including  cover page,  figures, tables  and references.
Times New Roman 12 font is  preferred. The first page should state the
title  of the paper,  the author's  name(s), affiliation,  surface and
email address(es),  followed by keywords and an  abstract and continue
with  the first  section of  your  paper. Papers  should be  submitted
electronically in **PDF** format to saggion@dcs.shef.ac.uk.  For up to
three         free        conversions        to         PDF        see
http://192.150.14.203/index.pl?BP=NS.     Guidelines   for   producing
camera-ready  versions  can  be  found  at the  conference  web  site:
http://www.lml.bas.bg/ranlp2005.


Each paper  will be  reviewed by  up to three  members of  the program
committee.  Authors   of  accepted  papers   will  receive  guidelines
regarding  how to produce  camera-ready versions  of their  papers for
inclusion in the proceedings.
  

Parallel  submissions to  the  main conference  and  the workshop  are
allowed  but the review  process will  be coordinated.  Please declare
this in the notification form.

Organization

*Horacio Saggion 
NLP Group 
Department of Computer Science 
University of Sheffield 
Sheffield - UK

*Jean-Luc Minel 
LaLLIC 
Universite de Paris IV-Sorbonne 
Paris - France


Program Committee:


Gustavo Crispino, LaLLIC, Universite de Paris IV, France 

Hercules Dalianis, Stockholm University, Sweden 

Brigitte Endres-Niggemeyer, University of Applied Sciences and Arts, 
Germany 

Donna Harman, National Institite of Standards and Techology, USA 

Hongyan Jing, IBM T.J. Watson Research Center, USA

Min-Yen Kan, School of Computing, National University of Singapore,
 Singapore 

Chua-Choi Kim, Universiti Sains, Malaysia 

Guy    Lapalme,   Departement    d'informatique   et    de   recherche
operationnelle, Universite de Montreal, Canada 

Chin-Yew Lin, Information Science Institute, University of Southern 
California, USA 

Inderjeet Mani, Department of Linguistics, Georgetown University, USA 

Jean-Luc Minel (Co-organizer), LaLLIC, Universite de Paris IV, France 

Marie-France  Moens, Interdisciplinary  Centre for  Law  & Information
Technology, Katholieke Universiteit Leuven, Belgium 

Constantin Orasan, School of Humanities, Languages and Social Studies, 
University of Wolverhampton, UK 

Dragomir  Radev, School  of Information  and Department  of Electrical
Engineering and Computer Science, University of Michigan, USA 

Horacio Rodriguez, Department de Llenguatges i Sistemes Informatics, 
Universitat Politecnica de Catalunya, Spain

Horacio   Saggion  (Organizer),   Department   of  Computer   Science,
University of Sheffield, UK 

Stan Szpakowicz, School of Information Technology and Engineering, 
University of Ottawa, Canada 

Simone Teufel, Computer Laboratory, University of Cambridge, UK 

Dina Wonsever, INCO, Universidad de la Republica, Uruguay


*** Please send your submission to:


Horacio Saggion Email: h.saggion@dcs.shef.ac.uk

Please use the subject line: "Summarization Workshop/RANLP2005"
and include in your message the following information:

# NAME:  Name of author for correspondence
# TITLE: Title of the paper
# KEYS : Keywords
# EMAIL: Email of author for correspondence
# PAGES: Number of pages (including bibliographical references)
# FILE : Name of PDF file
# ABSTR:
#    Abstract of the paper
#    ...
# OTHER: Under consideration for other conferences? (please specify)
# NOTE : Anything you would like to add


*** For any further information please contact 
Horacio Saggion  at h.saggion@dcs.shef.ac.uk