Crossing Barriers in Text Summarization Research Workshop to be help in conjunction with *** RANLP 2005 *** Borovets - Bulgaria http://www.lml.bas.bg/ranlp2005 *** 24th of September 2005 *** First Call for Papers An abstract or summary is a text of a recognisable genre with a very specific purpose: to give the reader an exact and concise knowledge of the contents of a source document. In most cases, summaries are written by humans, but nowadays, the overwhelming quantity of information and the need to access the essential content of documents accurately to satisfy users' demands has made of Automatic Text Summarization a major research field. Most summarization solutions developed today perform sentence extraction, a useful, yet sometimes inadequate technique. In order to move from the sentence extraction paradigm to a more challenging, semantically and linguistically motivated 'abstracting' paradigm, significant linguistic (i.e., lexicons, grammars, etc.) as well as non-linguistic knowledge (i.e., ontologies, scripts, etc.) will be required. Some 'abstracting' problems like 'headline generation', have been recently addressed using language models that rely on little semantic information, what are the limits of these approaches when trying to generate multi-sentence discourses? What tools are there to support 'text abstraction'? What type of natural language generation techniques are appropriate in this context? Are general purpose natural language generation systems appropriate in this task? Professional abstractors play a mayor role in dissemination of information through abstract writing, and their work has many times inspired research on automatic text summarization, they are certainly one of the keys in the understanding of the summarization process. Therefore, what tools are there to support Machine Assisted Summarization and more specifically how these tools can be used to capture 'professional summarization' knowledge? In a multi-lingual context, summaries are useful instruments in overcoming the language barrier: cross-lingual summaries help users assess the relevance of the source, before deciding to obtain a good human translation of the source. This topic is particularly important in a context where the relevant information only exists in a language different from that of the user. What techniques are there to attack this new and challenging issue? What corpora would be appropriate for the study of this task? The ``news'' has been a traditional concern of summarization research, but we have seen, in the past few years, an increasing interest for summarization applications on technical and scientific texts, patient records, sport events, legal texts, educative material, e-mails, web pages, etc. The question then, is how to adapt summarization algorithms to new domains and genres. Machine learning algorithms over superficial features have been used in the past to decide upon a number of indicators of content relevance, but when the feature space is huge or when more ``linguistically'' motivated features are required, and as a consequence the data sparseness problem appears, what learning tools are more appropriate for training our summarization algorithms? What types of models should be learned (e.g., macrostructures, scripts, thematic structures, etc.)? Text summarization, information retrieval, and question answering support humans in gathering vital information in everyday activities. How these tools can be effectively integrated in practical applications? and how such applications can be evaluated in a practical context? We call for contributions on any aspect of the summarization problem, but we would like the workshop to give the research community the opportunity for discussion of the following research problems: * Crossing the language barrier: cross-lingual summarization; corpora to support this summarization enterprise; * Crossing the extractive barrier: non-extractive summarization (i.e., text abstraction); resources for capturing abstraction knowledge or expertise; * Crossing genres, domains, and media: adaptation of summarization to new genres, domains, media, and tasks. * Crossing technological barriers: integration of summarization with other NLP technologies such as Question Answering and Information Retrieval. The workshop will be organized around paper presentations and panel discussions. It will also feature an invited speaker (to be confirmed). Important Dates: Deadline for submission: *** 3 June 2005 *** Notification of acceptance: 29 July 2005 Camera-ready copy due: 19 August 2005 Workshop: 24 September 2005 Important Announcement: If the workshop is successful, we will issue an special call for a thematically focused volume on text summarization. Workshop authors will be invited to submit extended versions of their papers for this purpose. Submission guidelines: Submissions should be A4, two-column format and should not exceed seven pages, including cover page, figures, tables and references. Times New Roman 12 font is preferred. The first page should state the title of the paper, the author's name(s), affiliation, surface and email address(es), followed by keywords and an abstract and continue with the first section of your paper. Papers should be submitted electronically in **PDF** format to saggion@dcs.shef.ac.uk. For up to three free conversions to PDF see http://192.150.14.203/index.pl?BP=NS. Guidelines for producing camera-ready versions can be found at the conference web site: http://www.lml.bas.bg/ranlp2005. Each paper will be reviewed by up to three members of the program committee. Authors of accepted papers will receive guidelines regarding how to produce camera-ready versions of their papers for inclusion in the proceedings. Parallel submissions to the main conference and the workshop are allowed but the review process will be coordinated. Please declare this in the notification form. Organization *Horacio Saggion NLP Group Department of Computer Science University of Sheffield Sheffield - UK *Jean-Luc Minel LaLLIC Universite de Paris IV-Sorbonne Paris - France Program Committee: Gustavo Crispino, LaLLIC, Universite de Paris IV, France Hercules Dalianis, Stockholm University, Sweden Brigitte Endres-Niggemeyer, University of Applied Sciences and Arts, Germany Donna Harman, National Institite of Standards and Techology, USA Hongyan Jing, IBM T.J. Watson Research Center, USA Min-Yen Kan, School of Computing, National University of Singapore, Singapore Chua-Choi Kim, Universiti Sains, Malaysia Guy Lapalme, Departement d'informatique et de recherche operationnelle, Universite de Montreal, Canada Chin-Yew Lin, Information Science Institute, University of Southern California, USA Inderjeet Mani, Department of Linguistics, Georgetown University, USA Jean-Luc Minel (Co-organizer), LaLLIC, Universite de Paris IV, France Marie-France Moens, Interdisciplinary Centre for Law & Information Technology, Katholieke Universiteit Leuven, Belgium Constantin Orasan, School of Humanities, Languages and Social Studies, University of Wolverhampton, UK Dragomir Radev, School of Information and Department of Electrical Engineering and Computer Science, University of Michigan, USA Horacio Rodriguez, Department de Llenguatges i Sistemes Informatics, Universitat Politecnica de Catalunya, Spain Horacio Saggion (Organizer), Department of Computer Science, University of Sheffield, UK Stan Szpakowicz, School of Information Technology and Engineering, University of Ottawa, Canada Simone Teufel, Computer Laboratory, University of Cambridge, UK Dina Wonsever, INCO, Universidad de la Republica, Uruguay *** Please send your submission to: Horacio Saggion Email: h.saggion@dcs.shef.ac.uk Please use the subject line: "Summarization Workshop/RANLP2005" and include in your message the following information: # NAME: Name of author for correspondence # TITLE: Title of the paper # KEYS : Keywords # EMAIL: Email of author for correspondence # PAGES: Number of pages (including bibliographical references) # FILE : Name of PDF file # ABSTR: # Abstract of the paper # ... # OTHER: Under consideration for other conferences? (please specify) # NOTE : Anything you would like to add *** For any further information please contact Horacio Saggion at h.saggion@dcs.shef.ac.uk