FIRST CALL FOR PAPERS Workshop: "Mixing Approaches to MACHINE TRANSLATION" Thursday February 14th 2008, Donostia - San Sebastián. Basque Country http://ixa2.si.ehu.es/matmt08-2008 Background ========== The aim of the workshop "Mixing Approaches To Machine Translation" is to promote practical hybrid approaches to MT, combining resources and algorithms coming from rule-based, example-based or statistical approaches. The boundaries between the three principal approaches to MT (rule-based, example-based, statistical) are becoming narrower: (i) Phrase based SMT models are incorporating morphology, syntax and semantics into their systems. (ii) Rule based systems are using parallel corpora to enrich their lexicons and grammars, and to create new methods for disambiguation. (iii) Previous ASR/ALT projects have shown that in a MT system benefits can be realized by a simple combination of different MT approaches in a Rover architecture. Data-driven Machine Translation (example-based or statistical) is nowadays the most prevalent trend in Machine Translation research. Translation results obtained with this approach have now reached a high level of accuracy, especially when the target language is English. But these Data-driven MT systems base their knowledge on aligned bilingual corpora, and the accuracy of their output depends heavily on the quality and the size of these corpora. Large and reliable bilingual corpora are unavailable for many language pairs. Workshop Programme ================== Structure: 1. Invited talks 2. Programme papers 3. Panel discussion Keynote speakers: * Koehn, Philipp (University of Edinburgh, UK) * Ney, Hermann (Rheinisch-Westfälische Technische Hochschule, Germany) * Way, Andy (Dublin City University, Ireland) Workshop topics =============== We are particulartly interested in papers describing research and development in the following areas: * Comparing different approaches for developing MT * Methods to compare and integrate translation outputs obtained with different MT approaches. * MT evaluation methods, especially those suitable for languages with rich morphology. * Morphology-, syntax- or semantic-augmented SMT models * Research developed using OpenSource language resources for developing hybrid MT Paper submission ================ Papers should be written in English and no longer than 8 pages. Use the same file template as was used for the TMI-07 conference. http://www.computing.dcu.ie/~away/TMI-07/guidelines.html Papers should be sent via e-mail to i.alegria@ehu.es All contributions will be published in the workshop proceedings. Important Dates =============== * Paper submission deadline: Nov 26, 2007 * Notification of acceptance: Jan 9, 2007 * Camera-ready papers: Jan 20, 2007 * Workshop: Feb 14, 2008 Programme Committee =================== * Iñaki Alegria (University of the Basque Country, Donostia) * Kutz Arrieta (Vicomtech, Donostia) * Núria Castell (Technical University of Catalonia, TALP, Barcelona) * Arantza Diaz de Ilarraza (University of the Basque Country, Donostia) * David Farwell (Technical University of Catalonia, TALP, Barcelona) * Mikel Forcada (University of Alacant, Alicante) * Philipp Koehn (University Of Edinburgh, UK) * Lluis Marquez (Technical Univ. of Catalonia, Barcelona)(Co-chair) * Hermann Ney (Rheinisch-Westfälische Technische Hochschule, Aachen) * Kepa Sarasola (University of the Basque Country, Donostia) (Co-chair) Local organizers ================ IXA Group, University of the Basque Country Alegria I., Casillas A., Díaz de Ilarraza A., Igartua J., Labaka G., Lersundi M., and Sarasola K. Elhuyar Fundazioa Gurrutxaga A., Leturia,I., and Saralegi X. About the OpenMT project ======================== The main goal of the OpenMT project is the development of an Open Source Machine Translation Architecture based on hybrid models and advanced semantic processors. These architectures will be Open Source systems combining the three main Machine Translation frameworks - Rule-Based MT, Statistical MT, and Example-Based MT %G—%@ into hybrid systems. Defined architectures and results of the project will be Open Source, this will allow a fast development and adaptation of new advanced Machine Translation systems for other languages. We will test the functionality of this system with different languages: English, Spanish, Catalan and Basque. Corpora are easily available for English and Spanish, but not so for the remaining languages. While the structure of some of those languages is very similar (Catalan and Spanish), others are very different (English and Basque). Basque is an agglutinative language with a very rich morphology, unlike English, Catalan and Spanish. The highlights of the OpenMT project are: * The design of hybrid systems combining traditional linguistic rules, example-based methods and statistical methods. * That it is an Open Source Initiative * The use of advanced syntactic and semantic processing http://ixa.si.ehu.es/openmt