![COLE](/images/cole.gif)
Interesting sites about (Natural Language) Information Retrieval and Information Extraction
We recommend Web
IR & IE, a collection of online resources for research in
the field of information retrieval and information extraction from the
web.
The following is a complementary list of sites we consider interesting
for peopple working on Information Retrieval, Information Extraction
and related areas. If you have any suggestion or comment, please mail
to alonso@dc.fi.udc.es.
- Books
- Information Retrieval
(HTML and PDF of the classic book by C. J. van Rijsbergen published by Butterworths, London, 1975 and 1979)
- Information Retrieval Data Structures & Algorithms
(table of contents and source code of the book edited by Bill Frakes and Ricado Baeza-Yates and published by Prentice-Hall, 1992)
- Information Retrieval Interaction
(PDF of the book by Peter Ingwersen, published by Taylor Graham, London, 1992)
- Introduction to Information Retrieval
(slides, HTML and PDF of the book by Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, published by Cambridge University Press, 2008)
- Introduction to Informetrics: Quantitative Methods in Library, Documentation and Information Science (PDF of the book by Leo Egghe and Ronald Rousseau, published by Elsevier Science, 1990)
- Managing Gigabytes: Compressing and Indexing Documents and Images (2nd edition)
(table of contents, errata and software about the book authores by Ian H. Witten, Alistair Moffat, and Timothy C. Bell, publiched by Morgan Kaufmann in 1999)
- Modern Information Retrieval (Second edition)
(table of contents, slides, errata and additional information about the book authored by Ricardo Baeza-Yates and Berthier Ribeiro-Neto, published by Pearson Education, 2011)
- Modern Information Retrieval
(table of contents, errata and additional information about the book authored by Ricardo Baeza-Yates and Berthier Ribeiro-Neto, published by Addison-Wesley, 1999)
- Web Data Mining
(table of contents, errata, slides and additional information about the book authored by Bing Liu, published by SPringer, 2007)
- Competitions
- CLEANEVAL,
cleaning arbitrary web pages, with the goal of preparing web data
for use as a corpus, for linguistic and language technology
research and development.
- CLEF,
Cross-Language Evaluation Forum
(alternative link).
- INEX,
Initiative for the Evaluation of XML retrieval.
- EVALITA,
Evaluation of NLP Tools for Italian.
- International Challenge: Classifying Clinical Free Text Using Natural Language Processing
- NTCIR,
(NII Test Collection for IR Systems) Project Infrastructure for Research and Evaluation of Information Retrieval and Access Technologies
- Pascal Morpho Challenge (Unsupervised Morpheme Analysis, Unsupervised Segmentation of Words into Morphemes).
- Pascal Textual Entailment Challenge and Resources Pool.
- SemEval/SensEval, Evaluation Exercises for the Semantic Analysis of Text.
- Shared Task for Challenges in Natural Language Processing for Clinical Data.
- The Spock Challenge:
Entity Resolution.
- TAC, Text Analysis Conference, the successor of
DUC,
Document Understanding Conferences.
- TREC,
Text Retrieval Conferences.
- Journals
- Organizations
- ACM SIGIR,
Special Interest Group on Information Retrieval.
- Papers
- IE People
- IR People
- Alonso, Miguel A.
(Corunna, Spain).
- Baeza-Yates, Ricardo
(Santiago, Chile)
- Callan, Jamie
(Carnegie Mellon University, USA)
- Crestani, Fabio
(University of Strathclyde, UK)
- Crimmins, Francis
(Dublin, Ireland)
- Croft, W. Bruce
(University of Massachusetts, USA)
- Ingwersen, Peter
(Copenhagen, Denmark)
- Jacquemin, Christian
(Orsay, France)
- Perez-Carballo, Jose
(Rutgers, USA)
- Ribeiro-Neto, Berthier
(Minas Gerais, Brazil)
- Dominich, Sándor.
(Buckinghamshire, UK)
- Losee, Robert M.
(University of North Carolina at Chapel Hill, USA)
- Paice, Chris D.
(Lancaster, UK)
- Martin Porter
(??, ??)
- Robertson, Stephen
(Microsoft Research Cambridge, UK)
- Salton, Gerard
(Cornell University, USA)
- Savoy, Jacques
(Neuchatel, Witzerland)
- Singhal, Amit
(Google, USA, )
- Smeaton, Alan F.
(Dublin, Ireland)
- Spärck Jones, Karen
(University of Cambridge, UK)
- Strzalkowski, Tomek
(Albany, USA)
- Sutcliffe, Richard F. E.
(Limerick, Ireland)
- Tait, John
(Sunderland, UK)
- van Rijsbergen, C. J. "Keith"
(Glasgow, UK)
- Vilares, Jesús
(Corunna, Spain).
-
(, )
- Projects
- ITEM,
Recuperación de Información Textual en un Entorno
Multilíngüe con Técnicas de Lenguaje Natural .
- LIQUID, Language Independent Querying for Information Discovery
- Resources
- Teaching
- IE Tools
- IR Tools
- Apache Lucene.
- Bow, a
Toolkit for Statistical Language Modeling, Text Retrieval,
Classification and Clustering.
- Clairlib, The Clair Library, intended to simplify a number of generic tasks in
NLP and IR, with
additional applications to Bioinformatics and Political Science.
- The Lemur Toolkit for Language Modeling and Information Retrieval
.
- List of IR systems and tools.
- List of open source search engines.
- List of search tools for web sites and intranets.
- List of software for building and editing thesauri
- List of software for Information Retrieval, by dmoz.
- Okapi, open source software, under the BSD license.
- The Porter Stemming Algorithm.
- Qanda: MITRE's Open Source Question Answering System.
- Some information retrieval tools by Michel Beigbeder.
- SMART
(ftp.cs.cornell.edu).
- SMART Retrieval System
(information about the system, in French).
- SMART Tutorial for beginners.
- Snowball, a tool for building stemmers.
- Terrier.
- Text-Garden, Text-Mining Software Tools (pre-processing, clustering, classification, feature construction/extraction, visualization, simple web mining, crawling, search engine).
- Z39.50/PRISE 2.0.
- .
- QA Tools
Send comments and suggestions to
alonso at dc.fi.udc.es
Last modified: Wed Apr 28 19:26:23 CEST 2010