Call For Papers Translating Biology: Text Mining Tools That Work A Pacific Symposium on Biocomputing Session January 4-8, 2008 The Big Island, Hawai'i http://psb.stanford.edu/cfp-nlp.html Biomedical science is now an information-intensive field of study, with high-throughput experimental techniques generating large amounts of data, and bioinformatics providing tools for managing and making sense of that data. However, the information generated and used in biomedical science must be accessible both to computers and to people. This requires constant translation between human-readable forms, such as text and figures, to computer-readable forms, such as biological databases and ontologies. In a recent PLoS Computational Biology editorial, Philip Bourne posed the following question: Will a biological database be different from a biological journal? If we had text mining tools that worked, then the translation from text to database (and back) would blur these lines. Such tools would enable the seamless incorporation of semantic information extracted from text with databases and with analytical tools, as just one of many sources of information for addressing complex biological problems. From the many publications in the area, we know that performance has reached reasonable levels on a number of basic text mining tasks, such as indexing and the identification of biomedical entities. We now need to ask a new set of questions: Do these tools work? Can they be adapted to new applications? Are they cost-effective in real applications? Who uses these tools, and how? Can these tools be maintained over time? The answers to these questions are critical to understanding the apparent gap between the number of publications on biomedical text mining and the number of deployed text mining applications. The answers to these questions are also essential to providing the bioinformatics community with the text mining tools that they are asking for. We categorize these questions into four attributes: utility, usability, portability, and robustness. The proposed session will focus on papers that explore these issues, including questions such as: What is the actual utility of text mining in the work flows of the various communities of potential users—model organism database curators, bedside clinicians, biologists utilizing high-throughput experimental assays, hospital billing departments? How usable are biomedical text mining applications? How does the application fit into the workflow of a complex bioinformatics pipeline? What kind of training does a bioscientist require to be able to use an application? Is it possible to build portable text mining systems? Can systems be adapted to specific domains and specific tasks without the assistance of an experienced language processing specialist? How robust and reliable are biomedical text mining applications? What are the best ways to assess robustness and reliability? Are the standard evaluation paradigms of the natural language processing world—intrinsic evaluation against a gold standard, post-hoc judging of outputs by trained judges, extrinsic evaluation in the context of some other task—the best evaluation paradigms for biomedical text mining, or even sufficient evaluation paradigms? Session chairs Lynette Hirschman, The MITRE Corporation Kevin Bretonnel Cohen (Contact person) University of Colorado School of Medicine kevin.cohen@gmail.com Philip Bourne University of California San Diego Hong Yu University of Wisconsin at Milwaukee Submission information The core of the conference consists of rigorously peer-reviewed full-length papers reporting on original work. Accepted papers will be published in a hard-bound archival proceedings, and the best of these will be presented orally to the entire conference. Researchers wishing to present their research without official publication are encouraged to submit a one-page abstract by noon, November 9, 2007 to present their work in the poster sessions. Important dates Paper submissions due: July 16, 2007 Notification of paper acceptance: September 5, 2007 Final paper deadline: September 24, 2007 midnight PT Abstract deadline: November 9, 2007 Meeting: January 4-8, 2008 For full submission information, including style sheets and all requirements, please see the session web site at http://psb.stanford.edu/cfp-nlp.html.