Publications

Export 81 results:
Sort by: [ Author  (Asc)] Title Type Year
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
A
Abi-Haidar, A, Maguitman A, Kaur J, Radivojac P, Retchsteiner A, Verspoor K, Wang Z, Rocha L.  2007.  Uncovering protein-protein interactions in the bibliome. Proceedings of the Second BioCreative Challenge Evaluation Workshop, Volume ISBN 84-933255-6. 2:247–255. Abstract

n/a

Abi-Haidar, A, Kaur J, Maguitman A, Radivojac P, Rechtsteiner A, Verspoor K, Wang Z, Rocha LM.  2008.  Uncovering protein interaction in abstracts and text using a novel linear model and word proximity networks. Genome biology. 9:S11., Number Suppl 2: BioMed Central Ltd Abstract

n/a

B
Bada, M, Eckert M, Evans D, Garcia K, Shipley K, Sitnikov D, Baumgartner WA, Cohen KB, Verspoor K, Blake JA, Hunter LE.  2012.  Concept Annotation in the CRAFT corpus. BMC Bioinformatics. 13(161) AbstractWebsite

Background
Manually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text.

Results
This paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released). Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement.

Conclusions
As the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http://bionlp-corpora.sourceforge.net/CRAFT/index.shtml.

C
Cavedon, L, Martinez D, Suominen H, Ananda-Rajah M, Pitson G, Verspoor K.  2013.  Roles for language technology and text mining for next-generation healthcare, 19 April. HISA Big Data. , Melbourne, Australiabigdata2013_cavedon.pdf
Cohen, KB, Verspoor K, Johnson HL, Roeder C, Ogren PV, Baumgartner Jr WA, White E, Tipney H, Hunter L.  2009.  High-precision biological event extraction with a concept recognizer. Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task. :50–58.: Association for Computational Linguistics Abstract

n/a

Cohen, KB, Christiansen T, Baumgartner Jr WA, Verspoor K, Hunter LE.  2011.  Fast and simple semantic class assignment for biomedical text. ACL HLT 2011. :38. Abstract

n/a

Cohen, KB *, Verspoor K *, Johnson HL, Roeder C, Ogren PV, Baumgartner WA, White E, Tipney H, Hunter L.  2011.  High-precision biological event extraction: Effects of system and data. Computational Intelligence. 27(4):681–701. Abstractbionlp09_coin_paper.pdf

n/a

Cohen, KB, Johnson H, Verspoor K, Roeder C, Hunter L.  2010.  The structural and content aspects of abstracts versus bodies of full text journal articles are different. BMC bioinformatics. 11:492., Number 1: BioMed Central Ltd Abstract

n/a

Cohen, KB, Baumgartner Jr WA, Roeder C, Hunter LE, Verspoor K.  2010.  Test suite design for ontology concept recognition systems. Language Resources and Evaluation Conference (LREC). Abstract

n/a

Comeau, DC, Doğan RI, Ciccarese P, Cohen KB, Krallinger M, Leitner F, Lu Z, Peng Y, Rinaldi F, Torii M, Valencia A, Verspoor K, Wiegers TC, Wu CH, Wilbur WJ.  2013.  BioC: A Minimalist Approach to Interoperability for Biomedical Text Processing. Database: The Journal of Biological Databases and Curation. :bat064.Journal Link
D
Dale, R, Green SJ, Milosavljevic M, Paris C, Williams S, Verspoor C.  1998.  Dynamic document delivery: Generating natural language texts on demand. Database and Expert Systems Applications, 1998. Proceedings. Ninth International Workshop on. :131–136.: IEEE Abstract

n/a

Dale, R, Green SJ, Milosavljevic M, Paris C, Williams S, Verspoor C.  1998.  The realities of generating natural language from databases. Proceedings of the 11th Australian Joint Conference on Artificial Intelligence. :13–17. Abstract

n/a

F
Ferrucci, D, Lally A, Verspoor K, Nyberg DE.  2009.  Unstructured Information Management Architecture (UIMA) Version 1.0, March 2, 2009. : OASIS Technical Standard
G
Gessler, DDG, Joslyn CA, Verspoor KM, Schmidt SE.  2006.  Deconstruction, Reconstruction, and Ontogenesis for Large, Monolithic, Legacy Ontologies in Semantic Web Service Applications. National Center for Genome Research. Abstract

n/a

Görg, C, Tipney H, Verspoor K, Baumgartner W, Cohen K, Stasko J, Hunter L.  2010.  Visualization and language processing for supporting analysis across the biomedical literature. Knowledge-Based and Intelligent Information and Engineering Systems. :420–429.: Springer Abstract

n/a

J
Jimeno Yepes, A, Verspoor K.  2013.  Towards automatic large-scale curation of genomic variation: improving coverage based on supplementary material, 20 July. Proceedings of BioLINK SIG 2013. , Berlin, Germanybiolinksig2013_jimeno_verspoor.pdf
Jimeno Yepes, A, Verspoor K.  2014.  Literature mining of genetic variants for curation: Quantifying the importance of supplementary material. Database: The Journal of Biological Databases and Curation. :bau003.Publisher website
Joslyn, CA, Gessler DDG, Schmidt SE, Verspoor KM.  2006.  Distributed Representations of Bio-Ontologies for Semantic Web Services. Joint BioLINK and 9th Bio-Ontologies Meeting (JBB 06): 2006. Abstract

n/a

Joslyn, CA, Verspoor KM, Gessler DDG.  2007.  Knowledge Integration in OpenWorlds: Utilizing the Mathematics of Hierarchical Structure. Semantic Computing, 2007. ICSC 2007. International Conference on. :105–112.: IEEE Abstract

n/a

Joslyn, C, Gregory M, McGrath L, Paulson P, Verspoor K.  2008.  Semantic Hierarchies: Induction, Measurement, and Management. Abstract

n/a

Joslyn, C, Paulson P, Verspoor K.  2008.  Exploiting Term Relations for Semantic Hierarchy Construction. Semantic Computing, 2008 IEEE International Conference on. :42–49.: IEEE Abstract

n/a

K
Kano, Y, Bjorne J, Ginter F, Salakoski T, Buyko E, Hahn U, Cohen KB, Verspoor K, Roeder C, Hunter LE, others, Ohta T, Tsujii J.  2011.  U-Compare bio-event meta-service: compatible BioNLP event extraction services. BMC bioinformatics. 12:481., Number 1: BioMed Central Ltd Abstract

n/a

Karimi, S, Verspoor K.  2013.  Proceedings of the Australasian Language Technology Association Workshop 2013 (ALTA 2013). Australasian Language Technology Association Workshop 2013 (ALTA 2013). ALTA2013_Proceedings.pdf
L
Lin, S, Verspoor K.  2008.  A semantics-enhanced language model for unsupervised word sense disambiguation. Computational Linguistics and Intelligent Text Processing. :287–298.: Springer Abstract

n/a

Lippincott, T, Rimell L, Verspoor K, Korhonen A.  2013.  Approaches to Verb Subcategorization for Biomedicine. Journal of Biomedical Informatics. 46(2):212-227.DOI
Liu, H, Verspoor K, Comeau DC, MacKinlay A, Wilbur WJ.  2013.  Generalizing an Approximate Subgraph Matching-based System to Extract Events in Molecular Biology and Cancer Genetics, 9 August. Proceedings of the BioNLP Shared Task Workshop at the Association for Computational Linguistics 2013 meeting. , Sofia, Bulgaria
Liu, H, Christiansen T, Baumgartner Jr WA, Verspoor K.  2012.  BioLemmatizer: a lemmatization tool for morphological processing of biomedical text. Journal of Biomedical Semantics. 3(3) AbstractWebsite

Background
The wide variety of morphological variants of domain-specific technical terms contributes to the complexity of performing natural language processing of the scientific literature related to molecular biology. For morphological analysis of these texts, lemmatization has been actively applied in the recent biomedical research.

Results
In this work, we developed a domain-specific lemmatization tool, BioLemmatizer, for the morphological analysis of biomedical literature. The tool focuses on the inflectional morphology of English and is based on the general English lemmatization tool MorphAdorner. The BioLemmatizer is further tailored to the biological domain through incorporation of several published lexical resources. It retrieves lemmas based on the use of a word lexicon, and defines a set of rules that transform a word to a lemma if it is not encountered in the lexicon. An innovative aspect of the BioLemmatizer is the use of a hierarchical strategy for searching the lexicon, which enables the discovery of the correct lemma even if the input Part-of-Speech information is inaccurate. The BioLemmatizer achieves an accuracy of 97.5% in lemmatizing an evaluation set prepared from the CRAFT corpus, a collection of full-text biomedical articles, and an accuracy of 97.6% on the LLL05 corpus. The contribution of the BioLemmatizer to accuracy improvement of a practical information extraction task is further demonstrated when it is used as a component in a biomedical text mining system.

Conclusions
The BioLemmatizer outperforms other tools when compared with eight existing lemmatizers. The BioLemmatizer is released as an open source software and can be downloaded from http://biolemmatizer.sourceforge.net.

Liu, H, Hunter L, Keselj V, Verspoor K.  2013.  Approximate Subgraph Matching-Based Literature Mining for Biomedical Events and Relations, 04. PLoS ONE. 8:e60954., Number 4: Public Library of Science AbstractWebsite

The biomedical text mining community has focused on developing techniques to automatically extract important relations between biological components and semantic events involving genes or proteins from literature. In this paper, we propose a novel approach for mining relations and events in the biomedical literature using approximate subgraph matching. Extraction of such knowledge is performed by searching for an approximate subgraph isomorphism between key contextual dependencies and input sentence graphs. Our approach significantly increases the chance of retrieving relations or events encoded within complex dependency contexts by introducing error tolerance into the graph matching process, while maintaining the extraction precision at a high level. When evaluated on practical tasks, it achieves a 51.12% F-score in extracting nine types of biological events on the GE task of the BioNLP-ST 2011 and an 84.22% F-score in detecting protein-residue associations. The performance is comparable to the reported systems across these tasks, and thus demonstrates the generalizability of our proposed approach.

Liu, H, Keselj V, Blouin C, Verspoor K.  2012.  Subgraph Matching-based Literature Mining for Biomedical Relations and Events, Nov 2-4, 2012. AAAI 2012 Fall Symposium on Information Retrieval and Knowledge Discovery in Biomedical Text. , Arlington, VA, USA
Livingston, K, Bada M, Hunter LE, Verspoor K.  2013.  Representing Annotation Compositionality and Provenance for the Semantic Web. Journal of Biomedical Semantics. 4:38.Journal link
Livingston, KM, Johnson HL, Verspoor K, Hunter LE.  2010.  Leveraging Gene Ontology Annotations to Improve a Memory-Based Language Understanding System. Semantic Computing (ICSC), 2010 IEEE Fourth International Conference on. :40–45.: IEEE Abstract

n/a

Lu, Z, Kao HY, Wei CH, Huang M, Liu J, Kuo CJ, Hsu CN, Tsai R, Dai HJ, Okazaki N, others, Verspoor K, Livingston K, Wilbur WJ.  2011.  The gene normalization task in BioCreative III. BMC Bioinformatics. 12(Suppl 8):S2., Number Suppl 8: BioMed Central LtdWebsite
M
MacKinlay, AD, Verspoor K.  2012.  Extracting Structured Information from Free-Text Medication Prescriptions, October 29, 2012. ACM Sixth International Workshop on Data and Text Mining in Biomedical Informatics (DTMBIO). , Hawaii, USA
MacKinlay, A, Verspoor K.  2013.  Information Extraction from Medication Prescriptions Within Drug Administration Data, 11 February. The 4th International Workshop on Health Document Text Mining and Information Analysis with the Focus of Cross-Language Evaluation (LOUHI). , Sydney, Australiainterpret-prescriptions-louhi.pdf
MacKinlay, A, Martinez D, Jimeno Yepes A, Liu H, Wilbur WJ, Verspoor K.  2013.  Extracting Biomedical Events and Modifications Using Subgraph Matching with Noisy Training Data, 9 August. Proceedings of the BioNLP Shared Task Workshop at the Association for Computational Linguistics 2013 meeting. , Sofia, Bulgaria
Maguitman, AG, Rechtsteiner A, Verspoor K, Strauss C, Rocha LM.  2006.  Large-scale testing of Bibliome informatics using Pfam protein families. Pacific Symposium on Biocomputing. 11:76–87. Abstract

n/a

Martinez, DM, MacKinlay A, Molla-Aliod D, Cavedon L, Verspoor K.  2012.  Simple similarity-based question answering strategies for biomedical text, Sept 17-20, 2012. Conference and Labs of the Evaluation Forum (CLEF). , Rome, Italy
Matykiewicz, P, Cohen KB, Holland KD, Glauser TA, Standridge SM, Verspoor KM, J P.  2013.  Earlier Identification of Epilepsy Surgery Candidates Using Natural Language Processing, 8 August. Proceedings of the BioNLP Shared Task Workshop at the Association for Computational Linguistics 2013 meeting. , Sofia, Bulgaria
N
Narasimhan, B, Di Tomaso V, Verspoor CM.  1996.  Unaccusative or Unergative? Verbs of Manner of Motion Quaderni del Laboratorio di Llinguistica. 10 Abstract

n/a

O
Ofoghi, B, Lopez Campos GH, Verspoor K, Martin Sanchez F.  2014.  BiomRKRS: A Biomarker Retrieval and Knowledge Reasoning System, 20-23 January. Proceedings of the Seventh Australasian Workshop on Health Information and Knowledge Management conference (HIKM 2013). , Auckland, NZ
R
Radivojac, P, Clark WT, Oron TR, Schnoes AM, Wittkop T, Sokolov A, Graim K, Funk C, Verspoor K, Ben-Hur A, Mooney S, Friedberg I, et al.  2013.  A large-scale evaluation of computational protein function prediction. Nature Methods. advance online publication: Nature Publishing Group AbstractWebsite

n/a

Ramakrishnan, C, Baumgartner Jr WA, Blake JA, Burns GAPC, Cohen KB, Drabkin H, Eppig J, Hovy E, Hsu CN, Hunter LE, Ingulfsen T, Pokkunuri S, Onda H, Riloff E, Roeder C, Verspoor K.  2010.  Building the Scientific Knowledge Mine (SciKnowMine): a Community-driven Framework for Text Mining Tools in Direct Service to Biocuration. Malta. Language Resources and Evaluation. Abstract

n/a

Ravikumar, KE, Liu H, Cohn JD, Wall ME, Verspoor K.  2012.  Literature Mining of Protein-Residue Associations with Graph Rules Learned through Distant Supervision. Journal of Biomedical Semantics. 3(S3):S2.
Ravikumar, KE, Liu H, Cohn JD, Wall ME, Verspoor K.  2011.  Pattern Learning through Distant Supervision for Extraction of Protein-Residue Associations in the Biomedical Literature. Machine Learning and Applications and Workshops (ICMLA), 2011 10th International Conference on. 2:59–65.: IEEE Abstract

n/a

Rimell, L, Lippincott T, Verspoor K, Johnson HL, Korhonen A.  2013.  Acquisition and evaluation of verb subcategorization resources for biomedicine. Journal of Biomedical Informatics. 46:228-237., Number 2 AbstractWebsite

n/a

Roeder, C, Jonquet C, Shah NH, Baumgartner Jr WA, Verspoor K, Hunter L.  2010.  A UIMA wrapper for the NCBO annotator. Bioinformatics. 26:1800–1801., Number 14: Oxford Univ Press Abstract

n/a

S
Shmanina, T, Zukerman I, Cavedon L, Jimeno Yepes A, Verspoor K.  2013.  Impact of Corpus Diversity and Complexity on NER Performance. Proceedings of the Australasian Language Technology Association Workshop 2013 (ALTA 2013). :91-95.
Sokolov, A, Funk C, Graim K, Verspoor K, Ben-Hur A.  2013.  Combining Heterogeneous Data Sources for Accurate Functional Annotation of Proteins. BMC Bioinformatics. 14(Suppl 3):S10.Publisher website