Publications

Export 81 results:
Sort by: Author [ Title  (Asc)] Type Year
A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 
A
Verspoor, K, Baumgartner Jr W, Roeder C, Hunter L.  2009.  Abstracting the types away from a UIMA type system. From Form to Meaning: Processing Texts Automatically. C. Chiarcos, Eckhart de Castilho, Stede, M. :249–256. Abstract

n/a

Rimell, L, Lippincott T, Verspoor K, Johnson HL, Korhonen A.  2013.  Acquisition and evaluation of verb subcategorization resources for biomedicine. Journal of Biomedical Informatics. 46:228-237., Number 2 AbstractWebsite

n/a

Verspoor, K, Jimeno Yepes A, Cavedon L, McIntosh T, Herten-Crabb A, Thomas Z, Plazzer J-P.  2013.  Annotating the biomedical literature for the human variome. Database: The Journal of Biological Databases and Curation. 2013 AbstractWebsite

This article introduces the Variome Annotation Schema, a schema that aims to capture the core concepts and relations relevant to cataloguing and interpreting human genetic variation and its relationship to disease, as described in the published literature. The schema was inspired by the needs of the database curators of the International Society for Gastrointestinal Hereditary Tumours (InSiGHT) database, but is intended to have application to genetic variation information in a range of diseases. The schema has been applied to a small corpus of full text journal publications on the subject of inherited colorectal cancer. We show that the inter-annotator agreement on annotation of this corpus ranges from 0.78 to 0.95 F-score across different entity types when exact matching is measured, and improves to a minimum F-score of 0.87 when boundary matching is relaxed. Relations show more variability in agreement, but several are reliable, with the highest, cohort-has-size, reaching 0.90 F-score. We also explore the relevance of the schema to the InSiGHT database curation process. The schema and the corpus represent an important new resource for the development of text mining solutions that address relationships among patient cohorts, disease and genetic variation, and therefore, we also discuss the role text mining might play in the curation of information related to the human variome. The corpus is available at http://opennicta.com/home/health/variome.

Lippincott, T, Rimell L, Verspoor K, Korhonen A.  2013.  Approaches to Verb Subcategorization for Biomedicine. Journal of Biomedical Informatics. 46(2):212-227.DOI
Liu, H, Hunter L, Keselj V, Verspoor K.  2013.  Approximate Subgraph Matching-Based Literature Mining for Biomedical Events and Relations, 04. PLoS ONE. 8:e60954., Number 4: Public Library of Science AbstractWebsite

The biomedical text mining community has focused on developing techniques to automatically extract important relations between biological components and semantic events involving genes or proteins from literature. In this paper, we propose a novel approach for mining relations and events in the biomedical literature using approximate subgraph matching. Extraction of such knowledge is performed by searching for an approximate subgraph isomorphism between key contextual dependencies and input sentence graphs. Our approach significantly increases the chance of retrieving relations or events encoded within complex dependency contexts by introducing error tolerance into the graph matching process, while maintaining the extraction precision at a high level. When evaluated on practical tasks, it achieves a 51.12% F-score in extracting nine types of biological events on the GE task of the BioNLP-ST 2011 and an 84.22% F-score in detecting protein-residue associations. The performance is comparable to the reported systems across these tasks, and thus demonstrates the generalizability of our proposed approach.

Wan, S, Verspoor CM.  1998.  Automatic English-Chinese name transliteration for development of multilingual resources. Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics-Volume 2. :1352–1356.: Association for Computational Linguistics Abstract

n/a

B
Comeau, DC, Doğan RI, Ciccarese P, Cohen KB, Krallinger M, Leitner F, Lu Z, Peng Y, Rinaldi F, Torii M, Valencia A, Verspoor K, Wiegers TC, Wu CH, Wilbur WJ.  2013.  BioC: A Minimalist Approach to Interoperability for Biomedical Text Processing. Database: The Journal of Biological Databases and Curation. :bat064.Journal Link
Liu, H, Christiansen T, Baumgartner Jr WA, Verspoor K.  2012.  BioLemmatizer: a lemmatization tool for morphological processing of biomedical text. Journal of Biomedical Semantics. 3(3) AbstractWebsite

Background
The wide variety of morphological variants of domain-specific technical terms contributes to the complexity of performing natural language processing of the scientific literature related to molecular biology. For morphological analysis of these texts, lemmatization has been actively applied in the recent biomedical research.

Results
In this work, we developed a domain-specific lemmatization tool, BioLemmatizer, for the morphological analysis of biomedical literature. The tool focuses on the inflectional morphology of English and is based on the general English lemmatization tool MorphAdorner. The BioLemmatizer is further tailored to the biological domain through incorporation of several published lexical resources. It retrieves lemmas based on the use of a word lexicon, and defines a set of rules that transform a word to a lemma if it is not encountered in the lexicon. An innovative aspect of the BioLemmatizer is the use of a hierarchical strategy for searching the lexicon, which enables the discovery of the correct lemma even if the input Part-of-Speech information is inaccurate. The BioLemmatizer achieves an accuracy of 97.5% in lemmatizing an evaluation set prepared from the CRAFT corpus, a collection of full-text biomedical articles, and an accuracy of 97.6% on the LLL05 corpus. The contribution of the BioLemmatizer to accuracy improvement of a practical information extraction task is further demonstrated when it is used as a component in a biomedical text mining system.

Conclusions
The BioLemmatizer outperforms other tools when compared with eight existing lemmatizers. The BioLemmatizer is released as an open source software and can be downloaded from http://biolemmatizer.sourceforge.net.

Ofoghi, B, Lopez Campos GH, Verspoor K, Martin Sanchez F.  2014.  BiomRKRS: A Biomarker Retrieval and Knowledge Reasoning System, 20-23 January. Proceedings of the Seventh Australasian Workshop on Health Information and Knowledge Management conference (HIKM 2013). , Auckland, NZ
Verspoor, K, Cohen KB, Goertzel B, Mani I.  2006.  BioNLP’06 Linking Natural Language Processing and Biology: Towards Deeper Biological Literature Analysis. Proceedings of the HLT-NAACL BioNLP Workshop on Linking Natural Language and Biology. , New York, NY
Ramakrishnan, C, Baumgartner Jr WA, Blake JA, Burns GAPC, Cohen KB, Drabkin H, Eppig J, Hovy E, Hsu CN, Hunter LE, Ingulfsen T, Pokkunuri S, Onda H, Riloff E, Roeder C, Verspoor K.  2010.  Building the Scientific Knowledge Mine (SciKnowMine): a Community-driven Framework for Text Mining Tools in Direct Service to Biocuration. Malta. Language Resources and Evaluation. Abstract

n/a

C
Verspoor, K, Cohn J, Mniszewski S, Joslyn C.  2006.  A categorization approach to automated ontological function annotation. Protein Science. 15:1544–1549., Number 6: Wiley Online Library Abstract

n/a

Verspoor, CM.  1994.  A cognitively relevant lexical semantics. Master's thesis. Abstract
n/a
Sokolov, A, Funk C, Graim K, Verspoor K, Ben-Hur A.  2013.  Combining Heterogeneous Data Sources for Accurate Functional Annotation of Proteins. BMC Bioinformatics. 14(Suppl 3):S10.Publisher website
Bada, M, Eckert M, Evans D, Garcia K, Shipley K, Sitnikov D, Baumgartner WA, Cohen KB, Verspoor K, Blake JA, Hunter LE.  2012.  Concept Annotation in the CRAFT corpus. BMC Bioinformatics. 13(161) AbstractWebsite

Background
Manually annotated corpora are critical for the training and evaluation of automated methods to identify concepts in biomedical text.

Results
This paper presents the concept annotations of the Colorado Richly Annotated Full-Text (CRAFT) Corpus, a collection of 97 full-length, open-access biomedical journal articles that have been annotated both semantically and syntactically to serve as a research resource for the biomedical natural-language-processing (NLP) community. CRAFT identifies all mentions of nearly all concepts from nine prominent biomedical ontologies and terminologies: the Cell Type Ontology, the Chemical Entities of Biological Interest ontology, the NCBI Taxonomy, the Protein Ontology, the Sequence Ontology, the entries of the Entrez Gene database, and the three subontologies of the Gene Ontology. The first public release includes the annotations for 67 of the 97 articles, reserving two sets of 15 articles for future text-mining competitions (after which these too will be released). Concept annotations were created based on a single set of guidelines, which has enabled us to achieve consistently high interannotator agreement.

Conclusions
As the initial 67-article release contains more than 560,000 tokens (and the full set more than 790,000 tokens), our corpus is among the largest gold-standard annotated biomedical corpora. Unlike most others, the journal articles that comprise the corpus are drawn from diverse biomedical disciplines and are marked up in their entirety. Additionally, with a concept-annotation count of nearly 100,000 in the 67-article subset (and more than 140,000 in the full collection), the scale of conceptual markup is also among the largest of comparable corpora. The concept annotations of the CRAFT Corpus have the potential to significantly advance biomedical text mining by providing a high-quality gold standard for NLP systems. The corpus, annotation guidelines, and other associated resources are freely available at http://bionlp-corpora.sourceforge.net/CRAFT/index.shtml.

Verspoor, CM.  1997.  Contextually-dependent lexical semantics. : University of Edinburgh. College of Science and Engineering. School of Informatics. Abstract
n/a
Verspoor, C.  1997.  Conventionality-governed logical metonymy. Proc. of the 2nd International Workshop on Computational Semantics. :300–312. Abstract
n/a
Verspoor, K, Cohen KB, Lanfranchi A, Warner C, Johnson HL, Roeder C, Choi JD, Funk C, Malenkiy Y, Eckert M, Xue N, Baumgartner WA, Bada M, Palmer M, Hunter LE.  2012.  A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools. BMC Bioinformatics. 13:207.Journal Website
D
Gessler, DDG, Joslyn CA, Verspoor KM, Schmidt SE.  2006.  Deconstruction, Reconstruction, and Ontogenesis for Large, Monolithic, Legacy Ontologies in Semantic Web Service Applications. National Center for Genome Research. Abstract

n/a

Verspoor, K, Sanfilippo A, Elmore M, MacKerrow E.  2006.  Deploying Natural Language Processing for Social Science Analysis. proceedings 2006 Chicago Colloquium on Digital Humanities and Computer Science. Abstract

n/a

Verspoor, K, MacKinlay A, Cohn JD, Wall ME.  2013.  Detection of protein catalytic sites in the biomedical literature, Jan 3-7, 2013. Pacific Symposium on Biocomputing. , Hawaii
Joslyn, CA, Gessler DDG, Schmidt SE, Verspoor KM.  2006.  Distributed Representations of Bio-Ontologies for Semantic Web Services. Joint BioLINK and 9th Bio-Ontologies Meeting (JBB 06): 2006. Abstract

n/a

Verspoor, K.  2014.  Diving deep into data to crack the gene code on disease, 14 February 2014. The Conversation.
Dale, R, Green SJ, Milosavljevic M, Paris C, Williams S, Verspoor C.  1998.  Dynamic document delivery: Generating natural language texts on demand. Database and Expert Systems Applications, 1998. Proceedings. Ninth International Workshop on. :131–136.: IEEE Abstract

n/a

E
Matykiewicz, P, Cohen KB, Holland KD, Glauser TA, Standridge SM, Verspoor KM, J P.  2013.  Earlier Identification of Epilepsy Surgery Candidates Using Natural Language Processing, 8 August. Proceedings of the BioNLP Shared Task Workshop at the Association for Computational Linguistics 2013 meeting. , Sofia, Bulgaria
Joslyn, C, Paulson P, Verspoor K.  2008.  Exploiting Term Relations for Semantic Hierarchy Construction. Semantic Computing, 2008 IEEE International Conference on. :42–49.: IEEE Abstract

n/a

Verspoor, K, Roeder C, Johnson HL, Cohen KB, Baumgartner Jr WA, Hunter LE.  2010.  Exploring species-based strategies for gene normalization. Computational Biology and Bioinformatics, IEEE/ACM Transactions on. 7:462–471., Number 3: IEEE Abstract

n/a

MacKinlay, A, Martinez D, Jimeno Yepes A, Liu H, Wilbur WJ, Verspoor K.  2013.  Extracting Biomedical Events and Modifications Using Subgraph Matching with Noisy Training Data, 9 August. Proceedings of the BioNLP Shared Task Workshop at the Association for Computational Linguistics 2013 meeting. , Sofia, Bulgaria
MacKinlay, AD, Verspoor K.  2012.  Extracting Structured Information from Free-Text Medication Prescriptions, October 29, 2012. ACM Sixth International Workshop on Data and Text Mining in Biomedical Informatics (DTMBIO). , Hawaii, USA
F
Cohen, KB, Christiansen T, Baumgartner Jr WA, Verspoor K, Hunter LE.  2011.  Fast and simple semantic class assignment for biomedical text. ACL HLT 2011. :38. Abstract

n/a

G
Lu, Z, Kao HY, Wei CH, Huang M, Liu J, Kuo CJ, Hsu CN, Tsai R, Dai HJ, Okazaki N, others, Verspoor K, Livingston K, Wilbur WJ.  2011.  The gene normalization task in BioCreative III. BMC Bioinformatics. 12(Suppl 8):S2., Number Suppl 8: BioMed Central LtdWebsite
Verspoor, CM, Joslyn C, Papcun GJ.  2003.  The gene ontology as a source of lexical semantic knowledge for a biological natural language processing application. SIGIR workshop on Text Analysis and Search for Bioinformatics. :51–56. Abstractlaur_03-4480.pdf

n/a

Liu, H, Verspoor K, Comeau DC, MacKinlay A, Wilbur WJ.  2013.  Generalizing an Approximate Subgraph Matching-based System to Extract Events in Molecular Biology and Cancer Genetics, 9 August. Proceedings of the BioNLP Shared Task Workshop at the Association for Computational Linguistics 2013 meeting. , Sofia, Bulgaria
H
Cohen, KB, Verspoor K, Johnson HL, Roeder C, Ogren PV, Baumgartner Jr WA, White E, Tipney H, Hunter L.  2009.  High-precision biological event extraction with a concept recognizer. Proceedings of the Workshop on Current Trends in Biomedical Natural Language Processing: Shared Task. :50–58.: Association for Computational Linguistics Abstract

n/a

Cohen, KB *, Verspoor K *, Johnson HL, Roeder C, Ogren PV, Baumgartner WA, White E, Tipney H, Hunter L.  2011.  High-precision biological event extraction: Effects of system and data. Computational Intelligence. 27(4):681–701. Abstractbionlp09_coin_paper.pdf

n/a

I
Shmanina, T, Zukerman I, Cavedon L, Jimeno Yepes A, Verspoor K.  2013.  Impact of Corpus Diversity and Complexity on NER Performance. Proceedings of the Australasian Language Technology Association Workshop 2013 (ALTA 2013). :91-95.
MacKinlay, A, Verspoor K.  2013.  Information Extraction from Medication Prescriptions Within Drug Administration Data, 11 February. The 4th International Workshop on Health Document Text Mining and Information Analysis with the Focus of Cross-Language Evaluation (LOUHI). , Sydney, Australiainterpret-prescriptions-louhi.pdf
Verspoor, C, Dale R, Green S, Milosavljevic M, Paris C, Williams S.  1998.  Intelligent Agents for Information Presentation: Dynamic Description of Knowledge Base Objects. the proceed. :75–86. Abstract

n/a

Verspoor, C, Joslyn C, Papcun G.  2003.  Interactions Between the Gene Ontology and a Domain Corpus for a Biological Natural Language Processing Application. proceedings Sixth Annual Bio-Ontologies Meeting. Abstract

n/a

K
Joslyn, CA, Verspoor KM, Gessler DDG.  2007.  Knowledge Integration in OpenWorlds: Utilizing the Mathematics of Hierarchical Structure. Semantic Computing, 2007. ICSC 2007. International Conference on. :105–112.: IEEE Abstract

n/a

L
Verspoor, K, Cohn J, Joslyn C, Mniszewski S, Rechtsteiner A, Rocha LM, Simas T.  2004.  The LANL BioCreAtIvE submission. Abstract

n/a

Radivojac, P, Clark WT, Oron TR, Schnoes AM, Wittkop T, Sokolov A, Graim K, Funk C, Verspoor K, Ben-Hur A, Mooney S, Friedberg I, et al.  2013.  A large-scale evaluation of computational protein function prediction. Nature Methods. advance online publication: Nature Publishing Group AbstractWebsite

n/a

Maguitman, AG, Rechtsteiner A, Verspoor K, Strauss C, Rocha LM.  2006.  Large-scale testing of Bibliome informatics using Pfam protein families. Pacific Symposium on Biocomputing. 11:76–87. Abstract

n/a

Livingston, KM, Johnson HL, Verspoor K, Hunter LE.  2010.  Leveraging Gene Ontology Annotations to Improve a Memory-Based Language Understanding System. Semantic Computing (ICSC), 2010 IEEE Fourth International Conference on. :40–45.: IEEE Abstract

n/a

Verspoor, C.  1996.  Lexical limits on the influence of context. Proc. of CogSci. :116–120. Abstract
n/a
Jimeno Yepes, A, Verspoor K.  2014.  Literature mining of genetic variants for curation: Quantifying the importance of supplementary material. Database: The Journal of Biological Databases and Curation. :bau003.Publisher website
Ravikumar, KE, Liu H, Cohn JD, Wall ME, Verspoor K.  2012.  Literature Mining of Protein-Residue Associations with Graph Rules Learned through Distant Supervision. Journal of Biomedical Semantics. 3(S3):S2.
M
N
Verspoor, K, Cohn J, Mniszewski SM, Joslyn CA.  2004.  Nearest Neighbor Categorization for Function Prediction. Proc. 5th Community Wide Experiment on the Critical Assessment of Techniques for Protein Structure Prediction (CASP 05). Abstract

n/a