Article ID: | iaor201023 |
Volume: | 13 |
Issue: | 3/4 |
Start Page Number: | 328 |
End Page Number: | 344 |
Publication Date: | Dec 2009 |
Journal: | International Journal of Risk Assessment and Management |
Authors: | McGovern Jerry R, Dowling Kathryn C |
Keywords: | risk |
The amount of toxicologic literature available can be so copious as to present significant challenges to risk assessors tasked with identifying key studies. As a new approach to managing such information, an information specialist and a toxicologist developed an open source text mining computer program consisting of knowledge bases and search algorithms. Quantitative toxicologic data, such as dose levels or risk numbers, are often presented in the abstracts of scientific literature records, which, in turn, include full or partial abstracts. We chose to examine records containing human blood lead concentration (HBLC) data. The resulting program (HBLCFinder) searches for lead concentration data in a record's abstract then determines the record's relevancy to human blood. After several iterative modifications, we achieved recall (sensitivity), specificity and precision of 86%, 99% and 96%, respectively. The approach may be of use to risk assessors needing to identify quantitative data in online database records.