Saturday, March 30, 2013

Welcome to ENVIRONMENTS-EOL, a few words on the project

Large-scale biological questions such as retrieving all species belonging to a specific group (e.g. Invertebrates), associated with a particular environment (e.g. coral reefs) and occurring in a specific region (e.g. Indo-Pacific Ocean) require the combinatorial analysis of information available in a diverse range of resources.

Taxonomy information along with species occurrence data (stored in centralized biodiversity resources) can be combined to this end. To fill-in, however, the missing pieces of the puzzle, input based on knowledge existing in the scientific literature is required.

The Encyclopedia of Life ( by collecting the available information about a given taxon is a one-stop-shop that greatly facilitates answering such questions.

The identification of environment descriptive terms, such as terrestrial, aquatic, lagoon, coral reef, in EOL text can drive the mining of species environmental context information.

ENVIRONMENTS is an open source tool supporting such identification. It does so by looking up words in the text against a dictionary of environment descriptors.

The Environment Ontology (, a community resource offering a controlled, structured vocabulary for ecosystems types (“biomes”), environmental materials, and environmental features (e.g. habitats), serves as the source of names and synonyms for the creation of such a dictionary.

While the environment descriptive term identification is the core of the project, tasks such as:
  • the evaluation of the accuracy of the method (via the creation of a manually annotated, gold standard, corpus)
  • the assessment of the contribution of the different EOL page sections to the environmental context mining
  • the consideration of taxonomy and species occurrence information
  • the generation of summarizing visualizations supporting comparisons and biological inferences

are equally important in answering large-scale biological questions like the one in the beginning of this post.

What lies ahead is a challenging project comprising a diverse range of tasks. As response, a team of researchers with diverse backgrounds (molecular biology, microbial ecology, data analysis, text/literature mining, bioinformatics, statistics and more) has been put together to this end.

Through the posts in this blog, it will, hopefully, be possible to keep you up-to-date with the project developments, provide you with more information on the tasks involved, present to you and bring you in contact with the people contributing to the different tasks.

Stay tuned!

No comments:

Post a Comment