July 2013 has been a highly active month.
A visit of Dr. Lars Juhl Jensen in HCMR (Hellenic Center for Marine Research), Crete followed
up on last April’s ENVIRONMENTS software developments (see post).
The main focus was on updating the dictionary and the stopword-list
according to the information contained in a recent Environmental Ontology version (envo-basic.obo, date: 14 June 2013)
The Environmental Ontology updates including an improved coverage of terrestrial
biomes (see EnvO News post) were the main reason for such an update.
As a result, the v1.0 ENVIRONMENTS
tagger is now ready and has been delivered to EOL (including the latest dictionary
of environment descriptive terms and the relevant stopword-list). All these software components are open source and will be made available at due time.
An annotation of all
EOL-Taxon pages using the v1.0 tagger, along with a precision analysis of the different EOL page section annotation have been completed.
The gold
standard corpus curation and the analysis of ENVIRONMENTS’ accuracy based on
that corpus are now the main focus. 600 EOL
species pages (from eight taxonomic taxa: Actinopterygii, Annelida, Arthropoda,
Aves, Chlorophyta, Mammalia, Mollusca, Streptophyta – to maximize environment
diversity) have now been shared among the curators and the manual annotation is
ongoing.