Friday, August 9, 2013

July 2013: First Deliverables: Tagger, Dictionary, Stopword-list: v1.0 Ready!

July 2013 has been a highly active month. 

A visit of  Dr. Lars Juhl Jensen in HCMR (Hellenic Center for Marine Research), Crete followed up on last April’s ENVIRONMENTS software developments (see post).

The main focus was on updating the dictionary and the stopword-list according to the information contained in a recent Environmental Ontology version (envo-basic.obo, date: 14 June 2013)

The Environmental Ontology updates including an improved coverage of terrestrial biomes (see EnvO News post) were the main reason for such an update.

As a result, the v1.0 ENVIRONMENTS tagger is now ready and has been delivered to EOL (including the latest dictionary of environment descriptive terms and the relevant stopword-list). All these software components are open source and will be made available at due time.

An annotation of all EOL-Taxon pages using the v1.0 tagger, along with a precision analysis of the different EOL page section annotation have been completed.

The gold standard corpus curation and the analysis of ENVIRONMENTS’ accuracy based on that corpus are now the main focus. 600 EOL species pages (from eight taxonomic taxa: Actinopterygii, Annelida, Arthropoda, Aves, Chlorophyta, Mammalia, Mollusca, Streptophyta – to maximize environment diversity) have now been shared among the curators and the manual annotation is ongoing.

At the mean time brief holiday opportunities arise :) (Picture taken at Ancient Falasarna, Chania, Crete, Early August 2013, CC BY-NC-SA)

No comments:

Post a Comment