December 14, 2009

Plan to create a UMBS controlled (keyword) vocabulary

Here is an outline of what will need to happen for the UMBS to have an established list of keywords.

1. Extract keywords from the UMBS bibliography, this will be the starting point.
- This list is not comma-delimited, meaning multi-term keywords will need to be manually identified and computer scripts will need to be used to reformat the terms and add commas

2. Parse the raw keyword list into 3 parts:
- a) keywords redundant with the LTER list (including synonyms and lexical variants)
- b) taxonomic descriptors (latin names and species-specific common names?)
- c) candidate-keywords for a UMBS keyword list.

3. Build UMBS keyword list using the candidate-keyword list:
- Identify how to treat hyphens, spaces and plurals
- Declare as equivalent lexical variants (e.g. analyze vs analyse)
- Identify synonyms
- Remove candidate-keywords that require context to make sense (e.g. "change", "description")


Posted by kkwaiser at December 14, 2009 10:59 AM


