November 09, 2011

Installing DEIMS using features

*note* probably best to ignore this post. so so sorry.

1. Get drush up and running

2. Download and rearrange Drupal 6

change directory name:
$ sudo mv drupal-6.22/ d2e

Link to web directory:
$ sudo ln -s /var/d2e/ /var/www/

3. Setup drupal

$ cd d2e/
$ sudo cp sites/default/default.settings.php sites/default/settings.php

change permissions on settings.php
chmod o+w sites/default/settings.php

Setup files directory:
$ sudo mkdir sites/default/files
$ sudo chown www-data\: sites/default/files/

create database
run through Drupal's install process

revert permissions on settings.php
$ sudo chmod a-w sites/default/settings.php

4. Install DEIMS features package

Navigate to DEIMS site

Download tar ball containing many necessary modules: Downloads > Sources > Branches > bdp_metabvist>modules

$ sudo mkdir sites/all/modules
$ cd sites/all/modules
$ sudo wget http://code.google.com/p/deims/source/browse/branches/bdp_metavist/modules/contrib/nbiidev_modules_contrib.tar.gz
$ sudo rm nbiidev_modules_contrib.tar.gz

Modify permissions:
$ sudo chmod 755 -R ../modules/*

Retrieve features package
DEIMS > Downloads > bdp_compliant_metadata-6.x-1.0-dev.tar

$ sudo mv /home/umbs/Downloads/bdp_compliant_metadata-6.x-1.0-dev.tar ../modules/
$ sudo tar -xf bdp_compliant_metadata-6.x-1.0-dev.tar

5. Enable modules and features package
Check Modules page to confirms bdp_compliant_metadata is a module

- Enable a whole bunch of modules, download additional modules not included

- fix getID error
Download 1.7.9 from http://sourceforge.net/projects/getid3

$ sudo mv /home/umbs/Downloads/getid3-1.7.9.zip sites/all/libraries/getid3/
$ cd sites/all/libraries/getid3
$ sudo unzip getid3-1.7.9.zip
$ sudo rm -rf demos

Posted by kkwaiser at 10:54 AM | Comments (0) | TrackBack

November 08, 2011

DEIMS Content Type Best Practices

Naming convention for fields

field_[contenttype]_[labelabbreviation][_optional]

Where [_optional] is optional and depends on the field type:

Node reference: _ref
Node referrer: _rfrr
Conditional field: _cf

Further notes:

The Label of a field is inconsequential in terms of generating EML

Suggestion Improvements:
- Use Modal Frame and Node Relationship for creating children nodes from parent (leverages Node Reference field)
- Use conditional fields to simplify Variable node/add form
- Use custom module for separating Code-Definition field
- Use OpenLayers vs Geo or Location

Posted by kkwaiser at 04:06 PM | Comments (0) | TrackBack

October 06, 2011

EML and DEIMS - Code-Definition Parsing

Whew. I am the world's slowest debugger but I am also doggedly persistent.

From the DEIMS Google code repository:

Code-Definition variable information not parsed
What steps will reproduce the problem?
1. Create an EML document of a dataset containing variables that have code-definitions

What is the expected output? What do you see instead?
These code-definitions are NOT included in the EML document

What version of the product are you using? On what operating system?
Patch is applied against the latest (Oct. 6th) Google version

The conditional statement that leads to code-definitions being parsed does not include a check on whether the $code_definitions array contains values. Adding this check allows downstream parsing to occur.

Here is the patch:

--- /home/data/Desktop/views_bonus_eml/export/views-bonus-eml-export-eml.tpl.php 2010-12-16 14:28:50.000000000 -0500
+++ /home/data/Desktop/views-bonus-eml-export-eml.tpl.php.patched 2011-10-06 11:36:42.000000000 -0400
@@ -313,7 +313,8 @@
$attribute_maximum[0]['value'] ||
$attribute_minimum[0]['value'] ||
$attribute_precision[0]['value'] ||
- $attribute_unit[0]['value']) {
+ $attribute_unit[0]['value'] ||
+ $code_definitions[0]['value']) {
views_bonus_eml_print_open_tag('measurementScale');
if ($attribute_formatstring[0]['value']) {
views_bonus_eml_print_open_tag('datatime');

Posted by kkwaiser at 11:51 AM | Comments (0) | TrackBack

October 05, 2011

EML and DEIMS - Mapping Attributes

How to treat variable information (called an Attribute in EML) is at the forefront of my mind for a few reasons:

1) Variables can vary widely (ha,ha) among and within data sets which makes the EML specification rather complex.
2) Portions of this complexity are ensconced within the DEIMS metadata structure but much of it is not. I tend to agree with this approach as encasing the entire specification would create a huge proliferation of fields that would rarely be used and would make direct metadata entry into the DEIMS system by non-expert researchers nearly impossible.

On with the show.

EML Specification*

<attributeName> is the official name of an attribute, typically the name of a field in a data table. This is often short and/or cryptic.

<attributeLabel> (optional): is used to provide a less ambiguous or cryptic alternative identification than what is provided in <attributeName>. This content may be used as a column or row header in an HTML display.

<attributeDefinition> gives a precise and complete definition of attribute being documented. It explains the contents of the attribute fully so that a data user can interpret the attribute accurately.

Corresponding DEIMS variable fields

Node Title -> attributeName
Variable Abbreviation (field_attribute_label) -> attributeLabel
Definition (field_var_definition) -> attributeDefinition

This is confusing. Variable Abbreviation maps to attributeLabel although the latter is designed to be the full variable name (i.e., Node Title.)

EML Specification* - yields 5 over-arching variable categories:

<measurementScale> indicates the type of scale from which values are drawn for the attribute. One of the 5 scale types must be used: nominal, ordinal, interval, ratio, or dateTime,

The <nominal> scale is used to represent named categories. Values are assigned to distinguish them from other observations. This would include a list of coded values (e.g. 1=male, 2=female), or plain text descriptions. Columns that contain strings or simple text are nominal. Example: plot1, plot2, plot3.

<ordinal> values are categories that have a logical or ordered relationship to one another, but the magnitude of the differences between the values is not defined or meaningful. Example: Low, Medium, High.

<interval> These measurements are ordinal, but in addition, use equal-sized units on a scale between values. The starting point is arbitrary, so a value of zero is not meaningful. Example: The Celsius temperature scale uses degrees which are equally spaced, but where zero does not represent “absolute zero” (i.e., the temperature at which molecular motion stops), and 20 Celsius is not “twice as hot” as 10 Celsius.

<ratio> measurements have a meaningful zero point, and ratio comparisons between values are legitimate. For example, the Kelvin scale reflects the amount of kinetic energy of a substance (i.e., zero is the point where a substance transmits no thermal energy), and so temperature measured in kelvin units is a ratio measurement. Concentration is also a ratio measurement because a solution at 10 micromolePerLiter has twice as much substance as one at 5 micromolePerLiter.

<dateTime>, is a date-time value from the Gregorian calendar and it is recommended that these be expressed in a format that conforms to the ISO 8601 standard. An example of an allowable ISO date-time is “YYYY-MM-DD”, as in 2004-06-25, or, more fully, as “YYYY-MM-DDThh:mm:ssTZD” (e.g., 1997-07-16T19:20:30.45Z).

Corresponding DEIMS variable fields

The DEIMS implementation is simplified into the following groups:

Quantitative Variable: Interval/Ratio are clumped under ratio

Date Time Variable: dateTime

Text Based Variable: Nominal/Ordinal are clumped under nominal.

- Note: pattern here is /attribute/measurementScale/nominal/nonNumericDomain/enumeratedDomain - are the last two contradictory? Partial answer: possibly not because there is a numericDoman field

EML Specification* - "The and scales require additional tags describing , the , and."


<unit> Units should be described in correct physical units. Terms which describe data but are not units should be used in <attributeDefinition>. For example, for data describing “milligrams of Carbon per square meter”, “Carbon” belongs in the <attributeDefinition>, while the <unit> is “milligramPerMeterSquared”.

Corresponding DEIMS variable fields

Unit (field_attribute_unit) -> Unit within a customUnit tag

Notes to follow-up on:

Code-definition doesn't show up in the EML output. views-bonus-eml-export-eml.tpl.php indicates it should appear as
attribute/measurementScale/nominal/nonNumericDomain/enumeratedDomain/code
attribute/measurementScale/nominal/nonNumericDomain/enumeratedDomain/defintion

Although it seems this would be correct

attribute/measurementScale/nominal/nonNumericDomain/enumeratedDomain/codeDefition/code
attribute/measurementScale/nominal/nonNumericDomain/enumeratedDomain/codeDefintion/defintion

*EML Best Practices Working Group. EML Best Practices for LTER Sites V2.0. August 1st, 2011. http://im.lternet.edu/sites/im.lternet.edu/files/emlbestpractices-2.0-FINAL-20110801_0.pdf

Posted by kkwaiser at 11:19 AM | Comments (0) | TrackBack

EML and DEIMS - Mapping Mission

My primary take-away from the EIM 2011 conference was that, while the metadata we (UMBS) are taking in is of sufficient quality for re-use by researchers, it is not rigorous enough to be machine-ingested and we (UMBS) therefore are not well-prepared to move data to third-party databases. The first step needed to remedy this situation is for me to gain a better - more rigorous, if you will - understanding of the EML specification and how our DEIMS fields translate to it.

Of course, I'm not entirely certain how in depth this process will go but I do anticipate a series of posts looking at particular portions of the EML specification and analysing the DEIMS fields and the Drupal2EML module that accomplishes the mapping.

Important resources:
The LTER's EML Best Practices Guide

The DEIMS code repository which contains the content types and modules we use.

The EML specification, but I hope to rely mostly on the Best Practices Guide.

Posted by kkwaiser at 10:56 AM | Comments (0) | TrackBack