November 17, 2009
Format Water Profile Data
This is the code I'm using to format some of our water quality data. It was in an Excel workbook where it occupied 132 worksheets. Once I am done it will be in 5 .csv files (1 file per variable.)
Cool things to notice with this R code is the dateTimeToStr() translates the Excel date/time value to an actual date/time value.
It also took me a while to figure out the as.numeric(levels())[as.integer()] line which was necessary because R was reading the data in as a Factor rather than Numerically.
# To run from the R Console command line:
# source("C:/workspace/web/data/rScripts/welchProfilesPh.r")library(xlsReadWrite)
dir = "C:/workspace/web/data/"
input = "welchsfbprofiles.xls"
output = "pH_Profile_Douglas.csv"cNames <- paste("pH_", paste(0:21, "m", sep=""), sep="")
d<-matrix(nrow=133, ncol= 23 )
d[1,1] <- "Date"
d[1,2:23]<- cNamesfor (i in 1:132) {
dset = read.xls(paste(dir, input, sep=""), colNames = TRUE, sheet = i, dateTimeAs = "numeric")d[i+1,1] <- dateTimeToStr(dset[1,2])
# This link is where the following code came from.
& d[i+1,2:23] <-as.numeric(levels(dset[4:25,3]))[as.integer(dset[4:25,3])]
}write.table(d, file=paste(dir, output, sep=""), sep=",", col.names = FALSE, row.names = FALSE, quote = FALSE)
Posted by kkwaiser at 02:19 PM | Comments (0)
November 16, 2009
Work with Excel sheets in R
Documentation of the mundane follows...
I am going to reformat one of our larger datasets, the Welch-Eggleton limnological dataset for South Fishtail Bay that spans from 1913-1950 with temperature, dissolved oxygen and more variables measured.
Good things to know thus far:
Using the R stats package, the xlsReadWrite package can import the worksheets from an Excel file. Use the read.xls command.
The dataTimeToStr routine documented here is useful for getting the Excel date/time number (4970??) into an actual date. Another reason not to use Excel, you need a special function to read dates.
Also, I am using Notepad++ to write scripts and downloaded this .exe file to tell Notepad++ about the R syntax.
Posted by kkwaiser at 03:45 PM | Comments (0)
November 10, 2009
Get Gmaps up and running
Just started messing around with the Google Maps module in Drupal. Here are some resources to get started with :
http://blip.tv/file/1765174
http://groups.drupal.org/node/19614
http://www.drupaltherapy.com/gmap
http://www.thomasturnbull.com/training/gmap.doc
Late Update:
I've gotten things working pretty well now. I can create maps that specifically plot one type of web content (e.g. study sites) and display information about the content in the Google maps popup marker.
It took longer than expected because of a bug in the GMap code. This post by me outlines the problem and this post has the solution.
I also found this tutorial helpful in creating my first map.
Posted by kkwaiser at 12:31 PM | Comments (0)
November 03, 2009
Study sites of the UMBS
The University of Michigan Biological Station has conducted study in a whole host of locations. Here's a quick map of 529 of these sites from the gazetteer developed by Bob Vande Kopple, our Resident Biologist.

(Right Click > View image for full size.)
Posted by kkwaiser at 04:43 PM | Comments (0)
November 02, 2009
Import a Content Type in Drupal 6
One of the advantages of Drupal is the ease with which you can avoid reinventing the wheel. For example, the LTER folks have created the following Content Types in Drupal:
Data Set - Title, abstracts, geo-temporal references are all part of the basics of a metadata collection
Data File - links to a csv/txt file and includes basic header information
Variable - describes the unites of measurement behind a Data File
Person - Name, contact info, etc.
Research Sites - Places where research is done
Research Project - with project title and abstract
Things to consider:
You can only import fields that do not already exist.
You need the Content Copy module in CCK. Go to Administer > Site Building > Modules > CCK > Click Content Copy > Save. See this entry for more on adding a module.
1) Administer > Content Management > Content Types > Import
2) Content Type: Create Content
3) Copy contents from the exported Content Type you received from your benefactor into the text square and press import.
4) Fail!

(Right Click > View image for full size.)
The import did not go through because I did not have all of the correct Modules enabled. Note, if the sharing of custom content types is to become widespread we will need sufficient module documentation.
Enable FieldGroup under the CCK Module and reimport:

5) Go to Administer > Content Management > Content Types to view the Data Set content type. It is there but I'm missing some fields because I don't have all of the modules enabled.

To fix:
Administer > Site Building > Modules > Date/Time > Date Popup > Save
Posted by kkwaiser at 01:02 PM | Comments (0)
October 28, 2009
Create Research Projects Content Type with Keyword connections
I'm going to create a simple Content Type called Research Project that will hold the Metadata information collected this summer. This will serve as a way for the UMBS community to keep tabs on what is going on.
1) Here are Modules and Sub-Modules I think I need for this:
Administer > Site Building > Modules
CCK > Content, Link, Node Reference, Number, Text, Option Widgest, User References
http://drupal.org/project/taxonomy_list
2) Create Research Project content type:
Administer > Content Management > Content Types > Add content type
Name: Research Project
Type: research_project
Description: Research activities at the University of Michigan Biological Station.
Title field label: Research Project Title
Default Options: Published
Default comment setting: disabled
3) Manage fields
Go to content type (Research Project) > manage fields > New Fields
Added these text fields: Researcher (Check boxes/radio buttons), Abstract
Note that it is probably appropriate to create a Taxonomy of Researchers.
4) Create Keyword taxonomy list
I want to associate keywords with the Research Project so I created a list of keywords. Working off of this list from the LTER.
Because I have a large list, I wanted to do bulk taxonomy uploads (vs 1 at a time):
I went here to get a Drupal 6 module called Taxonomy Batch Operations.
taxonomy_batch_operations.tar_.gz
$ cd /home/data/Desktop/
$ mv taxonomy_batch_operations.tar__0.gz taxonomy_batch_operations.tar.gz
$ sudo tar xzf /home/data/Desktop/taxonomy_batch_operations.tar.gz
$ sudo cp -r taxonomy_batch_operations /var/drupal/sites/all/modules/
Then go to enable the Module.
Administer > Content management > Taxonomy > Add vocabulary
Vocabulary name: Keywords
Description: Keywords for describing research projects and datasets.
Content Types: Research Project
Settings: Tags, Multiple select, Required
5) Notice that when you go to Create content > Research Project the keywords show up there and will auto complete as you type them in.
6) I then went in a create two Research Projects and the use of a keyword taxonomy links the two . E.g. if both use keyword: groundwater then by clicking on groundwater you list all research projects assocated with that keyword.
Posted by kkwaiser at 09:27 AM | Comments (0)
October 26, 2009
Retrieve lost Admin rights in Drupal 6
I don't know what happened, but while in the process of updating the CCK and Views module, my drupal administrator lost a lot of privileges. I used the following sites to rectify the problem.
http://drupal.org/node/354597
http://drupal.org/node/288162
Note, I was also running into memory problems. I increased the memory from 16 mb to 64 mb:
http://drupal.org/node/29268
$ sudo gedit /etc/php5/apache2/php.ini
$ /etc/init.d/apache2 reload
Posted by kkwaiser at 10:38 AM | Comments (0)
October 22, 2009
Create an SFTP site through UM LSAIT
How to create an SFTP Site for sharing data among UM and Non-UM researchers.
Drafted by K. Kwaiser
10/22/09
Options:
1) LSAIT can set up some filespace on LSA's Windows servers (LSA-F4\Group) - the 'Group' share is typically for collaboration. You can SFTP to the DFS servers and then access your data there.
The rest of this document documents the implementation of this option!
Continue reading "Create an SFTP site through UM LSAIT"
Posted by kkwaiser at 11:02 AM | Comments (0)
October 20, 2009
Set up a Virtual Web Server at UM
Options:
1) Go through LSA for virtual server.
- Doesn't work. They suggested VaaS.
2) VaaS - Essentially blank server space. ~$500/year
3) UMWeb - virtual webserver. This is the step I'm going with and the rest of this post documents the process.
- Post-hoc Cost Estimate: $30/year (IFS Group Space) + $50/year (UMWeb server space) = $80/year. We have 5Gb on the IFS drive and around 10Gb (I believe!) of database space. For comparison purposes, web hosting from Godaddy.com runs from $50 - $150/year.
- Question to Mark Montague of ITCS regarding how much space we may need on the IFS group directory:
Hi Mark, I think I'm going to set up a virtual server with MySQL and PHP but I had a question about table space. I'm pretty sure I am going to need to create an IFS group directory. When setting the storage space on this directory, do I need to account for the MySQL tablespace? Or will that IFS directory only be for photos and other webcontent we want to add? Hope this makes sense. Thanks, Kyle
Mark's reply:
The database resides on a dedicated database server, not in IFS. So you only need IFS space for files (PHP, HTML, CSS, media files, and other
files).
Mark Montague
Webmaster team
- CREATE IFS group directory (needed in order to set up a virtual server):
Talking to ITS Account Office about whether we can use a "Course Home Directory" for our IFS space or if the UMBS needs to pay for its own space... Turns out we will need to create our own IFS group space. Cost for UMBS group directory = $0.50/Gb/month
Pam and I set it up at this website and started with 5 Gb. Note that Alicia, Karie, Pam and I are administrators on this drive space. It is called "umbsweb" and should probably only be used for data hosted on the umbs.lsa.umich.edu website. Also note that the UMOD group is xbs-UMBSAnnArbor.
- Set up a domain name (umbs.lsa.umich.edu):
We also need to set up a domain name, I've emailed hostmaster@umich.edu to inquire about this. See "UMBS Domain Name" below the fold for documentation (email correspondence) on this step. The long/short is that our domain name will be umbs.lsa.umich.edu.
- Request a Virtual Host
Pam and I did so by going to this website. It took maybe 5 minutes to request (although I do not have the site yet!)
Continue reading "Set up a Virtual Web Server at UM"
Posted by kkwaiser at 10:51 AM | Comments (0)
October 19, 2009
GLEON recap
Summary of GLEON 9 conference
Kyle Kwaiser, UMBS Information Manager
Camp Manito-Wish in Boulder Junction, Wisconsin.
October 12-16th, 2009
My overall impression is that the strength of GLEON (Global Lakes Ecological Observatory Network) is the collaborative energy brought by individual researchers. There were about 90 attendees at this conference with ~40% being first-time attendees. Given the relative inaccessibility of northern Wisconsin there were a good number of international scientists and information managers.
I observed a great deal of interest in our buoy deployment and believe it will benefit the UMBS to encourage our researchers/students to attend other GLEON conferences and lakes. A reoccurring theme from the LTER All Scientist Meeting was that I spent a lot of time introducing the UMBS (who, what, where). I enjoy talking up the UMBS but I think we need to work on getting our name out there more. I should not have to clarify that KBS is an MSU affiliate and that they work in an agricultural landscape. One possible solution to this is to request that UMBS researchers prominently display their connections to UMBS in papers and presentations. This point warrants further consideration.
GLEON is a young organization and I am interested to see how it copes with turnover in key staff (e.g., web and database developers) and whether or not the bottom-up structure leads to leadership gaps in important areas. That said, I believe there are avenues for UMBS folks (staff, researchers and students) to take leadership roles in this organization. This point also warrants further consideration.
UMBS GLEON membership:
After discussing the matter with Karie and Phil, I am going to begin the process to make UMBS a member of GLEON. There is a perfunctory process to becoming a GLEON member. Here is the application website to give you an idea of how simple it is.
Here is a map of current site members (notice Flathead and Archbold are currently members, RMBL does not appear to be). Individual membership is also a possibility and is something I believe we should encourage our researchers to do.
Potential membership benefits:
GLEON has funding for student travel within GLEON sites and separate funding for student-exchange program. This would be a good way to get our aquatics faculty/students interested.
Of course, our researchers would also be put into a good position to collaborate with groups that are collecting data similar to ours (assuming they take it upon themselves to attend GLEON conferences.)
GLEON offers technical assistance in data management and sensor deployment. Right now, MHL has the ability to cover these needs for us but additional support never hurts.
Data Turbine:
Data Turbine is essentially server software specifically designed for processing (e.g. QA/QC, plotting) and routing sensor data streams to websites and databases. It was originally developed by NASA for aeronautics sensors and a group at the San Diego Supercomputing Sensor has tailored it to environmental field sensors and tested it in Wisconsin lakes.
The Data Turbine group is currently looking for collaborators as the software has recently exited the testing phase and is entering the deployment phase. I am going to pass this information onto the Marine Hydrodynamics Lab who may find it useful with the U-GLOS system.
Sameer Tilak (stilak@ucsd.edu) is a developer on the project and would be a good contact if we chose to use Data Turbine for anything.
Information Technology working group:
I sat in on 3 of 4 meetings among this group. We discussed Technical Documentation needed for individual sites to stream buoy data to the GLEON website. I pressed fairly hard on this point but did not obtain much documentation. Apparently, said documents exist in rough draft form but I cannot verify it (I even volunteered to contribute to the documentation when we begin streaming data to GLEON!) I will continue to request the documentation and plan to get it before our buoy goes online.
Other conversations included the need to establish QA/QC protocols for incoming data. Currently, most of the data stored is raw. There is also a need for getting the most recent datasets from the individual sites into the GLEON database.
Two groups of researchers have published pan-GLEON studies. Neither of these studies used data from the GLEON database (they solicited data from individual sites) which highlights the need to improve this resource. Right now, GLEON has developed critical tools for data sharing but they are not mature enough and are not highly used by researchers. This is something to consider before as we begin streaming data to GLEON (i.e., how useful will it be to our researchers?).
The GLEON IT group has developed a controlled vocabulary list for use with lake data collection. The current vocab list is on the GLEON website. GLEON has also recently passed a data access policy. I have requested a copy to compare to our policy.
Miscellaneous:
I met John Lenters from the University of Nebraska will pass his contact information to Guy Meadows. John has deployed buoys year-round in the arctic and is involved in a buoy deployment on Granite Island on Lake Superior (north of Marquette.)
Craig Williamson at Miami of Ohio and his collaborators at Kent State recently received IGERT funding. They will be deploying lake buoys in the coming year(s) and are looking for additional collaborators. I am open to suggestions as to how best to evaluate if UMBS researchers could benefit from this.
That's it for now!
Posted by kkwaiser at 10:11 AM | Comments (0)