« January 2011 | Main | March 2011 »

February 28, 2011

Execute Python script from CLI and print output

Another example Python script. The idea here is to input a CSV file and kick out a SQL statement to quickly insert the data into a database table.

Updated - see 5)

1) Create a Python Script, call it example_script.py:
- sys.argv is taken from the command line
- the Print statement will output to the terminal

#! /usr/bin/python

import csv, sys
try:

fileName= sys.argv[1]
inFile = list(csv.reader( open(fileName, 'r') , delimiter=','))
print str(inFile)

except IOError
print 'Can\'t open file for reading.'
sys.exit(0)

2) Make the file executable (taken from this forum post.)

$ sudo chmod +x example_script.py

3) Execute!
$ sudo ./example_script.py ~/Desktop/example_inputFile.csv

OR

$ sudo python example_script.py ~/Desktop/example_inputFile.csv

4) To pass additional arguments via the command line (e.g. database name) you can add them after the execute command and alter the code a bit.

Here's a more complete example using a script I've written to read a CSV and output an SQL insert statement:

Execute:
$ sudo ./example_script.py ~/Desktop/inputFile.csv [database] [table]

This script:


#! /usr/bin/python
import csv, sys

try:
fileName= sys.argv[1]
db = sys.argv[2]
table = sys.argv[3]

#Open CSV into a list
in1 = list(csv.reader( open(fileName, 'r') , delimiter=','))

#Write INSERT, db and table, column names and begin VALUES
print "INSERT INTO {0}.{1} ".format(db, table) +"\n"
print "(" + str(in1[0]).strip('[]') + ")\n" #First row must match table fields
print "VALUES\n"

#Write the data (i.e., rows 2:End
for row in in1[1:]:
if row != in1[-1]:
print "(" + str(row).strip('[]') + "),\n"
else:
print "(" + str(row).strip('[]') + ");\n"

except IOError:

print 'Can\'t open file for reading.'
sys.exit(0)

5) To pipe the output to a text file (via the terminal) use this command:

$ sudo ./format_csv_cli_output.py ~/Desktop/inputFile.csv [database] [table] > ~/Desktop/cli_output.txt

Source: Comment by Tomosaur in this issue.

Posted by kkwaiser at 09:54 AM | Comments (0)

February 25, 2011

Days in month

Linking an interesting discussion into whether a unite (e.g., meters) should be associated with a variable that is a count. For example, does a variable such as Number of Days in Month receive a unit of "days"?

The exchange is between a few ecological data types. I'll cite a few snippets here.

> This question is drifting out of the datetime issue and into the count > issue which I don't think we ever resolved very well. To my > recollection, counts (and percents, etc) were not considered > measurements by STMML and are all dimensionless units. I don't think we > are going to start naming units for all the things we might count > integer numbers of, are we? I would have encoded day of the year as > simply ratio with a unit of dimensionless and a domain of 0 to 365.
We put 'nominalDay' into EML as a 'day' unit of constant length just to accomodate those people that wanted to refer to time durations without reference to a calendar and all of its associated problems. Thus, a nominal day is a unit of time that is exactly 60*60*24 seconds.

Posted by kkwaiser at 03:18 PM | Comments (0)

February 24, 2011

Further Private Download instructions

Posted by kkwaiser at 12:06 PM | Comments (0)

Opensource Web Mapping and GIS Summary

Premise: we do not have a sufficient plan for managing spatial data at UMBS. Currently, it is housed on a computer maintained by the Resident Biologist which, while backed-up and secure, makes version control, editing and access difficult. Ideally, at some point we will go to a web infrastructure for this. I've got a lot to learn on this subject so here goes.

Following this talk Stefan Steiniger - Building on Open Source GIS @ COSSFest 2010:


Web Map Server Software

Web GIS Server - server-side data processing utilizing WPS

Spatial Database Management Systems

Catalogue/Registry Metadata - data discovery tools which offer UI's for querying and displaying data

Posted by kkwaiser at 09:12 AM | Comments (0)

February 22, 2011

IRC with Bdragon

This is a conversation I had on the IRC on whether patches to add polygon support and multiple points per map would be accepted into the GMap module.

< start>

< kbk1> Bdragon: got a sec for a quick question on this?

< Druplicon> http://drupal.org/node/338745 => Multiple locations on a single map => Location, Code, normal, needs review, 57 comments, 1 IRC mention

< Bdragon> kbk1: Things are pretty hectic over here atm

< kbk1> Bdragon: My employer is willing to sponsor completion of those patches. It'll be for D6 (instead of D7). Are you OK with committing this to the D6 branch?

< kbk1> Bdragon: I just want confirmation before we pony up.

< Bdragon> kbk1: Sponsorship is problematic for me atm because I'm fulltime with Tag1 Consulting.

< kbk1> Bdragon: we would go through Agileware; I've already talked with them and gotten an estimate

< Bdragon> kbk1: As is, I don't like the idea of storing polys as multiple locations. I had to maintain a site that did this in production and it's slow and buggy. Much better to use spatial tables... Have you looked at the geo module yet?

< kbk1> Bdragon: That's understandable. Would you commit the "Multiple locations on a single map" patch? I've looked into Geo a bit but wouldn't switch to it in the D6 incarnation of my site.

< Bdragon> kbk1: Not as it currently is. For it to be reliable locations an offset column will need to be managed in {location_instance} -- location doesn't guarantee that it returns stuff in the same order currently.

< Bdragon> kbk1: I am quite opposed to the patch, I spent weeks getting *rid* of it on a site.

< Bdragon> kbk1: It needs to be done in a much cleaner way than that patch

< kbk1> Bdragon: That explains why it has been pending for 3 years. This is beginning to sound like a 'no'.

< Bdragon> kbk1: It's outside the scope of location. [This post] is still relevant.

< Druplicon> http://groups.drupal.org/node/6089 => Bdragon's vision for doing locations "right" in Drupal => 31 comments, 11 IRC mentions

< kbk1> Bdragon: Seems things are more complicated than they appear. Either way, Gmap and Location work great for us and perhaps we can contribute in some other way in the future. Thanks.

< Bdragon> kbk1: I followed up with a suggestion.

< kbk1> Bdragon: Thanks. I collaborate with a few other data manager types and geographic data in Drupal is a big question of ours. It's not out of the question that we would submit for federal grant money to fund improvements. If our conversation ever goes there we will keep all of this in mind.

< Bdragon> kbk1: re: a grant -- good idea. I really want geospatial in drupal to go somewhere, but bolting adhoc junk onto gmap and location isn't the way to get there. Bolting conformant stuff on the other hand (WKT support for gmap would be awesome for one, but I don't have the time to write it)

< end>

Posted by kkwaiser at 12:05 PM | Comments (0)

February 21, 2011

Data Mangement Resource List

A collection of items I'm pulling together as a potential resource to place on the Research Gateway.

How to Write a Data Management Plan for a National Science Foundation (NSF) Proposal

LSA Joint IT Research Committee Data Management Plan Webpage

Writing DMPs - ICPSR

ISPSR Data Management Webinars

UM Library Data Management Page

Posted by kkwaiser at 04:25 PM | Comments (0)

February 18, 2011

Shibboleth at UM

Another email snippet, this one from MM.

Have folks found commercial hosting providers that can handle installation, config, and support of Cosign? Well, there are different sorts of hosting providers. The problem is not with the Drupal cosign module, it's whether the hosting provider is willing to handle mod_cosign for Apache HTTP Server, which the Drupal cosign module requires.

If you purchase a VM guest from a VPS hosting provider -- or find a PHP/MySQL hosting provider that provides root access to its customers -- then you can install and configure mod_cosign yourself. Of course, this is not what you were asking for.

cosign is a tough sell to PHP/MySQL hosting providers that do not provide their customers with root access, since cosign is primarily used by a few dozen higher-ed institutions around the world: not a very big market. Shibboleth would be a much easier sell, since it is supported by over 272 institutions in North America and over a thousand worldwide. Note that Shibboleth here at U-M uses cosign for authenticating U-M users, but that the current version of Shibboleth does not support multifactor authentication (e.g., MTokens), reauthentication, global log-out, or (as Shibboleth is currently deployed at U-M) Friend or other guest accounts, all of which are supported by cosign. For more information on Shibboleth, see

http://www.itcs.umich.edu/identity/shibboleth/faq/itstaff.php
http://www.itcs.umich.edu/identity/shibboleth/faq/basics.php

I hope this helps.

Posted by kkwaiser at 08:54 AM | Comments (0)

February 17, 2011

Notes on ITS hosted Drupal Install - Common tweaks

You probably should start with ITS's guide for hosting drupal on their virtual servers.

Email via the www-sig email list from CC:

I wouldn't say hosting a Drupal site with ITS has no problems. It's entirely doable (I've installed three there now), but here's just a sampling of a few things that many prospective Drupal admins will have to do to make it work just right...

- ask them increase php upload limit
- ask them increase suhosin limits
- ask them increase php max filesize
- wrestle with drupal cosign module integration oddities (like the super user admin needing to be an a real u-m person with a kerberos login - but who wants the admin login to be specific to one person who could leave that post some day? makes no sense).
- the need to add fairly common php modules via the drupal config page (settings.php). For example (from my settings.php file):

/** * Custom settings, some specific to UM ITS servers */

# This should load the gd library needed for images:
dl( 'gd.so' );
# To make Insert module use relative URLs, insert this into your settings.php file:
# http://drupal.org/node/622964#comment-2451810
$conf['insert_absolute_paths'] = FALSE;

# Load unicode support if not already present
if (!extension_loaded('mbstring')) {
if (!dl('mbstring.so')) {
exit;
}
}

# For the XML Sitemap module
if (!extension_loaded('xmlwriter')) {
if (!dl('xmlwriter.so')) {
exit;
}
}


- default Drupal .htaccess requires changes for successful installation. Try commenting out all of the Options, for starters.
- and more. For example, to access the ITS error logs for a site. "Use a web browser to view /cgi-bin/logs on your host (http://your-web-site.umich.edu/cgi-bin/logs)." - http://www.itcs.umich.edu/web/log-viewer.php


Here is a Drupal 7 related note (with error):

PHP Fatal error: Call to undefined function json_encode() in ~/includes/common.inc on line 4807

Solution is to add the following to settings.php:


if (!extension_loaded('json')) {
if (!dl('json.so')) {
exit;
}
}


These aren't MASSIVE inconveniences (aside from the cosign issue). But they're annoying. And it means that you have to jump through extra hoops just to get your Drupal install to work with ITS hosting, whereas most commercial Drupal hosts make the installation and management of Drupal sites their core business. Therefore, it works better.

We chose U-M hosting mainly out of loyalty. But there are plenty of commercial Drupal hosts out there who offer incredible services (like one-click Drupal site creation, full access for managing the files, dev-to-production workflows, and more). I use a commercial service for my sandboxes and it costs less than $150 a year and I couldn't be happier. I know that's more expensive than ITS hosting. But the reasons are obvious.

Also, a quick note or two about the wasup Drupal service*: you have to send them modules, libraries, and themes to install for you (no FTP access to the "sites/all" directory) - a big inconvenience in my opinion. Also, I noticed a lag in updates to Drupal core. I don't find it a very useful service beyond the most simple site-building tasks. Some use it for their main site. They must not mind the extra steps.

* This statement appears to be <100% accurate, here's a reply from drupal-people:


This actually isn't true because they can now set up a spot in AFS for
you that will sync to the sites/yourdomain.com/ directory. Combine
that with your VCS hosted on Codeblue and you have a pretty good
deployment process - no FTP required, just SSH into your AFS spot and
pull updates in to be synced. Also I think your statement regarding
sites/all is a little misleading, since this isn't what you really
need access to in a multi-site setup like theirs. Your comment about
a lag in Drupal core updates is true though since their version is
about 2 years old now with many missing security updates and
incompatibilities with newer modules, something most would find
unacceptable in a cloud hosting platform.

Posted by kkwaiser at 04:41 PM | Comments (0)

Node Ref Create & Node Import

With a large number of housing applications about to come in, I figured it would be worth my time to expedite the import process. This necessitated a small patch on the Node Import module to compensate for the fact that I use Node Reference Create. This issue documents my work.

Posted by kkwaiser at 09:43 AM | Comments (0)

February 16, 2011

Print $node array at submission

Drupal IRC snippet

kbk1: I need to get a print out of the array that is created when someone hits submit on the node/add form. I've got devel and drupalforfirebug installed and can see the form before submission. Help?

xjm> kbk1: If I recall you'll want to do a dpm($form_state), maybe in a hook_nodeapi
kbk1: nodeapi when $op == 'validate'

Posted by kkwaiser at 02:44 PM | Comments (0)

Notes on Views Custom Field (Drupal 6)

A few notes on Views Custom Field:

1) Enabling the module adds a new Field Type, "CustomField", to the Views configuration page.

2) Simple snippet used in the PHP field type:

 
print_r($data->node_data_field_app_fname_field_app_role_value); //print a value
print var_export($data, TRUE); // prints the data array
print_r($data); // prints the data array
dsm($data); // Requires devel, neater output
?>

3) Weird behavior alert! The field names change when new fields are added. There are workarounds but so far a true solution is only found in the Drupal 7 version Views PHP.

Great module.

Bonus snippet:


echo $data->node_node_data_field_housing_inst_affil_title . " [nid:" . $data->node_node_data_field_housing_inst_affil_nid . "]" ;
?>

Posted by kkwaiser at 08:45 AM | Comments (0)

February 10, 2011

Documentation for the DEIMS EML2Drupal module

Notes:

eml_config

- Should "Site name acronym" be limited to 3 letters? That's an LTER thing.
- URL should be admin/eml_config with a link "EML Configuration" on the admin page.
- Capitalize the first letter of each field "e.g., Country"
- An optional button to make Provider info the same as Publisher info
- The form uses a "Create" button for EML Config File -> confirmation message should read "Your EML settings have been saved created."
- A way to view/modify EML Config files would be good; after submitting my first thought was, where'd my Config info go??

eml_view

- Need to include a base_url when constructing the datatable url:

- You need to be logged in to view the EML file?

- What is the point of www.yoursite.com/eml_view

Attaching an EML feed to a dataset panel template:

- Modified the EML Style Feed to "Provide as file" and changed to .xml
- Attached to the Default EML View
- Inserted the Default View into the Panels template

Posted by kkwaiser at 09:12 AM | Comments (0)

February 07, 2011

Configuring Sendmail in Drupal 6

Apparently you're not supposed to use your uniqname to authenticate the UM SMTP server. This is a place holder for an ongoing task.

Here is a post on configuring Sendmail. Not so helpful post on same topic.

Good overview of three ways to handle email in Drupal.

Posted by kkwaiser at 02:09 PM | Comments (0)

February 01, 2011

Identify and kill slow queries

I've run into problems where a Drupal Views query involving Users executes extremely slow. Here is how to identify and kill it:

Login to mysql
$ mysql -u [dbuser] -h [dbhost] -p

Show processes:
mysql> show processlist;
mysql> show full processlist;
mysql> show processlist\G;

Kill query:
mysql> kill query 1831;

Posted by kkwaiser at 03:44 PM | Comments (0)