« February 2011 | Main | April 2011 »

March 31, 2011

Adding stable file download links to the Research Bibliography - Part 2

So, Jim Ottaviani, who puts the DeepBlue into DeepBlue, got back to me with questions on how much overlap exists between our collections. Yes, this is the question I tried to head-off but it failed, obviously. To get a rough idea, I compared the publication lists of UMBS and DeepBlue for two authors BT Hazlett and BA Hazlett. I came up with these statistics:

23 - number of publications listed with UMBS
31 - number of publications listed with DeepBlue
35% - percent of UMBS Pubs found in the entire DeepBlue collection (i.e., pubs in common)
38% - percent of pubs in common found in UMBS' DeepBlue Collection
25% - percent of pubs in common that had minor differences in the title

This is not a random sample but the take home messages appear to be:
1) There is significant - but hardly complete - overlap between the collections
2) Title is an easy but incomplete way to match.
3) Incidentally, the differences in the titles occurred in the last few words
4) We'll need to search the entire DeepBlue collection, not just UMBS' portion.

1) Do we know perl regular expressions? I don't.
2) How many items pubs/papers are in the entire DeepBlue collection?

Raw data below the fold.

ID, In DBlue?, In UMBS DBlue Collection?,Notes
1, No, No,
2, No, No,
3, No, No,
4, No, No,
5, Yes, No,
6, No, No,
7, No, No,
8, No, No,
9, No, No,
10, No, No,
11, No, No,
12, Yes, No,
13, No, No,
14, No, No,
15, No, No,
16, Yes, No,
17, Yes, No, Title differs slightly
18, Yes, Yes,
19, No, No,
20, No, No,
21, Yes, Yes,
22, Yes, No,
23, Yes, Yes, Titles differ slightly

Posted by kkwaiser at 03:52 PM | Comments (0) | TrackBack

Google Indexing speed

Continuing to update this which is why I keep placing it on top.

I'm going to try to track how quickly google indexes the URLs of the Research Gateway. I've noticed the number of indexed URLs drops at times; this discussion may explain why.

Date, Number of URLs Indexed, URLS in sitemap, proportion indexed
March 4th 2011, 2
March 7th 2011, 483
March 8th 2011, 483
March 9th 2011, 746
March 10th 2011, 1005
March 11th 2011, 1313
March 14th 2011, 1662
March 15th 2011, 1667, 11723, 0.10
March 16th 2011, 2058, 11724, 0.18
March 17th 2011, 2048, 11724, 0.175
March 18th 2011, 2354, 11727, 0.20
March 21st 2011, 2598, 11727, 0.22
March 22nd 2011, 2473, 11727, 0.21
March 23rd 2011, 2538, 11730, 0.216
March 24th 2011, 2499, 11734, 0.213
March 25th 2011, 2393, 11734, 0.204
March 26th 2011, 2788, 11735, 0.238
March 28th 2011, 2876, 11735, 0.245
March 29th 2011, 2890, 11741, 0.246
March 30th 2011, 3148, 11746, 0.268
March 31st 2011, 3338, 11745, 0.284
April 1st 2011, 3929, 11747, 0.334
April 4th 2011, 4512, 11747, 0.384
April 18th 2011, 4544, 11748, 0.387
June 17th 2011, 7418, 11793, 0.629

Posted by kkwaiser at 02:42 PM | Comments (0)

March 30, 2011

Installing PostGIS and Postgres on Ubuntu

Note: These are my working notes, there are better tutorials online.

PostgresSQL and PGAdmin
$ sudo apt-get install postgresql postgresql-client postgresql-contrib pgadmin3

$ sudo apt-get install postgresql-8.4-postgis

$ sudo su postgres
$ createdb postgistemplate
$createlang plpgsql postgistemplate

These fail (no file or directory):
psql -d postgistemplate -f /usr/share/postgresql-8.2-postgis/lwpostgis.sql
psql -d postgistemplate -f /usr/share/postgresql-8.2-postgis/spatial_ref_sys.sql

Install PHPpgAdmin (because it is different than pgadmin3):
$ sudo apt-get install phppgadmin

$ sudo nano /etc/apache2/apache2.conf
$ sudo nano /etc/phppgadmin/apache.conf
$ sudo service apache2 reload

Login to postgresql:
$ sudo -u postgres psql postgres

Set superuser password:
postgres=# \password postgres

Restart postgresql:
$ sudo /etc/init.d/postgresql-8.4 restart

To connect via PHPPgAdmin you need to create a non-superuser role (i.e., you can't connect via user postgres)

Posted by kkwaiser at 03:09 PM | Comments (0) | TrackBack

Adding stable file download links to the Research Bibliography - Part 1

I would like to add a stable link from our bibliographic entries to pdf downloads that DeepBlue has available. The first step in this process is to determine how many of our publications they actually have record of. I am sending them the title, year, journal and first author information for our publications so that they can run a quick-n-dirty match.

Here's the SQL statement I used to pull the information from the database:

SELECT drupal_node.nid,drupal_biblio.biblio_citekey, drupal_biblio.biblio_year, drupal_biblio_contributor_data.lastname, drupal_biblio_contributor_data.firstname, drupal_node.title, drupal_biblio.biblio_secondary_title
FROM `drupal_node`
LEFT JOIN drupal_biblio ON drupal_biblio.nid = drupal_node.nid
LEFT JOIN drupal_biblio_contributor ON drupal_biblio_contributor.nid = drupal_biblio.nid
LEFT JOIN drupal_biblio_contributor_data ON drupal_biblio_contributor_data.cid = drupal_biblio_contributor.cid
WHERE drupal_node.type = 'biblio'
AND drupal_biblio_contributor.rank = '0'
ORDER BY drupal_biblio.biblio_year DESC

OK, for whatever d@mn reason, the above query breaks when I run it on ITS servers (but not on my dev machine). The strangest part is that the query runs but when I go to export query through PHPMyAdmin it breaks and I end up exporting the FROM table. The problem, apparently, is the WHERE statement because the export works as expected without it. The following is what I used on ITS servers:

SELECT drupal_biblio.nid,drupal_biblio.biblio_citekey, drupal_biblio.biblio_year, drupal_node.title, drupal_biblio_contributor_data.lastname, drupal_biblio_contributor.rank

FROM drupal_biblio

LEFT JOIN drupal_node ON drupal_node.nid = drupal_biblio.nid
LEFT JOIN drupal_biblio_contributor ON drupal_biblio_contributor.nid = drupal_biblio.nid
LEFT JOIN drupal_biblio_contributor_data ON drupal_biblio_contributor_data.cid = drupal_biblio_contributor.cid

ORDER BY drupal_biblio.biblio_year DESC

I've tweaked the above query a bit:

SELECT drupal_biblio.nid, drupal_biblio.biblio_year, drupal_node.title, drupal_biblio_contributor_data.lastname,drupal_biblio_contributor_data.firstname, drupal_biblio_contributor_data.initials, drupal_biblio.biblio_secondary_title, drupal_biblio_contributor.rank

FROM drupal_biblio

LEFT JOIN drupal_node ON drupal_node.nid = drupal_biblio.nid
LEFT JOIN drupal_biblio_contributor ON drupal_biblio_contributor.nid = drupal_biblio.nid
LEFT JOIN drupal_biblio_contributor_data ON drupal_biblio_contributor_data.cid = drupal_biblio_contributor.cid

ORDER BY drupal_biblio.biblio_year DESC

Posted by kkwaiser at 10:40 AM | Comments (0) | TrackBack

March 28, 2011

Overlap between UM and UMBS Herbariums

I am in the process of exploring a Biological Research Collections (NSF) grant that would see us partnering with the UM Herbarium to digitize portions of our herbarium collection. For now, we have focused in on Howard Crum's collection (Bryophytes) as a potential starting point due to historical importance and a belief that the UM Herbarium does not have these items in its collection.

To test this latter assumption, I sampled 50 records from our Bryophyte collection and searched the UM Herbarium's collection of Bryophytes that were collected by Howard Crum to see how many matches we had. A large number of matches would indicate high overlap and decrease the uniqueness of our collection. I hope those unedited explanations make sense.

Here are the numbers:

2% - percent of matching records (i.e., one) that I found.
44% - percent of UMBS records I encountered that were not collected by H. Crum (22/50)
5 - Number of times the UM Herbarium had items from the same date and location as a UMBS record but did not have the same specimen

Conclusion, a quick glance indicates the UMBS Bryophyte collection is not represented within the UM Herbarium.

Posted by kkwaiser at 04:35 PM | Comments (0) | TrackBack

March 24, 2011

Trying EDIT's DarwinCore Model

I've tried EDIT's Darwin Core Module before and always run into bugs. This time though, I am going to make notes.

To reproduce
1) Download and enable the Darwin Core Module
2) Create a DwC Location
3) Create a DwC Specimen
- within the Specimen Node Add form, try to create a new Location


PHP Fatal error: Only variables can be passed by reference in /var/drupal/sites/all/modules/location/location.module on line 1050, referer: http://umbs-desktop.eeb.lsa.umich.edu/drupal/node/add/darwincore

Posted by kkwaiser at 04:45 PM | Comments (0) | TrackBack

March 23, 2011

Ubuntu's Computer Janitor

Note to self: Ubuntu's Computer Janitor will remove programs you rely on. It just swept aside my entire OpenOffice install. Not a big deal, but I'm not sure why those files even showed up. Interesting to note that after reinstalling OpenOffice the new install does not show up on the Janitor list.

Posted by kkwaiser at 01:35 PM | Comments (0) | TrackBack

March 21, 2011

Pulling author and publication per year counts

Just a bit of sql documentation for now:

Pull all authors along with NID of their publications:

SELECT drupal_biblio.nid, drupal_biblio_contributor_data.cid, drupal_biblio_contributor_data.name FROM drupal_biblio LEFT JOIN drupal_biblio_contributor ON drupal_biblio_contributor.nid = drupal_biblio.nid LEFT JOIN drupal_biblio_contributor_data ON drupal_biblio_contributor_data.cid = drupal_biblio_contributor.cid
SELECT drupal_biblio_contributor_data.name, count(drupal_biblio_contributor_data.name) FROM drupal_biblio LEFT JOIN drupal_biblio_contributor ON drupal_biblio_contributor.nid = drupal_biblio.nid LEFT JOIN drupal_biblio_contributor_data ON drupal_biblio_contributor_data.cid = drupal_biblio_contributor.cid GROUP BY drupal_biblio_contributor_data.name ORDER BY count(drupal_biblio_contributor_data.name) DESC

Pull Pubs per Year:

SELECT drupal_biblio.biblio_year, count(drupal_biblio.biblio_year)
FROM drupal_biblio
GROUP BY drupal_biblio.biblio_year
ORDER BY count(drupal_biblio.biblio_year) DESC

Pull Keywords used by Publications per year (remove group by year for variant):

SELECT drupal_biblio.biblio_year, drupal_term_data.name, COUNT(drupal_term_data.name)
FROM drupal_biblio
LEFT JOIN drupal_term_node ON drupal_term_node.nid = drupal_biblio.nid
LEFT JOIN drupal_term_data ON drupal_term_data.tid = drupal_term_node.tid
GROUP BY drupal_biblio.biblio_year, drupal_term_data.name
ORDER BY COUNT(drupal_term_data.name) DESC

- Bonus: Put results of this query (OK, modify a bit first) into a tag cloud. This website recognizes phrases.

Posted by kkwaiser at 05:13 PM | Comments (0)

IE emulator on Ubuntu

This is not a How To post. Think of it as a How I Tried and Failed post.

1> Installed Wine and IE 7 -> then learned Wine is not an emulator
2> Tried IE4Linux -> then learned it is not really maintained and is not compatible with Ubuntu 10.04 and/or I'm not smart enough to configure it.
3> UM Virtual Sites -> Haven't tried but assuming I will only see IE8 if I can get it to work from my machine.
4> User Agent Switcher -> Firefox Addon mentioned in some forum. Shot in the dark, no dice.
5> Find old XP laptop and place next to workstation -> this is what I've resorted to and where I wish I would have started.

Pinged a listserve I'm on for help and came away with this in terms of potential solutions:

1) Setup a virtual Windows OS using, VirtualBox, Parallels or VMWare.
- Probably the best, long-term solution but too involved considering my immediate need is simple and I already run a dual-boot

2) Firefox Addon: IE NetRender
- appears to work except it produces a static image of the specified page where my problem (or, more accurately, IE's problem) is with drop-down menus which don't render.

3) Use IE8's developer tools to render as an IE7 browser.
- This works well enough in combination with my old laptop "solution"

Posted by kkwaiser at 12:50 PM | Comments (0)

VirtualBox on Ubuntu 10.04

Ignore this, I don't have an XP bootdisk in front of me so am not going to bother.

I need a Window OS emulator; here are some notes.

Synaptic and select VirtualBox

Type: Dynamically expanding storage
Location: /root/.VirtualBox/HardDisks/iesux.vdi
Size: 8.00 GB (8589934592 Bytes)

Posted by kkwaiser at 11:33 AM | Comments (0)

March 18, 2011

Dropdown menus hidden behind Panels Panes in Internet Explorer

Time to fix this issues. Navigation is severely inhibited because the menus navigation isn't accessible. This is either a CSS/theming (Marinelli) issue or a rendering (Panels) issues. Or, it could be something else altogether.

Related issues:

Nice Menu drop down disappearing behind nearby panel content in IE7+

Marinelli Drop-down issue with Panels on IE7

Menu hidden behind nice menu and panels in IE7

I ended up adding the following z-index line to layouts.css

#utilities {
padding: 5px; /* originally 0; */
margin: 0px auto;
width:970px; /* match page width */
z-index: 1000 !important;

Posted by kkwaiser at 10:58 AM | Comments (0)

March 16, 2011

Showing total number of items in a view

Wow. What. A. P.I.T.A.

All I want to do is add a header to a View that says the following:

You are viewing [numItems] of [totalItems] records.

and it is nigh impossible.


[crowd snickers]

All you do is modify this simple snippet:


< ?php
$view = views_get_current_view();
$items_displayed = $view->pager['items_per_page'];
$num_rows = $view->total_rows;

You are viewing < ?php print $items_displayed; ?> of < ?php print $num_rows; ?> records.

and you get what you are asking for.

[Not so fast Mister and Miss Opensource...]

If you are limiting the number of items to display, then you need to have a Pager on your view. Problem solved.

[There you go again...]

If your view takes an argument a collision between Drupal Core and Views causes the pager not to render and your total row count is always 1. You might think adjusting the offset would work, but noooo.

If you have Houdini's stomach, you can hack Core. Or, maybe, you want to create a module to accomplish it - because you know your site doesn't have enough modules yet.

[But wait, simple, simpletons...]

The cause of your problem isn't actually the pager, it is the use of distinct in Core which returns a count of 1 which is the problem you were experiencing in the first place.

[Oh yeah. It gets better...]

While having an NID argument can cause the problem, so can having Content Access and/or ACL enabled.

Posted by kkwaiser at 12:04 PM | Comments (0)

March 14, 2011

Dataset todo's

Nothing but a boring todo list to make sure things don't fall between the cracks.

Received but not processed:

Done but up as an xlsx file:

lindsfp and lindpell
dailyppt.xlsx - not loaded
Cheb Climate Records - xlsx format
Mackninaw Climate Records - xlsx format
PELLPPT.xlsx - xlsx format
PELLTEMP.xlsx - xlsx format
solar.xlsx - xlsx format
umbstemp.xlsx -
Secchi Depth Readings

Douglas Lake levels.xlsx -
umbsppt.xlsx -

Posted by kkwaiser at 04:33 PM | Comments (0)

Potential OBFS Birds of a Feather session

Title: Identifying challenges, solutions and collaborative opportunities for information management at OBFS sites

Information management at OBFS sites is a challenging endeavor for many reasons. This session will be an open-forum discussion of how information managers at OBFS sites are responding to these challenges with an eye toward sharing solutions and identifying future collaborative channels. Potential topics:

- Funding (or lack thereof) for information management
- Avenues for collaboration and skill-sharing among IMs
- Data diversity and data management challenges
- Facilitating researcher involvement in data management activities
- Establishing and enforcing Data Management and Data Access policies
- Building and deploying information management systems
- Leveraging external data management resources
- Meeting new NSF Data Management requirements

Posted by kkwaiser at 02:21 PM | Comments (0)

March 11, 2011

Finding UMBS Pubs

The Research Bibliography is the single most important resource we have in documenting the history of research at the University of Michigan Biological Station. I've spent the last week or so updating it and was surprised by the number of publications I found via Google Scholar that we did not have record of. In fact I was so surprised I have decided to document my procedure so that it can be reproduced for earlier years:

Visit Google Scholar

- Advanced Search
- Constrain the Date - I have exhausted the 2008-present timeline but have not touched 2007 or earlier. I recommend doing this one-year-at-a-time
- Suggested searches
"University of Michigan Biological Station"
"Biosphere-Atmosphere Research Training"
"Biosphere-Atmosphere Research"
"Biosphere-Atmosphere Research and Training"
Pellston, MI -"University of Michigan Biological Station"
"university of michigan biological field station"
"university of michigan field biological station"
"university of michigan field station"
prophet atmosphere -"university of michigan biological station" -peptide -god -religion -Mohammad -Muhammad -Mohamed -Muhammed


Ignore links to conference websites - we typically do not enter abstracts
Ignore links to deepblue.lib.umich

Posted by kkwaiser at 09:59 AM | Comments (0)

March 09, 2011

Connecting ArcGIS and PostGIS

Doing some research on the viability of setting up a spatial server. One option is to setup Postgres + PostGIS but a precondition would be the ability to read data from PostGIS directly into ArcMap for viewing (if not editing.)

How can I connect to a PostGIS database from ArcMap 9.3 and 10.0?
- Suggestions are ZigGIS for Arc 9.x
- Use a Query Layer to connect for Arc 10.x (viewing only)

- A summary of some of zigGIS' capabilities.
- Version 2.0 is propreitary but 3.0 will be opensource. Here's the roadmap for 3.0.
- Thoughtful blogpost from a developer with first-hand involvement in zigGIS.

Query Layer in Arc 10.x

- What is a query layer?

- Summary information from ESRI.

- Potentially useful How To involving Query Layers, postGIS and Dekho(?).

Posted by kkwaiser at 09:33 AM | Comments (0)

March 07, 2011

Emacs related commands

For the timing being, I am going to be using Emacs as an editor for my R scripts. Welcome to the land of confusing command line shortcuts:

Overall guide.

R specific guide.

SHOW BUFFERS - ctrl-x ctrl-b
KILL BUFFER - ctrl-x k

SEND ALL CODE TO R - ctrl-x ctrl-b
- note similarity to SHOW BUFFERS...


Posted by kkwaiser at 12:39 PM | Comments (0)

Load R packages

Open the R command line.

$ sudo R
> install.packages('packageName')

To load (have the functions available) use:

> library('packageName')

Posted by kkwaiser at 12:26 PM | Comments (0)

Install R Statistical Package

Documentation of a process.

Step-by-step instructions from a kindred soul.

Official instructions from the R Project.

More official instructions.

1. Open a terminal
2. Add a gpg (pgp?) key to authenticate downloads from R.
- What is a gpg key?

$ gpg --keyserver keyserver.ubuntu.com --recv-key E084DAB9
$ gpg -a --export E084DAB9 | sudo apt-key add -

- The GPG key may change occasionally. The most up to date key can be found in the SECURE APT section of this page.

3. Add a mirror site from which to download R-related code to sources.list

$ sudo gedit /etc/apt/source.list

Add this line to the bottom:
deb http://lib.stat.cmu.edu/R/CRAN/bin/linux/ubuntu lucid/

Note: A list of mirrors is available.
Note: See the INSTALLATION first section of this page for more on setting up a mirror.

4. Install the R code
$ sudo apt-get install r-base

Note: This tutorial mentions additional commands and build-essential at steps 7 & 8 but the above command appears to install that code.

5. Start and Update R

$ sudo R
> update.packages()

Potentially helpful comment:

I also like 'sudo apt-get install littler' and I then copy or link
install.r to /usr/local/bin and just use

$ sudo install.r foo far fie foo

which would then install the (hyopthetical) packages foo, far, fie and foo
from CRAN.

6. Install and Open the R Commander GUI

$ R
> library(Rcmdr)
Loads packages
> Commander()

7. File Location Notes

/usr/share/R for architecture-independent files.

Downloaded Packages go here:
-- 'base R' and recommended packages are in /usr/lib/R/library/
-- Debian-packaged R packages are in /usr/lib/R/site-library/
-- packages installed by you / R are in /usr/local/lib/R/site-library/

Posted by kkwaiser at 10:01 AM | Comments (0)

March 06, 2011

Cloud based GIS server

Possible solution to our need for a GIS-based data management solution.

Original ESRI press release.

More on ESRI's relationship with Amazon.

Suggested read from SpatiallyAdjusted.

PostGIS on a Window 2008 Server installed on the cloud.
What about running Postgres on Amazon's EC2?

Email exchange on Amazon EC2 and geoservers.

Posted by kkwaiser at 01:29 PM | Comments (0)

March 04, 2011

Good Best Practices for data management resource

Best Practices for Preparing Environmental Data Sets to Share and Archive

Posted by kkwaiser at 03:59 PM | Comments (0)

Tiff to PDF in Ubuntu

This page tipped me off.

I've already installed PDFTK so I just needed to install the tiff tool. I used synaptic for this:

$ sudo synaptic

And installed this library: libtiff-tools

I then ran this command:

$ tiff2pdf -o ~/Desktop/pdf_output.pdf ~/Desktop/tif_input.tif

Posted by kkwaiser at 03:40 PM | Comments (0)

Stock import View for data tables

I am using the Table Wizard module to expose data tables to Drupal. I use a stock-view to create the View that allows includes attached feeds. Below is the code for that View, the only change that should be needed is to change 'tablename' to the name of the database table containing the data.

$view = new view;
$view->name = 'dt_tablename';
$view->description = 'dt_tablename';
$view->tag = 'tw';
$view->view_php = '';
$view->base_table = 'dt_tablename';
$view->is_cacheable = FALSE;
$view->api_version = 2;
$view->disabled = FALSE; /* Edit this to true to make a default view disabled initially */
$handler = $view->new_display('default', 'dt_tablename', 'default');
$handler->override_option('fields', array(
'did' => array(
'id' => 'did',
'table' => 'dt_tablename',
'field' => 'did',
'label' => 'did',
'exclude' => 0,
'relationship' => 'none',
$handler->override_option('access', array(
'type' => 'none',
$handler->override_option('cache', array(
'type' => 'none',
$handler->override_option('title', 'Contents of dt_tablename');
$handler->override_option('header', 'This is a view of a raw database table. It may be sorted in various ways by clicking the column headers.

If you identify a particular field that does not need to be used in views of this table, go to the analysis page and check the Ignore box for that field. It will then no longer appear here.');
$handler->override_option('header_format', '1');
$handler->override_option('header_empty', 0);
$handler->override_option('empty_format', '1');
$handler->override_option('items_per_page', 3);
$handler->override_option('use_pager', '0');
$handler->override_option('style_plugin', 'table');
$handler->override_option('style_options', array(
'grouping' => '',
'override' => 1,
'sticky' => 0,
'order' => 'asc',
'columns' => array(
'did' => 'did',
'info' => array(
'did' => array(
'sortable' => 0,
'separator' => '',
'default' => 'did',
$handler = $view->new_display('page', 'tablename Page', 'page_1');
$handler->override_option('header', '');
$handler->override_option('path', 'admin/content/tw/view/dt_tablename');
$handler->override_option('menu', array(
'type' => 'none',
'title' => '',
'description' => '',
'weight' => 0,
'name' => 'navigation',
$handler->override_option('tab_options', array(
'type' => 'none',
'title' => '',
'description' => '',
'weight' => 0,
'name' => 'navigation',
$handler = $view->new_display('panel_pane', 'tablename Pane', 'panel_pane_1');
$handler->override_option('title', 'tablename Pane');
$handler->override_option('header', '');
$handler->override_option('pane_title', '');
$handler->override_option('pane_description', '');
$handler->override_option('pane_category', array(
'name' => 'View panes',
'weight' => 0,
$handler->override_option('allow', array(
'use_pager' => FALSE,
'items_per_page' => FALSE,
'offset' => FALSE,
'link_to_view' => FALSE,
'more_link' => FALSE,
'path_override' => FALSE,
'title_override' => FALSE,
'exposed_form' => FALSE,
'fields_override' => FALSE,
$handler->override_option('argument_input', array());
$handler->override_option('link_to_view', 0);
$handler->override_option('inherit_panels_path', 0);
$handler = $view->new_display('feed', 'CSV Feed', 'feed_1');
$handler->override_option('items_per_page', 0);
$handler->override_option('style_plugin', 'views_csv');
$handler->override_option('style_options', array(
'mission_description' => FALSE,
'description' => '',
'attach_text' => 'CSV',
'provide_file' => 1,
'filename' => '%view.csv',
'parent_sort' => 0,
'seperator' => ',',
'quote' => 1,
'trim' => 0,
'header' => 1,
$handler->override_option('row_plugin', '');
$handler->override_option('path', 'tablename.csv');
$handler->override_option('menu', array(
'type' => 'none',
'title' => '',
'description' => '',
'weight' => 0,
'name' => 'navigation',
$handler->override_option('tab_options', array(
'type' => 'none',
'title' => '',
'description' => '',
'weight' => 0,
'name' => 'navigation',
$handler->override_option('displays', array(
'page_1' => 'page_1',
'panel_pane_1' => 'panel_pane_1',
'default' => 0,
$handler->override_option('sitename_title', FALSE);
$handler = $view->new_display('feed', 'XLS Feed', 'feed_2');
$handler->override_option('items_per_page', 0);
$handler->override_option('style_plugin', 'views_xls');
$handler->override_option('style_options', array(
'mission_description' => FALSE,
'description' => '',
'attach_text' => 'XLS',
'provide_file' => 1,
'filename' => '%view.xls',
'parent_sort' => 0,
$handler->override_option('row_plugin', '');
$handler->override_option('path', 'tablename.xls');
$handler->override_option('menu', array(
'type' => 'none',
'title' => '',
'description' => '',
'weight' => 0,
'name' => 'navigation',
$handler->override_option('tab_options', array(
'type' => 'none',
'title' => '',
'description' => '',
'weight' => 0,
'name' => 'navigation',
$handler->override_option('displays', array(
'page_1' => 'page_1',
'panel_pane_1' => 'panel_pane_1',
'default' => 0,
$handler->override_option('sitename_title', FALSE);

Posted by kkwaiser at 11:21 AM | Comments (0)

March 03, 2011

How to compress pdfs in Ubuntu

Option 1:

Install pdftk
$ sudo apt-get install pdftk

Compression command.
$ pdftk output.pdf output final.pdf compress

- Noticed no difference if file size.

Option 2:

Take the file to .ps and back to .pdf:

$ pdf2ps input.pdf output.ps
$ ps2pdf output.ps output.pdf

- This approach threw an error because my input.pdf was malformed.

Option 3:

Reprint the pdf with Okular

$ okular input.pdf

Print > Print to File (pdf)
Options > PDF Options > Force rasterization

This reduced the file form 18 mb to 6.1 mb where the previous options failed. I retried Option 1 and 2 after this step and no change in size was observed.

Option 4:

I also had success with opening the following, which is the GUI version of Option 2:

Open large_input.pdf in Okular (or Evince) > Print as large_output.ps

Open large_ouput.ps in Okular (or Evince) > Print as small_output.pdf

Posted by kkwaiser at 02:31 PM | Comments (0)

Rewrite URL to remove duplicate content

I added these lines to .htaccess to make sure search engines were not indexing duplicate content.

< IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{HTTP_HOST} ^www\.umbs.lsa.umich\.edu$ [NC]
RewriteRule ^(.*)$ http://umbs.lsa.umich.edu/$1 [R=301,L]

Posted by kkwaiser at 12:31 PM | Comments (0)

Drupal 6 SEO with PathAuto

I'm on an SEO kick and have decided to install the PathAuto Module. Here are my steps. I found this tutorial - despite the comments - to be useful.

1) Download and install PathAuto 6-1.x
- it looks like directions for 6-2.x would differ. Specifically, steps 5 and 6 appear unnecessary.

2) Enable Pathauto (admin/build/pathauto)

3) Configure Pathauto (admin/build/path/pathauto)

Strings to Remove: a
- I want the URLs to be friendly to human readers

Update Action: Do nothing. Leave the old alias intact.
- Building on an existing site, don't want to muck things up.

Node Paths
Default path pattern: [type]/[title-raw].htm
Pattern for all Biblio paths: bibliography/[title-raw].htm
Pattern for all Data Set paths: data/[title-raw].htm
Pattern for all Housing Application paths: housing/[title-raw].htm
Pattern for all Person paths: people/[title-raw].htm
Pattern for all REU Application paths: reu/[title-raw].htm
Pattern for all Research Project paths: projects/[title-raw].htm

Taxonomy term paths
Default path pattern: [vocab-raw]/[catpath-raw].htm

5) Automate the PathAuto bulk update process
- Follow this tutorial and specifically this comment.

- Created a file called pathauto.php and place it in the root drupal directory. The only change I suggest is reduce the number listed in this line "variable_set('pathauto_max_bulk_update', 100);"

6) Navigate to http://yoursite.com/pathauto.php
- This script will run as long as your webbrowser is at this URL.

7) Install and enable Global Redirect
- No further configuration needed
- This will forward all /node/1234 URLs to the URL Alias which will force search engines to index 2 URL instead of two.

8) Rebuild sitemap.xml

- It is probably be to set up your aliases before setting up your sitemap but if you tend to do things in the wrong order (me, me!) you can go to your site map settings and rebuild them.

Posted by kkwaiser at 09:53 AM | Comments (0)

March 01, 2011

Setup XML Sitemap for Drupal 6

I've noticed the Research Gateway does not appear as prominently as I think it should on some search results. Here are the steps to creating a sitemap for submission to google and bing, which I think should remedy this.

1. Download XML Sitemap
- enable XML sitemap, XML sitemap engines, XML sitemap modal UI, XML sitemap node, XML sitemap taxonomy

2. Visit the /admin page

- I hit a WSOD on ITS servers. Remedy is here.
- Note: Server errors logs for the Research Gateway are available here.

3. Visit the Edit section of content types and vocabularies to Include that content in the XML SiteMap.

Content types Enabled: Biblio, DataSet, Institution, Research Project, Variable, Person, Variable, Page

Vocabularies Enabled: UMBS Keywords, USGS Maps, Dataset Theme, Michigan Counties and Townships, Institution Categories, GNIS Classes

Note: a brief test showed that unpublished nodes are omitted from the sitemap
Note: a second brief test indicates that only nodes that are viewable to Anonymous users are included in the sitemap.
Further explanation: I use the Content Access module and set a content type to be Include in the sitemap but not viewable to Anonymous. I then created two nodes of this content type and - at the node level - made one of them viewable to Anonymous. After running cron.php this node showed up on the sitemap.

4. Configure XML SiteMap (admin/configuration/xmlsitemap)
- Added Bing and Google as search engines to submit sitemap to.

5. Verify ownership for Bing and Google.
- Create Webmaster Tools account at each site
- Place an xml file (Bing) and an html file (Google) in the site's root directory.

6. The sitemap is expanded everytime cron.php is run.

Posted by kkwaiser at 03:51 PM | Comments (0)