« April 2012 | Main | June 2012 »

May 18, 2012

Install R packages

> install.packages("/home/data/Downloads/foreach_1.4.0.tar.gz", repos=NULL, type="source")

what a PITA.

> .libPaths()
- this one shows where things will be installed

> install.packages("foreach", dependencies=TRUE)
- this one installed into a temp directory = FAIL

Posted by kkwaiser at 03:43 PM | Comments (0)

May 15, 2012

British data software


Other new features in v3.0 include:

- Ability to share plans, and to edit them jointly with colleagues
- Simultaneous viewing of multiple custom guidance notes
- More flexible project stages (phases) for templates
- User maintainable profile/login details
- XLSX output

Over the coming months we will be rolling out a number of additional features, and further announcements will flag their release:

- A facility for boilerplate text to be included within templates
- Display of funder constraints on output (e.g. number of pages, word count etc)
- Increased institutional customisability, including a new ‘administrator’ user type
- Support for non-English Language versions of the tool


DataStage is a secure personalized 'local' file management environment for use at the research group level, appearing as a mapped drive on the end-user's computer.

It can be deployed on a local server, or on an institutional or commercial cloud. Once the software has been installed on the server, there is no additional software for the end-user to install


DataBank is a scalable data repository designed for institutional deployment.

DataBank will provide a definitive, sustainable, referenceable location for (potentially large) research datasets and allow researchers to store, reference, manage and discover datasets.

Posted by kkwaiser at 08:41 AM | Comments (0)

May 10, 2012

EPA Data Mandate?

A colleague forwarded this document from the EPA in which they review all federal data management policies in preparation for creation of an over-arching EPA SDM (scientific data management) policy. FYI, ORD is an office within the EPA.

This review demonstrates that, in general, federal agencies have yet not developed comprehensive policies and approaches for managing the burgeoning amount of scientific data that they create. Nevertheless, this compilation of resources provides a solid base of information for beginning to develop a set of ORD SDM policies and guidance.

The introduction to this report laid out a general, long-term approach for two broad goals: (1) developing a SDM policy framework and (2) developing policies, guidance, and tools that fit within this framework.

Posted by kkwaiser at 10:41 AM | Comments (0)

May 09, 2012

Further reading?

American Journal of Economics and Business Administration 3 (1): 112-119, 2011
ISSN 1945-5488
© 2010 Science Publications

Using Metadata Analysis and Base Analysis Techniques
in Data Qualities Framework for Data Warehouses

Azwa Abdul Aziz, Md Yazid Mohd Saman and Mohd Pouzi Hamzah
Faculty of Informatics, Department of Computer Science,
University Sultan Zainal Abidin (UniSZA),
21030, Gong Badak, Terengganu Malaysia

the framework will use Metadata
Analysis to gain the target qualities value and Base
Analysis Techniques to view actual values in data
sources. A gap analysis technique will provide the
strategies to reduce the gap between the target and
actual values. This study also proposes a DQ matrix
strategy in DW design.

Posted by kkwaiser at 04:15 PM | Comments (0)