February 26, 2013
Word of the day: Codeframe
A coding frame, code frame, or codebook shows how verbal or visual data have been converted into numeric data for purposes of analysis. It provides the link between the verbal data and the numeric data and explains what the numeric data mean.
February 21, 2013
Set SAS to display variable names not labels
I don't understand this default setting but it can be changed.
- Click in or on the Explorer pane to highlight the the Explorer window.
- Select Tools->Options->Explorer in the menus.
- Select the Members tab.
- Double click on the TABLE icon.
- Double click on the &Open action.
- Set the Action command to: VIEWTABLE %8b.’%s’.DATA COLHEADING=NAMES
- Click on the Set Default button.
- Save changes and close the Explorer Options window
February 18, 2013
PDF to Word conversion particularities
I recently needed to convert some fairly complex (i.e. with tables, images) pdfs to word documents. The conversion went OK but a large number of the images held within tables were not visible and the ones that were visible were not selectable.
Upon doing some research, I found out that the unselectable images (vector images perhaps) could be selected if you went to home> select > select objects ( Click on the pic ).
From there it was simply a matter of right-click > Format Object > Layout > In Front of text.
In order to avoid having to select all the objects on each page (you can't select across pages) you can select them all by:
You can make use of the Select Multiple Objects command, which opens a dialog box that easily allows you to select objects. In Word 2007, you can add this command to the Quick Access Toolbar.
February 11, 2013
Anonymizing medical data
Just a few thoughts copped from a recent email exchange on the Research Dataman list:
It's one thing to be aware of the risks - it's another to decide how to
manage them. Refusing to disclose *any* data except under very carefully
controlled circumstances is one approach, and it's probably valid for data
where the reuse potential is likely to be limited to a few instances at most.
For data with greater reuse potential techniques adopted for some government
datasets may be appropriate. These include perturbation of some of the numbers
or suppression of some numbers in cases that might lead to disclosure even in
aggregated data. Both need expert statistical advice to ensure that the
resultant data can still be used to do something useful but isn't disclosive.
Examples of perturbation include varying a subject's age by a few years in
either direction. An example of suppression I am aware of comes from the Schools
Census - in any school where the number of pupils receiving free school meals
is below 5, the exact total is redacted from the published data.
Ultimately the only way to prevent identification of individuals by combining datasets (i.e. which include sufficiently sensitive data items to permit identification but not actual confidential=identifiable data) is through the Data Sharing or Re-Use Agreements between data controllers and data processors.
Websites that were mentioned:
Anonymisation of data from UK Data Archive