« Exploring Drupal | Main | Taxonomic Solutions »

January 20, 2010

Reveal Hidden characters in Ubuntu

Kind of an interesting aside here. I was trying to import a test bibliography into Drupal's biblio module and it was not registering the publication type of the first entry. The problem, it turns out, was a hidden character at the very front of the input file.

Here's a snippet of the entry:

%0 Journal Article

%A Barnese, L. E.

%A Lowe, R. L.

%A Hunter, R. D.

%D 1990

%T Comparative grazing efficiency of pulmonate and prosobranch snails

%J Journal of the North American Benthological Society

%V 9

%N 1

%P 35-44

It turns out Windows programs, such as Endnote, which I used to create the original input file, are known to introduce hidden characters which are known as an UTF-8 BOM. Who knew?

The string of characters I was seeing was:
\357\273\277

I used the following command to reveal the hidden characters:

$ od -c /home/data/Documents/Bibliography/umbs_format_test_1-20-10.txt

Here are the first two lines:

0000000 357 273 277   %   0       J   o   u   r   n   a   l       A   r
0000020   t   i   c   l   e  \r  \n   %   A       B   a   r   n   e   s

Posted by kkwaiser at January 20, 2010 01:57 PM

Comments

Login to leave a comment. Create a new account.