 |
|
List Archives > 
Reference Manager List Archive > 
Archive by date > 
This Month By Date > 
This Month By Topic
RE: Character set problems with RIS export
| RE: Character set problems with RIS export |
|
Author: Ulrich Bösing
Posted: Tue, 13 Feb 2007 19:29:52 -0500
|
Hi there,
though I can't make out on my mailer what your characters actually are, it looks a lot like the problem I face here with German characters: I noticed RefMan exports RIS in ASCII, not ANSI.
Therefore, I routinely load RefMan-exports into M$-Word (make sure you got a tag on something like "show character conversion" in the Windows General Options setting!), import as "M$-DOS Text", and save as "Text only". That way, I got all my umlauts etc. back :-). (Of course, clicking away the tag "ANSI characters" on the RIS-Filter on re-import into RefMan is also an option BTW: anybody know of a way to manipulate the _export_ filters?)
Works only on RIS-export (which I have to use, as I need to maintain the italics-commands for species names [for subsequent summary replacement with the respective HTML-codes], and it appears only this export format offers that) -- in Tab-delimited export, there is an additional quirck that all is in ANSI, though keywords _in_toto_ get reverted to ASCII. What the §$%&$.
Of course, all this depends, as you mention, also on the fonts you actually use. When using abovementioned Word method, it helps to open the exported text as "coded text", select "other coding", and just see how a particular string is displayed in the attached preview window as you switch between the various settings (e.g. your text looks best in UTF-8)...
In light of this mess, for all my reference work at home I use Unicode-compliant EndNote X.
They'll just have to add this feature to the next RefMan release -- we'll see.
Ulrich
u.boesing, at bba.de
-----Original Message-----
From: "listmaster"
"mailto:listmaster" Behalf Of "pacificw"
Sent: Thursday, February 08, 2007 1:33 AM
To: "RIS-LIST"
Subject: <RefMan> Character set problems with RIS export
We are having difficulty with the RIS export functionality dropping non standard alpha characters on export.
For example when we export comma delimited it maintains the accented characters.
"UNPB","4218","In the Land of Refuge: The Genesis of the Bahá'à Faith in ShÃráz","Mirza Afnan H;Rabbani A;","2008 Forthcoming","","of;","NOT IN FILE","","","","","","","","","","","Hong Kong","Juxta Press","","","","","","","","By MÃrzá HabÃbu'llà ¡h AfnánTranslated by Ahang
As it does when we export XML
<record>
<database name="babism"
path="c:">babism</database>
<source-app name="Reference Manager 11.0" version="11.0.1.1709">Reference Manager 11.0</source-app>
<rec-number>4218</rec-number>
<ref-type name="Unpublished Work">34</ref-type>
<contributors>
<authors>
<author>
<style face="normal" font="default">Mirza Afnan,Habibu'llah</style>
</author>
<author>
<style face="normal" font="default"> Rabbani,Ahang</style>
</author>
</authors>
</contributors>
<titles>
<title>
<style face="normal" font="default">In the Land of Refuge: The Genesis of the Bahá'à Faith in ShÃráz</style>
</title>
</titles>
<periodical/>
<pages end="" start="">-</pages>
<reprint-status status="no-file"/>
<keywords>
<keyword>
<style face="normal" font="default">of</style>
</keyword>
</keywords>
<dates>
<year Day="0" Month="0" Year="2008">2008=Forthcoming</year>
</dates>
<pub-location>
<style face="normal" font="default">Hong Kong</style>
</pub-location>
<publisher>
<style face="normal" font="default">Juxta Press</style>
</publisher>
<abstract>
<style face="bold" font="default">By MÃrzá HabÃbu'lláh Afnán</style>
<style face="normal" font="default"/>
<style face="bold" font="default">Translated by Ahang Rabbani</style>
</abstract>
<urls/>
</record>
However the RIS export fails to generate these accented characters
TY - UNPB
ID - 4218
T1 - In the Land of Refuge: The Genesis of the Bah '¡ Faith in Sh¡r z
A1 - Mirza Afnan,Habibu'llah
A1 - Rabbani,Ahang
Y1 - 2008///Forthcoming
KW - of
RP - NOT IN FILE
CY - Hong Kong
PB - Juxta Press
N2 - By M¡rz Hab¡bu'll h Afn n Translated by Ahang Rabbani
ER -
In the case of the N2 field it looks like it has “converted†the à but dropped/replaced the á
According to the manual the RIS specification permits ANSI characters 32-255
“The characters allowed in the reference ID fields can be in the set
“0†through “9,†or “A†through “Z.†The characters allowed in
all other fields can be in the set from “space†(character 32) to
character 255 in the Windows ANSI Character Set. Note,
however, that the asterisk (character 42) is not allowed in the
author, keywords, or periodical name fields.â€
So we attempted to add a set of symbols to the URL field of the above record
!"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~€Â‚ƒ„…†‡ˆ‰Š‹ŒÂŽÂ¬®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÃÂÃÄÅÆÇÈÉÊËÌÃÃŽÃÃÑÒÓÔÕÖרÙÚÛÜÃÞßà áâãäåæçè éêëìÃîïðñòóôõö÷øùúûüýþÿ
And this is the RIS export
TY - UNPB
ID - 4218
T1 - In the Land of Refuge: The Genesis of the Bah '¡ Faith in Sh¡r z
A1 - Mirza Afnan,Habibu'llah
A1 - Rabbani,Ahang
Y1 - 2008///Forthcoming
KW - of
RP - NOT IN FILE
CY - Hong Kong
PB - Juxta Press
N2 - By M¡rz Hab¡bu'll h Afn n Translated by Ahang Rabbani
UR - !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~__'Ÿ".Ã…ÃŽ^%S<O_Z_ª©îøñýüïæôú÷û§¯¬«ó¨·µ¶ÇŽÂ’€ÔÂÒÓÞÖ×ØÑ¥ãà â噞ÂëéêšÃèá… ƒÆ„†‘‡Š‚ˆ†°Â¡Œ‹Ã¤•¢“äâ€Ã¶â€ºâ€”£–Âìç˜
ER -
So it looks like RIS is mangling the character set. Even if we enter ASCII character set symbols the effect is the same.
Is there a work around? Specific font to use?
Cheers,
Steve Cooney
"pacificw"
|
[View Complete Thread]
Previous by date: RE: deleting references, Richard Mailman
Next by date: RE: RM 11 + Vista and Office 2007 :: any problems??, Steve Cooney
Previous thread: Chicago (notes and references), Kenneth MacKendrick
Next thread: RM 11 + Vista and Office 2007 :: any problems??, David Lawrence
|
|
|