Adept Scientific - English
The world's best software and hardware for research, science and engineering.
flag arrow
clearclear

 Adept Store | register Join My Adept | Flags  
Adept Scientific | Amor Way | Letchworth Garden City | Herts | SG6 1ZA | Tel: +44 (0)1462 480055  
UKdedksvnofi
Home
Products
Training
Events
 Buy Online
Downloads
Academic Discounts
Support
My Adept
International |  About Us |  Adept Scientific Blog |  Contact Us |  Press Room |  Jobs
Adept Scientific on Facebook Adept Scientific on Twitter Adept Scientific on YouBube Adept Scientific on LinkedIn


The Next Steps

• Sign up for a Webinar
• Ask us a question
• Buy EndNote
• EndNote Pricing
• Find out about Training
• Download a Brochure
• Download a Demo
• What's on at Adept
• Read our RSS Feeds

Learn More

EndNote Home
EndNote Overview
Using EndNote online?
EndNote Reviews
Running EndNote on a
network

Which bibliography
software is best for me?

Bibliography Software
Comparison Chart


Latest Information

What's New in EndNote
X7

EndNote Basic
View an Online Tutorial
Connection Files
EndNote Import Filters
EndNote Styles
EndNote for iPad
RefScan

Service & Support

Patches and Updates
Looking for EndNote Web?
EndNote Discussion List
Browse the user forum
EndNote FAQs
Search the Knowledge
Base

Technical Support request

List Archives >  EndNote List Archive >  Archive by date >  This Month By Date >  This Month By Topic

HOWTO: use awk to automatically generate labels, etc.

Search email archive for  

HOWTO: use awk to automatically generate labels, etc.
Author: Andy Jacobson    Posted: Tue, 27 Jan 2004 21:36:42 -0500
Howdy,

I attach a sample awk script for automatically generating
labels by operating directly on tab-delimited EndNote output
files. It could be generalized to do arbitrary manipulation
of field contents. Awk is a standard text processing tool
provided on unixy systems like Mac OS X, and is available for
Windows as well (please do not ask me questions about Windows).

This method for processing EndNote data builds on work done by
Robert W. Gear, who describes how to get your data into MS
Access in his writeup available at
http://www.gordonmckenzie.co.uk/academic/endnote/ If you have
Access and feel comfortable with it, that may well be the way
to go. I don't have Access, but I was able to get my data
into Excel using his technique. Unfortunately Excel's text
processing functions are too primitive for my needs.

No warranty implied, use at your own risk, your mileage may
vary, etc. Much information in comments of script.

Best Regards,

Andy

# Written 27 Jan 2004 by Andy Jacobson /> # Time-stamp: </Users/arj/bib/EndNote/add_labels.awk: 27 Jan 2004 (Tue)
21:05:49 EST>
#
# This script assigns labels to EndNote records. It is very specific
# to my needs and is unlikely to be useful to others without
# modifications. I simply want to show that awk can be used to do
# arbitrary manipulation of EndNote libraries, and share the knowledge
# I gained in the process of doing so.
#
# Run this script by doing "awk -f add_labels.awk filename.txt". This
# assumes that the script is called "add_labels.awk" and that the
# tab-delimited output from EndNote is called "filename.txt".
#
# The labels generated by this script are intended to be unique. I
# use them to name the electronic copies of manuscripts that I keep, and
# to specify temporary citations in documents I write. The labels are
# of the form lastnameYYx, where:
#
# "lastname" is the lowercased last name of the first author
#
# "YY" is the non-y2k compliant two-digit year abbreviation: e.g. 04
# for 1904 or 2004.
#
# "x", the trailing digit is a character (a,b,c,d, ...) that ensures
# uniqueness. The order of the letter is order of occurrence to
# me, not chronological. So smith95a is the first paper by some
# Smith in 1995 that I have come across. smith95b is the second,
# regardless of whether it is the same Smith or whether it was
# published before smith95a.
#
# The script operates on tab-delimited output from EndNote, and
# creates tab-delimited input for EndNote. Note that these are
# slightly different formats! It is vital that carriage returns be
# stripped from the EndNote library before exporting to tab-delimited
# output. This is easily accomplished by doing a first output-input
# cycle using the "EndNote export" style. For explicit instructions
# on this, see Robert W. Gear's guide at
# http://www.gordonmckenzie.co.uk/academic/endnote/
#
# I used EndNote 7.0 on OS X 10.3.2 when writing this script. The
# default tab-delimited format operating on my library has 38 fields
# in the output, the last of which is the label field and the first
# of which is the record number. EndNote apparently does not want the
# record number in the processed, ready-to-be-imported file. It also
# requires two header lines, the details of which may vary from
# platform to platform, and may even depend on what data are in the
# library being exported. Certainly if the tab-delimited output
# format has been changed (via the edit output style interface), this
# script will have to be changed to match the output.
#

# Awk scripts give instructions to operate on each record (line) of a
# text file. Looping through the lines of the input file is implicit.
# In addition to the instructions for this central processing loop,
# awk allows one to specify BEGIN and END blocks which do extra
# processing before the loop and after the loop respectively. In this
# script we use a BEGIN block, but no END block.

BEGIN {

# Tell awk that the input field separator is a tab and the input
# record separator is a carriage return. The carriage return as
# line delimiter is Mac-specific; Windows uses something else.
FS = "\t"
RS = "\r"

# Now request the same delimiters for the output.
OFS = "\t"
ORS = "\r"

# Here we make an array of all existing labels, cl. This is
# a list against which new labels will be checked so that uniqueness
# can be ensured. This involves running once through the input file
# (supplied on the command line). Reading from it (in the while line)
# is sufficient to open the file, whereas closing the file must be
# done explicitly.

nlabs = 0
while( getline < ARGV[ARGC-1] ) {
if($38 !~ /^$/) {
++nlabs
cl[nlabs]= $38
}
}
close(ARGV[ARGC-1])

# create output file name
if(match(ARGV[ARGC-1],".txt$")) {
fnout=ARGV[ARGC-1]
sub(".txt",".out.txt",fnout)
} else {
fnout=sprintf("%s.out",ARGV[ARGC-1])
}
# "letters" becomes an array with all the letters of the alphabet
split("abcdefghijklmnopqrstuvwxyz",letters,"")

# Write the two required header lines to the output file
print "*Generic" > fnout
print "Reference Type Author Year Title Secondary Author
Secondary Title Place Published Publisher Volume Number of Volumes
Number Pages Section Tertiary Author Tertiary Title Edition Date Type
of Work Subsidiary Author Short Title Alternate Title ISBN/ISSN
Original Publication Reprint Edition Reviewed Item Custom 1
Custom 2 Custom 3 Custom 4 Custom 5 Custom 6
Accession Number Call Number Author Address Image Caption
Label" > fnout
}


# The BEGIN section is over. Now the main section begins, in which
# processing implicitly loops over each line of the input file.


$38 ~ /^$/ { # this is like an "if" statement, and it means
# "do the following if field 38 (the label) is empty.

core = label($3,$4) # call the label function (below) to make the
# core of the output label string

# We heuristically append "a" to the core. If there is already a
# label like that, proceed sequentially through the alphabet and try
# another one. Process will break if there is already a full set of
# 26 labels for the target core (modify to use double letters as the
# suffix?).

ilet = 1
while(1) {
letter = letters[ilet]
mayb = sprintf("%s%c",core,letter)
for(item in cl) {
unique = 1
if(cl[item]==mayb) {
unique = 0
++ilet;
break;
} # if
} # for
if(unique) {
$38 = mayb # change field 38 of the current record
++nlabs
cl[nlabs] = $38 # add this new label to the list of current labels
# so that new labels are also unique.
break; # get us out of the while loop
} #if
} # while
} # if $38 is empty

# Output the current record. Note that we omit $1, the first field,
# which in my case is the record number. EndNote apparently does
# not want that field in an input file. Thus the awkward list of fields
# 2 through 38. Otherwise we could use the shorthand $0 which indicates
# "all fields".
{print
$2,$3,$4,$5,$6,$7,$8,$9,$10,$11,$12,$13,$14,$15,$16,$17,$18,$19,$20,$21,$22,
$23,$24,$25,$26,$27,$28,$29,$30,$31,$32,$33,$34,$35,$36,$37,$38 > fnout}

# That's it for the core section, no more processing is required!


# Take the full list of all authors and the 4-digit year,
# and extract the lowercased version of the first author's
# last name. Assumes that authors are separated by
# semicolons (EndNote also allows a double backslash, I think),
# and that the last name is the last element in each space-delimited
# author name. Name suffixes like "Jr." or "II" may cause this
# to fail, depending on how they are written in the library.
function label(namelist,year) {
split(namelist,nms,";")
z=split(nms[1],nm," ") # nm gets the output array here, and z
# the number of elements.
lb=sprintf("%s%s",tolower(nm[z]),substr(year,3,2))
return lb
}
--
Andy Jacobson

/>
Program in Atmospheric and Oceanic Sciences
Sayre Hall, Forrestal Campus
Princeton University
PO Box CN710 Princeton, NJ 08544-0710 USA

Tel: 609/258-5260 Fax: 609/258-2850

________________________________________________________________________
This email has been scanned for all viruses by the MessageLabs Email
Security System.

Previous by date: RE: extra commas and spaces in CWYW citation in the document,  A P Lombardo
Next by date: Re: .doc to ED6 import, Duncan Branley
Previous thread: Using your document to search your endnote database?, David Brake
Next thread: .doc to ED6 import, Simeon Boyd



Ready to buy?

Upgrade to EndNote for Windows - DOWNLOAD
IMMEDIATE DOWNLOAD
Add to shopping basket
£ 75.00
EndNote for Windows - DOWNLOAD
IMMEDIATE DOWNLOAD
Add to shopping basket
£ 159.00
Upgrade to EndNote for Mac (OS X) DOWNLOAD
IMMEDIATE DOWNLOAD
Add to shopping basket
£ 75.00
EndNote for Mac (OS X) DOWNLOAD
IMMEDIATE DOWNLOAD
Add to shopping basket
£ 159.00
Click here to buy boxed product.

Featured Downloads

EndNote X7 for Windows Trial
Finding Full Text with EndNote datasheet

Latest Downloads

EndNote X6 Online User Manual
EndNote VIP Meeting 2013 Q and A
EndNote X7 Datasheet
EndNote X7 Quick Reference Guide for Windows
EndNote X7 for Windows Trial

Product Reviews

"The most powerful citation manager you can find short of a personal librarian."
PC Magazine

"Researchers will find the cost of EndNote covered many times over by their increase in productivity and accuracy."
Science

Latest News

EndNote Case Study - Referencing in the Pharmaceutical Industry
EndNote Case Study - EndNote Makes a Case for Change
EndNote Case Study - Music To a Scholar's Ears
New EndNote for iPad app from Thomson Reuters mobilises the scientific research process
EndNote Case Study - Together for Short Lives
adept

Top of the Page

Popular Links: ChemDraw | ChemOffice | Data Acquisition | Data Analysis | EndNote | Maple | MapleSim | Mathcad | MathType | Quality Analyst | Reference Manager | VisSim

EU ePrivacy Directive | Our Privacy and Terms and Conditions Statement
All Trademarks Recognised. Copyright © 2013, Adept Scientific Ltd.
Site designed and maintained by Lyndon Ash

Adept Scientific | Amor Way | Letchworth Garden City | Herts | SG6 1ZA | Tel: +44 (0)1462 480055