At work we bought a lot of new books for our new library. Our libraries use aprogram called LibraryPro which can import MARC records to add books to the catalog. To save ourselves some time, I look on the web for a script to get MARC records from the Internet and found this excellent script from William Denton. Because I needed to get batches of records and modify the records gotten to follow our custom dewey decimal number (number/category/author), I modified his script to accomplish that and this is how it looks now:
#!/usr/local/bin/ruby -w
# Script to get a batch of MARC records
#
#original code taken from http://www.miskatonic.org/library/zmarc.html
# original programmer
# William Denton wtd@pobox.com
# April 2007
# modifications made by Juan Pablo Tarquino http://jptarqu.blogspot.com
# Released under the MIT License.
# Copyright (c) 2007 William Denton
# Copyright (c) 2008 Juan Pablo Tarquino
#
# Permission is hereby granted, free of charge, to any person
# obtaining a copy of this software and associated documentation
# files (the "Software"), to deal in the Software without
# restriction, including without limitation the rights to use,
# copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the
# Software is furnished to do so, subject to the following
# conditions:
#
# The above copyright notice and this permission notice shall be
# included in all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
# EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
# OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
# NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
# HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
# WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
# FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
# OTHER DEALINGS IN THE SOFTWARE.
# INSTALLATION NOTES
#
# Note: this script has been tested only in Linux. You may have to
# do extra steps to make it work in Windows or Mac
#
# Install 'yaz' because ruby-zoom depens on it. In debian:
# # sudo apt-get install yaz
#
# Requires the ruby-zoom package and the ruby-marc gem.
#
# ruby-zoom: http://ruby-zoom.rubyforge.org/
#
# ruby-marc: http://www.textualize.com/ruby_marc
# You can install ruby-marc by running
# # gem install marc
#
# NOTE ruby-zoom installs its own marc.rb file that will conflict with
# ruby-marc! You will need to delete ruby-zoom's marc.rb or rename it
# for ruby-marc to work.
#
# On my system it was installed in
# /usr/local/lib/ruby/site_ruby/1.8/marc.rb
# but you'll have to look for it wherever your system put it.
# USAGE
#
# Enter your ISBN numbers in a file called 'in_isbn' in the same
# folder as this script. One ISBN number per line. Then using the
# terminal change current directory to the folder that contains
# this script and then execute:
#
# ./zmarc.rb
#
# A new file (export_marc.txt) will be created with the MARC records
# found
require 'rubygems'
require 'zoom'
require 'marc'
require 'yaml'
# Given an ISBN and some Z39.50 information, return MARCXML. Why
# MARCXML? Because (now) the ruby-zoom module can't return a
# ruby-marc MARC object. It can, however, return MARCXML, which
# ruby-marc can grok, so we translate it into XML and then back.
class ZMarc
OUT_FILE_NAME = 'export_marc.txt'
ERR_FILE_NAME = 'err_marc.txt'
def self.z3950query(isbn, host, port, db)
begin
ZOOM::Connection.open(host, port) do |conn|
conn.database_name = db
conn.preferred_record_syntax = 'MARC21'
rset = conn.search("@attr 1=7 #{isbn}")
return rset[0].xml
end
rescue Exception => e
# puts e # Uncomment to see any server erorrs
return nil
end
end
def self.import_records(isbn_numbers)
#writer = MARC::Writer.new('marc.dat')
err_file = File.open(ERR_FILE_NAME,"w")
marc_file = MARC::Writer.new(OUT_FILE_NAME)
servers = [
# Reorder these so that your preferred servers are first
# North America
['z3950.loc.gov', 7090, 'Voyager' ], # Library of Congress
['www.saclibrarycatalog.org', 210, 'INNOPAC' ], # Sacramento Pub Lib
['sirsi.library.utoronto.ca', 2200, 'unicorn' ], # U Toronto
['amicus.collectionscanada.ca', 210, 'NL' ], # Lib & Archives Canada
['aleph.mcgill.ca', 210, 'MUSE' ], # McGill
# ['ualapp.library.ualberta.ca', 2200, 'unicorn', ], # U Alberta
['portage.library.ubc.ca', 7090, 'voyager' ], # UBC
['catnyp.nypl.org', 210, 'INNOPAC' ], # New York Pub Lib
['library.mit.edu', 9909, 'mit01pub' ], # MIT
['prodorbis.library.yale.edu', 7090, 'voyager' ], # Yale
['catalog.princeton.edu', 7090, 'voyager' ], # Princeton
['ipac.lib.uchicago.edu', 210, 'usmarc' ], # Chicago
['library.bu.edu', 210, 'INNOPAC' ], # Boston U
['voyager.wrlc.org', 7090, 'voyager' ], # Wash Res Lib Consor
['catalog.lib.jhu.edu', 210, 'horizon' ], # Johns Hopkins
['z3950.lib.umich.edu', 210, 'miu01_pub' ], # U Michigan
['catalog.library.cornell.edu',7090, 'voyager' ], # Cornell
# UK and Ireland
['library.ucc.ie', 210, 'INNOPAC' ], # U College Cork
['library.ox.ac.uk', 210, 'MAIN*BIBMAST'], # Oxford
['z3950.nls.uk', 7290, 'voyager' ], # Scottish Nat Lib
['lib-15.lse.ac.uk', 7090, 'voyager' ], # LSE
['libsys.lib.hull.ac.uk', 210, 'INNOPAC' ], # Hull
# Europe (non-English)
['sigma.nkp.cz', 9909, 'NKC' ], # Nat Lib Czech R
['lib.mpib-berlin.mpg.de', 2020, 'opac' ], # Max Planck Inst
['ubsun02.biblio.etc.tu-bs.de',2020, 'bac' ], # Bibliotheken Berlins
['z3950.kb.dk', 2100, 'KGL01' ], # Kongelige Bibliothek
['www.bne.es', 2210, 'BIMO' ], # Nat Lib Spain
['roble.unizar.es', 210, 'INNOPAC' ], # U Zaragoza
['www.helmet.fi', 210, 'INNOPAC' ], # Helsinki Lib
['carmin.sudoc.abes.fr', 210, 'ABES-Z39-PUBLIC' ], # France
['gofor.bibli.ens-cachan.fr', 21210, 'ADVANCE' ], # French school
['gofor.bibli.ens-cachan.fr', 21210, 'MAIN*BIBMAST'], # French school
['isis.cilea.it', 2100, 'usmarc' ], # U Brescia
['aleph.library.tudelft.nl', 9909, 'tud01' ], # # Techn U Delft
['z3950.bibsys.no', 2100, 'BIBSYS' ], # Nat Lib Norway
['z3950.nb.no', 2100, 'norbok' ], # Nat Lib Norway
['alpha.bn.org.pl', 210, 'INNOPAC' ], # Nat Lib Poland
['z3950.btj.se', 210, 'BURK' ], # Sweden
# ['lbsihol.unimaas.nl', 7190, 'lbs' ], # U Maastricht
# Australia and New Zealand
['catalogue.nla.gov.au', 7090, 'voyager' ], # Nat Lib Australia
['nlnzcat.natlib.govt.nz', 7190, 'voyager' ], # Nat Lib New Zealand
# Asia
['library.cuhk.edu.hk', 210, 'INNOPAC' ], # Chinesse U HK
['linc.nus.edu.sg', 210, 'INNOPAC' ], # Nat U Singapore
['nbinet.ncl.edu.tw', 210, 'INNOPAC' ], # Nat Cent Lib Taiwan
# ['wine.wul.waseda.ac.jp', 210, 'INNOPAC' ], # Waseda U
# Africa
['explore.up.ac.za', 210, 'INNOPAC' ], # U Pretoria
# ['natlib1.unisa.ac.za', 210, 'INNOPAC' ], # Nat Lib South Africa
]
total = 0
#isbn_numbers = "978-0-545-05471-3,0-8037-2842-5,978-0-7642-0184-4,978-0-7586-1270-0,978-1-883551-45-2, 0-7847-1512-2, 978-1-5914-5447-2, 978-0-590-29972-5,0-439-81111-2,978-0-545-01162-4".gsub('-','').split(',')
for isbn in isbn_numbers
found = false
# isbn = "978-1-883551-45-2"
isbn = isbn.gsub(/[^0-9X]/, '')
if (! /(978)*\d{9}[0-9X]/.match(isbn))
puts "This is not a valid ISBN #{isbn}" # Not a true validity check!
else
# Two lists of open Z39.50 servers:
# http://targettest.indexdata.com/
# http://staff.library.mun.ca/staff/toolbox/z3950hosts.htm
# Now the real business. Loop through all the servers listed above
# and query it about the ISBN until one answers or we run out of servers
servers.each do |server|
marcxml = z3950query(isbn, server[0], server[1], server[2])
unless marcxml.nil?
reader = MARC::XMLReader.new(StringIO.new(marcxml))
new_record = MARC::Record.new()
reader.each do |record|
# Would be good to have an option or something so that people
# wouldn't have to see the leader and other early fields and
# possibly less interesting fields such as 9xx (local information).
# Some libraries have lots of 852 (holdings) fields which
# fill up the screen.
#puts record.to_yaml
puts "#{server[0]} ..."
found = true
if record['100'].nil?
author_name = ' '*3
else
author_name = "#{record['100']['a']} "[0..2]
end
puts author_name
unless record['082'].nil?
dewey_decimal = record['082']['a'].to_s
#check if it already contains the 3 parts, add them if missing
dewey_number_parts = dewey_decimal.split('/')
if dewey_number_parts[1] == nil
dewey_number_parts[1] = 'EFic'
end
if dewey_number_parts[2] == nil
dewey_number_parts[2] = author_name
end
new_dewey_decimal = dewey_number_parts.join('/')
puts new_dewey_decimal
# new_data_field = MARC::DataField.new('082','0','0',
# ['a', new_dewey_decimal],['2', record['082']['2'].to_s])
# record.append(new_data_field)
#marc_raw_data = marc_raw_data.gsub(dewey_decimal, new_dewey_decimal)
end
#add fields to new_rcord
record.each do |field|
if field.tag == '082'
new_data_field = MARC::DataField.new('082','0','0',
['a', new_dewey_decimal],['2', field['2'].to_s])
new_record.append(new_data_field)
else
new_record.append(field)
end
end
marc_file.write new_record
end
total = total + 1
puts total
break
end
end
#puts "Sorry, nothing found for #{isbn}"
err_file.puts isbn unless found
end
end
marc_file.close
err_file.close
end
end
isbn_numbers = IO.read("in_isbn").split("\n")
ZMarc.import_records(isbn_numbers)
Feel free to use this script. It will only work with MRI ruby because of native extensions (ruby-zoom). Stay tune to this blog for a jruby script that gets the marc records from the Library of Congress's website.
Comments