Library Cataloging

Code4Lib Journal Articles

The latest issue of Code4Lib Journal has a couple of interesting articles.

Using Semantic Web Technologies to Collaboratively Collect and Share User-Generated Content in Order to Enrich the Presentation of Bibliographic Records?Development of a Prototype Based on RDF, D2RQ, Jena, SPARQL and WorldCat?s FRBRization Web Service
Ragnhild Holgersen, Michael Preminger, David Massey

In this article we present a prototype of a semantic web-based framework for collecting and sharing user-generated content (reviews, ratings, tags, etc.) across different libraries in order to enrich the presentation of bibliographic records. The user-generated data is remodeled into RDF, utilizing established linked data ontologies. This is done in a semi-automatic manner utilizing the Jena and the D2RQ-toolkits. For the remodeling, a SPARQL-construct statement is tailored for each data source.

In the data source used in our prototype, user-generated content is linked to the relevant books via their ISBN. By remodeling the data according to the FRBR model, and expanding the RDF graph with data returned by WorldCat?s FRBRization web service, we are able to greatly increase the number of entry points to each book. We make the social content available through a RESTful web service with ISBN as a parameter. The web service returns a graph of all user-generated data registered to any edition of the book in question in the RDF/XML format. Libraries using our framework would thus be able to present relevant social content in association with bibliographic records, even if they hold a different version of a book than the one that was originally accessed by users. Finally, we connect our RDF graph to the linked open data cloud through the use of Talis? openlibrary.org SPARQL endpoint.

GLIMIR: Manifestation and Content Clustering within WorldCat
Janifer Gatenby, Richard O. Greene, W. Michael Oskins, Gail Thornburg

The GLIMIR project at OCLC clusters and assigns an identifier to WorldCat records representing the same manifestation. These include parallel records in different languages (e.g., a record with English descriptive notes and subject headings and one for the same book with French equivalents). It also clusters records that probably represent the same manifestation, but which could not be safely merged by OCLC?s Duplicate Detection and Resolution (DDR) program for various reasons. As the project progressed, it became clear that it would also be useful to create content-based clusters for groups of manifestations that are generally equivalent from the end user perspective (e.g., the original print text with its microform, ebook and reprint versions, but not new editions). Lessons from the GLIMIR project have improved OCLC?s duplicate detection program through the introduction of new matching techniques. GLIMIR has also had unexpected benefits for OCLC?s FRBR algorithm by providing new methods for identifying outliers thus enabling more records to be included in the correct work cluster.

- Tags Into Rdfa
LODr looks interesting. It brings tagging, Web 2.0, into the Semantic Web, Web 3.0.LODr is a RDF-based (re-)tagging service, that allows people to weave their Web 2.0 tagged data into the Linked Data Web and provides a dedicated browsing interface. LODr...

- Dewey Classification As Linked Data
News from the Dewey office.For a long time, we wanted to do something with Linked Data. That is, apply Linked Data principles to parts of the Dewey Decimal Classification and present the data as a small ?terminology service.? The service should respond...

- Marc And Rdf
Semantic MARC, MARC21 and the Semantic Web by Rob Styles, Danny Ayers, and Nadeem Shabir is available as a preprint.The MARC standard for exchanging bibliographic data has been in use for several decades and is used by major libraries worldwide. This...

- Identifiers And Subject Access
A while back I posted a criticism of David Weinberger's piece in the Boston Globe. He was kind enough to respond. Since many folks might miss the comments, I'm reposting them here.Here's what I was trying to say, in a highly-compressed article....

- Marc And Frbr
Data mining MARC to find: FRBR? by Knut Hegna and Knut Hegna is an interesting read.In this project MARC data from two national bibliographies is analyzed in the light of the data model presented in the FRBR study from IFLA. The analysis shows that even...

Library Cataloging