Library Cataloging
Corporate Names
A new paper from HP discusses the problems and an automated solution to distinguishing corporate names. Company Names Matching in the Large Patents Dataset by Timofey Medvedev and Alexander Ulanov, HP Laboratories, HPL-2011-90R1.
This paper addresses the name matching (duplicate detection) problem in the US patent dataset. It contains more then 400K unique company names spellings. In order to solve the matching problem we choose appropriate string similarity measure and clustering approach and estimate their parameters. Finally we apply them to the whole dataset and estimate the positives and negatives rates.
-
Problems With Names
Falsehoods Programmers Believe About Names shows the computer folks are finding names a problem. We could have told them that a long time ago.So, as a public service, I?m going to list assumptions your systems probably make about names. All of these assumptions...
-
Better Targeted Ads
Computing Semantic Similarity Using Ontologies by Rajesh Thiagarajan, Geetha Manjunath, and Markus Stumptner is a new HP Lab Report.Determining semantic similarity of two sets of words that describe two entities is an important problem in web mining (search...
-
Lpsc Poster
Tuesday I'll be presenting my poster for LPSC. I'm hoping some of the scientists go back to their libraries and clean-up and get their names submitted into the LC Name Authority File. This is a larger problem for women who marry, change their...
-
Authorty Tool
a.k.a. (also known as) lists author pseudonyms, aliases, nicknames, working names, legalized names, pen names, noms de plume, maiden names... etc. As of 06/05/04 it included 11,516 entries (4,142 'real' + 7,374 'pseudo')....
-
Authority
On identifying name equivalences in digital libraries by Dror G. Feitelson appears in Information Research, v. 9, no. 4 (July, 2004)The services provided by digital libraries can be much improved by correctly identifying variants of the same name. For...
Library Cataloging