Library Cataloging
ANSEL Designated/Invoked by Two Bytes, Not One
Received via e-mail this morning.
This is a clarification of information on the newly assigned escape sequence for designating and invoking ANSEL, the Extended Latin character set used in MARC 21.
The email notification of the escape sequence assigned to the extended Latin set (ANSI/NISO Z39.47) provided information about the final character of the escape sequence only, which was identified as hexadecimal 45 (the uppercase letter "E"). The text of the registration itself, however, indicates that final part of the escape sequence for ANSEL is actually two bytes, hexadecimal 21 ("!") followed by hexadecimal 45 ("E"). Although "E" is the final character, it appears that the pair hex 21 45 ("!E") would need to be used in MARC 21 records to designate and/or invoke ANSEL. This is important with regard to the application of ISO 2022 which specifies the technique for using escape sequences to change character sets. ISO 2022 does not indicate that the final portion of the escape sequence is restricted to a single character. Please make note of this clarification when implementing any escape sequences to designate and/or invoke ANSEL (ANSI/NISO Z39.47)
--------------------------------------------------
Library of Congress
Network Development and MARC Standards Office
101 Independence Avenue, S.E.
Washington, DC 20540-4402 U.S.A.
TEL: +1-202-707-6237
FAX: +1-202-707-0115
NET: [email protected]
-
Character Sets
Michael Doran has put together Coded Character Sets: A Technical Primer for Librarians. If you, like I am, are trying to get a handle on character encoding, this is an excellent starting point....
-
Unicode And Marc
News from LC.The revised Character set specifications are now posted on the MARC site. They take into account the use of the full Unicode repertoire, as opposed to only the MARC-8 subset of Unicode, and also include the loss-less and lossy techniques...
-
Unicode
It's coming to MARC. This is a gentle introduction. The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) by Joel Spolsky.Before I get started, I should warn you that if you are...
-
Sorting
Guidelines for the Non-Sorting Control Character Technique has been posted by the Network Development and MARC Standards Office Library of Congress.With Proposal No 98-16R (Nonfiling characters in all MARC formats), the MARC 21 community approved the...
-
Marc
MARC::Charset is a package that allows you to easily convert between the MARC-8 character encodings and Unicode (UTF-8). The Library of Congress maintains some essential mapping tables and information about the MARC-8 and Unicode environments. MARC::Charset...
Library Cataloging