Published in:
Infoteka, 2002, vol. 3, no. 1-2, pp. 13-21
Scriptor: Bibliographic Information Parsing Program
National Library of Serbia, Belgrade
Abstract: Scriptor - a program developed for the use in maintaining SocioFakt and designed for parsing journals' contents and references is described. By making use of auxiliary databases (e.g. lists containing authors and publishers' names) and simple algorithms for processing Serbian as natural language, the program recognizes the elements of the journals' contents and articles' references (e.g. author name, book title, journal title, page numbers) and assigns a standardized label to each of those elements, providing automatic transfer of information into the respective database field.
Apart from basic parsing module, the program provides subroutines for conversions of various character sets, word (de)capitalization according to orthographic rules, inversion of author's name and surname position, filling up the missing data, as well as interactive control and correction of the parsed information.
Scriptor comes with an installation program and detailed help file which contains specific instructions for the operators explaining ways to effectively use program itself, and defining bibliographic standards used in the process SocioFakt maintenance. Scriptor is written in Visual Basic for Applications as an Microsoft Word template.
Key words: bibliographic information, parsing, bibliographic databases, citation information, software