Human or computer produced indexes?
Why have a human-produced index where full text searching is available?
Some computer programs and word processors claim to produce indexes but, in fact, produce "concordances" - lists showing where specific words or phrases appear in the text. Similarly, where an electronic copy of the text is available, included on a CD-ROM or on a webpage, providing full-text search facilities is sometimes considered to give the same results as an index, but again provides, effectively, no more than a concordance. Language, however, is complex and concordances have serious shortcomings when compared to an analytical index produced by a professional human indexer.
Problems with concordances/full text searches are:
- Full text searches do not cope with homographs (words spelt the same but with different meanings), for example:
- a book on boat-building might refer to the front of the boat (the bow), instruct that something is tied temporarily (with a bow) and that, as part of the construction, something should be bent into an arch (like a bow, as in a bow and arrow).
- an Internet search for information on the pop performers Madonna, Prince and the group Queen will produce millions of unwanted references to religious art and royalty.
- an Internet search for "lead" will find "Lead in paint...", "LEAD International: Leadership for Environment and Development..." and "12-lead ECG library ..."
- Full text searches do not cope with synonyms (words spelt differently but with the same meanings), for example:
- bruise / contusion,
- flammable / inflammable,
- Çanakkale Bogazi / Dardanelles / Hellespont.
- Full text searches do not distinguish between significant and trivial references to a topic, for example:
- it would not be helpful to the index user seeking information on 'children' to direct them to the texts
- "teaching adults (as opposed to children) ..."
- "the subject of children is dealt with later ...".
- Full text searches do not pick up inferences, where a concept is discussed but the actual search term is not used, for example:
- a biography which says about the subject that he was "a strong advocate of religious tolerance" and, elsewhere in the text, "he was not a member of the Church" should show both these references in the index under 'religion'.
- Full text searches do not cater for graphics - they may pick up the caption to a picture, but cannot access or assess the content of a picture, for example:
- The New Yorker Cartoon "On the Internet, Nobody Knows You're a Dog" (viewable here) is of interest to searches on security and identity, but probably not for searches about dogs.
A computer-produced concordance or full text search leaves the reader with two unanswered questions:
- Are all these hits really relevant?
- Have I missed any significant information ?
A professional quality, human-produced index, whether for book or a website, provides the answers. Every piece of information indexed is analysed by a human indexer to determine to what questions it would provide a relevant answer, and it is indexed under those questions.
Furthermore, value judgements made by the indexer can be communicated to the reader, for example:
- Page numbers for the main discussions on a topic may be shown in bold typeface
- On a website index, the index heading may include explanatory text such as "unverified source"
Computer-produced concordance-style indexes or full text search facilities are quick and easy to supply, saving time and cost for the information provider. They are suited to rapidly changing material.
Back-of-book-style indexes, in books and for websites, take more time and skill to produce but provide significantly improved access, saving time and cost, for the information seeker. They are suited to material which changes slowly, or only by addition.
There is more information about why ebooks need indexes on the Society of Indexers Publishing Technology Group Website at www.ptg-indexers.org.uk/
Last updated: 05 April 2013 | Maintained by Webmaster | Page ID: 463
Top of page