images/contents.gifimages/index.gif

Headline Indexing Criteria

This section describes the headline indexing process undergone by each news story that is archived in an Aspen news database. The workings of this process cannot be altered by users, but an understanding of how news headlines are indexed can help a user search the news database more efficiently.

Each headline is examined to select important keywords (words of substance rather than words that are strictly grammatical or of insignificant meaning); news stories are then indexed by the selected keywords. The keyword index provides a powerful resource for finding and retrieving stories on the basis of their headline contents. Because search queries entered by users can only be matched against indexed keywords, it is valuable to understand what kinds of keywords a server will actually select for the keyword index.

Topics:

Headline Preprocessing

Keyword Validation

Compound Words

Final Rejection Test

Inclusion List Contents

Exclusion List Contents