images/contents.gifimages/index.gif

Keyword Validation

Before it can be used as an indexed keyword, each candidate text fragment must be validated. A candidate keyword can be validated in one of three ways:

images/aspen00090000.gif it can be listed in an inclusion list of special keywords,

images/aspen00090000.gif it can represent the assigned story number for its associated news story, or

images/aspen00090000.gif it can meet a standard set of validation criteria.

The inclusion list contains special keywords which would otherwise be rejected by the standard validation process. Any candidate keyword which appears in the inclusion list is automatically validated for use without further examination. (The contents of the inclusion list are enumerated later.)

A candidate keyword formed to represent the assigned story number of its associated news story is also automatically validated for use without further examination. For example, a news story with an assigned story number of 16702 would be indexed by the 16702 keyword.

If not validated according to the two preceding rules, a candidate keyword must meet all of the following standard validation criteria:

images/aspen00090000.gif keywords must contain two or more characters.

images/aspen00090000.gif keywords must begin with a letter (A to Z or a to z); keywords beginning with a numeric character (0 to 9) are not indexable.

images/aspen00090000.gif keywords must not appear in the exclusion list, a list containing many short and usually irrelevant words such as, the, it, as, by, etc.

Note that the general exclusion of keywords beginning with a numeric character prevents the useless and resource-consuming practice of indexing text fragments like, “12:37” or “324,700”, which have only circumstantial value to a headline, as opposed to a text fragment like, “90-day”, which has a meaningful, descriptive value.