images/contents.gifimages/index.gif

Headline Preprocessing

News headline text is scanned preliminarily to break it apart into text fragments of legitimate characters; each text fragment is a candidate for use as an indexed keyword. The following rules apply to the preprocessing scan:

images/aspen00090000.gif Characters A to Z, a to z, and 0' to 9 will be accepted as keyword components.

images/aspen00090000.gif The period (.), ampersand (&), apostrophe ('), and hyphen (-) characters will be accepted if they do not begin the text fragment. (e.g., the text fragments U.S. or T-Bonds are acceptable as shown.)

images/aspen00090000.gif All other characters will be treated as empty space between text fragments.