The program Segmenter.java was written by Erik Peterson erik AT mandarintools.com. It was last modified Jan. 13, 2004. Downloaded from http://www.mandarintools.com/segmenter.html on 05/16/2013 by Xiannong Meng. I've modified the class such that in addition to parse the characters, an inverse document list is generated and printed. Certain data files are needed by the programs. They are simplexu8.txt tradlexu8.txt bothlexu8.txt in the program directory, and sforeign_u8.txt snumbers_u8.txt tforeign_u8.txt tnumbers_u8.txt snotname_u8.txt ssurname_u8.txt tnotname_u8.txt tsurname_u8.txt in the ./data directory. Compile the programs by javac Segmenter.java Run the program by java Segmenter [-8 | -b | -g] list of input files, e.g., java Segmenter -8 news.txt sina-news.txt Xiannong Meng 05-21-2013