Web Information Retrieval

Xiannong Meng
Computer Science Department
Bucknell University
Lewisburg, PA 17837

This is for a short, intensive course on web information retrieval for about 36 hours of meeting time.

Programming Project:
Web Search Engine -- An Application of Information Retrieval Theory

Summer 2014

The programming project in this course is to build a simple, yet functional search engine that can answer user queries from a collection of web pages gathered by a crawler. The expected background for students includes sufficient experiences in a high level programming language, basic data structures such as lists, queus, stacks, and graphs, and some basic mathetics found in college calculus and algebra.

The project is divided into five parts. The first four parts are required for a functional search engine. The fifth part (the page scoring part) is important. Due to the time limit, students who don't have time to finish this part can still have the experiences of building a basic search engine. See the following pages for details of each phase of the project.