Improvement of Web Retrieval by the Use of Contextual Information of Pages
Abstract
This work suggests a new model of Information Retrieval System for searching information in hypertexts representing web sites. The model is based on the construction of a 2-component index. One component concerns the HTML pages individually. The other one concerns the context of the pages. The assumed premise is that the textual content of a HTML page is not sufficient for a indexing process to extract the information that the page conveys. By the use of both local and complementary content when indexing pages, the quality of their index is improved and so is the effectiveness of the search engine.