Skip to Main content Skip to Navigation
Conference papers

Dealing with Structured Documents in Information Retrieval Systems

Abstract : In this paper we suggest how hypertext links and the content of HTML pages can be used to cluster pages into what we call Web documents We put forward a method to automatically construct a hierarchy ofWeb doc uments and with the help of an abstraction function the context hierarchy of a site This hierarchy is represented by a graph whose links are structural typed Structural links between nodes reveal a context relationship The con text hierarchy along with the graph of the pages underlying the site are used to better index and retrieve the pages Furthermore it permits a new operator to be added in the IRS Information Retrieval System query language whereby the user will be able to di erentiate the context from the subject of his queries .
Document type :
Conference papers
Complete list of metadata
Contributor : Florent Breuil <>
Submitted on : Monday, February 3, 2014 - 4:50:38 PM
Last modification on : Wednesday, June 24, 2020 - 4:18:07 PM


  • HAL Id : emse-00941327, version 1


Fernando Aguiar, Doan Bich-Liên, Michel Beigbeder. Dealing with Structured Documents in Information Retrieval Systems. World Conference on the WWW and Internet, Oct 1999, Honolulu, United States. 10 p. ⟨emse-00941327⟩



Record views