Dealing with Structured Documents in Information Retrieval Systems

Abstract : In this paper we suggest how hypertext links and the content of HTML pages can be used to cluster pages into what we call Web documents We put forward a method to automatically construct a hierarchy ofWeb doc uments and with the help of an abstraction function the context hierarchy of a site This hierarchy is represented by a graph whose links are structural typed Structural links between nodes reveal a context relationship The con text hierarchy along with the graph of the pages underlying the site are used to better index and retrieve the pages Furthermore it permits a new operator to be added in the IRS Information Retrieval System query language whereby the user will be able to di erentiate the context from the subject of his queries .
Type de document :
Communication dans un congrès
P. De Bra, John J. Leggett. World Conference on the WWW and Internet, Oct 1999, Honolulu, United States. 10 p., 1999
Liste complète des métadonnées

https://hal-emse.ccsd.cnrs.fr/emse-00941327
Contributeur : Florent Breuil <>
Soumis le : lundi 3 février 2014 - 16:50:38
Dernière modification le : mardi 22 mars 2016 - 01:16:43

Identifiants

  • HAL Id : emse-00941327, version 1

Citation

Fernando Aguiar, Doan Bich-Liên, Michel Beigbeder. Dealing with Structured Documents in Information Retrieval Systems. P. De Bra, John J. Leggett. World Conference on the WWW and Internet, Oct 1999, Honolulu, United States. 10 p., 1999. 〈emse-00941327〉

Partager

Métriques

Consultations de la notice

67