Semtinel
The use of thesaurus-based indexing is a common approach for increasing the performance of document retrieval. With the growing amount of documents available, manual indexing is not a feasible option. Statistical methods for automated document indexing are an attractive alternative. We argue that the quality of the thesaurus used as a basis for indexing in regard to its ability to adequately cover the contents to be indexed is of crucial importance in automatic indexing because there is no human in the loop that can spot and avoid indexing errors.
![]() |
With Semtinel we develop a framework that enables a human expert to supervise the automatic indexing process in an interactive and effective way. The Semtinel Framework provides various analysis approaches that are based on a combination of statistical measures and appropriate visualization techniques. The goal is the detection of potential problems in a thesaurus that affect the quality of the document annotations.
