Adapting A Monolingual Consumer Health System for Cross-Language Information Retrieval.
Rosemblat G, Tse T, Gemoets D
Advances in Knowledge Organization, Proceedings of the Eighth International ISKO Conference. 2004 July;9:315-321.
Abstract:
This preliminary study applies a bilingual term list (BTL) approach to cross-language information retrieval (CLIR) in the consumer health domain and compares it to a machine translation (MT) approach. We compiled a Spanish-English BTL of 34,980 medical and general terms. We collected a training set of 466 general health queries from MedlinePlus en espaƱol and 488 domain-specific queries from ClinicalTrials.gov translated into Spanish. We submitted the training set queries in English against a test bed of 7,170 ClinicalTrials.gov English documents, and compared MT and BTL against this English monolingual standard. The BTL approach was less effective (F= 0.420) than the MT approach (F= 0.578). A failure analysis of the results led to substitution of BTL dictionary sources and the addition of rudimentary normalization of plural forms. These changes improved the CLIR effectiveness of the same training set queries (F= 0.474), and yielded comparable results for a test set of new 954 queries (F= 0.484). These results will shape our efforts to support Spanish-speakers' needs for consumer health information currently only available in English.
Rosemblat G, Tse T, Gemoets D. Adapting A Monolingual Consumer Health System for Cross-Language Information Retrieval.
Advances in Knowledge Organization, Proceedings of the Eighth International ISKO Conference. 2004 July;9:315-321.