Mao S, Kanungo T.Empirical Performance Evaluation Methodology and its Application to Page Segmentation Algorithms IEEE Transactions on Pattern Analysis and Machine Intelligence. 2001 Mar;23(3): 242-256.
Mao S, Kim J, Le DX, Thoma GR.Generating Robust Features for Style-Independent Labeling of Bibliographic Fields in Medical Journal Articles Proc. 7th World Multiconference on Systemics, Cybernetics and Informatics.2003 July;III:53-6.
Misra D, Thoma GR.Use of descriptive metadata as a knowledgebase for analyzing data in large textual collections. Proc. IS&T Archiving 2013. Washington D.C. Proc. IS&T Archiving 2013. Washington D.C. pg 193-199.
Misra D, Hall RH, Payne SM, Thoma GR.Digital preservation and knowledge discovery based on documents from an international health science program. Proc. 12th ACM/IEEE-CS JCDL, pg 23-26 (2012). doi: 10.1145/2232817.2232823.
Mrabet Y, Kilicoglu H, Demner-Fushman D.TextFlow: A Text Similarity Measure based on Continuous Sequences. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017, Vancouver, Canada, July 30 - August 4, Volume
Rae A, Kim J, Le DX, Thoma GR.Main Content Detection in HTML Journal Articles. DocEng ’18: ACM Symposium on Document Engineering 2018, August 28–31, 2018, Halifax, NS, Canada. ACM, New York, NY, USA, 4 pages. https://doi.org/10.1145/3209280.3229115
Simpson M, Ford G, Antani S, Demner-Fushman D, Thoma GR.A Lightweight Statistics Package for Interactive Publications Poster at 20th NIH Research Festival (TECH-15), September 2007, National Institutes of Health
Szolovits P, Aberdeen J, Meystre S, Kayaalp M.Panel on: State of the Art of Clinical Narrative Report De-Identification and Its Future [Poster]. Proceedings of the Annual American Medical Informatics Association Fall Symposium: 240–242.