Mao S, Kanungo T.Empirical Performance Evaluation Methodology and its Application to Page Segmentation Algorithms IEEE Transactions on Pattern Analysis and Machine Intelligence. 2001 Mar;23(3): 242-256.
Mao S, Kim J, Le DX, Thoma GR.Generating Robust Features for Style-Independent Labeling of Bibliographic Fields in Medical Journal Articles Proc. 7th World Multiconference on Systemics, Cybernetics and Informatics.2003 July;III:53-6.
Misra D, Thoma GR.Use of descriptive metadata as a knowledgebase for analyzing data in large textual collections. Proc. IS&T Archiving 2013. Washington D.C. Proc. IS&T Archiving 2013. Washington D.C. pg 193-199.
Misra D, Hall RH, Payne SM, Thoma GR.Digital preservation and knowledge discovery based on documents from an international health science program. Proc. 12th ACM/IEEE-CS JCDL, pg 23-26 (2012). doi: 10.1145/2232817.2232823.
Mrabet Y, Kilicoglu H, Demner-Fushman D.TextFlow: A Text Similarity Measure based on Continuous Sequences. Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, 2017, Vancouver, Canada, July 30 - August 4, Volume
Rae A, Kim J, Le DX, Thoma GR.Main Content Detection in HTML Journal Articles. DocEng ’18: ACM Symposium on Document Engineering 2018, August 28–31, 2018, Halifax, NS, Canada. ACM, New York, NY, USA, 4 pages. https://doi.org/10.1145/3209280.3229115
Simpson M, Ford G, Antani S, Demner-Fushman D, Thoma GR.A Lightweight Statistics Package for Interactive Publications Poster at 20th NIH Research Festival (TECH-15), September 2007, National Institutes of Health
Thoma GR, Ford G, Le DX, Li Z.Text Verification in an Automated System for the Extraction of Bibliographic Data Proc. 5th International Workshop on Document Analysis Systems, Springer-Verlag: Berlin. 2002 Aug;: 423-32.