You are here
Label the many with a few: Semi-automatic medical image modality discovery in a large image collection.
In this paper we present a fast and effective method for labeling images in a large image collection. Image modality detection has been of research interest for querying multimodal medical documents. To accurately predict the different image modalities using complex visual and textual features, we need advanced classification schemes with supervised learning mechanisms and accurate training labels. Our proposed method, on the other hand, uses a multiview-approach with minimalexpert knowledge to semi-automatically label the images. All the images are projected in different feature spaces, which are then clustered in an unsupervised manner. Each cluster representative is mapped back to the image space, and labeled by an expert. The other images from the clusters “inherit” the labels from these cluster representatives. The final label is assigned to each image based on a voting mechanism, each vote providing an different opinion about the same image. The experimental setup showed that using only 0.3% of the labels was sufficient to annotate 300,000 medical images with 49.95% accuracy. Although, automatic labeling is not as precise as manual, it saves approximately 700 hours of manual expert labeling. We find that for this collection accuracy improvements are feasible with better disparate feature selection or different filtering mechanisms.