Breast cancer remains the leading cause of cancer morbidity and mortality among women worldwide, with early detection being the most effective determinant of survival. Current imaging modalities, including mammography, ultrasound, and histopathology, provide complementary diagnostic insights, yet each is limited in sensitivity, specificity, or reproducibility.
Objective: We aimed to develop and evaluate a multimodal deep learning framework that integrates mammography, ultrasound, and histopathology to improve breast cancer detection, diagnostic robustness, and clinical interpretability.
Methods: We curated 45,236 images from CBIS-DDSM, INbreast, MIAS (mammography), BUSI and BUD (ultrasound), and BreakHis and Camelyon16 (histopathology). Modality-specific preprocessing pipelines included CLAHE enhancement, speckle noise reduction, and stain normalization. Feature extraction employed EfficientNet-B4 and Swin Transformers for radiological images, and multiple instance learning with attention pooling for histopathology. A cross-modal attention fusion network integrated modality-specific embedding. Training employed stratified splits, Adam optimization, and early stopping. Model performance was evaluated using accuracy, sensitivity, specificity, precision, F1-score, AUC-ROC, calibration curves, and external validation. Explainability was assessed using Grad-CAM++ and SHAP.
Results: Single-modality models achieved AUC-ROC values of 0.89 (mammography), 0.87 (ultrasound), and 0.91 (histopathology). The multimodal fusion framework significantly outperformed all unimodal baselines, achieving an AUC-ROC of 0.96, accuracy of 92.1%, sensitivity of 92.1%, and specificity of 90.7%. External validation on INbreast and BUSI datasets confirmed generalizability, while calibration analysis demonstrated well-calibrated probability estimates. Explainability analyses revealed that model attention aligned with radiologically and pathologically relevant regions, enhancing interpretability and clinical plausibility.
@artical{a13102024ijcatr13101014,
Title = "Multimodal Computer Vision for Breast Cancer Detection: Integrating Mammography, Ultrasound, and Histopathology Data with Advanced Deep Learning",
Journal ="International Journal of Computer Applications Technology and Research (IJCATR)",
Volume = "13",
Issue ="10",
Pages ="149 - 157",
Year = "2024",
Authors ="Ayinoluwa Feranmi Kolawole "}