Articles

Deep Learning-Based Multimodal Assessment of Cumulative Smoking Exposure

Jinrun DAI, Kazumasa KISHIMOTO, Osamu SUGIYAMA, Masahiro MIYAKE, Hiroshi TAMURA
Vol. 15 (2026) p. 12-20

Reliable assessment of lifetime smoking exposure is essential for clinical risk stratification and public health interventions. However, current self-reported methods are subject to recall bias and misclassification. Here, we developed a new multimodal deep learning model for predicting cumulative smoking exposure based on fundus images, clinical features, and manually measured anatomic indicators. In a cohort of 8,299 subjects from the Japan Ocular Imaging Registry, pre-processing was performed. For each subject, smoking history was extracted, together with 31 anatomical features of fundus photographs and 13 clinical characteristics. The proposed model consists of three branches: (1) an image branch employing EfficientNet-B3 to extract image representations, (2) a clinical branch utilizing a residual multilayer perceptron to process clinical features, and (3) an anatomical branch using a fully connected encoder to obtain anatomical features; finally the fused multi-outputs are input to the classification-assisted multiexpert regressor to estimate Brinkman Index. Our multimodal model achieved significantly better results (MAE = 158.22, R2 = 0.34) compared to unimodal models. The classifier accomplished a accuracy of 61% and an F1-score of 0.50 for smoking intervals. These observations were validated by performing ablation studies, which corroborated that the modalities complement one another and that handcrafted features improve the results. This multimodal, non-invasive, objective framework should be regarded as a methodological exploration that demonstrates the technical feasibility of estimating cumulative smoking exposure from fundus images and associated clinical data. Future studies with larger datasets and objective biomarkers are needed to further validate its clinical applicability.

READ FULL ARTICLE ON J-STAGE