Articles

Random Forest-Based Blood Oxygen Saturation Estimation System Using a Web Camera

Buyanbat AYURZANA, Morio IWAI, Otgonbayar BATAA, Koichiro KOBAYASHI
Vol. 15 (2026) p. 254-264

Blood oxygen saturation (SpO2) is one of the most important vital sign parameters. The conventional measurement method is contact-based photoplethysmography (PPG), which includes finger pulse oximetry. While PPG is usually used to measure vital signs, this method can be uncomfortable for people with sensitive skin, such as infants and critically ill individuals. Recently, research has been conducted to improve noncontact SpO2 estimation techniques using facial videos and enhance remote photoplethysmography (r-PPG) signals to extract significant information. However, r-PPG signals are often degraded by lighting, motion, and skin tone variability. This study aimed to enhance the quality of r-PPG signal from facial videos, using feature selection to reduce complexity and overfitting, and applying random forest (RF)-based methods to model the r-PPG signals. Furthermore, we developed an innovative machine learning-based method for estimating SpO2 level using a web camera to capture r-PPG signals from defined facial videos in a controlled laboratory environment. Initially, an AI-driven framework for face mesh detection and region-of-interest (ROI) tracker algorithm were utilized to enhance r-PPG signal quality and reduce the noise caused by ambient light and subject’s motions. Face detection and ROI tracker vibration was smoothed using a Kalman filter. The resulting time-series data were processed using signal filtering. Thereafter, the RF algorithm, a supervised machine learning method, was used to predict SpO2 values from these components. For the experiment, red-green-blue (RGB) facial video data were collected from 11 subjects with different skin tones and genders to train the RF algorithm, using a smaller dataset than those utilized in other studies. The results showed mean square error of 0.95%, root mean square error of 0.98%, mean absolute error of 0.77%, and Pearson’s correlation coefficient of 0.80. Despite a small training dataset, the proposed method demonstrated notable feasibility. The RGB color values from facial videos were potentially useful for accurate estimation of SpO2 levels in a stable environment. The proposed noncontact method is a promising alternative to traditional pulse oximetry, and has potential applications in clinical settings, particularly in remote patient monitoring, critical care monitoring, early disease detection, and telemedicine.

READ FULL ARTICLE ON J-STAGE