A growing number of articles have been reported to investigate health consequences of micro- and nanoplastics (MNPs) exposure, but no coherent conclusions have been achieved due to the incomparability between studies, the complexity and heterogeneity of the existing toxicity data of MNPs. Herein, this work developed a predictive modelling framework for the cytotoxicity of MNPs using classification-based machine learning approaches. A literature search led to 1824 sample points represented by 9 features describing physicochemical properties of MNPs, the cell-related attribute and experimental factors. The decision tree ensemble classifier built from all features (DTE1) showed strong predictive ability with an accuracy of 0.95, recall and precision of both 0.86. Feature selection was subsequently performed to identify the key ingredients that dominated the toxic properties of MNPs. A simplified classifier developed from 6 influencing features demonstrated a comparable model performance to DTE1. This result can help direct future studies toward better experimental design and report, facilitating the understanding of the pressing MNP-related health issues. With continuous integration of more representative research data, the developed model can be widely applicable to a spectrum of MNP cytotoxicity settings.
Prediction of the cytotoxicity of micro- and nanoplastics using machine learning combined with literature data mining
.