23 Janvier 2023

Intelligence artificielle pour les données de santé non structurées : Application au codage des déclarations d'effets indésirables des médicaments par les patients

Louis Létinier, Julien Jouganous, Mehdi Benkebil, Alicia Bel-Létoile, Clément Goehrs, Allison Singier, Franck Rouby, Clémence Lacroix, Ghada Miremont, Joëlle Micallef, Francesco Salvo, Antoine Pariente.

Synapse Medicine

Synapse Medicine annonce sa première publication sur sa technologie de Medication Shield

Découvrez la première publication de Synapse Medicine sur l’utilisation de l’Intelligence Artificielle sur des données de santé non structurées et son application au codage de la déclaration des effets indésirables des médicaments par les patients. L’objectif de cette étude : développer un système automatisé permettant de coder les signalements d’effets indésirables médicamenteux (EIM) à partir des rapports des patients. Dans cette étude, vous pourrez en apprendre plus sur notre technologie de Medication Shield et comment elle identifie les EIM à partir de données non structurées.


"Adverse drug reaction (ADR) reporting is a major component of drug safety monitoring; its input will, however, only be optimized if systems can manage to deal with its tremendous flow of information, based primarily on unstructured text fields. The aim of this study was to develop an automated system allowing to code ADRs from patient reports. Our system was based on a knowledge base about drugs, enriched by supervised machine learning (ML) models trained on patients reporting data. To train our models, we selected all cases of ADRs reported by patients to a French Pharmacovigilance Centre through a national web-portal between March 2017 and March 2019 (n = 2,058 reports). We tested both conventional ML models and deep-learning models. We performed an external validation using a dataset constituted of a random sample of ADRs reported to the Marseille Pharmacovigilance Centre over the same period (n = 187). Here, we show that regarding area under the curve (AUC) and F-measure, the best model to identify ADRs was gradient boosting trees (LGBM), with an AUC of 0.93 (0.92–0.94) and F-measure of 0.72 (0.68–0.75). This model was run for external validation showing an AUC of 0.91 and a F-measure of 0.58. We evaluated an artificial intelligence pipeline that was found able to learn how to identify correctly ADRs from unstructured data. This result allowed us to start a new study using more data to further improve our performance and offer a tool that is useful in practice to efficiently manage drug safety information."

Lire la publication