Experts from the Center for Diagnostics and Telemedicine have developed a platform for self-testing services which is based on artificial intelligence and designed for medical tasks, such as for analyzing diagnostic images. The first working prototype of the platform is hosted on the popular GitHub service, and developers from all over the world can take part in its improvement by adding verification criteria depending on the purpose of the services. Sergey Morozov, CEO of the Center for Diagnostics and Telemedicine, spoke about this at the thematic week dedicated to artificial intelligence which was part of the program of the European Congress of Radiology (ECR 2020).
Before implementing a service based on artificial intelligence (AI) into routine clinical practice, it is necessary to test it for technical readiness, as well as to verify whether it meets the stated characteristics. It is called analytical validation of the algorithm. The services that have passed it are allowed to be integrated into medical systems, including city healthcare.
Integration is a complex and expensive process, so it becomes a barrier for many teams that cannot guarantee the required accuracy and speed of the algorithm processing data of the system into which they are integrated. Currently analytical validation is performed manually. Manual validation allows accidental or deliberate deviations from the approved test program, as well as manipulation of datasets, and also can potentially put different test participants in unequal conditions.
To solve these problems and automate the verification process, ensuring trust of users, specialists of the Center for Diagnostic and Telemedicine have developed a platform that allows developers of AI-based services to independently conduct preliminary tests (analytical validation) of their algorithms. A prototype of the platform has been hosted on the GitHub, and the first version of the service for exchanging datasets and data analysis results has already been uploaded.
The platform provides an opportunity for the unlimited number of accesses to single samples of data instances from the test set in order to fine-tune algorithms. It has uniform rules of use, and it is possible to test several services simultaneously. At the same time, the platform records the time that the software spends on data processing (time-study), and the developers receive an automatic report on the results of testing, explains Morozov.
By automating the entire process on the self-testing platform, the human factor is minimized, which makes data manipulation (to improve results) impossible. In addition, the comparison of the service’s verification results with the reference data is absolutely transparent; the developer can see what metrics were used, and how the final result reflected in the report was calculated.
Anyone can take part in improving the platform and add necessary metrics to it, which will be used to evaluate the algorithm’s performance for certain medical purposes (for example, for analyzing radiographs or mammograms). However, the addition of the platform will be monitored. The only metrics that have scientific justification will be included in the platform operating on the basis of the Center, notes Nikolai Pavlov, the developer of the platform, Head of Dataset Labeling Conveyor of the Medical Informatics, Radiomics, and Radiogenomics Sector, Center for Diagnostics and Telemedicine.
The creators of the platform invite developers of AI algorithms, programmers, and researchers to take part in updating and improving the platform in order to develop a uniform, universal, and user-friendly tool for self-testing of artificial intelligence algorithms intended for medical purposes in the international community. At the moment, there is no such tool aimed specifically at the clinical implementation of services based on AI technologies.
Featured image: The first working prototype of the platform is hosted on the popular GitHub service, and developers from all over the world can take part in its improvement by adding verification criteria depending on the purpose of the services. (Credit: Center for Diagnostics and Telemedicine)