A team led by Jim Collins at MIT has developed BioAutoMATED, an open-source automated machine-learning platform that simplifies building machine-learning models for biological sequences like DNA and proteins, reducing the process time from months to hours. The innovative tool aims to lower barriers for biology-centric labs and is supported by various institutions.
BioAutoMATED, an open-source automated machine-learning platform, has been developed to democratize artificial intelligence for research laboratories. This tool, described in a paper published on June 21, 2023, in Cell Systems, was created by a team led by Jim Collins, the Termeer Professor of Medical Engineering and Science at MIT, and life sciences faculty lead at the Abdul Latif Jameel Clinic for Machine Learning in Health (Jameel Clinic). The team includes PhD student Jacqueline Valeri and postdoctoral researcher Luis Soenksen.
BioAutoMATED simplifies the process of building machine-learning models by automating data preprocessing and model selection, reducing the process time from months to hours. Traditional AutoML tools are mainly used in image and text recognition, but BioAutoMATED has broadened this scope to biological sequences like DNA and proteins.
The tool supports binary classification, multi-class classification, and regression models, catering to complex datasets and neural networks. It is designed to lower the barriers for biology-centric labs, enabling initial experiments without the need for extensive investment in digital infrastructure or machine-learning expertise.
The platform’s open-source code is available for researchers to use and improve. The development was supported by various institutions, including the Defense Threat Reduction Agency, DARPA, and the National Institutes of Health, among others.