Citizen scientists can now create accurate protein models using computer game co-developed by Professor Firas Khatib
The best microscopes in the world often fail to produce crisp images of proteins. This is a frustration to the scientists who rely on them and a major roadblock to global efforts to design new drugs.
Now, volunteers playing the online puzzle game Foldit have shown that they can use fuzzy cryo-electron microscopy datasets to create accurate three-dimensional models of proteins. This work appears this week in the journal PloS Biology.
“Cryo-electron microscopy is the new hot thing in biochemistry that's making it possible to look at biomolecules that we could never visualize before, but that comes with new challenges,” senior author Scott Horowitz, assistant professor of chemistry and biochemistry and the Knoebel Institute for Healthy Aging at the University of Denver Chief among these is making sense of limited data. “Foldit players appear to very much be up to the challenge of working with the data and turning it into biological models that we can interpret.”
Constructing accurate protein models from cryo-electron microscopy (cryo-EM) images is widely regarded as difficult, even among trained experts. In the vast majority of cases, such datasets do not contain enough information to specify the exact location of each atom in a protein, and thus permit many different protein models to be built. This can introduce false models into circulation, with negative ramifications for drug design, clinical trials, and more.
A team of researchers from University of Massachusetts Dartmouth, University of Washington, Northeastern University, University of Denver, and University Grenoble Alpes in France together challenged Foldit players to create protein models from fuzzy cryo-EM data.
Foldit was developed in 2008 as a way to ‘gamify’ protein research. It gives players an intuitive video game interface to visualize and interact with proteins, as well as scientific software that scores how realistic a given protein model appears to be based on its exact physical and chemical properties. Players have previously used Foldit to solve the crystal structure of an HIV-related enzyme and to successfully design synthetic proteins.
As a proof-of-concept for this work, the researchers supplied gamers with a cryo-EM dataset of a toxin-delivering protein from the soil bacterium Serratia entomophila, which was separated into four parts, each with a resolution of 3.2Å, which is not enough to definitively place each atom. The players had to rely on the experimental data and their in-game scores to guide their model-building. Realistic models that fit entirely within the cryo-EM dataset scored higher.
“These were the first Foldit puzzles to use cryo-EM, and the players outperformed experts and automated methods on each and every one of them,” said first author Firas Khatib, assistant professor of computer and information science at the University of Massachusetts Dartmouth.
The player’s final models were far better than those made algorithmically, and as good or better than those made my trained microscopists who had been given the same datasets to work with and who built their models by hand.
“Speed is important because scientists — like most people — are impatient,” said Horowitz. “This is especially true when you have a lot of great data and you want to interpret it and find out what it all means.”
How did untrained volunteers outperform state-of-the-art algorithms? “There are still a few things that humans are better at than computers, and spatial reasoning is one of them,” said Khatib. “However, this is not an example of humans simply outperforming computers, because the Foldit players are using many different computational tasks within the game. It turns out humans and computers are most successful when they work together.”