We Might Not Know Half of What’s in Our Cells, New AI Discovery Suggests

Most human diseases can be traced to malfunctioning parts of a cell — a tumor is able to grow because a gene wasn’t accurately translated into a particular protein or a metabolic disease arises because mitochondria aren’t firing properly, for example. But to understand what parts of a cell can go wrong in a disease, scientists first need to have a complete list of parts.

By combining microscopy, biochemistry techniques and artificial intelligence, researchers at University of California San Diego School of Medicine and collaborators have taken what they think may turn out to be a significant leap forward in the understanding of human cells.

“Scientists have long realized there’s more that we don’t know than we know, but now we finally have a way to look deeper,” says computer scientist and network biologist Trey Ideker of the University of California (UC) San Diego.

Microscopes, powerful as they are, allow scientists to peer inside single cells, down to the level of organelles such as mitochondria, the power packs of cells, and ribosomes, the protein factories. We can even add fluorescent dyes to easily tag and track proteins.

Biochemistry techniques can go deeper still, honing in on single proteins by using, for example, targeted antibodies that bind the protein, pull it out of the cell, and see what else is attached to it.

Integrating those two approaches is a challenge for cell biologists.

“How do you bridge that gap from nanometer to micron-scale? That has long been a big hurdle in the biological sciences,” explains Ideker.

“Turns out you can do it with artificial intelligence – looking at data from multiple sources and asking the system to assemble it into a model of a cell.”

The result: Ideker and colleagues have flipped textbook maps of globular cells which give us a birds-eye view of candy-colored organelles into an intricate web of protein-protein interactions, organized by the teensy distances between them.

Fusing image data from a library called the Human Protein Atlas and existing maps of protein interactions, the machine learning algorithm was tasked with computing the distances between protein pairs.

The goal was to identify communities of proteins, called assemblies, that co-exist in cells at different scales, from the very small (less than 50 nm) to the very ‘large’ (more than 1 μm).

One shy of 70 protein communities were classified by the algorithm, which was trained using a reference library of proteins with known or estimated diameters, and validated with further experiments.

Around half of the protein components identified are seemingly unknown to science, never documented in the published literature, the researchers suggest.

In the mix was one group of proteins forming an unfamiliar structure, which the researchers worked out is likely responsible for splicing and dicing newly made transcripts of the genetic code that are used to make proteins.

Other proteins mapped included transmembrane transport systems that pump supplies into and out of cells, families of proteins that help organize bulky chromosomes, and protein complexes whose job it is to make, well, more proteins.

A hefty effort, it’s not the first time that scientists have tried to map the inner workings of human cells, though.

Other efforts to create reference maps of protein interactions have yielded similarly mind-boggling numbers and attempted to measure protein levels across tissues of the human body.

Researchers have also developed techniques for visualizing and tracking the interaction and movement of proteins in cells.

This pilot study goes a step further by applying machine learning to cellular microscopy images which locate proteins relative to large cellular landmarks such as the nucleus, and data from protein interaction studies that identify a protein’s nearest nano-scale neighbors.

“The combination of these technologies is unique and powerful because it’s the first time measurements at vastly different scales have been brought together,” said study first author Yue Qin, a Bioinformatics and Systems Biology graduate student in Ideker’s lab.

Microscopes allow scientists to see down to the level of a single micron, about the size of some organelles, such as mitochondria. Smaller elements, such as individual proteins and protein complexes, can’t be seen through a microscope. Biochemistry techniques, which start with a single protein, allow scientists to get down to the nanometer scale. (A nanometer is one-billionth of a meter, or 1,000 microns.)

To be clear, this research is very preliminary: the team focused on validating their method and only looked at the available data from 661 proteins in one cell type, a kidney cell line which scientists have been culturing in the lab for going on five decades.

The researchers plan to apply their newfangled technique to other cell types, says Ideker.