The University of Chicago

The University of Chicago Research Funding

Skip to: main navigation | utility navigation | main content

‘Programming’ Sherlock Holmes not so elementary

Bookmark and Share

Not many scientists invoke Sherlock Holmes in their grant proposals, but then everyone does not think like Andrey Rzhetsky.

Rzhetsky finds inspiration in Holmes, who used penetrating powers of reasoning to make the most of diverse and seemingly trivial pieces of evidence that other detectives ignored. With the help of a two-year $1,211,000 National Institutes of Health grant, the co-Principal Investigator proposes to develop automated methods to harvest, synthesize and compare scientific theories and research results, especially minority opinions or unpublished hypothesis.

“The scientific establishment tends to crush new hypotheses that compete with sitting paradigms,” says Rzhetsky, PhD, Professor, Committee on Genetics, Genomics & Systems Biology, Human Genetics, Institute for Genomics & Systems Biology, Computation Institute. “Making alternative hypotheses visible, accessible and computable could generate unexpected results, such as aspartame may predispose to brain tumors, or chronic cold sores may lower cognitive functions.”

The grant has allowed Rzhetsky to hire several students, one postdoctoral research associate and consultants. The multidisciplinary work will proceed on four overlapping fronts.

  • Search: Gather, archive and structure published and unpublished information that samples the diversity of concepts and observations within a scientific community.
  • Synthesize: Design tools for identifying sets of internally consistent statements from which to generate novel propositions and formalize competing hypotheses.
  • Compute: Develop a computerized infrastructure to store and analyze the new body of information.
  • Disseminate: Share the information dynamically through websites, online brokerages, blogs and other channels.

“We’ll enlist an army of scientists and computers to break the silence in which all but the dominant hypotheses, observations and evidence languish, thereby revolutionizing the scientific enterprise and significantly speeding up the pace of research and discovery,” Rzhetsky says.

One important part of the research will be harnessing the power of supercomputers to read and make sense of large quantities of data, says Ian Foster, PhD, Director, Computation Institute; Arthur Holly Compton Distinguished Service Professor, Computer Science and Argonne Distinguished Fellow, Argonne National Laboratory. Foster and James Evans, PhD, Assistant Professor, Sociology, are co-PIs.

“Exponential growth in the quantity of human knowledge makes it increasingly difficult—perhaps impossible—for any individual to form a clear picture of what is known, and with what degree of certainty, in even the narrowest subfields,” Foster says. “This knowledge deluge can be a significant barrier to progress, as researchers fail to infer connections, detect inconsistencies, identify opportunities.”

The massive project will begin with autism and breast cancer, about which more than 15,000 and 200,000 research articles have been published, respectively. Regarding autism, Rzhetsky hopes the new approach will answer questions, such as: What environmental theories and factors should researchers consider as possible causes of autism? Is autism more likely caused by rare or common genetic variation? Do other parallel illnesses occur with autistic patients or their families?

“If our project is successful, every scientific community could be affected,” Rzhetsky says. “Even partial success would expose scientists to fresh ideas and improve the way they evaluate evidence, and form and test hypotheses. As such, the potential benefits are global in scope.”

by Greg Borzo

This award is funded under the American Recovery and Reinvestment Act of 2009, NIH Award number: 1R01LM010132-01. For more information on NIH’s Recovery Act projects,

Bookmark and Share