NIH’s Big Data to Knowledge (BD2K) initiative

Awards2

This is a long awaiting news from NIH about announcing the first round of awardees for BD2K (sum of $32M).

These NIH multi-institute awards constitute an initial investment of nearly $32 million in fiscal year 2014 by NIH’s Big Data to Knowledge (BD2K) initiative, which is projected to have a total investment of nearly $656 million through 2020, pending available funds.

I browsed through the awards; they divided into four categories. The largest grant is a $3M (for 2014 FY) BioCADDIE, BIOMEDICAL AND HEALTHCARE DATA DISCOVERY AND INDEXING ENGINE CENTER, (btw, for those who never played golf as a rich one, caddie is a person who carries a golfer’s clubs and provides other assistance during a match). The other grants in that category are much smaller. One that caught my eye is about immunosequencing, Computational Tools for the analysis of high-throughput immunoglobulin sequencing.

11 grants are for the Centers of Excellence for Big Data Computing. Among those is the Center of Excellence for Mobile Sensor Data to Knowledge. My favorite one is for The Center for big data in translational genomics, given to UCSC’s PI David Haussler, developer of the UCSC Genome Browser, one of the most popular tools in genomics.

This one, A Framework for Integrating Multiple Data Sources for Modeling and Forecasting of Infectious Diseases, sounds very actual, considering the current Ebola outbreak, but has a ridiculously small budget of $100K (2014 FY), which would basically fund just one PI’s (this is K01 grant) training.

Another group of grants concerns education. Two for Big Data Educational efforts and development of Bioinformatics MOOC courses went to UCSD. Among other awardees for MOOCs are Johns Hopkins University, for developing courses in neuroimaging and genomics; Harvard University, for a modular online education program that brings together concepts from Statistics, Computer Science and Software Engineering; and initiatives from the Oregon Health and Science University, UCLA, and others. MOOCs are becoming a Big business, and soon there will be courses in various flavors.

Advertisements

Stanford receives $3M for Big Data in biomedicine

Li Ka Shing Foundation gave a grant to the Stanford University School of Medicine to boost the Big Data in Biomedicine initiative in collaboration with the University of Oxford in England via recruiting new faculty, establishing new educational programs and a major conference on big data in May 2014 at Stanford.

“In the world of medicine, we have a tsunami of data crashing over us, including electronic patient records, DNA sequencing, biological data on disease mechanisms, clinical trials, medical imaging and pharmaceutical records. We can put all these large data sets to work to identify innovative approaches to treatment and to improving access to care,” said Lloyd Minor, MD, dean of the School of Medicine.

The University of Oxford faculty members are leaders in one of the largest patient databanks in the world, the UK biobank, which has biomedical information on some 500,000 individuals. (About mining patient databanks for developing new medicines, see this post.) 

The Stanford arm of the effort will be directed by Euan Ashley, MD, PhD, whose research focuses on developing methods for interpreting genome-sequencing data to improve diagnosis of genetic disease and to develop targeted therapies for patients.

National Cancer Institute is to create a Cancer Genomics Cloud

Government Health IT reports on the NIH’s National Cancer Institute announcement to set up the cloud infrastructure toward the establishment of a full Cancer Knowledge Commons to enable “democratic access to NCI-generated genomic data”

“Today, researchers often have to mine genomics data from various sources by locating and downloading it — such as the Cancer Genome Atlas, the Cancer Genomics Hub and the International Cancer Genome Consortium — then add their own data and use it all on local hardware.

“This model has been successful for many years,” NCI officials wrote, “but is becoming untenable given the enormous growth of biomedical data since the advent of large-scale scientific programs such as the Cancer Genome Atlas,” which on its own this year is set to generate some 2.5 petabytes, half a petabyte less than the Library of Congress’ digital collection.”

This year, $20M will be spend on awarding contractors to deliver a cloud pilot.