NIH’s Big Data to Knowledge (BD2K) initiative


This is a long awaiting news from NIH about announcing the first round of awardees for BD2K (sum of $32M).

These NIH multi-institute awards constitute an initial investment of nearly $32 million in fiscal year 2014 by NIH’s Big Data to Knowledge (BD2K) initiative, which is projected to have a total investment of nearly $656 million through 2020, pending available funds.

I browsed through the awards; they divided into four categories. The largest grant is a $3M (for 2014 FY) BioCADDIE, BIOMEDICAL AND HEALTHCARE DATA DISCOVERY AND INDEXING ENGINE CENTER, (btw, for those who never played golf as a rich one, caddie is a person who carries a golfer’s clubs and provides other assistance during a match). The other grants in that category are much smaller. One that caught my eye is about immunosequencing, Computational Tools for the analysis of high-throughput immunoglobulin sequencing.

11 grants are for the Centers of Excellence for Big Data Computing. Among those is the Center of Excellence for Mobile Sensor Data to Knowledge. My favorite one is for The Center for big data in translational genomics, given to UCSC’s PI David Haussler, developer of the UCSC Genome Browser, one of the most popular tools in genomics.

This one, A Framework for Integrating Multiple Data Sources for Modeling and Forecasting of Infectious Diseases, sounds very actual, considering the current Ebola outbreak, but has a ridiculously small budget of $100K (2014 FY), which would basically fund just one PI’s (this is K01 grant) training.

Another group of grants concerns education. Two for Big Data Educational efforts and development of Bioinformatics MOOC courses went to UCSD. Among other awardees for MOOCs are Johns Hopkins University, for developing courses in neuroimaging and genomics; Harvard University, for a modular online education program that brings together concepts from Statistics, Computer Science and Software Engineering; and initiatives from the Oregon Health and Science University, UCLA, and others. MOOCs are becoming a Big business, and soon there will be courses in various flavors.

National Cancer Institute is to create a Cancer Genomics Cloud

Government Health IT reports on the NIH’s National Cancer Institute announcement to set up the cloud infrastructure toward the establishment of a full Cancer Knowledge Commons to enable “democratic access to NCI-generated genomic data”

“Today, researchers often have to mine genomics data from various sources by locating and downloading it — such as the Cancer Genome Atlas, the Cancer Genomics Hub and the International Cancer Genome Consortium — then add their own data and use it all on local hardware.

“This model has been successful for many years,” NCI officials wrote, “but is becoming untenable given the enormous growth of biomedical data since the advent of large-scale scientific programs such as the Cancer Genome Atlas,” which on its own this year is set to generate some 2.5 petabytes, half a petabyte less than the Library of Congress’ digital collection.”

This year, $20M will be spend on awarding contractors to deliver a cloud pilot.