Human Longevity, Inc. plans to overrun BGI

It is already three months as Human Longevity, Inc. was officially announced. The company now has a beautifully designed website and announcements about new executive hires.  The caliber of executives points at the company plans to grow huge and grow fast; thus CIO comes from AstraZeneca, where he was the Vice President, R&D IT responsible for the global IT organization services, analytics and infrastructure supporting drug discovery and development, leading a global team of approximately 300 and was accountable for the more than $120 million R&D IT budget. The company is also building a computing and informatics program and facility in Singapore.

BioIT World was one of the first to write about the company’s launch (for more news check the company’s website):

In a move that would be shocking from almost anyone else, Venter declared that his brand-new company’s sheer sequencing power will be leapfrogging the world’s best-established genomic research centers, such as the Broad Institute.

The company has acquired 20 the latest Illumina’s HiSeq X Ten machines ($1 million a piece), which would allow for sequencing full genomes of 40,000 people a year. For comparison, BGI by the end of 2013 had already sequenced 57,000 individuals. HLI doesn’t even need to compare itself with BGI as it plans to rapidly scale to 100,000 human genomes a year (considering that Illumina is among investors).

Human Longevity will also be characterizing at least some participants’ microbiomes, and, in partnership with Metabolon of North Carolina, their metabolomes, or the constantly changing array of small molecules present in the body. On top of that, said Venter, “we will be importing clinical records of every individual we’re sequencing,” in order to bring on board crucial phenotypic data.

The goal is to integrate this mass of data for new discoveries that can wed individuals’ own genetic variants, the composition of their bacteria, the molecules in their blood, and most importantly, their medical histories. Venter stressed that his aim is to enable predictive and preventative medicine for healthy aging, discovering early warning signs for susceptibility to chronic illnesses like cancer, Alzheimer’s, and heart disease, as well as new interventions tailored to each individual’s distinct profile. “We think this will have a huge impact on changing the cost of medicine,” added Venter.

A longer-term goal is to translate some of this information into stem cell therapies, an application that ties Human Longevity to Venter’s existing company, Synthetic Genomics.

But the goal of this year is sequencing genomes of cancer patients in collaboration with the UCSD Moores Cancer Center.

What exactly the company is going to do with all these data? Do research and publish papers? Yes, and some of principal scientists hired by the company got appointments at Craig Venter Institute. Sell data and the results of analysis? Yes. “Venter and his colleagues also held out the possibility of other commercial products and properties emerging from the company’s basic research.” The company is also actively hiring, and not only computational professionals but clinical and wet lab scientists as well. Here are some more about the company mission from job ads:

HLI will develop the most comprehensive gene-phenotype database in the world, with phenotype information deriving from molecular, physiologic, clinical, microbiome and longitudinal data assays. This database will be mined for biological meaningful patterns that can lead to better diagnostics, therapeutic targets and next-generation cell-replacement therapies.

 

 

“[Big Data] make business strategy interesting again”

EVans_TED_talk

TED posted a talk by Philip Evans about how Big Data will transform the business strategy. The screen shot above that I made on the 7:33 minute of the talk implies by Evans “a hundredfold multiplication in the stock of information that is connected via an I.P. address.”

“Now, if the number of connections that we can make is proportional to the number of pairs of data points, a hundredfold multiplication in the quantity of data is a ten-thousandfold multiplication in the number of patterns that we can see in that data, this just in the last 10 or 11 years. This, I would submit, is a sea change, a profound change in the economics of the world that we live in.”

But what his talk is really about is that technology (he gives an example of cost and speed of genome sequencing) is driving the typically vertical structure of businesses to becoming more horizontal, when cooperation is needed to achieve the big data scale.

“That implies fundamental changes in how we think about strategy. … It means, for example, we need to work out how to accommodate collaboration and competition simultaneously. Think about the genome. We need to accommodate the very large and the very small simultaneously. And we need industry structures that will accommodate very, very different motivations, from the amateur motivations of people in communities to maybe the social motivations of infrastructure built by governments, or, for that matter, cooperative institutions built by companies that are otherwise competing, because that is the only way that they can get to scale. … These kinds of transformations render the traditional premises of business strategy obsolete.”

In this light, one might start thinking differently about business endeavors and personal (horizontal?) strategy of contributing to the society.

 

Internet of Things on UCSD campus

UCSD_microgrid

This post by BigData-Startups about the Internet of Things (IoT) (sensors, computers, machines, mobile devices — all connected) made me want to look no further than around me and write about the UC San Diego’s power microgrid, which is the US’s largest online microgrid and the world’s most efficient Solar Integrated Storage System. It also makes the UCSD campus of 100 buildings (12M square feet) 95% energy self-supporting.

Here are the slides about microgrid in general and about the replicability of UCSD microgrid to Pacific Islands. The PACE center at SDSC manages analytical projects for the microgrid, including running electrical, solar charged vehicles for commuting to the campus.

Almost all the buildings on campus have a roof like one shown in the photo above. Newly constructed buildings, San Diego Supercomputer Center new East wing of the building was one of the firsts, are “green” buildings with natural cooling and heating systems (I cherish an absence of AC in my office and existence of a window that can be open).

Big Data technologies would achieve a plateau of productivity in 5-10 year

Source: Gartner August 2013

Source: Gartner August 2013

This is a map of hype cycle for emerging technologies by Gartner, Inc. A full report assessing more than 2,000 technologies in 98 areas is also available.

“In fact, by observing how emerging technologies are being used by early adopters, there are actually three main trends at work. These are augmenting humans with technology — for example, an employee with a wearable computing device; machines replacing humans — for example, a cognitive virtual assistant acting as an automated customer representative; and humans and machines working alongside each other — for example, a mobile robot working with a warehouse employee to move many boxes.”

Well, “a walking warehouse employee” sounds like an outdated concept. I think many already unemployed wish to still be or to become again “walking employees”, alongside with a robot or not — but that is not going to happen: robots can walk/jump/run on their own, with the other “seating” robots managing the process.

Looking at this plot, one can ask “What is smart dust?” Well, this is, according to Wikipedia, a 20-year old concept for millimeter and smaller size wireless sensors, and nanobots/nanorobots, I would add. Bioacoustic sensing is something new to me.

But the most important question to ask is “Which skills/education/business do I/my kids/my company need to pursue to be marketable in 10-15 years.” The answer doesn’t sound as revelation: bioengineering, electrical engineering, programming (cloud, mobile, web, databases, algorithms), statistics, analytics, life sciences, neuroscience, physics, linguistics, sociology, psychology, human-machine communicators/liaisons, robot teachers. And what is not?

With more and more people out of traditional jobs and with higher demand on talent and smart at job, will the society be in need of more artists, actors, singer, dancers, writers, poets, entertainers, designers, architects, communicators, restauranteurs — professions that currently pay only for extraordinary talent? Will “Quantified Self” include only the human physical substance (which is for a materialist includes complete self), or should we expect achieving a new level of artistic and philosophical creativity via self-quantification? And will it be rewarded by the society? Basically, what will keep billions on this planet busy and out off distraction?

“It is going to be beyond science fiction in our life time.”

data_wired_May_2013

On March 12, I attended an event organized by UCSD Extension “Big Data at Work: A Conversation with the Experts”. There were presentations from

  • Larry Smarr, Ph.D., Founding Director, CALIT2
  • Mike Norman, Ph.D., Director, San Diego Supercomputer Center
  • Stefan Savage, Ph.D., Professor, Computer Science & Engineering, UC San Diego
  • Michael Zeller, Ph.D., Chief Executive Officer, Zementis

Natasha Balac, Ph.D., Director, Predictive Analytics Center of Excellence, moderated the discussion panel.

Larry Smarr sounded exciting and optimistic. To illustrate the tsunami of data, he started with the old telling about rice and chessboard. On Wikipedia, it is going under “Wheat and chessboard problem”. If to start with one grain and double the amount of grains on each next square (1+2+4+8+16+32+64+ ….), on the 64th square of the chessboard alone there will be 263 = 9,223,372,036,854,775,808 grains of rice.

“On the entire chessboard there would be 264 − 1 = 18,446,744,073,709,551,615 grains of rice, weighing 461,168,602,000 metric tons, which would be a heap of rice larger than Mount Everest. This is around 1,000 times the global production of rice in 2010 (464,000,000 metric tons).”

Larry Smarr is one of the first adopter of monitoring his health using genome sequencing technologies as he sequences his gut microbiome as often as possible:

“If in the past just several variables from a blood test and weight defined me. Now, Billions of numbers define me! …

Healthcare and education are still pre-digital.”

In regard to efforts to harvest human genomics and microbiomics data, Larry mentioned the recent launches of the Human Longevity Inc. (here is more news on HLI from PR Newswire) and similar initiatives by Leroy Hood and George Church.

Mike Norman talked about Big Data initiatives at SDSC. To demonstrate the amount of available data in various domains, he kindly asked me to show the slide I made last summer (above; I borrowed the concept of circles and all data except biological, which is my estimate based on data available in public registered databases exclusively, from the Wired magazine). He also mentioned IntegromeDB among four Big Data projects running at SDSC. Among the two major challenges in the Big Data field Mike mentioned education and providing a computing environment for data storage, sharing and analytics.

Stefan Savage gave a fascinating talk about his research on the Internet security, abusive advertisement, web spam, bitcoin operations, and his live super-fast URL classification system with millions of features with online training (I am interested to do some research and write more about this system):

“Security is becoming a data-driven discipline. …

Security today is about understanding the environment. …

The data won’t be in personal possession.  “

Michael Zeller talked about two groups of application of Big Data analytics, people & behavior and sensors & devices:

“Big Data buzz creates new business opportunities to disrupt existing market and to develop new platforms with new capabilities. … The challenge on the industry side is cutting through the noise of many existing solutions. … The future will bring a lot of data-driven applications — agents that will make decisions on your behalf. It will be seen on every level of life.”

In the end of the panel (remarks from which I provided above), Larry encouraged to watch the movie “Her”:

“It is going to be beyond science fiction in our life time. Everyone will have an intelligent system knowing much more about ourselves that we do.”

Declaimer: The citations provided above are not exact. They are provided based on my writing during the event.

What are Gaming Big Data good for?

mobile-gaming-phone

According to GamesBeat, companies now measure and on-the-fly adjust to a player many parameters in a game, like onboarding techniques, the time to rich a specific level, the rate at which players can pick up goodies, etc.

“Gaming companies are now manipulating all of these variables as needed to ensure gamers onboard, get engaged, and keep playing over the long haul. Because, of course, it’s all about retention. If you can retain players, you can monetize. If you can’t, you won’t make money.”

But in the long run, collection of gaming data is not only about retention. It is at much extent about monitoring the people’s gaming abilities and brain activities to design games for treating psychiatric and mood disorders, elevating pain, learning new skills and manipulating robots, sensors and nanobots (for example, inside the body). Gaming data are needed for learning robots and manipulating them. And gaming data will be eventually used for searching for individuals with unusual brains and motor functions — to learn more about human neurology, brain physiology and anatomy to design new medical treatments and to advance the human brain. What else can you imagine?

Does Geospatial Data Unite the World?

Slide1

This interesting post from BigData-Startups, which newsletter I got today, discusses use of geospatial data in providing better customer service, explains well what those data are about, and gives examples of two startups, SpaceCurve and Loqate, thriving on developing tools for analysis of geospatial data. This reading made me think about usage of geospacial data in biology and medicine.

The possibilities are abandoned and some of the examples are already well-know: Google flu trends around the world based on Google search activities and the online interactive chart of vaccine preventable outbreaks around the world from 2006 to present day that covers outbreaks for measles, mumps, rubella, polio, whooping cough, and “other” (the data are downloadable as a CSV file that includes source citation, country, longitude/latitude, number of cases and fatalities by outbreak type). The image above is the screen-shot of that chart for 2013; it can be seen that in 2013 there was one case of typhoid in San Francisco. Those and similar monitoring systems are important tools for prevention of disease outbreaks and smart transportation and storage of vaccines and medication.

Another possibilities can be seen in the fitness domain, where geospatial data, together with data of people moving activities, can be used for smart distribution of recreational centers and construction of new bike and jogging lanes, not talking about sales and marketing of fitness devices and mobile apps. Same is true for stores and restaurants specialized in health products and organic food, and food banning policies.

Yet, geospatial data coupled with personal medical and genetic data and data coming from apps and wearable sensors monitoring various physiological activities of our bodies, like blood glucose level or pressure, will soon be a powerful tool for a personalized medicine — that account for a person geolocation and time of year as well. Those data will be also helpful in geographically locating populations susceptible to diseases or environmental cataclysms (see this relevant post about search for rare human phenotypes for drug development). And, obviously, no one can talk seriously about environmental data without considering geospatial data as well.

Then think about animals, wild and domesticated, about crops, produce and plants, and water surrounding us — everything is connected, and it appears that we are all connected via data, obviously hugely Big-Big Data; and extracting knowledge and prognoses from these data makes us appreciate and value the diversity of life on Earth more than ever.