Big data foments scientific revolution

30 Jul 2014 | News
In what may become the biggest paradigm shift in the history of science, Big Data is overthrowing the established scientific method

If there’s one thing most scientists agree on, it’s the method that leads to new discoveries.  Frame a question or problem to be solved; formulate a hypothesis to address it based on observation, and then test the argument or suggested solution through rigorous experimentation.  

The results of scientific inquiry prove or disprove the hypothesis, providing insights that drive the process of inquiry and discovery forward.  

But a new tool is shaking up the 2000-year-old practice. The ability to mine massive amounts of digital information, dubbed “Big Data,” about everything from patient records to climate conditions, has some proposing a radical break with the hypothesis model of scientific inquiry.  

“It’s a total shift in mindset,” said Isabelle Thizon de Gaulle, vice president of European Strategic Initiatives and Scientific Relations at Sanofi. “We are ready to first look at the data and then to try to understand the mechanism of diseases which is behind that data,” she said, reflecting on how the rise of Big Data is changing research methods at one of the world’s leading pharmaceutical companies.

It’s the same picture across disciplines ranging from biology to astronomy, where Big Data is opening whole new research possibilities and big business opportunities. The mining of massive data sets is being used to predict epidemics and to develop the emerging field of personalised medicine. Medical researchers troll through data sets about pathology, treatments and patient records, for example, to design more effective drugs.  

“Thanks to better analysis (of Big Data), we can fund with much better precision, [resulting in] new medicines with better efficacy and safety,” said Thizon de Gaulle. Targeted cancer drugs developed with inputs from Big Data already are on the market, and they are driving more effective therapy at lower cost.

But safeguards are needed: if scientists start mining public data as the first step in the research process, they must be sure the data does not become corrupted, Thizon-de Gaulle told delegates at a session on, ‘Big Data, Big Deal. Big Problem?’ at last month’s Euroscience Open Forum in Copenhagen.

Increasing the value of existing research

In addition to generating new insights, Big Data makes it possible to extract greater value from existing knowledge, by helping researchers plough through mountains of published papers to find relevant work quickly.  “It’s another filter mechanism,” said Jan Reichelt, co-founder and president of Mendeley, which has developed a software application that makes journal articles fully searchable.  “We help researchers be more efficient.  We can help them save time and be more up to date with relevant publications than before.”

Big Data is also being deployed for social and cultural projects. “We are looking at the research and commercial value of Big Data, but the uniqueness is the social value – the ability to preserve things,” said Sandra Collins, director of the Digital Repository of Ireland, Royal Irish Academy.

John Wood, chair of the Research Data Alliance, acknowledged the pressure to start using data to drive scientific inquiry.  One example is the use of Big Data to analyse wheat group phenotypes to determine which ones could help overcome food shortages in Africa.  Another is the use of sensors in the sea to monitor and analyse fish stocks to ensure sustainability.

“Big Data is not just an academic exercise,” said Wood.  “It’s really something that affects society.  It’s a critical moment for us.”  The Research Data Alliance is a global science policy initiative seeking to build common data infrastructures and enable data sharing.

Abuses of Big Data

The panel also discussed the possible abuse of Big Data, and the rise of the “hacktivist.”  If individuals fear their data will be misused, they will be more reluctant to share it.  “This is where publically-funded data comes face-to-face with personal data – it’s an area the European Commission is looking at,” said Wood.

Big data is even impacting social science research.  Governments, for example, could mine government data to understand which policies actually work to prevent child abuse or to promote healthy families.  “Big Data is the only way you will get answers,” said Wood.

Despite the burgeoning number of uses for Big Data, the real challenge is developing better algorithms to cut through the complexity of 200 petabytes of data and make it useful.  Do you start with the data side or from the hypothesis side?  If you start with the hypothesis, you get the algorithms, but you may have the wrong one,” said Wood.

In the world of medical research, the tension is palpable and difficult to manage, according to one participant at the session.  “It’s not that Big Data is good or bad, it’s different.  The paradigm of science has changed through the use of Big Data.”  

But Wood insisted the hypothesis approach to scientific inquiry would co-exist with data mining as research methods for a long time. “It will take decades to change the way scientists work,” he said.  

Never miss an update from Science|Business:   Newsletter sign-up