Combating complex diseases with data

All of us constantly generate data, that is for sure – no matter whether sending text messages, using our favorite streaming service or booking a plane ticket. And the amount of digital data continues to grow exponentially. Boehringer Ingelheim sees this as a great opportunity to innovate in areas of high unmet medical need.

Five million patients – five million persons with their own distinct biography. And all of them are affected by their health condition in a different way, which can be documented by a multitude of individual healthcare data. Our Biberach colleague George Okafo (Global Computational Biology and Digital Sciences, in short gCBDS), for instance, emphasizes the value of this data gathered from biological samples and patient records. He has a longstanding data platform background, but to make up to 100 petabytes (that is 100,000 terabytes) of external biobank data available to all of his Boehringer Ingelheim colleagues is a first for him: “These insights will help us develop personalized medicines and get them to the patients even faster.”

Gathering and interpreting data offers a broad range of opportunities to improve our lives, including medical research. Building on this, the Dataland initiative is on its way to establish a data-driven mindset and working culture at Boehringer Ingelheim and will offer an unprecedented end-to-end data ecosystem for all colleagues. All data used undergoes the required data security, access and compliance checks and procedures. And the user-friendly access to the data platform George’s team is currently developing, which is part of the initiative, will be fully available in the next few years.

Ferran Urgeles, the program manager of Dataland, says: "The main challenge is not the size of huge datasets but how you structure them. Through Dataland, we make sure that huge amounts of curated data are available for our entire workforce. This will lead to disrupting insights we would not have gained otherwise."

 

Brigitte Fuhr, Head of IT EDS Central Data Science, says: "Organized datasets can only be beneficial if our employees have the right skills to gain insights from them for their daily work. Due to this, upskilling and promoting a data mindset is a key success factor. We offer learning opportunities both for experienced data scientists as well as for newsbies with our Data Science Academy."

 

Jan-Nygaard Jensen, Head of Global Computational Biology and Digital Sciences, says: "Having a data mindset and curated data at hand leads to potential use cases across our entire value chain. We can get even better in understanding our patients' needs, in our Research & Development activities as well as in speeding up clinical trials."

As one of the Dataland enterprise use cases, the external databanks George and his team work on will enable an end-to-end data platform that allows users to access and derive scientific insights from biobank data: Biological samples for research purposes only. With our partners at Lifebit, we are building the technical infrastructure to help our colleagues with combating complex diseases. “Our focus is on fibrotic diseases of the lung and liver,” says George. “Focusing on specific diseases when building a scalable user interface to huge datasets across multiple biobanks is a unique offering in the pharmaceutical industry.” This focus of the gCBDS department on specific scientific use cases will help evaluate and potentially expand the data platform even more in the future.

An infographic showing how an end-to-end data ecosystem arises.

The current pace of George’s project is the more impressive considering how much of a challenge it is to cleanse, curate and assemble such a large amount of data for user-friendly access. “We have several groups of users in mind who can profit from the biobank data,” he says, “especially human geneticists, computational biologists and disease experts in our Therapeutic Areas. At the end of the day, though, all that Dataland has to offer – both the datasets and the upskilling opportunities – is a resource for everyone at Boehringer Ingelheim.”

Gaining data literacy through the Data Science Academy

And indeed: Not all our colleagues may profit from a dataset around fibrotic diseases, but data in general and having the right skills to interpret them become more and more important. “When you combine different types of data, new solutions emerge,” says Ferran Urgeles, the program manager of Dataland. “Let’s take identifying the optimal site for a clinical trial as another example. When you combine the participants’ data like their demographics with trial site performance data and other pieces of information, this can really optimize finding the perfect location for these trials.” More generally speaking: The data is there, so why not look at different datasets at the same time and make the most out of it?

Numerous use cases in a broad range of departments already prove the value of this, but one thing is for sure: Dataland will only be a success with colleagues who have a data-driven mindset. Upskilling is an important part of that, and the Data Science Academy plays a crucial part here, with offers for beginners, advanced colleagues and leaders. “We are all learning together,” says George Okafo. “And we can’t wait to see how the run phase will bring our research forward.”