What does the future hold for real-world data (RWD)? Looking back at the last couple of years and the accelerated adoption of RWD in research, evidence generation, HEOR and clinical trials, we can only guess the future use cases and how they will impact further healthcare innovation. 

Knowing how much data revolutionized other industry fields, I am a firm believer of the incredible potential that medical data holds – by 2025 36% of the world’s data volume is estimated to be generated by the healthcare industry. Just imagine how effectively utilizing this vast volume of data could accelerate the development of new treatments, generate valuable insights in oncology, genetic diseases, and countless other therapeutic areas, and even address persistent health challenges.

In this article I will discuss my top 5 real-world data predictions for 2025 (which we will review together in 12 months).

 

1. Unlocking RWD’s full potential with LLMs

Medical data sources can be categorized into structured and unstructured data. While structured data remains a primary source for medical research due to its accessibility and standardized format, it falls short in capturing the wealth of valuable information contained in unstructured data sources, such as clinical notes, imaging files, and other free-text documents. I can refer here to a study recently presented at ISPOR, showing that about ⅔ of the value of healthcare data can be found in free text. Although more challenging to access and analyze, unstructured data encompasses critical details about a patient’s journey, prescriptions, and medical history. Without leveraging the full spectrum of available information, we risk failing those in need of care. Incomplete data can lead to biased results and compromise epidemiological validity.

Large Language Models (LLMs) have proven to be a transformative tool in overcoming these challenges by enabling the extraction of actionable insights from unstructured data. In 2025, we can expect LLMs to be applied to the clinical field and play a larger role in extracting actionable insights from real-world data sources such as clinical notes, physician reports, and patient-reported outcomes. With LLMs, unstructured data sources become analyzable and actionable, transforming into an essential component of healthcare research.

 

2. Predictive Analytics and AI’s Expanding Role

Now imagine how optimized data quality and improved access to high-quality, comprehensive RWD will transform the utilization of predictive analytics. This advancement will bring a never-before-seen level of precision and accuracy to forecasting patient outcomes, identifying health R&D trends, pinpointing high-risk populations, and optimizing treatments in ways previously unattainable. Additionally, AI will have the ability to access and incorporate insights from a vast array of published literature while actively identifying qualitative challenges and gaps in datasets. This will help researchers define the most specific patient populations possible while minimizing selection bias. Real-world evidence (RWE) generation, and its role in informing clinical practices and evidence-based practices (EBP), will reach new heights and we will witness a shift from reactive to preventive care. 

AI-driven predictive analytics, supported by robust RWE, will enable researchers to simulate clinical trials and epidemiology studies more efficiently, reducing costs and time, expediting drug development, enhancing value-based care initiatives, fostering personalized treatments, enabling early disease detection, and so much more. The integration of predictive analytics and RWD is no longer a novelty – it’s a necessity.

 

3. Shift from U.S.-Based Data to Global RWD for Increased Inclusivity

Historically, RWD has been predominantly U.S.-centric, which is unsurprising given the availability of claims data, pro-research HIPAA regulations, and the high value of the U.S. pharmaceutical market. According to the latest report by Nova One Advisor, the U.S. pharmaceutical market size was calculated at $639.22 billion in 2024. While the U.S. population is known for its diversity, relying solely on U.S.-based data comes with several significant limitations. This data is fragmented, lacks full representation of the U.S. population—particularly minorities, due to their limited access to healthcare—and is predominantly derived from billing systems (RCMs), which offer less granular insights compared to medical information systems. Furthermore, it excludes valuable perspectives from the rest of the world.

By making RWD a more accessible commodity globally, 2025 promises to foster greater inclusivity and representation. Researchers will be better equipped to understand how treatments perform across diverse genetic, cultural, and socioeconomic contexts, paving the way for personalized medicine and treatments. 

Organizations like the Global Alliance for Genomics and Health (GA4GH) are leading efforts to make global data sharing a reality, ensuring that research reflects a more comprehensive and accurate view of the world’s population (GA4GH). This shift is also expected to enhance regulatory acceptance of therapies, as agencies like the EMA and FDA increasingly prioritize diversity in clinical evidence.

 

4. Enriching RWD with Dynamic Context

In a recent blog by my colleague Ruth Levi Lotan, she highlights the limitations of frozen, or static, data registries, one being the inability to revisit existing datasets for updated or additional information about the same population, either due to privacy concerns, or other data-matching issues. This poses a significant challenge especially in long-term research that spans several months or evolves in response to new hypotheses. 

The ability to contextualize data – by returning to its source, enriching it, adjusting the sensitivity and specificity of population definitions, or modifying specific information and variables – ushers in a new era of dynamic data registries. These registries are “alive” and remain relevant as research evolves. I see 2025 as a pivotal year in this transformation, with technologies like Briya leading the way and adding an entirely new dimension of dynamic context to real-world data.

Contextualized RWD enables real-time updates, ensuring that research stays current and impactful. Tools powered by natural language processing (NLP) and AI will empower researchers to revisit and enhance datasets, address privacy regulations through advanced compliance technologies, and bridge knowledge gaps – reducing reliance on static, “frozen” data.

 

5. Deeper Symbiosis between RWD & Digital Health

Digital health technologies are on the rise, and their synergy with real-world evidence generation is becoming increasingly profound. Insights obtained through real-world data are driving medical advancements, while wearables, remote monitoring tools, and mobile health apps generate vast amounts of real-time data that seamlessly feed back into RWD ecosystems. This invaluable patient-generated data can be processed and utilized to propel medical breakthroughs and foster research innovation.

Patient privacy remains a significant challenge in unlocking the full potential of this data. Advancements in de-identification and anonymization technologies, combined with regulatory shifts, are steadily reducing these barriers, enabling patient-level data analysis in full compliance with data privacy regulations.

Patient-generated data holds immense potential, offering a more continuous and comprehensive view of patient health and paving the way for hyper-personalized care plans. As digital health adoption continues to grow, RWD will play a pivotal role in incorporating patient-generated data into mainstream healthcare research and delivery.

 

A Bright Future lies Ahead of Real-World Data and Healthcare Innovation

As we move into 2025, real-world data is set to become an even greater driving force behind innovation in healthcare research. While challenges such as data fragmentation, incompleteness, and privacy concerns continue to pose obstacles for healthcare visionaries and innovators, new technologies are paving the way to overcome these barriers, making such hurdles a thing of the past (or of 2024).