Why is Healthcare Data so Complex?

It’s now a widely accepted and intrinsically undeniable fact that properly managing and mining healthcare data is paramount to the growth and success of the healthcare industry. Healthcare data is important because it is being collected to inform clinical decisions, to shape personalized predictive medicine and to improve population health. There is, however, no integration to improve clinical trials and inform better health practices. Many believe the healthcare industry is in desperate need of reform and one way to do that is to embrace technology as a tool that can help.

Healthcare data is complex and most data-driven healthcare IT providers aren’t going to survive the transformation if they don’t master how to handle the complexity. Like any number of industries, the healthcare industry is being transformed by the explosion of low-cost data. This transformation is driven in large part by electronic medical record adoption and digitization. There have been many benefits. End users can take advantage of quantities of newly available information to solve problems in population health, clinical decision support, and patient engagement, among other applications. And ease of access means ease of market entry: Emerging data providers can get on their feet quickly and create new sources of competition.

But even with all this information, it begs the question: Why is Healthcare Data so Complex? There are many reasons and speculations as to why healthcare data is so complex but it can all be summarized in concrete reasons. They are

  • Much of the data is in multiple places
  • The data is structured and unstructured
  • The data is complex
  • There are Inconsistent/variable definitions
  • Changing Regulatory Requirements

The first three reasons are the most common reasons why healthcare organizations find their data to be so complex so they’ll be discussed in more detail below

Much of the data is in multiple places

Healthcare data tends to reside in multiple places. This is mostly due to the fact that they are from different source systems, like EMRs or HR software, to different departments, like radiology or pharmacy. The data comes from all over the organization. Aggregating this data into a single, central system, such as an enterprise data warehouse (EDW), makes this data accessible and actionable. Healthcare data also occurs in different formats. For example, radiology uses images, old medical records exist in paper format, and today’s EMRs can hold hundreds of rows of textual and numerical data. Sometimes the same data exists in different systems and in different formats. Such is the case with claims data versus clinical data. And it looks like the future holds, even more, sources of data, like patient-generated tracking from devices like fitness monitors and blood pressure sensors.

The data is structured and unstructured

The structure of healthcare data varies and is inconsistent and that’s because the data capture is not consistent. For years, documenting clinical facts and findings on paper has trained an industry to capture data in whatever way is most convenient for the care provider with little regard for how this data could eventually be aggregated and analyzed. Many care providers are reluctant to adopt a one-size-fits-all approach to documentation. As a result, much of the data captured in an unstructured manner is difficult to aggregate and analyze in any consistent manner. As EMR products improve, as users become trained to standard workflows, and as care providers become more accustomed to entering data in structured fields as designed, we will have more and better data for analytics.

The Data is Complex

While developing standard processes that improve quality is one of the goals in healthcare, the number of data variables involved makes it far more challenging and complex. Data technicians cannot work with a finite number of identical parts to create identical outcomes. Instead, you’re looking at an amalgam of individual systems that are so complex we don’t even begin to profess we understand how they work together. Managing the data related to each of those systems which are often being captured in disparate applications, and turning it into something usable across a population, requires a far more sophisticated set of tools that is needed for other industries like manufacturing.