Big data has the potential to vastly change the field of cardiovascular health care in a variety of ways — from helping to identify risk factors and streamlining trials to improving patient care and monitoring how well people respond to therapies. But what exactly do we mean by the term “big data”?
The most basic answer is exactly what it sounds like: Big data is a set of information that is too large to be handled by normal data processing programs. Thanks to electronic health records (EHRs), databases, registries and other sources, we are gathering information about patients’ health at an unprecedented rate. The typical hospital generates about 100 terabytes of data every year, according to James Tcheng, M.D., interventional heart specialist at Duke University. If that doesn’t mean much to you, consider this: The entire Library of Congress contains only 10 terabytes of text data.
All of the information in the world is useless, though, unless it can be interpreted and analyzed in meaningful ways. That’s where big data comes in. Experts are working to create advanced statistical and mathematical methods to draw conclusions that can be used by doctors and hospitals to guide medical decisions.
The process sounds complicated (and, truthfully, it is), but the results can be simple. Most people already benefit from big data every day. When you use your grocery store loyalty card and receive coupons for products you might enjoy, those suggestions come from the analysis of millions of shoppers’ buying habits, aka big data. When you visit Netflix, iTunes or Amazon and receive recommendations about what to buy based on your previous purchases and ratings, that’s big data in action.
If we can do those sorts of fancy algorithmic predictions when it comes to movie rentals, why aren’t we getting daily, individualized medical advice from our computers and smart phones yet? In short, because no one gets seriously injured when Netflix recommends the wrong movie, but applying big data conclusions to individual patients can be ineffective at best and deadly at worst.
John Rumsfeld, M.D., Ph.D., is chief science officer at the American College of Cardiology’s National Cardiovascular Data Registries (NCDR), the preeminent cardiovascular data repository in the U.S. He explains, “Big data needs to be thought of like a new technology, new medical device, or new drug. It needs the same sort of rigorous evaluation to know that when we use it, it will actually improve care and outcomes and not have unintended consequences.
“For example, if the models built off of big data platforms don’t correctly identify patients who need a certain therapy or misclassify people as being at high or low risk, you could imagine there being unintended harm. Hospitals and doctors are being told that big data will help guide their decisions, but that guiding will only be as good as the underlying models.
“We should have enthusiasm because we’re getting a new tool to help us potentially understand patterns of care and predict which patients might be at the highest risk. All of that is tremendously exciting. I just think as it gets down to making decisions about individual patients, we should subject that to rigorous evaluation. I’m worried that the enthusiasm may be outweighing the evidence, so it’s important for professional societies, such as the American College of Cardiology, and academic research organizations to partner and work with big data and bring a scientific rigor to it, which NCDR is committed to doing. The stakes in health care are higher.”
While it may be a few years before heart patients can wear high-tech devices that provide accurate individualized alerts when they are at high risk, big data is already benefiting cardiovascular patients on a larger, population-level scale.
Research is one realm where big data is already making a difference. Matthew Oster, M.D., M.P.H., Director of Children’s CORPS (Cardiac Outcomes Research Program at Sibley Heart Center) at Children’s Healthcare of Atlanta, says that one of big data’s largest advantages is that “It can help streamline the ability to do larger trials without having to do a prospective, randomized, multi-center trial since that can be very time- and cost-prohibitive. By using existing data, we can use advanced methods to try to answer some of the burning questions in a much quicker and cost-effective manner.”
Databases and registries are being used to speed up the process of identifying qualified subjects and, in doing so, can also reduce the cost of clinical trials and the time it takes for new drugs and treatments to be brought to market. Thanks to enormous, crowd-sourced endeavors such as University of California San Francisco’s Health eHeart study (see sidebar), certain non-drug interventions can be tested almost in real time. Jeffrey Olgin, M.D., chief of cardiology at UCSF and a principal investigator in the Health eHeart study, describes a recent example: “A collaborator approached us and said, ‘We have this way to help reduce smoking and we want to test it. It could be really impactful and scalable,’ so we said, ‘Sure.’ We have several thousand daily smokers in our study and we can reach them instantly to deploy the intervention and tell whether it worked. We went from the idea of that intervention to starting the study in a three to four week period.”
Jeffrey Jacobs, M.D., professor of surgery at Johns Hopkins University and chair of The Society of Thoracic Surgeons Congenital Heart Surgery Database, says benchmarking outcomes against national aggregate data is, in his opinion, the most useful role of big data for pediatric cardiac surgery: “That benchmarking allows opportunities to identify areas that would benefit from quality improvement initiatives and allows the across-the-board identification of programs that are performing better than expected and worse than expected. Quality improvement initiatives can be established in programs that are performing worse than expected, and we can learn from the programs that are performing better than expected to improve quality across the board.”
The Congenital Cardiac Interventional Study Consortium (CCISC) registry, designed to determine best practices for the treatment of pediatric patients with congenital heart disease (CHD), is an example of those sorts of quality control efforts in practice. Co-founded by Thomas Forbes, M.D., pediatric cardiologist at the Children’s Hospital of Michigan, CCISC’s databases currently include information from around 30 institutions in the United States, Europe and South America. That number is expected to double in the next six to 12 months. By comparing the results from institutions around the world, CCISC’s use of big data has already had a noteworthy impact on the complication rates of pediatric congenital heart defect patients.
Dr. Forbes says that using the data to identify the need for a smaller catheter for babies weighing less than 10kg has “vastly decreased the number of injuries and complications” and that comparing surgeries performed with and without body warmers has “dramatically changed the complication rates” of infants and babies undergoing procedures in South America. “Those are two examples,” he says, “of how the database showed us who was at high risk and what we can do to make that better.”
The registry was also able to significantly reduce the amount of radiation some patients are exposed to over their lifetimes, but before that conclusion could be reached, CCISC had to overcome one of the biggest pitfalls in medical big data analysis: lack of consistency. Not only is much of the available data in inconsistent formats and disparate systems, even the terminology and measurements used are variable. Dave Fornell, editor of Diagnostic and Interventional Cardiology, says, “One hospital may call the imaging system in its cath lab a ‘fluoroscopy system,’ another may call it ‘angio’ or ‘angiography,’ ‘vascular imaging system,’ ‘digital angiography,’ ‘flat panel detector’ — they all mean the same thing. What one person might say is one part of the anatomy may be called something else by another doctor. They’re going to have to come up with standard taxonomy in order for the datapoints to mean the same thing across the country and across hospitals.”
That’s just what CCISC had to do before it could accurately be used to compare radiation levels among patients around the world. Dr. Forbes explains, “We said, ‘First of all, we all have to measure radiation in the same way.’ So we got five nuclear physicists and said, ‘We can only choose one parameter. What is it going to be?’ They researched it and came up with one measurement. We went to all the vendors that provide the cath lab systems and said, ‘Your system has to measure this. That’s the way it’s going to be.’ They changed their systems so that they all measured radiation the same way. Once they did that, we were able to compare institutions as far as overall radiation dosage and break it down for different procedures.”
The analysis showed that some institutions were using much more radiation than others, and thanks to that knowledge, Dr. Forbes says, “We’ve seen a dramatic drop, both in complication rates as well as radiation dose by 50 percent. That’s a huge amount over the lifetime of a child who might have 13, 15, or 20 procedures. That’s the advantage of this database.”
The quality of data is also a major concern to big data experts. Fornell says, “You have the old saying: ‘garbage in/garbage out,’ so while we have all these advanced tools to manipulate the data and analyze it and try to interpret it, if we don’t have good quality data to begin with, everything else is going to be garbage. We need to make sure we’re measuring what we say we’re measuring, classifying appropriately, and truly getting accurate and reliable information before we do anything with it.”
The consistency, quality and availability of information continues to improve, and by 2018, government reform policies will require EHRs to fulfill “meaningful use requirements” that will make the advantages of big data more attainable. With careful use of this ever-increasing river of information, big data has the potential to identify new risk factors for heart disease, decipher the links between family history and cardiovascular events, find correlations between environmental factors and heart disease, increase quality of care around the world, assist physicians with making the best possible choices, and monitor the effectiveness of various treatments and procedures over time.