A few weeks ago Colin Hansen - a politician in the governing party in British Columbia (BC) - penned an op-ed in the Vancouver Sun entitled Unlocking our data to save lives. It's a paper both the current government and opposition should read, as it is filled with some very promising ideas.
In it, he notes that BC has one of the best collections of health data anywhere in the world and that, data mining these records could yield patterns - like longitudinal adverse affects when drugs are combined or the correlations between diseases - that could save billions as well as improve health care outcomes.
He recommends that the province find ways to share this data with researchers and academics in ways that ensure the privacy of individuals are preserved. While I agree with the idea, one thing we've learned in the last 5 years is that, as good as academics are, the wider public is often much better in identifying patterns in large data sets. So I think we should think bolder. Much, much bolder.
Two years ago California based Heritage Provider Network, a company that runs hospitals, launched a $3 Million predictive health contest that will reward the team who, in three years, creates the algorithm that best predicts how many days a patient will spend in a hospital in the next year. Heritage believes that armed with such an algorithm, they can create strategies to reach patients before emergencies occur and thus reduce the number of hospital stays. As they put it: "This will result in increasing the health of patients while decreasing the cost of care."
Of course, the algorithm that Heritage acquires through this contest will be proprietary. They will own it and I can choose who to share it with. But a similar contest run by BC (or say, the VA in the United States) could create a public asset. Why would we care if others made their healthcare system more efficient, as long as we got to as well. We could create a public good, as opposed to Heritage's private asset. More importantly, we need not offer a prize of $3 million dollars. Several contests with prizes of $10,000 would likely yield a number of exciting results. Thus for very little money with might help revolutionize BC, and possibly Canada's and even the world's healthcare systems. It is an exciting opportunity.
Of course, the big concern in all of this is privacy. The Globe and Mail featured an article in response to Hansen's oped (shockingly but unsurprisingly, it failed to link back to - why do newspaper behave that way?) that focused heavily on the privacy concerns but was pretty vague about the details. At no point was a specific concern by the privacy commissioner raised or cited. For example, the article could have talked about the real concern in this space, what is called de-anonymization. This is when an analyst can take records - like health records - that have been anonymized to protect individual's identity and use alternative sources to figure out who's records belong to who. In the cases where this occurs it is usually only only a handful of people whose records are identified, but even such limited de-anonymization is unacceptable.
As far as I can tell, no one has de-anonymized the Heritage Health Prize data. But we can take even more precautions. I recently connected with Rob James - a local epidemiologist who is excited about how opening up anonymized health care records could save lives and money. He shared with me an approach taking by the US census bureau which is even more radical than de-anonymization. As outlined in this (highly technical) research paper by Jennifer C. Huckett and Michael D. Larsen, the approach involves creating a parallel data set that has none of the features of the original but maintains all the relationships between the data points. Since it is the relationships, not the data, that is often important a great deal of research can take place with much lower risks. As Rob points out, there is a reasonably mature academic literature on these types of privacy protecting strategies.
The simple fact is, healthcare spending in Canada is on the rise. In many provinces it will eclipse 50% of all spending in the next few years. This path is unsustainable. Spending in the US is even worse. We need to get smarter and more efficient. Data mining is perhaps the most straightforward and accessible strategy at our disposal.
So the question is this: does BC want to be a leader in healthcare research and outcomes in an area the whole world is going to be interested in? The foundation - creating a high value data set - is already in place. The unknown is if can we foster a policy infrastructure and public mandate that allows us to think and act in big ways. It would be great if government officials, the privacy commissioner and some civil liberties representatives started to dialogue to find some common ground. The benefits to British Columbians - and potentially to a much wider population - could be enormous, both in money and, more importantly, lives, saved.