Sunday, 13 December 2015

In planning to restrict the NHS data available to NHS organisations, the HSCIC has jumped the shark

The HSCIC is planning to redact the activity information in HES (hospital episode statistics) for patients who have lodged type 2 objection to data sharing even when the data is shared inside the NHS. This policy is mad, will hurt the care given to those patients and will cause potentially huge chaos in other NHS bodies. It will also fail to achieve any increase in protection for patient confidentiality. If a clusterfuck had a child with an omnishambles we'd get a result like this.

A caveat before I describe what I think is proposed
My knowledge of what the HSCIC has proposed is based on a chain of chinese whispers that may have garbled some of the details. So some of what I say below may not be exactly right. I actually hope I'm wrong as the implications of what I've heard are catastrophic for NHS planning, improvement and patient care.

What has been proposed
The HSCIC currently releases comprehensive information about hospital activity (the HES datasets) for use by the NHS to plan and manage the system and to third parties (including drug companies) for research purposes. Those releases are only supposed to happen when the purpose is valid and the users can handle the data securely.

They are, apparently, proposing to remove the records relating to a large number of patients from any releases after january. This redaction of data will apply even to other NHS bodies who will now have incomplete data about what is happening in hospitals (GP data is irrelevant as it isn't collected centrally yet).

This proposal has profound and worrying consequences for the NHS. It is also entirely unnecessary. I'm going to try and explain how we got here, why it is so dangerous for the NHS and patient care and what should have been and could be done in the future to avoid the problem

How the HSCIC got into this mess
When was first proposed as a way of joining up GP data with other NHS data about patients many people were very worried about their confidentiality. Eventually it was realised that some people would want to object to their data being used

The NHS collects a lot of data about patient activity. Comprehensive information about every admission, outpatient appointment and A&E attendance is gathered nationally by the HSCIC and released to other NHS bodies in several forms for the purposes of running the NHS. HES (hospital episode statistics) is one of the most important datasets and is used very widely (though perhaps not used enough given how valuable it is for care and improvement). The NHS has also proposed joining up this data with data about activity in GP practices which is currently not collected nationally. This is the highly controversial programme.

The controversy over raised a number of concerns about confidentiality and patient consent to the use of their data. The initial communication about the programme dealt with these issues very badly. As a result new proposals were developed to allow patients to record objections to their data being shared. A type 1 objection should stop any data leaving the GP. A type 2 objection was intended to stop any "secondary uses" of the data (I'll come back to what this means later). GPs started to record those objections a couple of years ago.

But since the data collection proposed under kept being postponed nobody worried about those objections and the HSCIC did nothing with them. But the objections had implications beyond data from GPs. HES data on all patients has been widely shared for decades. The controversy highlighted patient concerns about the use of that data for non-NHS purposes such as medical research or drug development. The apparent intent of type 2 objections was to allow patients to stop such uses of their data.

But the HSCIC did nothing and continued to release the complete data to organisations who requested it for valid purposes and who could show they would handle it securely. Then the Information Commissioner's Office (ICO) ruled that doing nothing was not an option and that the HSCIC would have to honor the objections.

The HSCIC response to this appears to be to remove the information from all releases of HES data even to other NHS bodies.

What type 2 objections mean and why there is confusion about it
The catastrophe the NHS is facing arises from three interlinked factors:
  • What a type 2 objection means has been poorly defined and poorly communicated
  • Senior policymakers, campaigners and others have shown an incredibly poor appreciation of what the NHS does (or should be doing) with data
  • The HSCIC has shown a poor appreciation of the uses of its data and a chronic lack of pragmatism about ways to implement mechanisms to assure patient consent

The early results of pilots suggest that nobody understands what a type 2 objection actually means. The current explanation of what people are objecting to confuses patients, the GPs who have to explain it to them and almost everyone else who has to deal with the consequences. The language talks about the uses of data for "direct care" and about restricting uses for "secondary purposes". But these are incredibly poorly defined terms (I've discussed the consequences of this in a previous post). What many objected to in the original communication of the purposes of was the use of their data by commercial firms (like drug developers) for research or the possible use of the data in insurance or by other branches of government.

Later pilots seem to have concluded that a simpler explanation of the objection (that describes it as an objection to sharing with bodies outside the NHS) is clearer and easier to explain. Fiona Caldicott's review in january seems likely to conclude that this is how we should communicate to patients and how we should interpret the opt-out. This is a pragmatic idea which creates a clear guideline for where data can be shared. The HSCIC have to do something before that ruling will be finalised, meaning they have to live with a fuzzy definition of what a type 2 objection actually means. They have, apparently, chosen the most restrictive and most damaging interpretation of it that prevents any data being used outside the HSCIC even in NHS bodies where the data supports direct care.

They may have been told to go with this interpretation by senior policy makers to send a strong signal that patient consent will be respected and to avoid the possibility of future legal challenges. But if the lawyers have recommended this as the best approach there is only one (Shakespearean) response: let's kill all the lawyers (Henry VI part 2, if you wanted to know the source). Even a data scientist can think of defensible pragmatic solutions other than this one that are arguable given the fuzziness of what the current type 2 objections mean (for example implement redaction for data shared outside the NHS).

But the most astounding thing about the idea of preventing even NHS bodies from getting this information is that the senior people who have recommended it have spent no time at all considering the damaging consequences. As far as I can tell they didn't even think they should ask any of the affected bodies whether there would be any adverse consequences if the data they get is suddenly corrupted by the removal of data. This suggests to me a staggering lack of appreciation of what the NHS does with this data.

This isn't helped by the same lack of appreciation shown by the HSCIC for how the data they provide is used. And a further lack of appreciation for how to ensure confidentiality is protected (it's not that they don't protect information well, it's that they seem to have no capability to explain those protections to anyone). The CEO illustrated this earlier this year when he said in a discussion about future data protection that he didn't see why anyone needed patient-level data (for the uninitiated: some analysis is impossible without patient-level information and much other analysis is difficult or incredibly time consuming without it). They also show their lack of insight when monthly flows of information change the date format for fields like hospital admission date from an international standard to an illogical mess creating vast amounts of extra work for the poor analysts who have to sort it out to do any work.

This lack of insight means they are a shockingly poor advocate for their customers inside the NHS when policy changes are proposed. They simply don't have the insight to know why the policy changes will affect anyone.

The collective lack of insight about how data is used combines with the fuzzy definition of what type 2 objections mean to give a policy proposal that will corrupt essential key data widely used across the NHS. And, as far as I can tell, no senior policy maker even thinks it is a problem.

The situation is even worse than that. The small group of people who came up with the idea didn't even talk to the groups who were most likely to be affected by it. Some of those groups found out what was proposed by accident. Rumour has it that many of those groups are now desperately scrambling to find workarounds that will enable them to continue their essential work. These workarounds will be expensive, may not work and will undermine the purpose of the HSCIC policy in a way that might terminally damage the organisation's credibility.

Why corrupting the data is bad for the NHS
It should be obvious that, if the data you have is incomplete, the decisions you can make with it will be wrong, perhaps very wrong. But if this simple observation were obvious the HSCIC proposal would not happening and certainly would not be happening without any consultation with the groups most affected by it.

But it is happening, so perhaps some basic explanation is required.

Imagine, if you can, that you are running a major supermarket chain. Let's keep it simple by keeping the role of head office to distributing the money and restocking the shelves. You need to know everything that has been sold every day. Using this you can work out how many orders to place for new stock and how much money should be distributed to your stores and suppliers. Now imagine your data protection officer says you can only see data related to customers who have loyalty cards. It will be impossible to pay your stores the correct amount or to keep the shelves stocked with what your customers want. The consequences will be catastrophic for the business.

This is an extremely simple analogy but it captures the essence of why partial data is a problem.

Central bodies in the NHS do a whole range of things that require complete datasets. How do you plan local services when your data about who uses them is incomplete? How do you understand where a service could be improved when the data might omit the details of the services most frequent users? How do you pay hospitals when you don't know how much activity they did? How do you work out the correct prices for that activity when the information about a significant chunk of the costs is missing?

What is disturbing is that the desire to remove data contradicts one of the goals of the patient opt-out. The original promise was that any opt out would not affect the care given to patients. This promise can't be kept if the people planning the service can't see all the data.

Without complete data, paying hospitals for what they do could be problematic for two different reasons. Hospitals are paid standard prices for most of their activity (for example, a standard cataract removal is £704 on the current tariff). The NHS has to know how many operations have happend to pay the hospitals the right amount. But it also has to calculate what that price is by a sophisticated process that maps all the costs associated with an operation onto the number of operations that a hospital does. Obviously, if the prices are wrong hospitals won't be able to cover their costs with the money they are paid. The prices are currently recalculated every year so the effect of errors in the price calculation won't show up immediately. But calculating prices with incomplete data can lead to potentially major errors.

Given that omitting data for some patients from the system that pays hospitals after january would be immediately apocalyptic for hospital finances, I presume that there will be some exemption. But, if there is, then the supposed protection of the data of patients expressing an opt out will be undermined. We can't even be sure what will happen as none of the people who know how the payment system and price calculations work have even been consulted on the effects of removing some data from their data sources.

What should be done
In a world where the people writing policy knew what they were talking about this is what would happen.

The HSCIC would fulfill its ICO obligations by restricting the data provided to non-NHS bodies. So medical researchers and drug companies would only get the data about patients who have not expressed an objection to data sharing. The HSCIC would send a strong message that it was enforcing patients' decisions about sharing their data. NHS bodies would continue to get complete data at least until a clearer definition of a type 2 objection has been formulated. This should be clear enough that doctors and patients understand exactly who can and can't have their data and, if the rules are set so the NHS can't use the data internally, the patients understand the significant implications for their care of not sharing their data.

Then we should identify everyone involved in setting this apocalyptically dumb policy and either sack them or move them to jobs where they can't do any more damage.