Tuesday, 1 March 2016

The NHS isn't very good at driving operational improvement: the data it collects could help it get better

The NHS collects a large volume of administrative data. It could use that for driving operational improvement but mostly doesn't.

The central NHS collects patient-level data about what is happening in its hospitals. Since 2007 the major datasets have collected more than a billion records of admissions, outpatient appointments and A&E attendances. These datasets are collectively known as "administrative data" and are used for a variety of internal purposes including paying hospitals for the activity they do.

The primary reason why they are collected isn't operational improvement. Arguably, it should be, though, if it were, we might collect the data differently (we might also disseminate if more speedily and collect additional things).

The controversial programme (which is an attempt to join-up data collected by GPs with hospital data) was promoted as a way to enhance economic growth by exploiting the data for medical research even though it is probably far more useful for driving improvement in the existing care offered by the system. But improvement is the neglected orphan child of NHS data collection and is barely mentioned in any of the arguments about or any other NHS data collections. It should be the primary reason why we bother with this data not least because making NHS care better is easy for patients to understand (and harder to object to) than, for example, helping the pharmaceutical industry make even more otiose margins.

Even though the big data collections are not optimised for supporting improvement, they are still useful. I'm going to illustrate this with a few examples from analysing the HES (hospital episodes statistics) A&E dataset. HES is one of the ways the central data is disseminated back to the system.

What we collect in A&E data and why it is relevant

Since 2007 the English NHS has collected a range of useful data about every A&E attendance (which includes attendance at minor injury units as well as attendance at major 24hr, full service A&E departments). It took several years before that collection achieved nearly complete coverage of all departments in England, but it has mostly been complete for the last 5 years.

The data contains basic information about each attendance such as what time the patient arrived and left A&E plus some other timestamps during their stay (eg when first seen by a professional, time treatment finished and time patient departed the A&E). Basic demographics about the patient are recorded and some data about where they came from.  Data about the where the patient came from and where they went after the visit are also collects as well as information about investigations diagnoses and treatments (though these are often not collected reliably).

This is a rich source of data for identifying why A&E departments struggle to treat patients quickly, which is currently a major concern in many hospitals.

So here are a few examples of how the data can be used.

Local operational insights are available in the data

How well organised you are matters. If you have a grip on the operational detail you will be constantly identifying where things can be improved. One of the key tasks is to identify whereabouts in the process things are broken. We might identify a department that has a problem with one type of patient, or one time of day or one particular step in the process. If we know where the problem is,we can focus improvement effort in one place which is much more effective than wasting effort on a wide range of interventions most of which will have no effect.

I'm going to show two ways the patient-level dataset can support such focus. I'm only going to show how to isolate performance issues with the type of patient and the time of the week. But I hope this illustrates how effective use of the data can support improvement.

One way to get a quick overview of how the complete process functions is to look at the histogram of patient waiting times (ie toting up how many patients wait different lengths of time before leaving A&E). In this case a useful way to do this is to use counts of waits in 15 minute blocks. A typical chart is shown below:

This plot summarises the experience of every patient (in this case over a whole year, but it works well for smaller numbers and time periods). It is common to see a peak in the waits in the 15 minute interval before the 4hr target time. This is a useful indicator of a last minute rush to meet the target (which is bad). But the other features are also useful indicators. We can see at a glance for example the total waits of >12hr (this is the last peak on the right of the chart). We can tell in this case that a lot of patients leave before they get to even 1.5hr (which is good).

Experience shows that we can diagnose many problems in A&E from the shape of this curve.

Some of those are easier to spot if we look at how different types of patient wait. The next chart shows the histogram broken down by 4 categories of patient: admitted patients, discharged patients, discharged patients with a referral and transferred patients (patients admitted to another hospital usually for specialist treatment).

We can instantly see that the shapes are different for different types of patient. And we can see that nearly half of all patients being admitted get admitted in the 15 minute interval before they have waited for 4hrs. Other patient types show a similar but much less strong peak just before 4hr.

This 4hr peak is a sign of bad things in the process. Are doctors making rushed last minute decisions to admit patients? Do they know the patient needs to be admitted earlier but can get access to a bed unless a breach of the target is about to occur? Neither of these are good for the patient. But knowing where the problem is is the first step in fixing it.

To show that not every trust is the same, here is the same analysis for a different (much better) trust. They still have a peak at 4hr for admitted patients. But it is only 15 % of all patients not 50 %: most admissions are spread over the 3hr period before 4hr not the 15 minute period before 4hr.  Other types of patient show only a tiny 4hr rush and the majority are dealt with well before they get close to a 4hr wait.

Analysis of these patterns can tell us a lot about the underlying quality of the process for treating patients. One particular insight found in most trusts is the apparent problems admitting patients quickly when they need to be admitted. The shapes of the admitted patient curve often show a last minute rush to admit just before 4hr. This isn't usually because sick patients need more care in A&E; it is often obvious from the moment they arrive that they will need a bed but free beds are often hard to find. The contrasting pattern for transferred patients is a strong confirmation of this idea. Transferred patients also need a bed, but are often transferred because they need a specialty unavailable in that hospital. Most hospitals achieve that transfer much more quickly than they achieve admission to their own beds. The clock stops when they leave the A&E and they leave faster than admitted patients and often in much less than 4hr. Finding a bed for them is another hospital's problem.

Admitted patients wait until the priority of not breaching the target triggers some action to free up beds. This is bad for patients, who wait longer, and staff, who could be treating other patients instead of searching for free beds.

The insight that the problem is associated with beds is well-known but often neglected in improvement initiatives (not least because it is not really an A&E problem and it is A&E who get the blame for the delays). But A&E departments don't control the flow through beds. Adding more A&E staff or facilities won't fix waits caused by poor bed flow. Nor will diverting patients to other services (you can only divert the minors who are often treated quickly even in departments with bad problems with their beds.)

These sorts of insights should be a crucial part of deciding what initiatives to focus improvement programmes on. But far too much effort is actually spent on non-problems that will have no impact. Sorting out flow in beds is a hard problem; but much harder if you don't even recognise that it is the most important problem.

We can also do other analyses that localise where in the process the problems occur. For example, some departments have problems at particular times of day or particular days of the week. If you know, for example, that some days are usually good and others are usually bad, you can ask what is different on the good days and, perhaps, find ways to improve the bad ones.

Here are some examples.

This shows the average performance for one trust on different weekdays:

There is no huge insight here except that performance at weekends is better than on weekdays. This might reveal some important issues with matching staffing to the volume of attendance or it could be caused by different admission practices at weekends.

But we can drill further into the data and get more detailed insights. Here is the volume and performance by hour of week for the same trust:

We can tell from this that although volume at the weekends is a little lower, performance is better and more consistent. We can also tell that performance falls off a cliff at 8am every weekday but just for that hour, just  when it starts to get busy but no such effect is seen at weekends.

We can drill deeper into the data and look at performance by different types of patient. The chart below is the same as the one above but we have broken out performance and volume by patient type.

In this chart we can see that the unusual performance collapse at 8am occurs only for the discharged patient group (normally considered to be the easiest to deal with). The most likely explanation for this is some major problem with shift handovers at that time in the morning. We can't prove this from the data but we can certainly trigger some careful local analysis to explore the cause. I'm guessing this has not happened since the same pattern is seen over several years since the merger that created this trust. We also can't tell whether this problem is localised to one site (this trust runs several major A&E sites) because this trust doesn't report site-specific data nationally (unhelpfully site-specific reporting is not mandatory). I know they have recently recruited a new data team so I hope they are addressing the problem now.

Just for reference here is the same plot for one of the top A&E performers.

Note that this trust achieves consistent and very good performance for all patient groups almost all the time.

This sort of analysis should be routine when trying to improve A&E performance

A large part of improving performance is knowing where to focus the improvement effort. I hope that these relatively simple examples show that there are plenty of simple analytical tools that can provide that focus. These tools should be available to any competent analyst. Trusts already have the data that feeds them and the national data is available to qualified analysts who want to benchmark their hospitals with others.

Unfortunately this is far from standard practice. Many trusts, even troubled ones being hounded to improve by their management or by external regulators produce analysis that never seems to ask the important questions that would create some focus for improvement. No national body routinely produces tools to enable this sort of analysis even though the data has been available for years.

The NHS has a huge challenge ahead in driving up the rate it can improve. Many large national datasets exist that contain (like the A&E data here) major insights that can help to focus that improvement effort. It is critical that analytical skills are focussed on identifying where problems occur so we can spent improvement effort in the right place. Sadly too many competent analysts in the NHS spend all their time doing routine reports which contain no useful insights for driving improvement. Many of the bodies who could have access to this sort of data don't exploit it for improvement. And many of the national bodies who do have the data never do this sort of analysis. Most surprisingly, perhaps, even the hospitals who could use this data in real time (national bodies only get their data several months after the activity occurs) mostly don't, even the troubled ones who really need to improve.

This has to change or improvement will remain impossible.


  1. Having recently interviewed at a couple of poor performing hospitals for an information analyst position it appears that when hospitals are failing it's the non patient-facing areas that get their recruitment stopped. Only when staffing levels drop to the extent that mandatory reporting requirements start to be affected do funds become available to hire analysts. The net result appears to be that NHS organisations that would really benefit from the sort of data analysis that you describe are too under staffed to even begin a forensic scrutiny of their operational processes.

    1. I'm afraid you are right. Hospital management teams rarely prioritise having accurate information about what is happening inside their organisations. So when they need the information to improve they don't have it.

      For example, instead of insisting on having reliable tools/processes to manage their waiting lists, many simply did the minimum required to feed the beast of national mandatory reporting. More than a few have been caught out when audited or when problems arise as they processes are ramshackle and unreliable. How else did GOSH "lose" 7,000 patients from their waiting list as was recently reported?

      Information is critical for good management and those who don't prioritise it are doomed to suffer the consequences.