Statistics from Altmetric.com
The recent hearings of the General Medical Council and the public inquiry into the “Bristol affair” have fuelled several high profile reports in the media, based on legitimate concerns that cardiac surgery, with its potentially profound adverse outcomes, was devoid of effective quality assurance. Two papers published in this issue ofHeart take us one step further in the quest for quality management in cardiac surgery. The first paper from Wynne-Jones and colleagues compares activity and outcomes from four cardiac surgical centres in the north west of England and highlights a number of important issues related to reliable data collection, validation, and risk stratification.1 The second paper, by Sherlaw-Johnson and colleagues, simplifies the complex issue of presentation and display of outcome data by adding easily understandable limits to risk adjusted outcome graphs.2
The first report describes the highly successful North West Cardiac Surgical Database initiative and demonstrates that where the professional will exists, reliable and meaningful data collection is possible using a variety of data capture models tailored to suit individual institutions. But it is also clear that whichever model is used, support is required from dedicated non-clinical staff to ensure completeness of data collection. One of the key strengths of this initiative has been the rigorous data validation employed across all four centres, and it is therefore pleasing to note that the mortalities quoted for the common adult cardiac surgical procedures are in line with those reported to the UK Cardiac Surgical Register which has been collecting unvalidated activity and mortality data from all National Health Service cardiac surgical units since 1977.
A natural progression from simple activity and mortality data is the collection of a number of variables on each and every patient undergoing surgery to enable stratification for case mix. To this end, the four units have based their dataset on the agreed national dataset of the Society of Cardiothoracic Surgeons, which in turn was based on the US Society of Thoracic Surgeons national database dataset and definitions.3 They have deviated from only one definition among the preoperative variables and their adherence to the agreed national dataset and definitions strengthens the UK national initiative. However, even limited deviation raises interesting issues.
No set of definitions will ever be perfect and widespread agreement may require some degree of local compromise in order to ensure that data can be compared between institutions or even internationally. Constructive collaboration of this nature has already been exercised in the UK between the Society of Cardiothoracic Surgeons, the Association of Cardiothoracic Anaesthetists, and the British Cardiovascular Intervention Society to produce a harmonised dataset for the Central Cardiac Audit Database (CCAD).4Similarly, a current initiative to standardise European and North American adult cardiac surgical datasets and definitions is due to report soon. But definitions are an iterative process and without the sort of rigorous analysis used to unravel the apparent different incidences of chronic obstructive airways disease between the participating units, we run the risk of misunderstanding the variables which underpin our risk stratification models. We must therefore be prepared to review and modify existing definitions in concert at appropriate and defined intervals. In time this should improve the quality of contemporary regional and national data, but the price may be a reduction in the value of historical data.
Mortality outcome data
The authors have limited their outcome data to mortality only.1 Reliable postoperative data are the most difficult data to collect and this strategy was wise in the first instance. However, as our appreciation of quality issues and the requirement for evidence based budgeting advances, the value of more comprehensive outcome data to understand “near misses” and to relate preoperative variables to morbidity and resource consumption will become more evident. Furthermore, from the perspective of a patient and his or her family, an accurate explanation of anticipated postoperative morbidity and complications forms an essential part of a courteous and respectful consultation leading to genuinely informed consent.
That the Parsonnet score5 is not an accurate predictor of mortality in the north west of England is not surprising. This simple additive scoring system was developed in the USA in the 1980s. Times have changed and the early version, as used by Wynne-Jones, has been largely abandoned in North America for exactly the reasons outlined in their report.1 Indeed, the limited applicability of this system to European practice in the 1990s prompted the development of the EuroSCORE system based on similar methodology.6 No scoring system will ever be completely predictive of outcome, particularly in high risk patients, for three reasons. Firstly, we do not yet fully understand the basis of the pathophysiological response to surgery or factors influencing an individual patient's physiological reserve. Secondly, some of the major risk factors are not easily quantifiable or definable and are therefore omitted from most scoring systems. A typical example would be the state of the coronary arteries. To quote Parsonnet, “What may be identified as severe and diffuse disease by one surgeon may be considered relatively routine and non-intimidating by another”.4 Thirdly, some high risk patients may be difficult to characterise and the statistical denominators are relatively small.
Nevertheless, both the Parsonnet and EuroSCORE models provide a useful yardstick when examining mortality in groups of patients. In general, current UK practice results in a mortality of around half of that predicted by the Parsonnet score, but this will decrease with time—in part because practice is improving and in part because the weighting of the preoperative risk variables changes with time. Indeed, the essence of responsible surgical audit is to understand and attack the most influential risk factors in order to reduce their impact. As a result there is accumulating evidence that the influence of previously important risk factors is being reduced towards the mean. Thus, effective risk modelling must be an iterative process; it is this that has prompted the authors to explore an alternative, more contemporary and locally appropriate scoring system.
Sherlaw-Johnson and colleagues use another locally developed model to track mortality trends following cardiac surgery performed by a single surgeon on nearly 400 patients.2 The system works somewhat like a bank balance. The operator starts with a zero balance—that is, no operations and no deaths. But with each operation the surgeon is either debited or credited according to the predicted risk of the operation, and at various times the net balance will be either credit, debit or zero. An overall credit implies that more patients have survived than would have been expected and zero balance is exactly what the risk model predicts.7 As with most accounts, fluctuations between credit and debit are commonplace and reflect the nature of normal surgical practice. Until now, however, there has been no way of easily knowing when the operator is drifting into the twilight zone of borderline results. Sherlaw-Johnson's article presents us with an objective, statistically based, and easy visual aid for comparing the individual surgeon's variability against that which might be expected for his case mix. This represents a significant step forwards for two reasons. Firstly, this is an extremely difficult issue mathematically, but the authors have used an innovative approach based on pooled data to overcome the difficulties of repeated significance testing. Secondly, the ability to quantify variability in performance will allow us to use variability as a barometer of performance rather than some other arbitrary, rigid value.
Both articles allude to locally developed risk models. We should exercise caution in the construction and interpretation of local risk models and we must understand the balance between an accurate and a useful risk model. An accurate risk model accurately predicts an outcome. At one extreme an accurate local risk model could be constructed using data from a single individual or unit with poor results, reliably predict those results, and then be used to lull the participants into a false sense of security. Although accurate and helpful for predicting local outcome, such a model would have limited value for comparative purposes. A genuinely useful risk model is one that reliably predicts an outcome but is based on, and applicable to, a wider constituency, thereby facilitating reasonable and meaningful comparisons between individuals or units. Local risk modelling has value in that it caters for immeasurable local influences which may not pertain elsewhere. Thus a hierarchy of risk models at different levels, based on single units, several units in a region, along with a nationally based model, will allow accurate local analysis and comparisons while also providing the substrate for dissecting and understanding local variations in the process of surgical care. But again, to be of value, all models should rely on the same definitions for their variables.
The definition of quality in surgery is difficult and was first considered by Florence Nightingale in the UK and Ernst Codman in the USA, but both were ahead of their time and were to some extent ostracised for their efforts which were perceived by some as threatening.8 The issue of how to measure quality remains perplexing, but it is an issue which we must grasp if we are to retain the respect and confidence of those we seek to treat. The concept of quality in cardiac surgery should encompass the whole hospital journey, from the time the patient walks through the front door for his or her preliminary assessment to the time of discharge from the postsurgical outpatient clinic and beyond. Individual surgical performance constitutes only a small, albeit important, part of this process.
Surgeon specific mortality data collected by the Society of Cardiothoracic Surgeons clearly indicates that surgeons in the same unit tend to have similar mortalities, highlighting the importance of additional local influences on surgical outcomes. Such influences may include the socioeconomic status of the catchment area, severity of cardiac illness, prevalence of co-morbidities, threshold of referral from both the general practitioner and the cardiologist, threshold of acceptance by the surgeons, standards of anaesthesia, surgery and intensive care, adequacy of facilities and staffing levels, attitude to training, interpersonal relationships between staff, and architectural dispersion within the unit. Any one of these can influence surgical outcome, and it is clear that hospital mortality is not necessarily a measure of overall quality but is simply a rough guide to the success of the surgical episode. An appreciation of quality begins when the patient has recovered from the musculoskeletal insult of surgery and he and his family ask, “was it all worth it?”. If the answer is “yes”, then added value can be measured by how long the answer remains “yes”. A favourable answer to these questions depends as much, if not more, on appropriate referral and acceptance as it does on the technical quality of surgery and postoperative management.