Article Text

Download PDFPDF

Original article
Predicting the adverse risk of statin treatment: an independent and external validation of Qstatin risk scores in the UK
  1. Gary S Collins,
  2. Douglas G Altman
  1. Centre for Statistics in Medicine, Wolfson College Annexe, University of Oxford, Oxford, UK
  1. Correspondence to Dr Gary Collins, Centre for Statistics in Medicine, Wolfson College Annexe, University of Oxford, Linton Road, Oxford OX2 6UD, UK; gary.collins{at}csm.ox.ac.uk

Abstract

Objective To evaluate the performance of the QStatin scores for predicting the 5-year risk of developing acute renal failure, cataract, liver dysfunction and myopathy in men and women in England and Wales receiving statins.

Design Prospective cohort study to evaluate the performance of four statin risk prediction models.

Setting 364 practices in the UK contributing to The Health Improvement Network database.

Participants 2.2 million patients aged 35–84 years registered with a general practice surgery between 1 January 2002 and 30 June 2008, with 2037 incident cases of acute renal failure, 25 692 incident cataract cases, 14 756 cases of liver dysfunction and 1209 incident cases of myopathy.

Main outcome measures First recorded occurrence of acute renal failure, cataract, moderate or severe liver dysfunction and moderate or severe myopathic events as recorded in general practice records.

Results Results from this independent and external validation of QStatin scores indicate that models predicting the 5-year statin risk of developing acute renal failure, cataracts and myopathy perform well with areas under the receiver operating characteristic curve ranging from 0.73 to 0.87. Calibration plots for the three models also indicated close agreement between observed and predicted risks. Poor performance was observed for the model predicting the 5-year statin risk of developing liver dysfunction with areas under the receiver operating characteristic curve of 0.64 and 0.60 for women and men, respectively.

Conclusions QStatin scores for predicting the 5-year statin risk of developing acute renal failure, cataract and myopathy appear to be useful models with good discriminative and calibration properties. The model for predicting the 5-year statin risk of developing liver dysfunction appears to have limited ability to identify high-risk individuals and the authors caution against its use.

  • Statistics
View Full Text

Statistics from Altmetric.com

Introduction

Cardiovascular disease is a major health burden accounting for nearly one-third of all deaths around the world in 2004 and is the leading cause of premature death in the UK. Thus, early identification of individuals at an increased risk of developing cardiovascular disease is an important challenge. In the UK, tools such as QRISK2 or the Framingham equation are used to identify high-risk patients who could benefit from lifestyle changes (eg, smoking cessation, weight control) or lipid modification therapy using statins.1 While there are clear benefits of statins for patients at high risk of cardiovascular disease, a number of studies have observed negative effects on a range of clinical outcomes, including acute renal failure,2 cataract,3 liver dysfunction4 and moderate or severe myopathic events.5 Such concerns have led to the development of four risk scores to identify those who are at greatest risk of adverse events of statins (acute renal failure, cataract, liver dysfunction and moderate or severe myopathic events) to individualise risk estimation for patients and provide more information during the general practice consultation.6

The four risk scores (QStatin scores) were developed and validated on a large cohort of patients (3 million) from the QRESEARCH (http://www.qresearch.org) database; two-thirds of the cohort was randomly allocated for model development and one-third to model validation. The QRESEARCH database is a large database comprising over 12 million anonymised health records from 557 practices throughout the UK using the EMIS computer system (used in 59% of general practices in England). QStatin scores were developed on 2 million patients aged between 35 and 84 years with 1969 incident cases of acute renal failure, 36 541 incident cases of cataract, 15 020 incident cases of moderate/serious liver dysfunction and 1406 incident cases of moderate/serious myopathy between 1 January 2002 and 31 December 2008. The models were derived using a Cox proportional hazards model using fractional polynomials to model non-linear risk relationships with continuous predictors. Multiple imputation was used to replace missing values for key risk predictors (body mass index and smoking status) to reduce the biases that can occur when omitting individuals with incomplete data. The risk factors included in the final prediction models are described in table 1 and open source code to calculate the QStatin scores is available from http://www.qintervention.org released under the GNU Lesser General Public Licence, Version 3.

Table 1

Summary of risk factors in QStatin risk scores

The original published article that described the development of the model to predict the adverse risk of statin usage included some initial results of an evaluation using the The Health Improvement Network (THIN) cohort. However, the results of the external validation were limited and incomplete, with many key details of the validation omitted (probably due to space restrictions), leading readers with insufficient information to objectively evaluate the generalisability of the QStatin risk scores. In this article we describe the results of new detailed independent evaluation assessing the performance of QStatin risk scores on a large cohort of general practice patients in the UK using the THIN database.

Methods

Cohort selection

Study participants were patients registered between 1 June 2002 and 30 June 2008 and recorded on the THIN database (http://www.thin-uk.com) and who were event-free at 1 June 2002. Entry to the cohort was the latest of study start date (1 January 2002), 12 months after the patients registered with the practice or date of first statin prescription. Observation time was calculated from the entry date to an exit date, which was defined as the earliest date of recorded incidence of acute renal failure, cataract, liver dysfunction or myopathy, date of death, date of deregistration with the practice, date of last upload of computerised data or the study end date (30 June 2008). Patients were excluded if they had missing Townsend scores (social deprivation), had been prescribed statins before the study start date, had invalid dates, were aged <30 years or were aged ≥85 years.

Outcome measures

All outcome measures were defined as for the original development study.6 Acute renal failure was recorded during follow-up and obtained from Read codes. Moderate or serious liver dysfunction was defined as alanine transaminase >120 IU/l among patients without diagnosed chronic liver disease at baseline. Moderate or serious myopathic events were defined as a diagnosis of myopathy or rhabdomyolysis or raised creatinine kinase of ≥4 times the upper limit of normal (>560 in women and >696 in men). Cataract was recorded during follow-up and obtained from Read codes.

Statistical analysis

Smoking status was derived from combining two risk factors: (1) whether the patient was a non-smoker, ex-smoker or current smoker and (2) the number of cigarettes smoked, defined as light (<10 cigarettes/day), moderate (10–19 cigarettes/day) or heavy (≥20 cigarettes/day).

For every patient in the THIN cohort the 5-year estimated risk of acute renal failure, cataract, liver dysfunction and moderate or serious myopathy in relation to statin use was calculated using QStatin risk scores (http://www.qintervention.org). Observed 5-year statin risks were obtained using the method of Kaplan–Meier. Multiple imputation using all predictors plus the outcome variable was used to replace missing values for smoking status and body mass index. This involves creating multiple copies of the data and imputing the missing values with sensible values randomly selected from their predicted distribution. Ten imputed data sets were generated and results from analyses on each of the imputed data sets were combined using Rubin's rules to produce estimates and CIs that incorporate the uncertainty of imputed values.7 ,8

Predictive performance of the QStatin risk scores on the THIN cohort was assessed by examining measures of calibration and discrimination. Calibration refers to how closely the predicted 5-year statin risks agrees with the observed 5-year statin risks. This was assessed for each tenth of predicted risk, ensuring 10 equally sized groups, and each 5-year age band by calculating the ratio of predicted to observed statin risk, separately for men and for women. Calibration of the risk score predictions was assessed by plotting observed proportions versus predicted probabilities by tenth of risk and 5-year age band.

Discrimination is the ability of the risk score to differentiate between patients who experience an event during the study period and those who do not. This measure is quantified by calculating the area under the receiver operating characteristic curve (AUROC) statistic; a value of 0.5 represents chance and 1 represents perfect discrimination. We also calculated the D statistic9 and R2 statistic10 which are measures of discrimination and explained variation, respectively, and are tailored towards censored survival data. The larger the D statistic, the greater the degree of separation in the prediction model. R2 captures how much variation in the outcome is accounted for through the prediction model.

All statistical analyses were carried out in R (Version 2.13.1)11 and the ICE procedure in Stata (Version 11.2).

Results

Between 1 January 2002 and 30 July 2008, 2 319 181 patients aged between 30 and 84 years from 364 general practices in the UK were registered in the THIN database. We excluded patients who were past users of statins, current users of statins, patients who experienced the event before the study start date of 1 January 2003 and, for the acute renal failure, cataract and liver dysfunction scores, patients who had missing Townsend deprivation scores. This left 2 009 743, 1 976 068, 2 005 792 and 2 205 143 patients with 2037, 25 692, 14 756 and 1268 incident cases available for analysis of the acute renal failure, cataract, liver dysfunction and myopathy risk prediction models, respectively, from an eligible cohort of 2 205 613 patients. There were 129 053 women who were new statin users, with a median follow-up ranging from 2.55 years (cataract) to 2.64 years (acute renal failure) compared with non-statin users, 5.26 years (myopathy) to 5.67 years (acute renal failure). Similarly, there were 152 929 men who were new statin users, with a median follow-up ranging from 2.49 years (cataract) to 2.55 (acute renal failure) compared with non-statin users, 5 years (myopathy) to 5.25 years (acute renal failure, cataract, liver dysfunction). Table 2 details the characteristics of the patients in the THIN cohort.

Table 2

Characteristics of participants aged 30–84 years in the THIN validation cohort

Complete data on smoking status, smoking category and body mass index were available for between 70.73% (liver dysfunction) and 100% (myopathy) of women and between 63.66% (cataract) and 72.50% (myopathy) of men. Most patients had no or only one missing risk factor (table 3). Missing data on the entire eligible cohort (n=2 205 613) for body mass index was 18.48% for women and 27.49% for men. For other risk factors there were 12.14% and 10.47% on smoking status for men and women, respectively, and 4.94% and 10.27% for number of cigarettes.

Table 3

Completeness of data

Discrimination and calibration

Table 4 presents the performance data for each of the four risk prediction models. The R2 statistic (percentage of explained variation) for the acute renal failure and cataract models are around 60% for both women and men and slightly lower for the myopathy model at 42% and 38% for women and men, respectively. However, for the liver dysfunction model, the percentage of explained variation is much lower at 15% and 11% for women and men, respectively. Likewise, the D statistic for the acute renal failure and cataract models were similar, slightly lower for the myopathy model and much lower for the liver dysfunction model. Values for the area under the receiver operating characteristic curve were high again for both the acute renal failure and cataract models (between 0.84 and 0.87) in both men and women, while much lower for the liver dysfunction model (0.64 and 0.60 for women and men, respectively. There were no discernible differences between the performance on the multiple imputed datasets and restricting analysis to those with completely recorded information on all risk factors. There were also no apparent differences in the performance of the QStatin scores between our results on the THIN cohort and those reported in the internal validation of the scores on the QRESEARCH database.

Table 4

Performance data of QStatin risk scores

Figures 1 and 2 show the calibration curves for the four risk scores for men and women separately by tenth of risk. There was very good agreement between observed and predicted risks for the acute renal failure model in both men and women across all tenths of risk and across all ages (figure 3). There was a small but consistent overprediction for the cataract risk models in both women and men across all tenths of risk, with the overprediction more evident with increasing age (figure 3). Calibration of the liver dysfunction model is good for women but appears poorly calibrated for men. Finally, the model for myopathy is reasonably well calibrated but exhibits a small degree of overprediction when examined by age group (figure 3).

Figure 1

Calibration curves for the four risk scores for women by tenth of risk.

Figure 2

Calibration curves for the four risk scores for men by tenth of risk.

Figure 3

Observed (black) and predicted (grey) 5-year statin risk by age group and sex.

Discussion

QStatin scores are four new risk scores to predict the 5-year risk of developing acute renal failure, cataracts, liver dysfunction and myopathy after commencing treatment with a statin. The models were produced and internally validated on the large primary care electronic QRESEARCH database, which comprised 3 million patients registered between 1 January 2002 and 30 June 2008 using the Egton Medical Information Systems (EMIS) computer system. QStatin scores were designed to be based on risk factors that are readily available and recorded in patients' health records, or which patients themselves are likely to know during their consultation with their GP (see table 1).

To date, the development, internal validation and our external validation of the QStatin scores have used 5.2 million patients contributing 5335 incident cases of acute renal failure, 85 674 incident cases of cataract, 38 322 incident cases of liver dysfunction and 3424 incident cases of myopathy during the observation periods to develop and evaluate a risk score to predict the 5-year statin risk of developing acute renal failure, cataract, liver dysfunction and myopathy in adults aged 30–84 years.

Used in conjunction with QRISK2, GPs have readily available tools to calculate the 10-year risk of developing cardiovascular disease and the 5-year risk of developing acute renal failure, cataracts, liver dysfunction and myopathy as side effects of taking statins (http://qintervention.org). Used during the consultation process, patients at increased risk of developing cardiovascular disease will be able to discuss with their GP the balance between risks and benefits of starting statin treatment. In addition, those with an increased risk of developing adverse side effects of statins could be monitored more closely.

Strengths and weaknesses

The key strengths of this study include its size, length of follow-up and representativeness, as smaller studies may be unable to provide sufficient numbers of events to enable detailed evaluation of the QStatin risk scores. Limitations include missing data, though multiple imputation was carried out to minimise any potential biases and is methodologically preferred over omitting patients with missing data, thereby conducting a complete case analysis. Ascertainment bias is a potential limitation in that patients taking statins may be visiting their GP more frequently, thus increasing the opportunity for patients to report problems (specifically for cataracts).

In this study we have provided an independent and external evaluation of the QStatin risk score on a large cohort of patients in the UK. We have assessed the performance of QStatin against performance metrics presented in the internal validation of this score and have provided empirical evidence to support the use of QStatin scores for predicting the 5-year statin risk for developing acute renal failure, cataracts and myopathy.

Conclusions

Our evaluation of QStatin scores has demonstrated very good performance for the models predicting the 5-year statin risk of acute renal failure and cataracts, and moderately good performance data supporting the model for myopathy. However, our results also highlight the poor performance in calibration and discrimination for the model predicting the 5-year statin risk for liver dysfunction and, as such, we strongly caution against the use of this particular model.

References

View Abstract

Footnotes

  • Competing interests None.

  • Ethics approval Ethics approval was provided by Trent multicentre research ethics committee.

  • Provenance and peer review Not commissioned; internally peer reviewed.

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.