Article Text

Download PDFPDF

Diagnostic accuracy of technician supervised and reported exercise tolerance tests
  1. D F Muir1,
  2. M Jummun2,
  3. D J Stewart2,
  4. A L Clark3
  1. 1Manchester Heart Centre, Manchester Royal Infirmary, Manchester, UK
  2. 2Department of Cardiology, Western Infirmary, Glasgow, UK
  3. 3Department of Cardiology, Castle Hill Hospital, Hull, UK
  1. Correspondence to:
    DF Muir;

Statistics from

Request Permissions

If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.

Ischaemic heart disease (IHD) is a common condition with an annual hospital admission rate in England and Wales of around 6.3 per 1000.1

Guidelines produced by the joint audit committee of the British Cardiac Society and Royal College of Physicians recommend exercise electrocardiography in most new cases of angina. In accordance with current guidelines,2 many of these tests are now supervised by senior cardiac technicians without direct medical supervision.

In our centre, around 2000 exercise tests are performed annually, 80–90% of which are technician supervised. All tests are reported by medical staff, which necessitates a significant increase in resources and an increase in reporting time. The purpose of this study was to evaluate the ability of experienced technical staff to provide accurate reports for technician supervised exercise tests.


A total of 246 consecutive technician supervised exercise tests were collected prospectively. Tests were excluded if the requesting physician had indicated that medical supervision was required (n = 23), if bundle branch block was present at rest (n = 15), or where the purpose of the test was not primarily for the assessment of IHD (n = 8), leaving 200 tests for assessment.

All tests were performed to Bruce or modified Bruce protocols. Each test was reported according to prespecified criteria by each investigator who was blinded to the clinical report and to the reports of the other investigators. The criteria were:

  • Pre-test probability: high or low, depending on whether typical chest pain or a history of IHD was indicated on the request.

  • Symptoms: positive if typical chest or jaw pain occurred, borderline if isolated arm pain or atypical chest pain occurred, or negative.

  • Exercise time: compared against age and sex predicted values and considered excellent if this time was exceeded, good if ∼ 75–100% of predicted, moderate for ∼ 50–75%, and poor if ∼ < 50% of the predicted value.

  • Objective criteria: positive, borderline or negative. Positive criteria were: ≥ 2 mm planar or downsloping ST depression or T wave inversion in two or more contiguous ECG leads, ST elevation ≥ 1 mm in leads without previous Q waves, ≥ 5 beats of ventricular tachycardia, ventricular fibrillation, new left bundle branch block or a drop in systolic blood pressure ≥ 20 mm Hg. Borderline criteria were: T wave “pseudonormalisation”, otherwise significant ST segment changes in the presence existing ST segment abnormalities (for example, “left ventricular hypertrophy and strain” pattern), upsloping ST depression ≥ 3 mm, development of right bundle branch block, atrial fibrillation, supraventricular tachycardia or ventricular ectopy ≥ 10/min. Any other changes were considered negative. Objective criteria could be met during exercise or during the recovery phase.

  • Overall risk assessment: Classified on a six point scale: inconclusive; low; low/medium; medium; medium/high; and high risk. No specific predefined criteria were set for this; rather each investigator was allowed to use their own judgement in the report.

Investigators were selected to represent three standards of reporting: consultant level (AC); middle grade level (DM) which represents our normal reporting practice; and senior technical level (MJ, DS).

Comparisons were made between the reports of AC, representing the “gold standard”, and each of the other investigators' reports by a simple percentage measure of agreement, and secondly by calculating κ which adjusts for chance agreement. Where more than two choices were possible, a weighted κ (κW) calculation was performed, so that disagreement in observations by only one category (for example, low risk v low/medium risk) reduces κ by less than disagreement in observations by two categories or more (for example, low risk v high risk).


One hundred and fifteen men and 85 women (mean age 59.3 (11.6) years) performed the tests and the comparisons between observers are shown in table 1. For each category, κ or κW was greater than our predefined acceptable level of 0.5 and was mainly considerably higher. All of the observers (DM, MJ, DS) had similar levels of agreement with the “gold standard” and with each other (data not shown).

Table 1

Agreement between investigators' reports


Exercise testing is a common investigation, which is frequently supervised by experienced cardiac technicians. Our study demonstrates the ability of such technicians to report these tests to a high standard, when using prespecified end points for the reports.

Discrepancies in the observations between investigators may occur for a number of reasons. Firstly, the observers may disagree on the interpretation of the test per se. Secondly, with increased options for each category, the level of agreement is reduced because of smaller differences between consecutive categories (for example, low risk v medium/low risk). Thirdly, the interpretation of the predefined protocol for the test reports may be different between observers. For example, if the blood pressure drops by 19 mm Hg during the test, one observer may report a positive test, where another may report the test more strictly as negative. On retrospective review of the data, we believe the latter two reasons are more common in our study.

Exercise testing is a simple non-invasive investigation, which provides valuable information in the assessment of possible IHD and for postmyocardial infarction risk stratification. The exercise test may also be incorporated in protocols for assessment of possible myocardial ischaemia in patients presenting to emergency departments, which may shorten admission periods and reduce costs.3

A previous study evaluating the efficacy of technician supervised tests4 showed that physician and technician supervised tests produced similar levels of positive and negative test results. However, there was no formal comparison of the standards of doctor and technician reporting, which, as far as we know, has not previously been assessed.

In many centres, tests which are supervised by technicians are reported only by medical staff. Our study shows that appropriately trained experienced technicians can report exercise tests accurately, thus avoiding duplication of efforts by medical staff. We believe that such a system may be implemented by validating senior technical staff against a series of standard tests and allowing them to report tests initially under supervision, then independently. The continued high quality of reports should be assessed by regular audit.