Article Text

## Abstract

**Objective** To establish a general method to estimate the measuring error in QT dispersion (QTD) determination, and to assess this error using a computer program for automated measurement of QTD.

**Subjects** Measurements were done on 1220 standard simultaneous 12 lead electrocardiograms.

**Design** The computer program was validated against two observers on a random subset of 100 electrocardiograms. Simple laws of physics require that at least five of the six extremity leads have the same QT duration. This allows the direct assessment of the error in measuring QTD derived from five extremity leads (QTD_{5}). It also enables ST-T amplitude dependent distributions of measurement error in determining QT duration to be established. These QT error distributions were then used to estimate the error in measuring QTD from all 12 leads (QTD_{12}).

**Main outcome measures** Mean and standard deviation of error in measuring QT duration, QTD_{5}, and QTD_{12}.

**Results** Performance of the program was comparable to that of observers. Errors in measuring QT duration (measured QT minus reference QT) fell from a mean (SD) of 6.9 (17.1) ms for ST-T amplitudes < 50 μV to −1.4 (6.3) ms for amplitudes > 350 μV. Measurement errors of QTD_{5} and QTD_{12} were 20.4 (11.5) ms and 29.4 (14.9) ms.

**Conclusions** The fact that no QTD can exist between five of the six extremity leads provides a means of estimating QTD measurement error. Measuring error of QT duration is dependent on ST-T amplitude. QTD measurement error is large compared with typical QTD values reported.

- automated ECG analysis
- QT measurement
- QT dispersion
- measurement error

## Statistics from Altmetric.com

QT dispersion (QTD) is defined as the difference in duration between the longest QT interval in any lead and the shortest, for a given set of electrocardiographic leads. QTD has been proposed as a sign of regional differences in cardiac repolarisation.1-14 Many papers on QTD have been published over the past decade.

The measurement of QTD, however, is not straightforward. The T wave tapers off more or less gradually and the lower the ST-T wave complex or the noisier the signal, the more erratic the determination of its end point. The problem is often “solved” by excluding leads with flat ST-T waves from analysis. How to deal with U waves is another problem. No wonder that poor measurement reproducibility, both within and between observers, has been reported.15-19

Measurement reproducibility, expressed in the difference between two measurements of QTD on the same electrocardiogram (ECG), should be distinguished from measurement error (inaccuracy), which is defined as the difference between actually measured QTD and the “true” value of QTD. A perfectly reproducible measurement—for example, as made by a computer program—may still have a large measurement error. (A perfectly accurate measurement, on the other hand, implies perfect reproducibility.) Thus knowledge of the inaccuracy of a “measuring device,” be it human or computer, would be helpful in ascertaining the usefulness of QTD. However, data on measurement inaccuracy cannot be provided because it seems impossible to establish the true reference value of QTD, given the impossibility to establish the “true” end of T waves.

In this paper, we describe a method to estimate the error in measuring QTD. Our approach consists of two steps. First, we argue that the internal consistency between leads, imposed by the physics of the electric circuitry of the lead system, requires that at least five of the six extremity leads have the same QT duration. Second, this observation will provide a method to estimate the measurement error in determining QTD from five extremity leads and from all 12 leads. We work out this second step quantitatively using a computer program for automatic measurement of QTD and a large database of ECGs.

## Methods

All measurements were made on a database of 1220 standard 12 lead ECGs, collected in the project “common standards for quantitative electrocardiography” (CSE).20 ,21 All leads of each ECG were recorded simultaneously at a sampling rate of 500 Hz during 8 or 10 seconds. The diagnostic classification of individual ECGs has not been released, but the overall composition of the database has been made public21: normal (382); left ventricular hypertrophy (183), right ventricular hypertrophy (55), biventricular hypertrophy (53); anterior myocardial infarction (170), inferior myocardial infarction (273), combined myocardial infarction (73); combined infarction and hypertrophy (31).

### COMPUTER MEASUREMENTS

For the processing of the data our ECG computer program MEANS (modular ECG analysis system)22 was used. Normally, MEANS determines common wave onsets and offsets for all 12 leads together on one representative averaged beat, using template matching techniques that have been described before,22
,23 and QTD is thus non-existent. The program was therefore adjusted to determine the end of T in each lead. Taking the location of the overall end of T as a starting point, the algorithm may find an end of T later than the overall end if the signal continues to decrease monotonically towards the baseline, until it recedes within a 50 μV noise band around the baseline. If no retrograde end of T is found, an antegrade end of T may be located where the signal leaves a band of ±15 to ±30 μV centred around the amplitude at overall end of T, its width depending on an estimate of the noise in the lead. If the peak to peak ST-T amplitude was less than 50 μV, the T wave was considered to be flat and the lead was excluded from further analysis. Finally, QTD was defined as the difference between the maximum and the minimum QT interval in the leads considered. For the purpose of comparison, we also computed QT_{c} dispersion (QT_{c}D) after correcting QT intervals for heart rate with Bazett’s formula (QT_{c} = QT/
).

For validation, QTD also had to be determined by human observation. Two observers independently marked the end of the T wave with the cursor on a high resolution computer screen. This position was stored in the computer. The start of the QRS was determined by the computer itself. ECG leads were magnified to 100 mm/s and 50 mm/mV. The observer was presented one lead at a time. To prevent the cursor position in one lead from biasing the observer in identifying the end of the T wave in the next lead, the cursor was reset to the boundary of the display window after each finished measurement.

### RELATION BETWEEN EXTREMITY LEADS

In the standard 12 lead ECG, only two of the six extremity leads are actually recorded, for instance leads I and II, and the other four are derived from mathematical relations imposed by the lead system. Thus for the amplitudes in the extremity leads at any time instant it holds that III = II − I, aVR = −(I + II)/2, aVL = I − II/2, and aVF = II − I/2. Of course if all T waves end at the same moment, QTD = 0. Suppose the T wave in one lead, say I, is shorter than in the other ones, ending at some time instant t_{1}. Then, lead I being zero, III = II, aVR = −II/2, aVL = −II/2, and aVF = II for t > t_{1}. This means that the T waves in leads II, III, aVR, aVL, and aVF must all end at the same moment, say t_{2}, namely where II becomes 0. With necessary changes the argument can be applied to any extremity lead other than I, Einthoven or augmented. It is always true that if there is a shortest T wave in one of the extremity leads ending at some time instant t_{1}, the T waves in the other five extremity leads must all end at the same time instant t_{2} > t_{1}. As a consequence, QTD cannot exist among these “longest leads.”

### MEASUREMENT ERROR OF QTD IN FIVE EXTREMITY LEADS

The relation between extremity leads suggests a general method to gauge the measurement error of man or computer in determining QTD from five extremity leads (QTD_{5}). After removal of the extremity lead with the shortest QT duration, the “true” QTD_{5} of the remaining leads is zero, as argued above. Any actually measured QTD_{5} then reflects the measuring error of the computer program or the human observer.

### MEASUREMENT ERROR OF QTD IN ALL LEADS

QTD is usually derived from all 12 leads (QTD_{12}). To determine the error in measuring QTD_{12} we would need to know the “true” QTD_{12}, but such a reference is not available. The chest leads are independent of each other: unlike the five limb leads, they may have different QT durations and there is no known “true” value of QTD. We therefore resort to an artifice. Suppose the true QT durations of all leads are equal, that is, true QTD_{12} = 0. Suppose also that the probability distribution of the error in measuring QT duration be given. We could then randomly draw 12 “errors” from this error distribution and consider each draw to represent the error in the measurement of QT duration for one lead. The difference between the largest and smallest of these 12 QT duration errors amounts to the estimated QTD_{12} and reflects the measurement error in one ECG, since the true QTD_{12} was taken to be 0.

The approach so far does not take into account that QT measurement error, and thus the error in QTD_{12}, is likely to be conditional on the ST-T amplitude of a lead: the lower the ST-T amplitude in a lead, the greater (on average) presumably the measurement error. The procedure could be refined if the error distribution were known, conditional on the ST-T amplitude: we could then draw 12 errors conditional on the 12 ST-T amplitudes of a given ECG to find an amplitude dependent error estimate for the measurement of QTD_{12}. For any number of ECGs, a set of error estimates can be obtained, and mean and standard deviation computed.

### DISTRIBUTION OF QT MEASUREMENT ERROR CONDITIONAL ON ST-T AMPLITUDE

To make the step from supposition to reality, we must have at our disposal a real QT error distribution. This distribution can be estimated by again using the fact that the five “longest” extremity leads must have the same QT duration. We postulate that the value of the “true” QT duration is equal to the median of the five longest actually measured QT durations (for our reasons for this choice, see the discussion). The differences between this “true” QT and the five measured QTs then reflect the five measurement errors. Taken over 1220 ECGs, this yields 6100 measurement errors that form an error distribution (the algorithm’s logic to exclude leads with peak to peak amplitudes less than 50 μV was temporarily turned off). To account for ST-T amplitude, we distinguished eight amplitude classes. Peak to peak ST-T amplitudes of the 6100 longest extremity leads were computed, and each lead was assigned to the corresponding amplitude class. For each class, the QT measurement errors of the leads in that class formed an error distribution.

## Results

### PROGRAM VALIDATION

The MEANS algorithms for the determination of overall onset of QRS and end of T have been validated before in the CSE study.24 The mean (SD) difference in QRS onset between the consensus opinion of a group of cardiologists and the computer program was −0.2 (3.4) ms. For the overall end of the T wave, the mean (SD) difference was 4.2 (12.4) ms.

To assess program performance in determining QTD, the computer results in a random sample of 100 ECGs taken from the CSE database were compared with those obtained by two observers as described in the methods. Mean (SD) difference in end of T determination was 23.7 (26.6) ms between observer A and computer, 14.1 (21.6) ms between observer B and computer, and 9.6 (19.6) ms between the two observers. Mean (SD) difference in QTD was 6.9 (27.3) ms between observer A and computer, 0.4 (28.4) ms between observer B and computer, and 6.5 (25.7) ms between the two observers. Mean (SD) interobserver difference in QT_{c}D was 6.7 (28.4) ms. Combining the data of both observers, the mean (SD) QTD difference with the computer program was 3.6 (28.0) ms. Figure 1 shows a Bland–Altman plot25 of the differences between automatic and manual QTDs. Mean (SD) QT_{c}D difference was 5.1 (29.3) ms.

### MEASUREMENT ERROR OF QTD IN FIVE EXTREMITY LEADS

For each of the 1220 ECGs, the program computed QTD for five extremity leads, after removal of the extremity lead with shortest QT. These QTDs had a mean (SD) of 20.4 (11.5) ms, which reflects the measuring error since the true QTD for the five longest extremity leads is zero.

### QT DURATION ERROR DISTRIBUTION CONDITIONAL ON ST-T AMPLITUDE

As explained in the methods section, the 1220 ECGs in our database provided 6100 measurement errors of QT duration, which were divided over eight error distributions corresponding with eight ST-T amplitude classes. Figure 2 shows the mean and standard deviation of the QT measurement errors (“true” QT minus measured QT) for each amplitude class. Errors fall from a mean (SD) of 6.9 (17.1) ms for amplitudes less than 50 μV to −1.4 (6.3) ms for amplitudes greater than 350 μV. (The negative sign indicates that, on average, the end of large T waves is found 1.4 ms later than the “true” QT duration.) Hence the lower the amplitude of the wave, the more erratic the identification of its end and the higher the measurement inaccuracy, as expected.

### MEASUREMENT ERROR OF QTD ESTIMATED FOR ALL LEADS

To estimate the error in QTD for all 12 leads (QTD_{12}), we used the ST-T amplitude in each lead of each of the 1220 ECGs as an indicator to determine from which error distributions to draw, as described in the methods section. For each ECG, a QT measurement error per lead was randomly drawn by the computer from the appropriate error distribution, and the error in QTD_{12} was computed. For statistical expedience, this procedure was repeated 10 times for each ECG. The mean (SD) of these 10 × 1220 QTD_{12} estimates was 29.4 (14.9) ms, which gives a realistic estimate of the error in measuring QTD for 12 leads. Mean (SD) for QT_{c}D_{12} was 31.5 (16.1) ms.

We also took the average and the latest of the measured T offsets in the five longest extremity leads, rather than the median, as the reference end of T. They carry their own QT error distributions. For average and latest location as the reference, mean (SD) of the estimated QTD_{12} measurement error was 27.7 (11.7) and 32.3 (13.3) ms, respectively.

## Discussion

### AUTOMATED MEASUREMENT OF QTD

Although automated measurement of QTD has been advocated as a way of reducing large within and between observer variations and of furthering the use of QTD in clinical practice,6
,26
,27few reports have appeared on the issue.15
,19
,28
,29Recently, McLaughlin *et al* assessed the performance of four automated QT measurement techniques, taking manual QT duration measurements of one observer as the reference.28
,29 The standard deviations of the differences in QT duration were of the order of 30 ms in 25 normal individuals, and up to 45 ms in 25 postinfarction patients.29 In our study of 100 cases, two thirds of which were pathological, the standard deviation of differences in QT duration between the (pooled) observers and the computer was 28.0 ms. Since we used a common QRS onset for all leads in determining QT durations, our results may be biased optimistically, though the bias is not likely to be great (see below).

In two other studies,15
,19 hard copy ECGs were scanned and stored in a digital format. The digital signals were subsequently processed by computer to determine QT_{c}D. Bhullar *et al* compared the performance of the computer algorithm with two human observers on a set of 112 ECGs.15 A mean difference (SD) for QT_{c}D of 3.2 (34.3) ms was reported. Glancy*et al* also compared the algorithm with two observers on another set of 70 ECGs.19 Here, the mean (SD) difference for QT_{c}D was 19.5 (38) ms. It should be noted that leads for which the algorithm grossly mismeasured the QT interval were excluded from their analysis.19

Our results compare favourably with these performance figures. We found a mean (SD) of the QT_{c}D difference between program and pooled data from two observers of 5.1 (29.3) ms. Moreover, these results are comparable with the interobserver variability (mean (SD) 6.7 (28.4) ms). We are therefore confident that the performance of our program is comparable with that of human observers.

As to the practical application of automated measurement, our program takes less than two seconds to process one ECG on a standard Pentium PC. This makes the program very suited for use in a clinical environment or large scale epidemiological studies.

### METHODOLOGY TO GAUGE THE ERROR IN QTD MEASUREMENT

While the difficulties in measuring QTD have received ample attention, quantification of the measurement error in determining QTD has not been undertaken. This is easily explained by the absence of a reference location for the end of the T wave. The method we propose, using the physical relation between the extremity leads, enables us to determine such a reference value.

Several comments on the methodology can be made.

- (1)
- Removal of the extremity lead with shortest QT may bias the inaccuracy estimate for two reasons. One reason is that, owing to measurement error, the shortest QT may have been measured in a lead other than the one with the truly shortest QT, in which case the wrong QT measurement is removed. The other reason is that all six extremity leads may have had the same “true” QT interval, in which case no measurement should have been removed. In both situations, removal of the shortest QT will result in an optimistically biased error estimate.
- (2)
- We estimated the “true” end of the T wave by taking the median of the measured end of T of the five extremity leads. The median was chosen for its robustness against outliers and because we considered it likely that the “true” end of T would be located among the measured T offsets. Other choices, such as the average or the latest location, are also legitimate. Their QTD measurement errors were comparable with the one obtained with the median.
- (3)
- We did not explicitly take into account the problem of U waves in the precordial leads. The presence of U waves, however, would only seem to increase measurement errors, further implying that our present estimates are likely to be conservative.
- (4)
- The magnitude of the error in measuring QT duration in a given lead was found to be dependent on the peak to peak ST-T amplitude in that lead. This dependence might be caused by the particular detection algorithm we used, and it is conceivable that other algorithms would not show such a dependence. In that case, our method of estimating the measurement error remains valid and would in fact be simplified because the QT distributions pertaining to different ST-T amplitudes would coincide.
- (5)
- Another objection could be that, in dividing the set of “longest extremity leads” in ST-T amplitude classes, the amplitudes greater than 350 μV were lumped into one class (fig 2). ST-T amplitudes greater than 700 μV rarely occur in the extremity leads, in contrast to the precordial leads where the average amplitude is about twice as large. The QT duration error distribution of the > 350 μV class as a whole might lend a pessimistic bias when applied to cases with much higher ST-T amplitudes. To verify this we set the QT duration error to 0 ms for all ST-T amplitudes greater than 700 μV and repeated the experiment. This resulted in a mean (SD) QTD
_{12}of 29.0 (15.1) ms, only slightly different from the original estimate of 29.4 (14.9) ms. Thus the perhaps overlarge error imparted to the QT duration estimate of large ST-T amplitudes hardly affects the overall measurement error of QTD_{12}. - (6)
- The artifice of assuming all QT durations to be equal, and thereby making QTD
_{12}= 0, is not essential for our method. One could in fact postulate any set of 12 QT durations and take the corresponding QTD_{12}≠ 0 as the reference. The same procedure to obtain an estimated QTD_{12}could then be applied and the measurement error determined.

### MAGNITUDE OF QTD MEASUREMENT ERROR

We found a QTD measurement error using our program of 29.4 (14.9) ms. This error must be seen in relation to typical QTD values reported in the literature. Mean (SD) QTD measurements for different patient groups showed large variation, ranging from 38 (13) ms10 to 94 (41) ms.9 Control groups generally presented much smaller QTDs, varying between 30 (10) ms8 and 43 (12) ms.4 Related to these figures, a mean (SD) measurement error of 29.4 (14.9) ms, as estimated by us, is extremely large.

We could not test whether the QTD measurement error differed between diagnostic categories since the classification of individual ECGs in the CSE database has not been revealed. In a study by McLaughlin*et al*,29 QT reproducibility error (the difference between automatically and manually measured QT) was shown to vary over different categories, and a relation between T wave amplitude and QT difference was suggested. We found a clear relation between QT measurement error and ST-T amplitude. The effect on QTD measurement error, however, is complicated and requires further investigation.

We used a single, overall QRS onset for all leads as the beginning of the QT intervals. In manual QTD measurement, however, QRS onsets are often measured in single leads. Cowan *et al* showed that QTD is mainly the result of variation in the end of the T wave rather than in the onset of the QRS complex.30 Since our program is also capable of measuring lead dependent QRS onsets, we repeated our experiments to quantify the effect. When using lead dependent onsets, we found a mean (SD) measurement error for QTD_{12} of 32.8 (14.7) ms, as opposed to 29.4 (14.9) ms when using the overall onset. Thus measuring the onset of the QRS complex in single leads slightly increases QTD measurement error, as could be expected.

### LEAD SELECTION FOR QTD MEASUREMENT

If one wishes to perform QTD measurements the matter of lead selection should be considered. If QT intervals can be determined in all six extremity leads, the shortest QT is taken as one measurement and the median of the other five as another. This median QT is considered to be an estimate of the “true” QT interval that must underlie at least five of the six extremity leads, as argued above. QTD is then derived from eight QT interval measurements, two from the extremity leads and the other six from the chest leads. If QT intervals cannot be measured in one or more extremity leads, presumably because of flat T waves, we consider it most likely that the one short QT was present in the excluded leads, and recommend the use of the median of the remaining QT intervals as one measurement. QTD is then derived from seven QT interval measurements, one from the extremity leads and six from the precordial leads. This procedure assumes that one overall onset of the QRS complex is used in QT interval measurement. If not, our recommendation would—strictly speaking—only pertain to the measurement of the end of the T wave.

### INTRINSIC VALUE OF QTD

Our primary goal was to explore the solidity of QTD measurement, not to discuss its intrinsic value. It is noteworthy, however, that in current reports on QTD, extremity leads are often used for QTD determination without acknowledging the fact that no QTD can exist between five of the six extremity leads. Even more surprising, information about local cardiac action potential properties is thought to be obtainable from one exploring electrode, without realisation of the bipolar nature of every electrocardiographic lead, the precordial leads included. (The central terminal by no means constitutes a zero of potential.) What is measured are potential differences between two lead electrodes (the central terminal acts as a single electrode)—that is, lead voltage. The fact that the T wave in a lead becomes zero before the T waves in other leads does not mean that the electrode in that lead designated as exploring has gained zero potential, nor even that the two lead electrodes both have zero potential—it only means that the potentials are equal. (This is equivalent to saying that the electrical heart vector has become perpendicular to the lead axis.) Therefore, the QT duration in a lead does not allow inferences about the duration of potential in the electrodes and is of no help in obtaining information about action potential durations in the heart. Since in many studies, including our own,31 a discriminative or predictive value of QTD has been reported, we think this probably reflects T wave properties—that is, amplitude and axis—which will need further elucidation.