BETWEEN-GROUP VARIANCE
(November 26, 2011)
Prefatory note: The Measuring Health Disparities, Scanlan’s Rule, and Mortality and Survival pages of this site are principally devoted to explaining that four standard measures of differences between outcome rates (proportions) – relative differences in adverse outcomes, relative differences in favorable outcomes, absolute differences between rates, and odds ratios – are not useful for appraising the comparative well being of two groups because each measure tends to be systematically affected by the overall prevalence of an outcome. This item is one of a number of items explaining how less standard measures that are functions of dichotomies are also affected by the overall prevalence of an outcome. See also the Concentration Index and Gini Coefficient sub-pages of the Measuring Health Disparities page.
In various places where I have questioned the wisdom of including perceptions about the size of healthcare disparities in pay-for-performance (P4P) programs, I have discussed Massachusetts’ inclusion of such perceptions in its Medicaid P4P program, noting that the program intended to measure disparities in terms of absolute differences between rates. And I have noted that because of the pattern whereby as uncommon outcomes become more common absolute differences will tend to increase while as common outcomes become even more common absolute differences will tend to decrease, for some outcomes higher levels of care will be associated with larger absolute differences and for others higher levels of care will be associated with smaller absolute differences. See generally the Pay for Performance subpage of the Measuring Health Disparities page. As made clear on many other pages of this site, the point is that the comparative size of absolute differences is not necessarily indicating anything meaningful about the comparative size of disparities in treatment.
A recent article in Health Affairs by Blustein et al.[i] discusses the disparities reduction aspects of the Massachusetts program and in doing so explains that after considering the absolute difference as a potential measure of healthcare disparities, Massachusetts decided to rely on the between-group variance (BGV). Because of the potential importance of that measure as a result of its use in the Massachusetts program, including the potential for its use in similar programs, the BGV warrants some discussion here. There are two parts to that discussion. The first involves the pattern by which BGV is affected by the overall prevalence of an outcome (something discussed on other sub-pages of MHD with respect to the Gini Coefficient and the Concentration Index and in Sections A.13 and A.13a of the Scanlan’s Rule page with respect to the phi coefficient and Cohen’s Kappa) and the implications of that pattern regarding healthcare disparities. These include the potential to increase healthcare disparities by causing resources to flow to hospitals that serve few minorities. The second involves the pattern by the BGV tends to vary according to the proportions advantaged and disadvantaged groups make up of the patient population and the implications of that pattern regarding healthcare disparities. These implications, too, include the potential to increase healthcare disparities by causing resources to flow to hospitals that serve few minorities.
A. Effects of Overall Care Levels on Between Group Variance
Table 1 below is based on Table 1 of 2006 British Society for Population Studies Presentation. It shows the pattern by which the BGV [ii] changes as the prevalence of the outcome changes in a setting involving normal underlying risk distributions where the means differ by half a standard deviation and where the advantaged and disadvantaged populations are of equal size.[iii] That pattern is similar to the pattern of absolute differences, which differences are included in the final column. That is, as the favorable outcome increases from being very rare to being fairly common, the BGV tends to increase; as the favorable outcome increases from being fairly common to being almost universal, the BGV tends to decrease. For some nuances of the patterns of changes in absolute differences see the introduction to the Scanlan’s Rule page.
Table 1: BGV for Favorable Outcome Rate in BSPS Table 1 [ref zz2325]
|
CutPoint
|
AGFavRt
|
DGFavRt
|
BGV
|
AbsDf
|
A 99
|
1.00%
|
0.24%
|
0.0014
|
0.0076
|
B 97
|
3.00%
|
0.87%
|
0.01138
|
0.02134
|
C 95
|
5.00%
|
1.62%
|
0.02859
|
0.03382
|
D 90
|
10.00%
|
3.75%
|
0.09753
|
0.06246
|
E 80
|
20.00%
|
9.01%
|
0.3018
|
0.10988
|
F 70
|
30.00%
|
15.39%
|
0.5339
|
0.14614
|
G 60
|
40.00%
|
22.66%
|
0.7514
|
0.17337
|
H 50
|
50.00%
|
30.85%
|
0.9165
|
0.19147
|
I 40
|
60.00%
|
40.52%
|
0.9491
|
0.19484
|
J 30
|
70.00%
|
50.80%
|
0.9218
|
0.19202
|
K 20
|
80.00%
|
63.31%
|
0.6966
|
0.16692
|
L 10
|
90.00%
|
78.23%
|
0.3463
|
0.11769
|
M 5
|
95.00%
|
87.29%
|
0.1488
|
0.07714
|
N 3
|
97.00%
|
91.62%
|
0.0723
|
0.05379
|
O 1
|
99.00%
|
96.56%
|
0.0015
|
.00242
|
Thus, as in the case of the absolute difference, for some outcomes higher levels of care will be associated with larger BGV values and for other outcomes higher levels of care will be associated with lower BGV values. But data in the Blustein article indicate that all groups’ rates for the types of care examined in the Massachusetts program (above 80% for all types of care combined) are well within ranges in which higher levels will be associated with lower BGV values, something that the Bluestein article in fact found.[iv]
One important implication of this associations warrants discussion. A 2010 Health Affairs article by Friedberg et al.[v] discussed the potential for pay-for-performance generally (i.e., without consideration of disparities issues) to steer resources away from more vulnerable communities and hence to increase disparities. The authors reasoned that hospitals with a high proportion of patients from vulnerable populations tend to have lower overall performance, which pattern plays a large role in healthcare disparities. Hence, rewarding high-performing hospitals will further disadvantage the already disadvantaged institutions that provide the bulk of care to vulnerable populations, with a potential for increasing healthcare disparities. The forces underlying the potential increases discussed by Friedberg et al. are unrelated to the measurement issues that I have raised (though such issues have to be considered in determining whether disparities have in fact increased or decreased).
In settings where higher levels of care are associated with lower BGV and lower proportions of minorities, the effort to tie pay-for-performance to the reduction of healthcare disparities will contribute further to the steering of resources away from institutions where minorities comprise a high proportion of patients. But this phenomenon contrasts sharply with that identified by Friedberg et al. in an important respect. The phenomenon identified in the Friedberg article involved an unfortunate consequence of a program most believe will promote general improvements in healthcare. The steering of resources arising from pattern of correlation of low BGV scores at high-performing institutions involves an aspect of a fundamentally flawed tool for measuring the size of disparities in treatment within institutions.
B. Effects of Minority Representation among Patients on Between-Group Variance
Blustein et al. noted an association between low minority representation among patients and low BGV. They attribute this pattern to the BGV calculation itself in a way that is unrelated to the distributionally-driven patterns discussed in Section A. Because, as discussed in Section A, the distributionally-driven patterns will tend to cause an association between low minority representation among patients and low BGV, it needs to be recognized that those patterns may in fact be partially or entirely a function of the distributional forces rather than aspect of the BGV calculation cited by Blustein et al. But there does exist potential for increases in minority representation in hospitals to directly increase the BVG in those hospitals (though not precisely for the reasons proffered by Blustein et al. and not universally) and, in any case, the BGV will be influenced by the composition of the patient population in a way that cannot be justified.
It first warrants note that typically where a measure of disparity is based on the comparison of a disadvantaged group’s rate with the rate of the entire population rather with the rate of an advantaged group (as, for example, with the standardized mortality ratio or the healthcare disparities index discussed in note vii infra) the larger the proportion the disadvantaged group comprises of the total population, the smaller will be the disparity, however measured. That occurs because the larger the proportion the disadvantaged group comprises of the total population the more the group’s rate will influence the overall rate and hence the less the group’s rate can depart from the overall rate. But while the BGV compares the disadvantaged group’s rate with the overall rate, it treats the advantaged group’s rate’s departure from the overall average the same as the disadvantaged group’s rate’s departure from the overall average (with each weighted according to the group’s proportion of the total population). And the larger the proportion the disadvantaged group comprises of the total population the greater is the degree to which the advantaged group’s rate may depart from the overall rate. That tends to nullify the typical result of comparing a disadvantaged group’s rate with that of the total population rather than with the rate of the advantaged group. Other forces then control the pattern of correlation between the proportion the disadvantaged group comprises of the total population and the measure of disparity.
In their Appendix Exhibit 3, Blustein et al. attribute the observed pattern of association between low BGV and low minority representation among patient to the fact that hospitals with fewer minorities will have fewer opportunities to provide care for minorities, then stating that less diverse hospitals appear to provide better care. The latter point might be deemed correct depending on how one describes diversity. But the size of the BGV only tends to increase as the disadvantaged group’s rate moves toward 50% of the total patient population. Thereafter, it will tend to decline.
That is, for any given pair of rates for the advantaged and disadvantaged group (for instant purposes being defined as the groups with rates of appropriate care that are higher than and lower than the overall average), the BGV will reach a maximum at the point where the two groups each comprise 50% of the patient population. But for any given pair of rates, the BGV will be the same, where for example, the advantaged group comprises two-thirds and the disadvantaged group comprises one-third of the patient population as when the advantaged group comprises one-third and the disadvantaged group comprises two-thirds of the population.
Table 2 illustrates this pattern with the hypothetical data from Hospital’s B and C in Blustein’s Table 1 of Appendix Exhibit 3. That table presented two situations where the minority appropriate care rate was 50% and the white appropriate care rate was 83.4%, which yielded a BGV of .025 where minorities comprised two thirds of the population and .015 where minorities comprised 9.1% of the population, as reflected in the rows for Hospitals B and C below.[vi] But, as reflected in the rows for Hospitals B-alt and C-alt, both situations yield the same BGV value when the proportions that minorities and whites comprise of the total population are reversed.
Table 2 Example of Impact of BGV of Different Minority Representations [ref b2325 a 2]
|
Hospital
|
MinCareRate
|
WhiteCareRate
|
MinorityRep
|
BGV
|
B
|
50.00%
|
83.33%
|
66.67%
|
.025
|
B-alt
|
50.00%
|
83.33%
|
33.33%
|
.025
|
C
|
50.00%
|
83.33%
|
16.67%
|
.015
|
C-alt
|
50.00%
|
83.33%
|
83.33%
|
.015
|
D
|
50.00%
|
83.33%
|
50.00%
|
.028
|
In the Massachusetts situation examined by Blustein et al., where minority representation tends generally to be low, disadvantaged groups may never have comprised more than 50% of a hospital’s patients. Thus, in the Massachusetts setting there may well be reason to expect that the larger a proportion the disadvantaged group comprises of the total population, the larger will be the BGV, with the consequence that, for reasons unrelated to the size of disparities in particular institutions as those disparities might be rationally measured, the effort to address within-hospital disparities will steer resources away from institutions that serve large numbers of minorities.
In many larger cities, however, minorities will comprise a majority of patients at some hospitals. And in such hospital, the more the minority representation exceeds 50%, the lower will tend to be the BGV. But whether the particular situation is one where disadvantaged groups, minority or otherwise, comprise less than 50% of total patients and (hence where the larger is the disadvantaged group representation the larger tends to be the BGV) or one where disadvantaged group comprise more than 50% of the total patients (hence where the larger is the disadvantaged group’s representation the smaller tends to be the BGV), the appraisal of the disparity will be influenced by something that has no rational justification.[vii]
A final word is in order about some singular aspects of the Massachusetts situation. Blustein et al. report overall statewide appropriate care rates of 91.4% for blacks, 89.6% for whites and 86.2% for Hispanics. These figures appear to have been derived without adjustment for any differences in the compositions of the types of care each group received or for hospital rates. Results like this without adjustment for hospital suggest that neither differences in performance across hospitals with different minority representations nor differences in disparities within hospitals are very great (though one must keep in mind that NCHS would likely find the 59% greater rate at which Hispanics failed to receive appropriate care to be substantial and the procedure I have recommended (see NACE 2011) would find a difference between means of .27 standard deviations). In any case, the generally high quality of hospitals in Massachusetts, particularly in the cities where minorities are principally located, suggests that variation in hospital quality may be smaller than in other jurisdictions. Similarly, issues concerning the steering of resources to high performing hospitals that serve few minorities may be less serious in Massachusetts than elsewhere.
[i] 1. Blustein J, Weissman JS, Ryan AM, et al.. Analysis raised question of whether pay-for-performance in Medicaid can efficiently reduce racial and ethnic disparities. Health Aff (Millwood) 2011;30(6):1165-1175.
[ii] The formula for calculating the BGV in the discussion herein is that set out in Appendix Exhibit 1 to the Blustein article. That is, BGV = ∑ (ni/di -N/D)2 (di/D), where:
ni = the number of successfully achieved opportunities for a given racial/ethnic group
di = the total number of eligible opportunities for a given racial/ethnic group
N = the total number of successfully achieved opportunities (for all groups)
D = the total number of eligible opportunities (for all groups).
[iii] The specification that the advantaged and disadvantaged populations are of equal size is solely for convenience and, while affecting the size of the BGV, does not affect the pattern of correlation between prevalence and the size of the BGV. The specification that the means differ by half a standard deviation may affect those patterns in a small way but only in the range where one group’s rate of experiencing some outcome is above 50% and the other’s is below 50% (as discussed with regard to the absolute difference in the introduction to the Scanlan’s Rule page).
[iv] While the extent to which different entities appraise disparities using measures that tend systematically to lead to different conclusions is not the main point of this item, I note that with the rates at issue in the Massachusetts setting the National Center for Health Statistics (which measures all disparities in terms of relative differences in adverse outcomes) and the Agency for Healthcare Research and Quality (which measures disparities in terms of the larger relative difference, which in this range tends to be the relative difference in the adverse outcome) would tend to yield different conclusions as to the comparative size of disparities different from those yielded by the BGV. The Centers for Disease Control and Prevention (which generally relies on absolute differences between rates) would tend to reach the same results as those yielded by the BGV. See Sections E.4 of the Measuring Health Disparities page and A.6 of the Scanlan’s Rule page. See also the 2007 APHA Presentation, 2011 NACE Presentation, and 2011 ICPHS Presentation.
[v] Friedberg MW, Safran DG, Coltin K et al. Paying for performance in primary care: Potential impact on practices and disparities. Health Aff (Millwood) 2010;29:926-932.
[vi] Table 2 in the text here collapses the figures for blacks and Hispanics in the Blustein table into one minority rate, which it is necessary to do in order to readily illustrate the implications of reversing the proportions advantaged and disadvantaged group’s make up of the total population. This collapsing, however, is only statistically sound where the appropriate care rates for all disadvantaged groups are the same (as was the case in the Blustein hypothetical data). It would not be sound where the disadvantaged group rates differ from one another because the BGV yields values when different rates of individual disadvantaged (or advantaged) groups are collapsed into one overall rate for the combined groups that are different from the values yielded when the group’s are kept separate.
[vii] See in the Comment on Siegel Q Manage Health Care 2009 my criticism of the healthcare disparities index proposed by Siegel et al., among other reasons, for its tying the disparity score to the number of minority patients. The index is intended to increase as the number of minorities increases. Assuming that the number of minorities correlated with the proportion they comprised of the patient population (rather than simply the size of the hospital), the index would increase across the range of minority representations among patients. Thus, in a pay-for-performance program that tied higher payments to lower index scores, the method proposed by Siegel et al. would seem more consistently (than the BGV) to decrease payments to institutions with high minority representations for reasons that had nothing to do with any meaningful indicator of equal treatment.
|