Channeling Bias With Propensityscoresanalysis
Channeling Bias With Propensityscoresanalysis
Channeling Bias With Propensityscoresanalysis
2 (2006) 143–151
Commentary
Addressing the issue of channeling
bias in observational studies
with propensity scores analysis
Francis S. Lobo, B.Pharm., M.S., Ph.D.a,*,
Samuel Wagner, R.Ph., Ph.D.b, Cynthia R. Gross, Ph.D.c,
Jon C. Schommer, R.Ph., M.S., Ph.D.d
a
Health Economics & Outcomes Research Group, East Hanover, NJ 07936, USA;
Novartis Pharmaceuticals, College of Pharmacy, University of Minnesota, MN 55455, USA
b
Customer Outcomes Research Group, Pfizer Inc, New York, NY 10017, USA
c
Department of Experimental and Clinical Pharmacology, College of Pharmacy,
University of Minnesota, MN 55455, USA
d
Graduate Studies in Social and Administrative Pharmacy, College of Pharmacy,
University of Minnesota, MN 55455, USA
Abstract
Randomized Clinical Trials (RCTs) remain the gold standard for determining the
utility of pharmaceuticals especially from a safety and efficacy standpoint. However,
restrictive entry criteria and stringent protocols can be barriers to generalizing RCT
findings to real world practices and outcomes. Observational studies overcome these
limitations of RCTs since they are representative of real world populations and prac-
tices. Nonetheless, attributing causality remains a major limitation in observational
studies, due to the non-random assignment of subjects to treatment. Non-random
assignment can lead to imbalances in risk-factors between the groups being com-
pared and thus bias the estimates of the treatment effect. Non-random assignment
can be particularly problematic in observational studies comparing older versus
newer pharmaceuticals from similar therapeutic classes due to the phenomenon of
* Corresponding author: One Health Plaza, East Hanover, NJ 07936, Tel.: 862 778 5218,
Fax: 973 781 2439.
E-mail address: [email protected] (F.S. Lobo).
1551-7411/$ - see front matter Ó 2006 Elsevier Inc. All rights reserved.
doi:10.1016/j.sapharm.2005.12.001
144 F.S. Lobo et al. / Research in
Social and Administrative Pharmacy 2 (2006) 143–151
channeling. Channeling occurs when drug therapies with similar indications are pref-
erentially prescribed to groups of patients with varying baseline prognoses. In this
manuscript we discuss the phenomenon of channeling and the use of a statistical
technique known an propensity scores analysis which potentially adjusts for the ef-
fects of channeling. During the course of this manuscript we discuss tests for deter-
mining the quality of the derived propensity score, various techniques for utilizing
propensity scores, and also the potential limitations of this technique. With the in-
creasing availability of high quality pharmaceutical and medical claims data for
use in observational studies, increased attention must be given to analytic techniques
that adjust optimally for non-random assignment and resulting channeling bias. For
research studies using observational study designs, propensity score analysis offers
a reasonable solution to address the limitation of non-random assignment, especially
when RCTs are too costly, time-consuming or not ethically feasible.
Ó 2006 Elsevier Inc. All rights reserved.
1. Introduction
Randomized clinical trials (RCTs) are the gold standard for establishing
causality in clinical research. The randomization procedure is a touchstone
that permits the outcome of interest to be causally linked to exposure in an
RCT. To a large extent it ensures that patient or sample characteristics will
be allocated among treatment groups in an unbiased manner and hence will
be uniformly distributed, if an optimal sample size is attained. This applies
especially to confounders, ie, factors modifying the exposure-outcome rela-
tionship by their dual correlation with both the exposure and outcome. The
main limitations of RCTs are practical barriers, which include time, cost,
and in certain instances ethical considerations, which make conducting an
RCT difficult or even impossible. Randomized clinical trials may addition-
ally be criticized for the lack of generalizability. Randomized clinical trials
typically measure efficacy and not the effectiveness of treatment. Narrow in-
clusion criteria and willingness to participate in research often ensure that
the characteristics of patients participating in the clinical trials are not rep-
resentative of the general population. Additionally, the RCT protocols spec-
ifying the use of the intervention may deviate substantially from general
practice thus undermining the generalizability of results.
Observational studies generally reflect real world practices with regards to
simulating behaviors of both physicians and patients. Furthermore,
observational studies, specifically those with retrospective database study
designs, tend to be relatively inexpensive and less time consuming than
RCTs. In particular, database studies making use of medical and pharmaceu-
tical claims data have gained popularity in the last decade because of the
wealth of information they may provide to decision makers. Apart from
clinical data reported from RCTs focusing on treatment efficacy, decision
F.S. Lobo et al. / Research in 145
Social and Administrative Pharmacy 2 (2006) 143–151
makers are interested in the real world outcomes of a resulting intervention,
ie, the effectiveness question. For such information they rely on results from
observational studies using data from enrolled plan members to assess the ef-
fectiveness of various interventions. Some of the advantages of such observa-
tional studies are that they are relatively inexpensive, easy to implement from
a time and process standpoint, and are reflective of real world practices.
However, results from such studies can be distorted by selection bias and
confounding. Selection bias refers to the way people are recruited for the
study or retained during the course of the study. Confounding is the influence
of extraneous variables related to both exposure and outcome. In observa-
tional, comparative studies selection bias and/or confounding may be re-
sponsible for part or all of the observed effects, the lack of observed effect,
or a reversal of the effect.1 This article describes the phenomenon of ‘‘chan-
neling’’ bias that results from selective prescribing of pharmaceuticals by
physicians. The article focuses on the use of a statistical procedure called
‘‘propensity scores analysis,’’ which offers a useful and relatively easy ap-
proach to address the issue of channeling bias in such observational studies.
A relatively easy way to adjust for a single confounder such as age is via
stratification and then estimating the outcome within each age strata.11 How-
ever, this technique is not feasible when there are multiple confounding vari-
ables. A practical technique for adjusting for several confounders at once in
observational studies is via the use of propensity scores. Propensity scores
are the ‘‘conditional probability of exposure to a treatment given observed co-
variates.’’11-13 Propensity scores represent the likelihood or probability of re-
ceiving one treatment over the other, based upon the combined influence of
selected independent predictors. Propensity scores range from 0 to 1. This pro-
cedure has 2 stages: the derivation stage, where the score is obtained through
various regressions models, and the adjustment stage, where the score is used
via different techniques to control the effects of channeling. In the derivation
stage, all the information from background covariates for each individual is
summarized into a propensity score. This process involves the use of statistical
methods such as logistic regression, discriminant analyses, or classification
trees. The dependent variable represents the treatment/exposure of interest,
and the predictor variable(s) is the background variable assumed to be related
to both the choice of therapy and the outcome of interest. The predictors
include patient’s age, gender, comorbidities, etc. Generally, in observational
studies, such as those using claims data, only certain variables that are re-
corded are available for use as predictors. However, there may be many other
confounding variables that are not recorded and hence not available for use.
This represents a limitation of the propensity scores analysis procedure.
148 F.S. Lobo et al. / Research in
Social and Administrative Pharmacy 2 (2006) 143–151
To assess the balance between treatment groups on the background cova-
riates, quintiles of the distribution of the propensity scores derived from the
logistic regression are constructed. Within each stratum of the propensity
score, the 2 treatment groups are compared on the background covariates.
In case of categorical covariates, chi-square tests, and for continuous vari-
ables, t tests can be used to test these differences. If the differences continue
to remain significantly different, the logistic regression for deriving propen-
sity scores must be revisited. Possible solutions for further improvement of
the model include the addition of interaction terms to the equation or
identification of additional background variables that are hypothesized to
impact treatment choice. This is again followed by the process of stratifying
by quintiles and testing for differences between treatment groups on back-
ground covariates within each stratum. One needs to bear in mind that
the primary purpose of this logistic regression is to predict propensity
scores. Some indication of model fit via like the ÿ2 log-likelihood ratio,
might be useful in interpretation of results; however, this has yet to be
demonstrated empirically. As alternative approach to assessing how ‘‘good’’
their scores were, Connors et al14 reported the area under the receiver oper-
ating characteristics (ROC) curve for the propensity scores analysis in their
study assessing the effectiveness of right heart catheterization. The ROC is
a measure of predictive accuracy and the level to which the propensity scores
classify subjects into their actual exposure groups. A high ROC value (closer
to 1) indicates a high level of discrimination between the 2 exposure groups.
The second stage of propensity scores analysis involves the use of propen-
sity scores in a variety of analytic methods such as matching, stratification,
or in regression models.15 Matching is a technique often used to equate in-
dividuals across treatment arms on the confounder of interest. The end result
of matching is a cohort comprised of subjects balanced on the confounder of
interest. Because a propensity scores analysis summarizes several of the
background covariates influencing treatment selection for each individual,
matching subjects across treatment cohorts on the propensity scores enables
the researcher to control for several background covariates simultaneously.
In the Connors et al14 study, subjects from the 2 treatment groups who had
propensity scores within a range of 0.03 were matched with each other. For
example, a subject with a score of 0.60 could be matched either with controls
having a score of 0.61, 0.62, or 0.63 on the upper end or with controls having
a score 0.59, 0.58, or 0.57 on the lower end, if a control with the exact same
score of 0.60 was not available. Rosenbaum and Rubin16 provide a compre-
hensive description of the possible matching strategies in their study.
Another manner in which propensity scores can be used is via stratifica-
tion, or subclassification of the treatment groups being compared on the
propensity scores. The idea is to directly compare subjects within a particular
stratum on the outcome of interest. For example, after stratifying the treat-
ment cohorts by quintiles of the propensity score, the outcome of interest
can be observed and compared between the treatment groups within each
F.S. Lobo et al. / Research in 149
Social and Administrative Pharmacy 2 (2006) 143–151
quintile. Rubin11 provides an excellent example of the utility of subclassifi-
cation on the propensity scores in a study comparing the impact of 2 proce-
dures (breast conservation vs mastectomy) on survival rates. Several other
studies using this particular methodology of propensity score subclassifica-
tion are available in the literature.17,18
The third method for addressing the issue of channeling bias via the use
of propensity scores is that of regression (covariance) adjusment.15 There
are 2 possible ways in which this particular methodology is used. The first
methodology uses a regression model with 2 predictor variables of the out-
come of interest: the treatment group and the propensity score. However,
a more popular way to use propensity scores in regression adjustment is
to use a subset of covariates and the propensity score in the regression equa-
tion. Studies using this methodology are those by Connors et al14 and Hylan
et al.19 Using the propensity score with a subset of covariates has an advan-
tage over regression models using a larger set of covariates for adjustment.
This is primarily because a parsimonious model with fewer predictors allows
the investigator to assess model fit more reliably as compared to larger
models.15 However, researchers must be careful while using this particular
technique because the estimated propensity scores can be very influential
on the estimated effects of the treatments in question. Hence, appropriate
diagnostics assessing the accuracy of the propensity score as well as the
resulting regression estimates must be used.20
Observational studies may, often times, not have any information on im-
portant confounders that may affect the exposure-outcome relationship. This
is frequently seen in studies using claims data, where the data were never col-
lected with the intent of conducting outcomes research. In the process of de-
riving a propensity score, only observed variables can be used. This is not
a possibility for unobserved variables, and hence controlling for these vari-
ables is not possible.11 In the event that the unobserved variables are correlated
with the observed variables used in generating the propensity score, what one
can do is cautiously assume that they have been adequately controlled.
Observational studies seeking to control the impact of channeling bias via
the use of propensity scores analysis need to have large sample sizes. Larger
sample sizes increase the probability of even distribution of observed cova-
riates among treatment groups when they are subclassified on the derived
propensity score. In the event the treatment groups are matched on the pro-
pensity score, there will be a reduction in sample size, especially in 1:1
matching. Hence, researchers need to be cognizant about this.
8. Conclusions
Acknowledgments
1. Collet J, Boivin J. Bias and confounding in pharmacoepidemiology. In: Strom BL, ed.
Pharmacoepidemiology. Chichester, West Sussex, England: John Wiley & Sons Ltd; 2000.
2. Feinstein AR. Clinical Epidemiology. Philadelphia, Pa: Saunders; 1985.
3. Urquhart J. ADR crisis managementdbefore and after. Scrip. 1989;1388:19–21.
4. Petri H, Urquhart J. Channeling bias in the interpretation of drug effects. Stat Med.
1991;10:577–581.
5. Petri H, Naus J, Urquhart J. Channeling of aerosol beta agonist and the interpretation of
a concomitant adverse event. J Clin Res Drug Dev. 1989;3:224–230.
6. Inman WH. Comparative study of five NSAIDs. PEM News. 1985;3:3–13.
7. O’Donnell TV, Rea HH, Holst PE, et al. Fenoterol and fatal asthma. Lancet. 1989;1:
1070–1071.
8. Spitzer WO, Buist AS. Case-control study of prescribed fenoterol and death from asthma in
New Zealand. Thorax. 1990;45:645–646.
9. Blais L, Ernst P, Suissa S. Confounding by indication and channeling over time: the risks of
beta 2-agonists. Am J Epidemiol. 1996;144:1161–1169.
10. Wolfe F, Flowers N, Burke TA, et al. Increase in lifetime adverse drug reactions, service
utilization, and disease severity among patients who will start COX-2 specific inhibitors:
quantitative assessment of channeling bias and confounding by indication in 6689 patients
with rheumatoid arthritis and osteoarthritis. J Rheumatol. 2002;29:1015–1022.
11. Rubin DB. Estimating causal effects from large data sets using propensity scores. Ann
Intern Med. 1997;127:757–763.
12. Rosenbaum PR, Rubin DB. The central role of the propensity score in observational
studies for causal effects. Biometrika. 1983;70:41–55.
13. Joffe MM, Rosenbaum PR. Invited commentary: propensity scores. Am J Epidemiol.
1999;150:327–333.
14. Connors AF, Speroff T, Dawson NV, et al. The effectiveness of right heart catheterization
in the initial care of critically ill patients. JAMA. 1998;18:889–897.
15. D’Agostino RB. Propensity score methods for bias reduction in the comparison of a treat-
ment to a non-randomized control group. Stat Med. 1998;17:2265–2281.
16. Rosenbaum PR, Rubin DB. Constructing a control group using multivariate matched
sampling methods that incorporate a propensity score. Am Stat. 1985;39:33–38.
17. Stone RA, Obrosky S, Singer DE, Kapoor WN, Fine MJ. Propensity score adjustment for
pretreatment differences between hospitalized and ambulatory patients with community
acquired pneumonia. Med Care. 1995;33:AS56–AS66.
18. Perkins SM, Tu W, Underhill MG, Zhou X, Murray MD. The use of propensity scores in
pharmacoepidemiology research. Pharmacoepidemiol Drug Saf. 2000;9:93–101.
19. Hylan TR, Crown WH, Meneades L, et al. TCA and SSRI anti-depressant selection
and health care costs in naturalistic setting: a multivariate analysis. J Affect Disord.
1998;47:71–79.
20. Newhouse JP, McClellan M. Econometrics in outcomes research: the use of instrumental
variables. Annu Rev Public Health. 1998;19:17–34.
21. Rubin DB. On principles for modeling propensity scores in medical research. Pharma-
coepidemiol Drug Saf. 2002;13:855–857.
22. McClellan M, McNeil BJ, Newhouse JP. Does more intensive treatment of acute myocar-
dial infarction reduce mortality? JAMA. 1994;272:859–866.
23. McClellan M, Newhouse JP. The marginal costs and benefits of medical technology: a panel
instrumental-variables approach. J Econ. 1997;77:39–64.
24. Zohoori N, Savitz DA. Econometric approaches to epidemiologic data: relating endogene-
ity and unobserved heterogeneity to confounding. Ann Epidemiol. 1997;7:251–257.