Channeling Bias With Propensityscoresanalysis

Download as pdf or txt
Download as pdf or txt
You are on page 1of 9

Research in Social and Administrative Pharmacy

2 (2006) 143–151

Commentary
Addressing the issue of channeling
bias in observational studies
with propensity scores analysis
Francis S. Lobo, B.Pharm., M.S., Ph.D.a,*,
Samuel Wagner, R.Ph., Ph.D.b, Cynthia R. Gross, Ph.D.c,
Jon C. Schommer, R.Ph., M.S., Ph.D.d
a
Health Economics & Outcomes Research Group, East Hanover, NJ 07936, USA;
Novartis Pharmaceuticals, College of Pharmacy, University of Minnesota, MN 55455, USA
b
Customer Outcomes Research Group, Pfizer Inc, New York, NY 10017, USA
c
Department of Experimental and Clinical Pharmacology, College of Pharmacy,
University of Minnesota, MN 55455, USA
d
Graduate Studies in Social and Administrative Pharmacy, College of Pharmacy,
University of Minnesota, MN 55455, USA

Abstract

Randomized Clinical Trials (RCTs) remain the gold standard for determining the
utility of pharmaceuticals especially from a safety and efficacy standpoint. However,
restrictive entry criteria and stringent protocols can be barriers to generalizing RCT
findings to real world practices and outcomes. Observational studies overcome these
limitations of RCTs since they are representative of real world populations and prac-
tices. Nonetheless, attributing causality remains a major limitation in observational
studies, due to the non-random assignment of subjects to treatment. Non-random
assignment can lead to imbalances in risk-factors between the groups being com-
pared and thus bias the estimates of the treatment effect. Non-random assignment
can be particularly problematic in observational studies comparing older versus
newer pharmaceuticals from similar therapeutic classes due to the phenomenon of

* Corresponding author: One Health Plaza, East Hanover, NJ 07936, Tel.: 862 778 5218,
Fax: 973 781 2439.
E-mail address: [email protected] (F.S. Lobo).

1551-7411/$ - see front matter Ó 2006 Elsevier Inc. All rights reserved.
doi:10.1016/j.sapharm.2005.12.001
144 F.S. Lobo et al. / Research in
Social and Administrative Pharmacy 2 (2006) 143–151
channeling. Channeling occurs when drug therapies with similar indications are pref-
erentially prescribed to groups of patients with varying baseline prognoses. In this
manuscript we discuss the phenomenon of channeling and the use of a statistical
technique known an propensity scores analysis which potentially adjusts for the ef-
fects of channeling. During the course of this manuscript we discuss tests for deter-
mining the quality of the derived propensity score, various techniques for utilizing
propensity scores, and also the potential limitations of this technique. With the in-
creasing availability of high quality pharmaceutical and medical claims data for
use in observational studies, increased attention must be given to analytic techniques
that adjust optimally for non-random assignment and resulting channeling bias. For
research studies using observational study designs, propensity score analysis offers
a reasonable solution to address the limitation of non-random assignment, especially
when RCTs are too costly, time-consuming or not ethically feasible.
Ó 2006 Elsevier Inc. All rights reserved.

Keywords: Observational studies; Channeling bias; Propensity scores

1. Introduction

Randomized clinical trials (RCTs) are the gold standard for establishing
causality in clinical research. The randomization procedure is a touchstone
that permits the outcome of interest to be causally linked to exposure in an
RCT. To a large extent it ensures that patient or sample characteristics will
be allocated among treatment groups in an unbiased manner and hence will
be uniformly distributed, if an optimal sample size is attained. This applies
especially to confounders, ie, factors modifying the exposure-outcome rela-
tionship by their dual correlation with both the exposure and outcome. The
main limitations of RCTs are practical barriers, which include time, cost,
and in certain instances ethical considerations, which make conducting an
RCT difficult or even impossible. Randomized clinical trials may addition-
ally be criticized for the lack of generalizability. Randomized clinical trials
typically measure efficacy and not the effectiveness of treatment. Narrow in-
clusion criteria and willingness to participate in research often ensure that
the characteristics of patients participating in the clinical trials are not rep-
resentative of the general population. Additionally, the RCT protocols spec-
ifying the use of the intervention may deviate substantially from general
practice thus undermining the generalizability of results.
Observational studies generally reflect real world practices with regards to
simulating behaviors of both physicians and patients. Furthermore,
observational studies, specifically those with retrospective database study
designs, tend to be relatively inexpensive and less time consuming than
RCTs. In particular, database studies making use of medical and pharmaceu-
tical claims data have gained popularity in the last decade because of the
wealth of information they may provide to decision makers. Apart from
clinical data reported from RCTs focusing on treatment efficacy, decision
F.S. Lobo et al. / Research in 145
Social and Administrative Pharmacy 2 (2006) 143–151
makers are interested in the real world outcomes of a resulting intervention,
ie, the effectiveness question. For such information they rely on results from
observational studies using data from enrolled plan members to assess the ef-
fectiveness of various interventions. Some of the advantages of such observa-
tional studies are that they are relatively inexpensive, easy to implement from
a time and process standpoint, and are reflective of real world practices.
However, results from such studies can be distorted by selection bias and
confounding. Selection bias refers to the way people are recruited for the
study or retained during the course of the study. Confounding is the influence
of extraneous variables related to both exposure and outcome. In observa-
tional, comparative studies selection bias and/or confounding may be re-
sponsible for part or all of the observed effects, the lack of observed effect,
or a reversal of the effect.1 This article describes the phenomenon of ‘‘chan-
neling’’ bias that results from selective prescribing of pharmaceuticals by
physicians. The article focuses on the use of a statistical procedure called
‘‘propensity scores analysis,’’ which offers a useful and relatively easy ap-
proach to address the issue of channeling bias in such observational studies.

2. Description of channeling bias

Channeling is a type of selection bias often seen in observational studies


comparing older vs newer drugs from similar therapeutic classes.
Channeling occurs when drug therapies with similar indications, either
self-selected or clinically assigned, are prescribed to groups of patients with
varying baseline prognoses.2 Drugs with similar therapeutic actions may be
launched into the market at different times. This creates a situation where
drugs entering the market at a later stage may have a higher likelihood of
being prescribed to patients who have not responded well to existing agents.
The promotional claims for the newer drug often conjure up the profile of
a patient most suitable and more likely to benefit from the newer therapy,
leading to channeling. Thus, recipients of the older drug and newer drug will
differ in terms of their baseline morbidities. Hence, channeling ultimately
leads to the selective prescription of drugs from the same therapeutic class
to groups of patients differing in their susceptibility to problems or special
preexisting morbidities.3-5
At the launch of a newer drug, manufacturers seek to differentiate their
product from the competition generally in terms of better efficacy claims
and/or side effect profiles, if any. Promotional claims of a better side-effect
profile might lead physicians to prescribe the new drug for patients who
have experienced undesirable side effect(s) from the use of existing drugs.
In case of drugs claiming better efficacy, physicians might prescribe them
to patients who have failed on prior therapies. Thus, the therapeutic agent
entering the market at a later stage often ends up being selectively prescribed
to patients with a more severe clinical profile.
146 F.S. Lobo et al. / Research in
Social and Administrative Pharmacy 2 (2006) 143–151
3. Case studies for channeling bias

Inman provides a case study documenting channeling for Osmosin, a


controlled-release form of indomethacin, using the prescription event moni-
toring method (looking at drug safety in a longitudinal patient database).6
Osmosin was strongly promoted with claims of fewer gastrointestinal events
during its launch into the market. However, the drug was withdrawn on
the basis of reports suggesting an unexpectedly high occurrence of gastrointes-
tinal ulcerations, bleedings, or perforations. Inman conducted a prescription
event monitoring study, which revealed a much higher prevalence of gastroin-
testinal complaints among patients long after the end of use of Osmosin. Based
on these findings Inman concluded that because Osmosin was promoted on
the basis of claims for fewer side effects, it was more likely to be prescribed
to patients who had a higher propensity for developing these side effects.6
Several studies reported that fenoterol, a beta agonist, was associated with
a higher rate of asthma mortality.7,8 Petri et al assessed the impact of chan-
neling of fenoterol in relation to concomitant adverse events. They found that
among the 3 aerosol beta agonists (ie, salbutamol, terbutaline, and fenoterol),
physicians channeled patients with a higher severity of asthma to fenoterol.
They concluded that this channeling might have led to a noncausal linkage
of fenoterol to consequences of severe asthma.5 Blais et al9 reported similar
findings for fenoterol vs salbutamol. Preferential prescribing for fenoterol
had occurred among users of salbutamol who showed signs of increased se-
verity or uncontrolled asthma. In contrast, the switch from inhaled fenoterol
to salbutamol was minimally related to asthma severity.9
Wolfe et al studied treatment practices by rheumatologists in a population
of arthritis patients (both osteoarthritis and rheumatoid arthritis). The ob-
jective was to compare underlying differences in severity between patients
who were treated with cyclooxygenase-2-specific inhibitors and those who
did not receive these agents indicated for treatment of arthritis. This study
was conducted in 433 practices across the United States and comprised
6637 patients. Their findings indicated channeling of patients with a greater
lifetime history of adverse reactions of all kinds (particularly gastrointestinal
adverse drug reactions) to the cyclooxygenase-2-specific inhibitors. Addi-
tionally, these patients had higher level of severity in terms of their pain,
functional disability, fatigue, helplessness, and global severity, accompanied
by increased use of inpatient and outpatient services.10

4. Effects of channeling bias

As a consequence of channeling there may be an imbalance in the distri-


bution of covariates in the treatment cohorts being compared. One cohort
may have subjects with a higher proportion of risk factors for the outcome
of interest as compared to the other cohort. The former treatment cohort
F.S. Lobo et al. / Research in 147
Social and Administrative Pharmacy 2 (2006) 143–151
subsequently ends up reporting worse outcomes of interest as compared to
the later cohort. This may be attributed erroneously to drug characteristics
rather than underlying patient characteristics. The imbalanced baseline
covariates confound the observed treatment impact, causing potential bias
in the estimates of treatment effect.
Hence, it becomes vital to balance treatment groups being compared for
confounding variables, essentially mimicking the situation of an RCT. One
way to adjust for a confounding variable such as age would be to use statis-
tical methods like regression analysis. However, Rubin11 cautions against
the standard use of such statistical procedures. He makes the case that re-
gression diagnostics do not include a comprehensive comparison of the dis-
tributions of the confounders across the treatment groups. In situations
where treatment groups lack overlap in terms of a crucial confounder, even
large databases lack the ability to support causality between the treatment
and outcome relationship. The situation is exacerbated when several
imbalanced covariates are involved. This occurs due to the accumulation
of several small differences into a substantially large overall difference.

5. Adjustment of channeling bias via use of propensity scores analysis

A relatively easy way to adjust for a single confounder such as age is via
stratification and then estimating the outcome within each age strata.11 How-
ever, this technique is not feasible when there are multiple confounding vari-
ables. A practical technique for adjusting for several confounders at once in
observational studies is via the use of propensity scores. Propensity scores
are the ‘‘conditional probability of exposure to a treatment given observed co-
variates.’’11-13 Propensity scores represent the likelihood or probability of re-
ceiving one treatment over the other, based upon the combined influence of
selected independent predictors. Propensity scores range from 0 to 1. This pro-
cedure has 2 stages: the derivation stage, where the score is obtained through
various regressions models, and the adjustment stage, where the score is used
via different techniques to control the effects of channeling. In the derivation
stage, all the information from background covariates for each individual is
summarized into a propensity score. This process involves the use of statistical
methods such as logistic regression, discriminant analyses, or classification
trees. The dependent variable represents the treatment/exposure of interest,
and the predictor variable(s) is the background variable assumed to be related
to both the choice of therapy and the outcome of interest. The predictors
include patient’s age, gender, comorbidities, etc. Generally, in observational
studies, such as those using claims data, only certain variables that are re-
corded are available for use as predictors. However, there may be many other
confounding variables that are not recorded and hence not available for use.
This represents a limitation of the propensity scores analysis procedure.
148 F.S. Lobo et al. / Research in
Social and Administrative Pharmacy 2 (2006) 143–151
To assess the balance between treatment groups on the background cova-
riates, quintiles of the distribution of the propensity scores derived from the
logistic regression are constructed. Within each stratum of the propensity
score, the 2 treatment groups are compared on the background covariates.
In case of categorical covariates, chi-square tests, and for continuous vari-
ables, t tests can be used to test these differences. If the differences continue
to remain significantly different, the logistic regression for deriving propen-
sity scores must be revisited. Possible solutions for further improvement of
the model include the addition of interaction terms to the equation or
identification of additional background variables that are hypothesized to
impact treatment choice. This is again followed by the process of stratifying
by quintiles and testing for differences between treatment groups on back-
ground covariates within each stratum. One needs to bear in mind that
the primary purpose of this logistic regression is to predict propensity
scores. Some indication of model fit via like the ÿ2 log-likelihood ratio,
might be useful in interpretation of results; however, this has yet to be
demonstrated empirically. As alternative approach to assessing how ‘‘good’’
their scores were, Connors et al14 reported the area under the receiver oper-
ating characteristics (ROC) curve for the propensity scores analysis in their
study assessing the effectiveness of right heart catheterization. The ROC is
a measure of predictive accuracy and the level to which the propensity scores
classify subjects into their actual exposure groups. A high ROC value (closer
to 1) indicates a high level of discrimination between the 2 exposure groups.
The second stage of propensity scores analysis involves the use of propen-
sity scores in a variety of analytic methods such as matching, stratification,
or in regression models.15 Matching is a technique often used to equate in-
dividuals across treatment arms on the confounder of interest. The end result
of matching is a cohort comprised of subjects balanced on the confounder of
interest. Because a propensity scores analysis summarizes several of the
background covariates influencing treatment selection for each individual,
matching subjects across treatment cohorts on the propensity scores enables
the researcher to control for several background covariates simultaneously.
In the Connors et al14 study, subjects from the 2 treatment groups who had
propensity scores within a range of 0.03 were matched with each other. For
example, a subject with a score of 0.60 could be matched either with controls
having a score of 0.61, 0.62, or 0.63 on the upper end or with controls having
a score 0.59, 0.58, or 0.57 on the lower end, if a control with the exact same
score of 0.60 was not available. Rosenbaum and Rubin16 provide a compre-
hensive description of the possible matching strategies in their study.
Another manner in which propensity scores can be used is via stratifica-
tion, or subclassification of the treatment groups being compared on the
propensity scores. The idea is to directly compare subjects within a particular
stratum on the outcome of interest. For example, after stratifying the treat-
ment cohorts by quintiles of the propensity score, the outcome of interest
can be observed and compared between the treatment groups within each
F.S. Lobo et al. / Research in 149
Social and Administrative Pharmacy 2 (2006) 143–151
quintile. Rubin11 provides an excellent example of the utility of subclassifi-
cation on the propensity scores in a study comparing the impact of 2 proce-
dures (breast conservation vs mastectomy) on survival rates. Several other
studies using this particular methodology of propensity score subclassifica-
tion are available in the literature.17,18
The third method for addressing the issue of channeling bias via the use
of propensity scores is that of regression (covariance) adjusment.15 There
are 2 possible ways in which this particular methodology is used. The first
methodology uses a regression model with 2 predictor variables of the out-
come of interest: the treatment group and the propensity score. However,
a more popular way to use propensity scores in regression adjustment is
to use a subset of covariates and the propensity score in the regression equa-
tion. Studies using this methodology are those by Connors et al14 and Hylan
et al.19 Using the propensity score with a subset of covariates has an advan-
tage over regression models using a larger set of covariates for adjustment.
This is primarily because a parsimonious model with fewer predictors allows
the investigator to assess model fit more reliably as compared to larger
models.15 However, researchers must be careful while using this particular
technique because the estimated propensity scores can be very influential
on the estimated effects of the treatments in question. Hence, appropriate
diagnostics assessing the accuracy of the propensity score as well as the
resulting regression estimates must be used.20

6. Other potential methods for addressing channeling

Another method for addressing the issue of imbalance in confounders


among treatment cohorts being compared is that of the instrumental vari-
ables approach. The instrumental variables approach is a 2-stage approach.
In the first stage, an instrumental variable related to the treatment assign-
ment is identified or constructed. An instrumental variable is defined as
a variable that (a) influences the treatment selection and (b) affects the out-
come only through its relationship with the assigned treatment. Any infor-
mation related to the treatment assignment that the instrumental variable
contains is independent of the response. In the second stage, various
methods involving the instrumental variable such as ordinary least-squares
regression are used to obtain the estimated treatment effect.20-24 This ap-
proach essentially results in a test of the treatment effect adjusted for patient
characteristics. By substituting the original nonrandom treatment assign-
ment with the information contained in the instrumental variable, one as-
sumes they have simulated the condition of unbiased random allocation.
A serious limitation of this approach is that it may often be difficult to iden-
tify an instrumental variable among the covariates collected, particularly in
claims database research where there is no emphasis on collection of
variables pertinent only to the treatment assignment. Most background
150 F.S. Lobo et al. / Research in
Social and Administrative Pharmacy 2 (2006) 143–151
variables, such as age, gender, or comorbidities, will violate the assumption
of no relationship to outcome. Hence, propensity scores analysis may poten-
tially offer a much simpler and efficient method to address the issue of chan-
neling bias in observational studies.

7. Limitations of propensity scores

Observational studies may, often times, not have any information on im-
portant confounders that may affect the exposure-outcome relationship. This
is frequently seen in studies using claims data, where the data were never col-
lected with the intent of conducting outcomes research. In the process of de-
riving a propensity score, only observed variables can be used. This is not
a possibility for unobserved variables, and hence controlling for these vari-
ables is not possible.11 In the event that the unobserved variables are correlated
with the observed variables used in generating the propensity score, what one
can do is cautiously assume that they have been adequately controlled.
Observational studies seeking to control the impact of channeling bias via
the use of propensity scores analysis need to have large sample sizes. Larger
sample sizes increase the probability of even distribution of observed cova-
riates among treatment groups when they are subclassified on the derived
propensity score. In the event the treatment groups are matched on the pro-
pensity score, there will be a reduction in sample size, especially in 1:1
matching. Hence, researchers need to be cognizant about this.

8. Conclusions

Faced with the constant pressure to control increasing pharmacy budgets,


decision makers constantly seek to evaluate the clinical and economic benefits
of newer pharmaceuticals claiming advantages over drugs existing on the for-
mulary. This pressure combined with the availability of enormous health care
claims databases (Medicare, Medicaid, Health Maintenance Organizations,
etc) is likely to increase the use of observational studies to assess the clinical
and economic benefits drug therapies have to offer. Benefits that might be ev-
ident from RCTs will be obscured or even eliminated in the event of channel-
ing of patients presenting higher levels of disease severity to newer therapies.
The need to understand and control for channeling bias is critical, because
decision makers increasingly use pharmacy and medical claims databases to
assess the clinical and economic outcomes of drug therapies.

Acknowledgments

During the course of the preparation of this article, Francis Lobo’s


graduate education at the University of Minnesota was supported through
an outcomes research fellowship by the Pharmacia Corporation.
F.S. Lobo et al. / Research in 151
Social and Administrative Pharmacy 2 (2006) 143–151
References

1. Collet J, Boivin J. Bias and confounding in pharmacoepidemiology. In: Strom BL, ed.
Pharmacoepidemiology. Chichester, West Sussex, England: John Wiley & Sons Ltd; 2000.
2. Feinstein AR. Clinical Epidemiology. Philadelphia, Pa: Saunders; 1985.
3. Urquhart J. ADR crisis managementdbefore and after. Scrip. 1989;1388:19–21.
4. Petri H, Urquhart J. Channeling bias in the interpretation of drug effects. Stat Med.
1991;10:577–581.
5. Petri H, Naus J, Urquhart J. Channeling of aerosol beta agonist and the interpretation of
a concomitant adverse event. J Clin Res Drug Dev. 1989;3:224–230.
6. Inman WH. Comparative study of five NSAIDs. PEM News. 1985;3:3–13.
7. O’Donnell TV, Rea HH, Holst PE, et al. Fenoterol and fatal asthma. Lancet. 1989;1:
1070–1071.
8. Spitzer WO, Buist AS. Case-control study of prescribed fenoterol and death from asthma in
New Zealand. Thorax. 1990;45:645–646.
9. Blais L, Ernst P, Suissa S. Confounding by indication and channeling over time: the risks of
beta 2-agonists. Am J Epidemiol. 1996;144:1161–1169.
10. Wolfe F, Flowers N, Burke TA, et al. Increase in lifetime adverse drug reactions, service
utilization, and disease severity among patients who will start COX-2 specific inhibitors:
quantitative assessment of channeling bias and confounding by indication in 6689 patients
with rheumatoid arthritis and osteoarthritis. J Rheumatol. 2002;29:1015–1022.
11. Rubin DB. Estimating causal effects from large data sets using propensity scores. Ann
Intern Med. 1997;127:757–763.
12. Rosenbaum PR, Rubin DB. The central role of the propensity score in observational
studies for causal effects. Biometrika. 1983;70:41–55.
13. Joffe MM, Rosenbaum PR. Invited commentary: propensity scores. Am J Epidemiol.
1999;150:327–333.
14. Connors AF, Speroff T, Dawson NV, et al. The effectiveness of right heart catheterization
in the initial care of critically ill patients. JAMA. 1998;18:889–897.
15. D’Agostino RB. Propensity score methods for bias reduction in the comparison of a treat-
ment to a non-randomized control group. Stat Med. 1998;17:2265–2281.
16. Rosenbaum PR, Rubin DB. Constructing a control group using multivariate matched
sampling methods that incorporate a propensity score. Am Stat. 1985;39:33–38.
17. Stone RA, Obrosky S, Singer DE, Kapoor WN, Fine MJ. Propensity score adjustment for
pretreatment differences between hospitalized and ambulatory patients with community
acquired pneumonia. Med Care. 1995;33:AS56–AS66.
18. Perkins SM, Tu W, Underhill MG, Zhou X, Murray MD. The use of propensity scores in
pharmacoepidemiology research. Pharmacoepidemiol Drug Saf. 2000;9:93–101.
19. Hylan TR, Crown WH, Meneades L, et al. TCA and SSRI anti-depressant selection
and health care costs in naturalistic setting: a multivariate analysis. J Affect Disord.
1998;47:71–79.
20. Newhouse JP, McClellan M. Econometrics in outcomes research: the use of instrumental
variables. Annu Rev Public Health. 1998;19:17–34.
21. Rubin DB. On principles for modeling propensity scores in medical research. Pharma-
coepidemiol Drug Saf. 2002;13:855–857.
22. McClellan M, McNeil BJ, Newhouse JP. Does more intensive treatment of acute myocar-
dial infarction reduce mortality? JAMA. 1994;272:859–866.
23. McClellan M, Newhouse JP. The marginal costs and benefits of medical technology: a panel
instrumental-variables approach. J Econ. 1997;77:39–64.
24. Zohoori N, Savitz DA. Econometric approaches to epidemiologic data: relating endogene-
ity and unobserved heterogeneity to confounding. Ann Epidemiol. 1997;7:251–257.

You might also like