Module 3-Health Technology, Assessment Study Designs, and Evidences
Module 3-Health Technology, Assessment Study Designs, and Evidences
Module 3-Health Technology, Assessment Study Designs, and Evidences
Detection (or ascertainment) bias refers to systematic differences between groups in how outcomes are
assessed. These differences may arise due to, e.g., inadequate blinding of outcome assessors regarding patient
treatment assignment, reliance on patient or provider recall of events (also known as recall bias), inadequate
outcome measurement instruments, or faulty statistical analysis. Whereas allocation concealment is intended to
ensure that persons who manage patient allocation, as well as the patients themselves, do not influence patient
assignment to one group or another, blinding refers to preventing anyone who could influence the assessment
of outcomes from knowing which patients have been assigned to one group or another. Knowledge of patient
assignment itself can affect outcomes as experienced by patients or assessed by investigators. The techniques
for diminishing detection bias include blinding of outcome assessors including patients, clinicians, investigators,
and/or data analysts, especially for subjective outcome measures, and validated and reliable outcome
measurement instruments and techniques.
Attrition bias refers to systematic differences between comparison groups in withdrawals (drop-outs) from a
study, loss to follow-up, or other exclusion of patients/participants and how these losses are analyzed. Ignoring
these losses or accounting for them differently between groups can skew study findings, as patients who
withdraw or are lost to follow-up may differ systematically from those patients who remain for the duration of the
study. Indeed, patients’ awareness of whether they have been assigned to a particular treatment or control group
may differentially affect their likelihood of dropping out of a trial. Techniques for diminishing attrition bias include
blinding of patients as to treatment assignment, completeness of follow-up data for all patients, and intention-to-
treat analysis (with imputations for missing data as appropriate).
Reporting bias refers to systematic differences between reported and unreported findings, including, e.g.,
differential reporting of outcomes between comparison groups and incomplete reporting of study findings (such
as reporting statistically significant results only). Also, narrative, and systematic reviews that do not report search
strategies or disclose potential conflicts of interest raise concerns about reporting bias as well as selection bias
(Roundtree 2009). Techniques for diminishing reporting bias include thorough reporting of outcomes consistent
with outcome measures specified in the study protocol, including attention to documentation and rationale for
any post hoc (after the completion of data collection) analyses not specified prior to the study, and reporting of
literature search protocols and results for review articles. Reporting bias, which concerns differential or
incomplete reporting of findings in individual studies, is not the same as publication bias, which concerns the
extent to which all relevant studies on given topic proceed to publication.
In contrast to the systematic effects of various types of bias, random error is a source of non-systematic
deviation of an observed treatment effect or other outcome from a true one. Random error results from chance
variation in the sample of data collected in a study (i.e., sampling error). The extent to which an observed
outcome is free from random error is precision. As such, precision is inversely related to random error.
Random error can be reduced, but it cannot be eliminated. P-values and confidence intervals account for the
extent of random error, but they do not account for systematic error (bias). The main approach to reducing
random error is to establish large enough sample sizes (i.e., numbers of patients in the intervention and control
groups of a study) to detect a true treatment effect (if one exists) at acceptable levels of statistical significance.
The smaller the true treatment effect, the more patients may be required to detect it. Therefore, investigators
who are planning an RCT or other study consider the estimated magnitude of the treatment effect that they are
trying to detect at an acceptable level of statistical significance, and then “power” (i.e., determine the necessary
sample size of) the study accordingly. Depending on the type of treatment effect or other outcome being
assessed, another approach to reducing random error is to reduce variation in an outcome for each patient by
increasing the number of observations made for each patient. Random error also may be reduced by improving
the precision of the measurement instrument used to take the observations (e.g., a more precise diagnostic test
or instrument for assessing patient mobility).
6. Role of Selected Other Factors
Some researchers contend that if individual studies are to be assembled into a body of evidence for a systematic
review, precision should be evaluated not at the level of individual studies, but when assessing the quality of the
body of evidence. This is intended to avoid double-counting limitations in precision from the same source
(Viswanathan 2014).
In addition to evaluating internal validity of studies, some instruments for assessing the quality of individual
studies evaluate external validity. However, by definition, the external validity of a study depends not only on
its inherent attributes, but on the nature of an evidence question for which the study is more or less relevant. An
individual study may have high external validity for some evidence questions and low external validity for others.
Some quality assessment tools for individual studies account for funding source (or sponsor) of a study and
disclosed conflicts of interest (e.g., on the part of sponsors or investigators) as potential sources of bias. Rather
than being direct sources of bias themselves, a funding source or a person with a disclosed conflict of interest
may induce bias indirectly, e.g., in the form of certain types of reporting bias or detection bias. Also, whether the
funding source of research comes is a government agency, non-profit organization, or health technology
company does not necessarily determine whether it induces bias. Of course, all of these potential sources of
bias should be systematically documented (Viswanathan 2014).
A variety of assessment instruments are available to assess the quality of individual studies. Many of these are
for assessing internal validity or risk of bias for benefits and harms; others focus on assessing external validity.
These include instruments for assessing particular types of studies (e.g., RCTs, observational studies) and
certain types of interventions (e.g., screening, diagnosis, and treatment).
A systematic review identified more than 20 scales (and their modifications) for assessing the quality of RCTs
(Olivo 2008). Although most of these had not been rigorously developed or tested for validity and reliability, the
systematic review found that one of the original scales, the Jadad Scale (Jadad 1996).
When the sample size of an RCT is calculated to achieve sufficient statistical power, it minimizes the probability
that the observed treatment effect will be subject to random error. Further, especially with larger groups, the
randomization enables patient subgroup comparisons between intervention and control groups. The primacy of
the RCT remains even in an era of genomic testing and expanding use of biomarkers to better target selection
of patients for adaptive clinical trials of new drugs and biologics, and advances in computer‐based modeling that
may replicate certain aspects of RCTs (Ioannidis 2013).
Despite its advantages for demonstrating the internal validity of causal relationships, the RCT is not the best
study design for all evidence questions. Like all methods, RCTs have limitations. RCTs can have limitations
regarding external validity. The relevance or impact of these limitations varies according to the purposes and
circumstances of the study. In order to help inform healthcare decisions in real-world practice, evidence from
RCTs and other experimental study designs should be augmented by evidence from other types of studies.
E. Different Study Designs for Different Questions
RCTs are not the best study design for answering all evidence questions of potential relevance to an HTA. As
noted in Box III-11, other study designs may be preferable for different questions. For example, the prognosis
for a given disease or condition may be based on a follow-up study of patient cohorts at uniform points in the
clinical course of a disease. Case-control studies, which are usually retrospective, are often used to identify risk
factors for diseases, disorders, and adverse events. The accuracy of a new diagnostic test (though not its
ultimate effect on health outcomes) may be determined by a cross-over study in which patients suspected of
having a disease or disorder receive both the new (“index”) test and the “gold standard” test. Non-randomized
trials or case series may be preferred for determining the effectiveness of interventions for otherwise fatal
conditions, i.e., where little or nothing is to be gained by comparison to placebos or known ineffective treatments.
Surveillance and registries are used to determine the incidence of rare or delayed adverse events that may be
associated with an intervention. For incrementally modified technologies posing no known additional risk,
registries may be appropriate for determining safety and effectiveness.