Furr (2011) CAP 2 Scale Construction and Psychometrics For Social and Personality Psychology.

Scale Construction
and Psychometrics
for Social and
Personality
Psychology
R. Michael Furr
00-Furr-4149-Prelims.indd 3 11/10/2010 5:18:13 PM

2
Core Principles, Best Practices,
and an Overview of Scale
Construction
This chapter presents principles and practices that are among the broadest and
most fundamental issues for scale construction, modification, use, evaluation, and
interpretation. The points are rather straightforward but are vitally important in
conducting informative research. Thus, this chapter provides nontechnical over-
views of each point, to be complemented by greater exploration and depth later in
this volume.
Several of these principles and practices strike me, as an editor, reviewer, and
reader of social/personality research, as being somewhat under-appreciated. To
retain focus on those issues, this chapter bypasses issues that, though fundamental
to scale construction and psychometrics, seem generally well-known and well-
implemented. Indeed, much social/personality research is based upon measurement
that is well-conceived and appropriately-executed. This discussion is intended to
raise awareness and understanding of issues that, if appreciated even more widely,
will enhance the generally good conduct and interpretation of research. The issues
are summarized in Table 2.1.
Most facets of the process and principles covered in this chapter apply to
all forms of psychological measurement. For example, this chapter addresses
the need to articulate the construct and context of a measurement strategy, the
need to evaluate psychometric properties, and the need to revise the measure-
ment strategy if necessary—all of which apply to measurement strategies
such as “tests” of maximal performance, reaction time, behavioral obser-
vations, physiological, measures, choices and decisions, informant-reports,
and so on.
In addition, this chapter outlines scale construction in terms of four steps
(Figure 2.1). Reflecting contemporary social/personality psychology (John &
Benet-Martinez, 2000), this chapter (and this volume more generally) blends sev-
eral approaches to scale construction. It involves rationally-focussed item-writing,
attention to scale dimensionality and internal coherence, and empirical examination
of the scale’s psychological meaning.
02-Furr-4149-CH-02.indd 4 08/10/2010 5:03:07 PM

core principles, best practices, and an overview of scale construction
Table 2.1 Under-appreciated principles and practices in scale construction, use, evaluation, and
interpretation
Principles and practices
1 Examination and interpretation of reliability and validity
a) Psychometric properties and quality of the current data should be evaluated
b) Psychometric properties and quality of the current data should be considered when
using scales and when drawing psychological implications from observed effects
2 Dimensionality (i.e., factor structure)
a) Dimensionality should be evaluated and considered in scale construction, use, and
evaluation
b) Coefficient alpha should is not an index of unidimensionality, the “eigenvalue greater
than one” rule should be avoided, and oblique rotations are preferable to orthogonal
3 Ad-hoc scales
a) Previously-validated scales are preferable to ad hoc scales
b) Ad hoc scales should be examined psychometrically, requiring more than assumed face
validity
4 Modified scales
a) Previously-validated original scales are preferable to modified scales
b) Modified scales should be examined psychometrically, not assumed to have the same
dimensionality, reliability, and validity of original scales
5 Brief/single-item scales
a) Sufficiently-long scales are preferable to overly-brief scales
b) Brief/single-item scales should be examined psychometrically, and their psychometric
quality and potential limitations should be appropriately discussed
6 The use of scales across groups
a) If scales are used in groups that might differ psychologically from those within which the
scales were developed and validated, then the psychometric properties/differences
should be examined, understood, and potentially rectified.
b) Careful translation does not guarantee psychometric stability across linguistically-
differing groups
7 Difference scores (i.e., gain scores, change scores, discrepancy scores)
a) Alternatives to difference scores should be considered strongly.
b) Difference scores should be used with attention to their components and their
psychometric quality.
8 Advanced psychometric perspectives (e.g., Confirmatory Factor Analysis, Generalizability
Theory, and Item Response Theory)
a) Advanced perspectives offer some important differences/advantages, as compared to
traditional psychometric theory; thus, they might be highly-appropriate methods of
understanding and evaluating the psychometric properties of a given scale
b) Producers and consumers of research should be prepared to provide and/or interpret
information obtained from these perspectives when necessary
Examination and Interpretation of Reliability

and Validity
First and most broadly, the psychometric quality of the current data should be
evaluated and considered when interpreting results. Reliability and validity are
02-Furr-4149-CH-02.indd 5 08/10/2010 5:03:07 PM

scale construction and psychometrics for social and personality psychology
1 Articulate construct and context
2 Choose response format and assemble initial item pool
3 Collect data from respondents
4 Examine psychometric properties and quality
Final
scale
Figure 2.1 The scale construction process
crucial in understanding statistical results and their psychological implications.

Roughly stated, reliability is the precision of scores—the degree to which scores
accurately reflect some psychological variable in a given sample. Validity, then,
concerns the “some variable” reflected by those scores—specifically, validity is
the degree to which scores can be interpreted in terms of a specific psychologi-
cal construct. Note that scores can be reliable (i.e., they can be good indicators
of something), but—at the same time—they can be interpreted invalidly (i.e.,
they can be interpreted in terms of a construct that they do not truly reflect).
Thus, there at least two issues that should be addressed in any psychological
study. The first is the psychometric properties and qualities of the measures used
in the study. Reliability and validity are fundamental facets of psychometric
quality, and as such, researchers should provide evidence regarding the nature and
strength of the reliability and validity of any scale, test, assessment, or dependent
variable used in a study—is the scale performing well in the sample being studied,
do the scale’s scores truly reflect the construct that researchers wish to measure?
The second issue is the implications that scales’ reliability and validity have for
analysis and psychological implications. Among their effects, reliability affects
one’s statistical results, and validity affects one’s ability to interpret results in
terms of specific psychological phenomena.
02-Furr-4149-CH-02.indd 6 08/10/2010 5:03:07 PM

Without proper understanding of the psychometric properties of the measures

in a given study, researchers and readers cannot be sure that those measures were
used and interpreted appropriately. Despite this importance, fundamental psycho-
metric information is sometimes omitted from research reports. Unfortunately, we
cannot assume confidently that the reliability of a scale’s scores in one study or
with one sample of participants generalizes to all studies or all samples. Thus,
each time a scale is used, its scores should be evaluated in terms of psychometric
quality. This volume and others (Nunnally & Bernstein, 1994; Furr & Bacharach,
2008) provide broad backgrounds for addressing psychometric quality.
Reliability can often be estimated quite easily for multi-item scales, and
researchers usually assume that validity evidence generalizes across similar
samples of participants. However, both of these practices are limited, as
discussed later.
Dimensionality
A scale’s dimensionality, or factor structure, reflects the number and nature of
variables assessed by its items. Some questionnaires, tests, and inventories are
unidimensional, with items cohering and reflecting a single psychological varia-
ble. Other questionnaires are multidimensional, with sets of items reflecting
different psychological variables.
Usually based upon factor analysis, an accurate understanding of the number
and nature of a scale’s dimensionality directly affects its scoring, psychometric
quality, and psychological meaning. Dimensionality dictates the number of mean-
ingful scores that a scale produces for each participant. If a scale includes two
independent dimensions, its items should be scored to reflect those dimensions.
For example, the Positive Affect Negative Affect Schedule (PANAS; Watson
et al., 1988) is a multidimensional questionnaire that produces one score for
Positive Affect (PA) and another for Negative Affect (NA). Researchers should
not combine items across the two dimensions, as it would produce scores reflect-
ing no coherent psychological variable. By dictating the number of meaningful
scores derived from a questionnaire, dimensionality also directs researchers’
evaluations of reliability and validity. That is, researchers must understand the
psychometric quality of each score obtained from a questionnaire. For example,
the PANAS has been developed and used with psychometric attention to each of
its two “subscales.” Thus, researchers who develop and use psychological scales
must understand the dimensionality of those scales. This is true even for short
scales that might appear to reflect a single psychological variable. Inaccurate
understanding of dimensionality can produce scores that are empirically and
psychologically meaningless.
An important related point is that a scale’s dimensionality is not clearly
reflected by the familiar Cronbach’s coefficient alpha. Based upon a scale’s internal
02-Furr-4149-CH-02.indd 7 08/10/2010 5:03:07 PM

consistency, alpha is an estimate of a scale’s reliability; however, it is not an index

of unidimensionality. That is, a large alpha value cannot be interpreted as clear
evidence of unidimensionality (see Chapter 4).
Finally, there are several recommendations that contradict many applications
of factor analysis in evaluating the dimensionality of psychological scales. One
is that the “eigenvalue greater than one” rule is a poor way to evaluate the
number of dimensions underlying a scale’s items; other procedures, such as
scree plots, are preferable. A second recommendation is that oblique rotations
are preferable to orthogonal rotations. These recommendations are detailed in
Chapter 4.
Ad Hoc Scales
Occasionally, researchers create scales to measure specific constructs for a study.
Of course, scale-development is important for psychological research, and there
are good reasons to create new scales (e.g., one might wish to measure a construct
for which a scale has not yet been developed). However, there are two caveats
related to ad hoc scales.
The first caveat is that previously-validated scales are generally preferable to
ad hoc scales. That is, when well-validated scales exist for a given construct,
researchers should strongly consider using those scales rather than a new scale.
For example, there are many well-validated self-esteem scales differing in length,
psychological breadth, and dimensionality. With such diversity and known psy-
chometric quality within easy reach, there seems little reason to assemble a new
ad hoc self-esteem scale.
The second caveat is that, if ad hoc scales are created, they require psychomet-
ric evaluation, including validity evidence that goes beyond face validity. Ideally,
scales that are intended to measure important psychological variables are devel-
oped through a rigorous process emphasizing psychometric quality. However,
ad hoc scales sometimes seem to be applied without much evaluation, with
researchers apparently relying upon on face validity and assuming that items
obviously reflect the intended construct. Not only should researchers examine the
dimensionality and reliability of ad hoc scales, but they should strive to obtain and
report independent evidence of validity. For example, researchers might recruit
independent raters (e.g., colleagues or students) to read the items along with items
intended to reflect other variables, ask the raters to rate the clarity with which each
item reflects each of several variables, and evaluate the degree to which each item
was rated as clearly reflecting its intended variable. Such examinations produce
validity-related evidence that goes beyond the researcher’s opinions, and they
might convince readers that a scale is sufficiently valid for narrowly-focussed
application.
02-Furr-4149-CH-02.indd 8 08/10/2010 5:03:07 PM

Modified Scales
In addition to creating ad hoc scales, researchers sometimes modify existing
scales for new purposes. For example, researchers might shorten an existing scale
or might revise a measure of one variable (e.g., health locus of control) to “fit” a
different variable (e.g., financial locus of control). Again, there can be good
reason for such modifications (e.g., existing scales are perceived as too lengthy,
or there is no existing measure of the variable in question).
However, there are caveats that arise with modified scales, paralleling those
associated with ad hoc scales and arising from the fact that modified scales might
not have the psychometric properties or quality of an original scale. Indeed, the
more substantially-modified a scale is, the less it can be assumed to have psycho-
metric quality similar to the original. Thus, one caveat is that well-validated,
original scales are preferable to modified scales. Because a modified scale’s psy-
chometric properties and quality might differ from those of the original scale, the
modified scale is—to some degree—an ad hoc scale. As such, its psychometric
properties and quality are unclear and suspect. Consequently, a second caveat is
that modified scales require psychometric evaluation and interpretation. Because
the psychometric properties of a modified scale likely differ from those of the
original, researchers should carefully examine the scale’s dimensionality, reliability,
and validity.
Brief or Single-item Scales

An issue related to the creation or modification of scales is the use of
brief scales—scales with very few items, perhaps only a single item. Indeed,
brief scales are appealing, particularly when participants cannot be burdened
with long scales.
Unfortunately, brief scales have important psychometric costs—their psycho-
metric quality might be, likely is, poor or even unknown. As we shall see, tradi-
tional reliability theory suggests that reliability is relatively weak for brief scales,
all else being equal. For example, recent research (van Dierendonck, 2005) com-
pared three versions of a “Purpose in Life” scale—a 14-item version, a 9-item
version, and a 3-item version. Results showed weaker reliability for shorter
versions, with reliability estimates of α = .84, α = .73, and α = .17 for the three
versions, respectively. The particularly-poor reliability of the 3-item version sug-
gests that “it is troublesome if the scales are to be used as variables in correlational
analysis. Low reliability diminishes the chance of finding significant correlations”
(van Dierendonck, 2005, p. 634). In fact, low reliability is problematic not only for
typical “correlational analyses” but for any analysis of group differences or asso-
ciations (see Chapter 4). A second difficulty particular to single-item scales is that
02-Furr-4149-CH-02.indd 9 08/10/2010 5:03:07 PM

they preclude the use of internal consistency methods for estimating reliability (e.g.,
coefficient alpha). Because internal consistency methods are easier than most other
methods of estimating reliability, single-item scales are often used without atten-
tion to psychometric quality. This is a serious problem, preventing researchers,
reviewers, editors, and readers from knowing the quality of a measurement that,
being based upon single-item scale, is inherently suspect. Thus, researchers who
use single-item scales might examine test–retest studies in order to estimate the
reliability of the scales.
Importantly, brief or even single-item scales can have psychometric properties
sufficient for researchers facing strict constraints in measurement strategies. For
example, the Single-Item Self-Esteem Scale has good test–retest reliability esti-
mates, and it has strong convergent validity correlations with the widely-used
Rosenberg Self-Esteem Inventory (Robins et al., 2001). Similarly, the Ten-Item
Personality Inventory has good test–retest estimates of reliability for each of its
2-item scales, which have strong convergent validity correlations with longer
measures of their constructs (Gosling et al., 2003). The important point is that
brief scales are appropriate and useful when their psychometric properties are
adequate, as demonstrated by solid empirical estimates.
Scale Use Across Psychologically-differing Groups

Researchers often assume that psychometric properties generalize across sam-
ples of participants, but this assumption is not always valid. Indeed, a scale’s
psychometric properties might be importantly different in differing groups, and
this might be particularly problematic for research in which scales are trans-
ported across cultural groups. Whether due to group differences in the interpreta-
tions of words, question/items, instructions, or the general meaning of a set of
items, it is possible that “the items of the scale do not similarly represent the
same latent construct … across groups” (Tucker et al., 2006, p. 343). In such
situations, the scale’s psychological meaning differs across groups and “the
accuracy of interpretations about group differences on the latent construct is
compromised” (ibid.).
Several considerations are important when a scale is used in groups that might
differ psychologically from the group in which it was initially developed and
validated. First, psychometric properties should be examined within each new
group, and psychometric differences should be examined, understood, and poten-
tially rectified before scores are used in those groups. If the new group might
differ in the interpretation of a scale or in terms of the scale’s link to the construct
of interest, then researchers should explore this possibility. Results will either
reveal that the scale is similarly meaningful in the new group, or they will reveal
psychologically-interesting differences.
10
02-Furr-4149-CH-02.indd 10 08/10/2010 5:03:07 PM

To establish the comparability of a scale across groups, researchers examine

“test bias” (see Chapter 6), “measurement invariance,” or differential item func-
tioning (see Chapter 10). For example, Tucker et al. (2006) examined the
Satisfaction With Life Scale (SWLS) in North Americans and Russians, finding
that scores were not strongly comparable across groups. Such results suggest that
comparison of groups’ averages might produce misleading conclusions.
The second consideration is that careful translation does not guarantee psycho-
metric stability across groups. Many researchers should be commended for care-
ful attention in translating scales into a new language. Indeed, such attention is
crucial for good cross-cultural work; however, it does not guarantee that the
scale’s psychometric properties or meaning are comparable. Again, the compara-
bility of scale’s properties, and ultimately of its psychological meaning, are
revealed through psychometric analysis of responses to the scale.
Difference Scores
One important topic addressed in this volume is the use of difference scores as
indicators of psychological phenomena. Difference scores—or “change scores” or
“discrepancy scores”—are obtained by measuring two component variables and
computing the difference between the two. For example, a participant’s intergroup
bias might be measured by having her rate the positivity of an ingroup and of an
outgroup, and then calculating the difference between the two ratings. The differ-
ence might be interpreted as the degree to which she has a more favorable attitude
toward the ingroup than toward the outgroup.
Difference scores are appealing, but their intuitive appeal masks complexities
that have been debated for decades. Although much debate highlights the sup-
posed unreliability of difference scores, there is also concern that difference
scores can lack discriminant validity, in terms of simply reflecting one of their
component variables. Both issues can compromise the psychological conclusions
based upon difference scores.
This volume discusses these complexities, including psychometrically-based
and statistically-based recommendations for handling them. First, and most gener-
ally, alternatives to difference scores should be considered seriously. That is, dif-
ficulties might be handled best by avoiding difference scores altogether, focussing
instead on their component variables. Second, if difference scores are used, they
should be used with attention to their components and to their psychometric quality.
There seems to be a tendency to ignore the fact that difference scores are variables
with psychometric properties that must be understood and considered when draw-
ing psychological conclusions. Without such understanding and consideration,
research based upon difference scores is ambiguous in terms of measurement
quality, statistical validity, and psychological meaning.
11
02-Furr-4149-CH-02.indd 11 08/10/2010 5:03:07 PM

Advanced Psychometric Perspectives

Advanced psychometric perspectives or tools such as Confirmatory Factor Analysis,
Generalizability Theory, and Item Response Theory are increasingly accessible.
As they become more well-known and well-integrated into user-friendly
statistical software, they may become more important for scale development
and evaluation.
Such perspectives offer important differences from and advantages over tradi-
tional psychometric theory, and they can be useful for understanding and evaluating
the psychometric properties of psychological measures. Furthermore, producers
and consumers of psychological research should be prepared to provide and/or
interpret information obtained from these perspectives. Given the advantages of
these perspectives, they may be the optimal choices for some—perhaps much—of
the work in scale development and evaluation. This volume presents important
principles of these perspectives, with examples that lay foundations for conduct-
ing and interpreting these important psychometric perspectives.
Steps in Scale Construction

Scale construction can be seen as a four-step process that is often iterative
(Figure 2.1). Although each step is important, some are ignored in some scale
construction procedures. Unfortunately, bypassing any of these steps might pro-
duce a scale with unknown psychometric quality and ambiguous meaning. High-
quality research requires serious attention to scale construction and evaluation.
Step 1: Articulate the Construct and Context

The first, and perhaps most deceptively-simple, facet of scale construction is
articulating the construct(s) to be measured. Whether the construct (one or
more) is viewed as an attitude, a perception, an attribution, a trait, an emotional
response, a behavioral response, a cognitive response, or a physiological
response, or—more generally, a psychological response, tendency, or disposi-
tion of any kind—it must be carefully articulated and differentiated from
similar constructs. Is more than one construct to be measured? What is the
exact psychological definition of each construct? Is each construct narrow or
broad? Does the construct have subcomponents or dimensions that should be
differentiated and measured? What are the likely associations and differences
between the intended construct and other relevant psychological constructs?
Such questions guide subsequent steps in scale construction and evaluation,
ultimately determining the scale’s meaning and quality. For example, if an
intended construct is not clearly differentiated from other constructs, then
12
02-Furr-4149-CH-02.indd 12 08/10/2010 5:03:07 PM

subsequent steps might produce a scale with poor validity and ambiguous
meaning.
In addition, researchers creating a new scale must articulate the context in
which it is likely to be used. The context includes at least two elements—the
likely target population and the likely administration context. Obviously,
the intended population will direct subsequent steps of item writing. For example,
the response formats, items, or instructions for a scale intended to be used with
adults would differ dramatically from those for one intended to be used with chil-
dren. As discussed earlier, researchers cannot assume that a scale developed for
and validated within one population is psychometrically comparable or similarly
meaningful in different populations. Similarly, the likely administration context(s)
must be considered carefully. For example, if the scale will be used primarily in
research contexts that are time-sensitive, then subsequent steps will likely focus
on brevity. Or, for example, if the scale will be administered via an online survey,
then researchers should consider implementing online strategies in Step 3 of the
construction process.1
Step 2: Choose Response Format and Assemble Initial Item Pool

In the second step of scale construction, researchers choose a response format and
assemble an initial item pool. Guided by considerations from the first step,
researchers write or seek out items that seem psychologically relevant to the
intended construct. Of course, this depends on factors such as the number of con-
structs to be measured, the intended length of the scale, and the clarity of the
construct’s definition.
As discussed in Chapter 3, this step often includes iterative sub-steps in which
items are discussed, considered in terms of conceptual relevance and linguistic
clarity, and discarded or revised. In addition, this work may lead researchers to revisit
the first step—potentially re-conceptualizing the focal construct(s). Indeed, a
thoughtful item-writing process can reveal shortcomings in a scale’s conceptual basis.
Step 3: Collect Data

After one or more constructs have been articulated, the likely assessment context
has been determined, and items have been assembled, the items should be admin-
istered to respondents representing the likely target population, in a manner
reflecting the likely administration context. This step has at least two purposes.
First, it can reveal obvious problems through respondent feedback or observation.
For example, respondents might require more time than initially supposed, or they
might express confusion or frustration. Such issues might require revision of the
scale. Second, this step produces data for the next step of scale construction—
evaluation of the item pool’s psychometric properties and quality.
13
02-Furr-4149-CH-02.indd 13 08/10/2010 5:03:07 PM

Step 4: Psychometric Analysis

Scale construction requires attention to the psychometric properties of proposed
items and of the proposed scale as a whole. By collecting data in a representative
administration context and attending to dimensionality, reliability, and validity,
researchers enhance the possibility that the scale will be useful and psychologi-
cally informative. Without such attention, even scales with the most straightfor-
ward appearance might be psychologically ambiguous or meaningless. Subsequent
chapters articulate principles and processes important for psychometric evaluation.
The results of psychometric analyses determine subsequent phases of scale
construction. If analyses reveal clear psychometric properties and strong psycho-
metric quality, then researchers might confidently complete scale construction.
However, psychometric analyses often reveal ways in which scales could be
improved, leading researchers back to item (re)writing. In addition, psychometric
analyses occasionally even lead researchers to re-conceptualize the nature of the
construct(s) at the heart of the scale. Upon rewriting, the newly-revised scale
should be evaluated in terms of its psychometric properties. This back-and-forth
process of writing, analysis, and re-writing might require several iterations, but
the result should be a scale with good psychometric quality and clear psychological
meaning.
General Issue: Scale of Measurement

One general issue that sometimes escapes scrutiny is whether a scale produces
scores at an interval level of measurement. At an interval level of measurement,
the underlying psychological difference between scores is constant across the
entire range of scores. Consider a hypothetical 1-item scale measuring homopho-
bia: “I avoid homosexual people,” with response options of 1 = Never, 2 = Rarely,
3 = Sometimes, 4 = Often, 5 = Always. To interpret scores at an interval level of
measurement, researchers must believe that the size of the psychological differ-
ence (in terms of underlying homophobic attitudes) between “Never” and
“Rarely” avoiding homosexual people is identical to the size of the psychological
difference between “Rarely” and “Sometimes” avoiding homosexual people. That
is, the psychological difference between a score of 1 and a score of 2 is identical
to the psychological difference between a score of 2 and a score of 3. There is
serious debate about whether this is true for many psychological scales.
Level of measurement can have implications for the meaningfulness of specific
forms of statistical analysis, though there is disagreement about this. Strictly
speaking, a scale that is not at least at the interval level of measurement is difficult
to interpret in terms of analyses based upon linear regression (as are ANOVA-
based procedures). For example, an unstandardized regression slope reflects the
difference in an outcome variable associated with a one-unit difference in a predictor
14
02-Furr-4149-CH-02.indd 14 08/10/2010 5:03:07 PM

variable. The psychological meaning of this is clear only when “a one-unit difference
in a predictor variable” is consistent across levels of that variable. This is true for
interval scales, but not for ordinal or nominal scales.
Regardless of ambiguities and disagreements, researchers generally treat
Likert-type scales (such as the hypothetical homophobia scale) as an interval level
of measurement. Particularly for aggregated scores obtained from multi-item
scales, researchers assume that scores are “reasonably” interval-level. For very
brief or single-item scales, this assumption is more tenuous. In such cases,
researchers should either consider alternative analytic strategies or acknowledge
the potential problem.
Summary
This chapter has reviewed principles and recommendation for scale construction,
evaluation, and use, and has summarized the scale construction process. The
remainder of this volume provides greater depth into the process, principles, and
practices, hopefully enhancing motivation and ability to pursue effective measure-
ment procedures.
Note
1 The possibility that online surveys differ meaningfully from traditional methods has
received some empirical attention. For example, Gosling et al. (2004) found that
web-based surveys produce results similar to those produced by traditional methods;
however, they note that “this question has yet to be resolved conclusively” (p. 102).
15
02-Furr-4149-CH-02.indd 15 08/10/2010 5:03:08 PM

Furr (2011) CAP 2 Scale Construction and Psychometrics For Social and Personality Psychology.

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Furr (2011) CAP 2 Scale Construction and Psychometrics For Social and Personality Psychology.

Uploaded by

Copyright:

Available Formats

Scale Construction

00-Furr-4149-Prelims.indd 3 11/10/2010 5:18:13 PM

02-Furr-4149-CH-02.indd 4 08/10/2010 5:03:07 PM

Examination and Interpretation of Reliability

02-Furr-4149-CH-02.indd 5 08/10/2010 5:03:07 PM

1 Articulate construct and context

2 Choose response format and assemble initial item pool

3 Collect data from respondents

4 Examine psychometric properties and quality

Figure 2.1 The scale construction process

crucial in understanding statistical results and their psychological implications.

02-Furr-4149-CH-02.indd 6 08/10/2010 5:03:07 PM

Without proper understanding of the psychometric properties of the measures

02-Furr-4149-CH-02.indd 7 08/10/2010 5:03:07 PM

consistency, alpha is an estimate of a scale’s reliability; however, it is not an index

02-Furr-4149-CH-02.indd 8 08/10/2010 5:03:07 PM

Brief or Single-item Scales

02-Furr-4149-CH-02.indd 9 08/10/2010 5:03:07 PM

Scale Use Across Psychologically-differing Groups

02-Furr-4149-CH-02.indd 10 08/10/2010 5:03:07 PM

To establish the comparability of a scale across groups, researchers examine

02-Furr-4149-CH-02.indd 11 08/10/2010 5:03:07 PM

Advanced Psychometric Perspectives

Steps in Scale Construction

Step 1: Articulate the Construct and Context

02-Furr-4149-CH-02.indd 12 08/10/2010 5:03:07 PM

Step 2: Choose Response Format and Assemble Initial Item Pool

Step 3: Collect Data

02-Furr-4149-CH-02.indd 13 08/10/2010 5:03:07 PM

Step 4: Psychometric Analysis

General Issue: Scale of Measurement

02-Furr-4149-CH-02.indd 14 08/10/2010 5:03:07 PM

02-Furr-4149-CH-02.indd 15 08/10/2010 5:03:08 PM

You might also like