PharmacoEconomics (2017) 35:1153–1165

DOI 10.1007/s40273-017-0538-9


The Indonesian EQ-5D-5L Value Set

Fredrick Dermawan Purba1,2 • Joke A. M. Hunfeld1 • Aulia Iskandarsyah3 •
Titi Sahidah Fitriana4 • Sawitri Supardi Sadarjoen3 • Juan Manuel Ramos-Goñi5 •
Jan Passchier6 • Jan J. V. Busschbach1

Published online: 10 July 2017

Ó The Author(s) 2017. This article is an open access publication

Abstract were applied. Interviews were undertaken by trained inter-

Background The EQ-5D is one of the most used generic viewers using computer-assisted face-to-face interviews
health-related quality-of-life (HRQOL) instruments with the EuroQol Valuation Technology (EQ-VT) platform.
worldwide. To make the EQ-5D suitable for use in eco- To estimate the value set, a hybrid regression model com-
nomic evaluations, a societal-based value set is needed. bining C-TTO and DCE data was used.
Indonesia does not have such a value set. Results A total of 1054 respondents who completed the
Objective The aim of this study was to derive an EQ-5D- interview formed the sample for the analysis. Their character-
5L value set from the Indonesian general population. istics were similar to those of the Indonesian population. Most
Methods A representative sample aged 17 years and over self-reported health problems were observed in the pain/dis-
was recruited from the Indonesian general population. A comfort dimension (39.66%) and least in the self-care dimen-
multi-stage stratified quota method with respect to residence, sion (1.89%). In the value set, the maximum value was 1.000
gender, age, level of education, religion and ethnicity was for full health (health state ‘11111’) followed by the health state
utilized. Two elicitation techniques, the composite time ‘11112’ with value 0.921. The minimum value was -0.865 for
trade-off (C-TTO) and discrete choice experiments (DCE) the worst state (‘55555’). Preference values were most affected
by mobility and least by pain/discomfort.
Conclusions We now have a representative EQ-5D-5L
Electronic supplementary material The online version of this value set for Indonesia. We expect our results will promote
article (doi:10.1007/s40273-017-0538-9) contains supplementary and facilitate health economic evaluations and HRQOL
material, which is available to authorized users.
research in Indonesia.
Fredrick Dermawan Purba
[email protected]; [email protected]
Key Points for Decision Makers
of Psychiatry, Erasmus MC University Medical Center,
Wytemaweg 80, Room Na-2019, 3015 CN Rotterdam, The Indonesia does not have an EQ-5D value set.
Department of Developmental Psychology, Faculty of An EQ-5D-5L value set was derived from a highly
Psychology, Padjadjaran University, Jatinangor, Indonesia representative sample of the Indonesian general
Department of Clinical Psychology, Faculty of Psychology, population.
Padjadjaran University, Jatinangor, Indonesia
Data were collected using a rigorous quality control
Center of Applied Psychometrics, Faculty of Psychology, procedure which led to logical and significant models.
YARSI University, Jakarta, Indonesia
5 This Indonesian EQ-5D-5L value set is now
Executive Office, EuroQol Research Foundation, Rotterdam,
The Netherlands becoming available and will be used by all health
6 economic evaluations and health-related quality-of-
Department of Clinical, Neuro and Developmental
Psychology, VU University, Amsterdam, The Netherlands life studies in Indonesia that use EQ-5D.
1154 F. D. Purba et al.

1 Introduction healthcare programmes in terms of cost per QALY [5, 6].

A QALY is obtained by integrating a health state utilities
function, measured by multi-attribute utility instruments
The Indonesian government wishes to improve equal
(MAUIs), differentiated over a lifetime. The three most
access to healthcare by introducing universal health
widely used MAUIs are the EQ-5D, the Health Utility Index
insurance. To ensure health technology assessment (HTA)
(HUI), and the Short Form 6D (SF-6D) [5–8]. Several national
can be undertaken for such an insurance scheme, Indonesia
HTA organizations, for example in the UK and Thailand, have
intends to employ cost-effectiveness analysis for new and
recommended EQ-5D as the preferred method for deriving
existing medical interventions. To value the outcomes of a
utilities [9, 10]. Developed by the EuroQol Group, EQ-5D is a
medical intervention in quality-adjusted life-years
standardized generic instrument that collects descriptive
(QALYs) requires a quality-of-life instrument that can
HRQOL data on five dimensions: mobility, self-care, usual
value the health states of patients using societal prefer-
activities, pain/discomfort, anxiety/depression); followed by a
ences, such as the EQ-5D instrument. At present, no
self-rating of overall health status on a visual analogue scale
Indonesian EQ-5D value set is available for the calculation
(EQ VAS) ranging from 0 (‘worst imaginable health state’) to
of QALYs. There exists a standardized valuation protocol
100 (‘best imaginable health state’) [11, 12]. In 2011, the
for the 5-level version of EQ-5D. We employed this pro-
EuroQol Group expanded the levels of severity of the classic
tocol with over 1000 respondents representative of the
version of EQ-5D, renamed EQ-5D-3L, from three to five
Indonesian population. Below we describe in more detail
levels. This new instrument is designated ‘EQ-5D-5L’ [12].
(1) the social, economic and organizational HTA setting
Recent studies have shown that EQ-5D-5L produces a richer
that determines the demand and specifications for an
description of health states, a higher discriminatory power,
Indonesian valuation study; (2) a brief introduction to the
and a lower ceiling effect compared with EQ-5D-3L [13–18].
EQ-5D-5L, its valuation protocol and the place of the EQ-
The EuroQol Group has also developed a valuation protocol
5D in HTA; and (3) why we cannot rely on values set from
for EQ-5D-5L [19], and the EuroQol Group Valuation
European countries and/or neighbouring countries.
Technology (EQ-VT) template computerized the interview
Indonesia is located in South East Asia, with 255.5
method to standardize EQ-5D-5L valuation studies in differ-
million inhabitants in 2015 [1]. Commencing in January
ent countries. This protocol provides a value set for the cal-
2014, Indonesia has implemented universal healthcare
culation of QALYs using a societal perspective, the preferred
coverage organized by the ‘Badan Penyelenggara Jaminan
perspective in health economics [5].
Sosial Kesehatan’ or BPJS Kesehatan: the Healthcare and
Indonesia does not have an EQ-5D value set, either for
Social Security Agency. The aim of the BPJS Kesehatan is
the 3-level or for the new 5-level version. Previous EQ-5D
to include all Indonesian citizens in the National Health
studies conducted in Indonesia measured health prefer-
Insurance system to enable them to obtain access to
ences using the Malaysian value set or values derived from
healthcare benefits and to provide protection with respect
citizens of the UK [20, 21]. However, for a value set to be
to basic health needs [2]. The decision-making process
valid for Indonesia it should represent the culture and liv-
related to the implementation of this national health cov-
ing standards of Indonesia [22]. Moreover, the values
erage and the adoption of new technologies can benefit
should match the particular wording of the Indonesian
from an evidence-based strategy and the application of
instrument: for instance, if ‘cukup’ (i.e. ‘moderate’) is less
HTA, a decision-making process involving economic
worse in Bahasa Indonesia than in the Malaysian language
evaluation and other considerations such as those of an
(‘sederhana’) or in English, then the values should match
ethical and organizational nature, to ensure the optimal use
that difference. For these reasons the aim of our study was
of health technologies for the population. In 2015, the
to obtain preferences from the general population in order
Ministry of Health of Indonesia formed a national HTA
to derive a national EQ-5D-5L value set for the calculation
committee (Komite Penilaian Teknologi Kesehatan). The
of QALYs from a societal, Indonesian perspective.
committee’s expected output is a policy recommendation
to the Minister on the feasibility of the health service(s) to
be included in the National Health Insurance benefit 2 Methods
package [3, 4].
Economic evaluation uses clinical evidence to provide 2.1 Respondents
systematic consideration of the effects of all available alter-
natives regarding health, healthcare costs, and other effects A representative sample was recruited from the Indonesian
regarded as valuable [5]. Cost-utility analysis (CUA) is used to general population, with a minimum of 1000 respondents
evaluate health-related quality-of-life (HRQOL) outcomes aged 17 and over, based on the work of Ramos-Goñi et al.,
and to compare costs and outcomes between different to obtain a 0.01 standard error (SE) of the observed mean
The Indonesian EQ-5D-5L Value Set 1155

composite time trade-off (C-TTO), 9735 C-TTO responses 5D-5L was produced using a standardized translation
were needed. Therefore, the 1000 respondents interviewed protocol that followed international recommendations [24].
provided 10,000 C-TTO and 7000 discrete choice respon- As briefly mentioned in the introduction, EQ-5D-5L con-
ses to estimate the models [23]. The adult population was sists of five dimensions: mobility (MO), self-care (SC),
defined as aged 17 and over, because in Indonesia, the legal usual activities (UA), pain/discomfort (PD), and anxiety/
age to obtain an ID card, a driving license, and access to depression (AD). Each dimension has five levels: no
voting is 17. To ensure the representativeness of the final problems, slight problems, moderate problems, severe
sample for the Indonesian general population, we used a problems, and unable/ extreme problems [12]. The EQ-5D-
multi-stage stratified quota method with respect to resi- 5L instrument describes 3125 (55) unique health states. A
dence (urban/rural, as registered by the official national 1-digit number expresses the level selected for that specific
register); gender (male/female); age (17–30/31–50/ dimension. Hence, combining a 5-digit number for five
[50 years); and level of education: basic (primary school dimensions will describe a specific health state. For
and below), middle (primary school plus at least 1 year of example, state ‘11111’ indicates ‘no problems on any of
high school) and high (all others). This resulted in the first the five dimensions’, while state ‘54321’ indicates ‘unable
stage of 36 quota groups. Two other categories, religion to walk about, severe problems washing or dressing,
(Islam/Christian/Others) and ethnicity (own-declared eth- moderate problems doing usual activities, slight pain or
nicity: Jawa/Sunda/Sumatera/Sulawesi/Madura-Bali/Oth- discomfort, and no anxiety or depression’ [12]. Each health
ers), were considered important as well. However, state has a so-called ‘sum score of the level digits’, which
including them in the same way as residence, gender, age, means the sum of the levels across domains; for example,
and education would result in 36 9 3 9 6 = 3888 quota ‘11111’ sum score of the level digits is 5 and ‘54321’ is 15.
groups. We therefore used religion and ethnicity quotas This EQ-5D descriptive system is followed by self-rating
independently from the other factors. So religion and eth- of overall health status on a visual analogue scale (EQ
nicity are representative over the whole sample, but within VAS) ranging from 0 (‘worst health you can imagine’) to
the individual 36 quota groups this might not be the case. 100 (‘best health you can imagine’).
To take account of this second layer of sampling, we called
this a ‘multi-stage stratified quota’. The predefined quotas
2.2.2 Valuation Protocol
were based on updated data from the Indonesian Bureau of
Statistics [1].
The EQ-5D-5L valuation protocol consists of five sections
We designed and used an online tool to ensure that the
recruitment of respondents was in accordance with prede-
fined quotas while the sampling was employed in different 1. A general welcome, where the interviewer explains the
parts of the country. Interviews were conducted in the objectives of the research, followed by filling in the
following six cities and their surroundings, located in dif- informed consent when the individuals agree to
ferent parts of Indonesia: Jakarta, Bandung, Jogjakarta, participate.
Surabaya, Medan, and Makassar. Respondents were 2. Introduction to and completion of the descriptive
recruited through a mixed strategy, i.e. through personal system, VAS and background questions (age, sex,
contact, local leader assistance, and from public places experience of illness, religion, ethnicity and
such as mosques and shopping streets. We also asked education).
respondents to introduce us to other potential respondents. 3. C-TTO (see Sect. 2.2.3 below) tasks followed by a
Interviews were conducted at the respondents’ or inter- ‘Feedback Module’ task. Each respondent has to
viewers’ homes. For their participation, all respondents complete one example (health state: being in a
received a mug or a t-shirt specifically designed for the wheelchair), three practice health states (mild:
valuation study. Informed consent was obtained from all ‘21121’; severe: ‘35554’; and moderate but difficult
respondents included in the study. The study was approved to imagine: ‘15411’) and ten ‘real’ C-TTO tasks
by the Health Research Ethics Committee, Faculty of valuing hypothetical EQ-5D-5L health states. In the
Medicine, Padjadjaran University, Indonesia. Feedback Module task, the respondents check whether
they agree with the order of the health states they
2.2 Instruments valued before. The EQ-VT screen shows health states
for 10 C-TTO tasks arranged based on their value
2.2.1 EQ-5D-5L given by the respondents: from the lowest value at the
bottom to the highest value at the top. Respondents are
We used the official EQ-5D-5L Bahasa Indonesia version allowed to ‘flag’ the health state(s) for which they do
provided by the EuroQol Group. This translation of EQ- not agree with the previously given relative position to
1156 F. D. Purba et al.

other health states, but they are not allowed to alter in each block), five very mild states (only one dimension at
their initial values. Three debriefing questions regard- level 2 and all others at level 1, e.g. ‘11112’) (each
ing the difficulties of the C-TTO tasks are added at the included in two blocks) and the most severe/‘pits’ state
end of this section. (‘55555’) (included in all blocks) [19]. Respondents were
4. A discrete choice experiment (DCE, see Sect. 2.2.3 randomly assigned to one of the ten C-TTO blocks. Each
below) followed by three debriefing questions regard- state of the block was presented in random order to
ing the DCE. Each respondent has to complete seven respondents using the EQ-VT platform.
forced-pair comparisons. However, it was realized that TTO has its limitations.
5. A round-up, where respondents can comment on the EuroQol Group considered different valuation techniques
valuation tasks. to be used in conjunction with TTO to make the valuation
6. Country-specific questionnaire(s) (if any). studies more robust and valid. Previous experiments with
DCE, like the study by Stolk et al. using EQ-5D-3L [28] or
All sections were administered utilizing computer-as-
Ramos-Goñi et al. using EQ-5D-5L [29], showed that the
sisted face-to-face interviews employing the EQ-VT plat-
DCE is a valid valuation technique to get health prefer-
form version 2.0.
ences from respondents. Since both TTO and DCE try to
measure the same concept, it was anticipated that DCE
2.2.3 Preference Elicitation Methods
could be used in combination with TTO [30]. In the light of
this reasoning, DCE was included in the EuroQol VT
Time trade-off (TTO) has been widely used as a standard
method to elicit preferences [25, 26]. C-TTO uses con-
Each DCE task was conducted by presenting two health
ventional TTO to elicit better-than-dead (BTD) values, and
states and asking the respondent to select the preferred state
lead-time TTO to elicit worse-than-dead (WTD) values.
for him/her. The DCE design consisted of 196 pairs of EQ-
Details regarding C-TTO can be found in the study by
5D-5L health states distributed over 28 blocks, each con-
Oppe et al. [27]. In summary, respondents were first faced
sisting of seven pairs with a similar severity [19]. The
with ‘conventional’ TTO where they had to choose
seven paired comparisons were presented in random order
between 10 years in an impaired health state (Life B) and
by the EQ-VT; in addition, the right–left order of the two
10 years of full health (Life A). After a series of choice-
health states presented was also randomized.
based iterations, respondents achieved a point of equiva-
lence between the length of time in full health (Life A): ‘x’
and a period of time (10 years) in the impaired health state 2.3 Data Collection
(Life B). The impaired health state value is defined as x/10.
For example, if a respondent could not differentiate At the outset, 13 interviewers were recruited and trained
between 3 years of full health in Life A and 10 years living intensively in a 1-day workshop at two locations: (1)
in Life B, then that health state value would be 0.3 (3/10). Jakarta for interviewers who worked in Jakarta, Bandung
For a really poor health state, respondents might prefer to and Makassar; and (2) Jogjakarta for interviewers who
die immediately; that is, the value for that specific health worked in Jogjakarta, Surabaya and Medan. Each inter-
state is \0 (death value = 0). In this case, the lead-time viewer performed at least five pilot interviews in the week
TTO approach was introduced to allow respondents to after training. Their experiences were discussed and feed-
express a value below the value of death; that is, below 0. back was given by the daily supervisor. Only after this
The two lives in the lead-time TTO are 10 years of full were they permitted to conduct real data interviews. Three
health (Life A) and 10 years of full health followed by additional interviewers were hired during the data collec-
10 years in the impaired health state (Life B). When tion and they received similar training and met similar
respondents reach an indifference point between the requirements to the first 13. Interviews were performed
amount of time ‘x’ in Life A and Life B, the health state between March 9, 2015 and January 24, 2016. After 102
value is defined as (x - 10)/10. Hence, -1 is the lowest interviews we evaluated the quality of the interviews (see
possible value of a given health state, generated from Sect. 2.5 below) and we concluded that their quality was
trading the full 10 years of Life A in a lead-time TTO. not yet sufficient. Hence we retrained the interviewers and
The EQ-5D-5L valuation protocol included 86 EQ-5D- treated the 102 interviews collected thus far as pilot
5L health states to be valued using C-TTO. The 86 health interviews, excluding the 102 interviews in the data anal-
states were distributed into ten blocks with a similar level ysis. A detailed description of this decision-making process
of severity. Eighty unique heath states were selected using and the retraining of the interviewers is provided elsewhere
Monte Carlo simulation (eight unique heath states included [30].
The Indonesian EQ-5D-5L Value Set 1157

2.4 Exclusion Criteria 2.5.1 Minimum Quality Criteria

There were two main criteria for data exclusion: lack of The QC reports provided a number of statistics related to
completion of an interview and characteristics of respon- the quality of the data collected thus far, differentiated by
dents’ answers that related to poor understanding of the interviewer.
task or to errors [31]. Note that the first criterion concerns
1. Wheelchair time: when the duration of time an
excluding respondents and the second excludes respondent
interviewer used to explain the ‘wheelchair example’
preceding the actual C-TTO tasks was \3 min.
With respect to the first criterion, interviews were
2. Wheelchair lead-time: when the interviewer did not
excluded when respondents did not finish the interview for
explain the WTD element of the wheelchair example.
the following reasons: (1) the respondent indicated that he/
3. C-TTO duration: if completing the ten C-TTO tasks
she did not want to continue the interview process, (2)
took \5 min.
interviewers concluded that the respondent was unable to
4. Inconsistency: the value for state ‘55555’ was not the
differentiate between the different dimensions and levels of
lowest and it was at least 0.5 higher than that of the
EQ-5D-5L, (3) interviewers concluded that the respondent
state with the lowest value
was not able to comprehend the C-TTO task during the
practice session. When an interview had to be stopped If any of the four above-mentioned signs are observed,
during the C-TTO task it was excluded from the study. the interview is ‘flagged’ as being of suspicious quality. If
With respect to the second criterion, completed inter- four or more of the interviews are flagged as being of poor
view responses were excluded on account of any of the quality, all ten interviews thus far conducted by that
following characteristics: (1) a respondent had a positive specific interviewer are removed and retraining of that
slope on the regression between his/her values on C-TTO interviewer is conducted. After a further ten interviews, the
and the ‘sum score of the level digits’, as this would performance and compliance are re-evaluated. If again four
indicate that the respondent provided higher utility values or more interviews are flagged, the next set of ten inter-
for poorer health states on average—the slope of the views will also be removed and the interviewer is removed
regression between C-TTO and the ‘sum score of the level from the data collection process. Quality control focused
digits’ was generated as part of the standard quality control on the interviewer; responses in flagged interviews were
report; (2) when a response in the C-TTO tasks was judged not removed from the data that was analysed.
to be irrational: for instance, preferring life B (10 years in The DCE part of the valuation study was also monitored
the corresponding health state) to life A (10 years in full to detect suspicious response patterns. Assuming that A is
health) and not shifting after his/her initial response was the health state at the left of the screen and B is the health
reconfirmed by the interviewer; (3) responses that were state at the right of the screen, then a consistent preference
marked by the respondents in the Feedback Module task, for the left (A) would be suspicious (AAAAAAA). The
which was a sign that the respondents disagreed with the same would apply for the response pattern BBBBBBB,
valuation of those responses. ABABABA, BABABAB. This was also reported in the QC

2.5 Quality Control 2.5.2 Cyclical Feedback

To ensure data quality, we followed the quality control The retraining programme conducted by the daily super-
(QC) process described by Ramos-Goñi et al. [32], which visor was held in 2 locations: (1) Jakarta for interviewers
consisted of minimum quality criteria and cyclical feed- who worked in Jakarta, Bandung and Makassar; and (2)
back to improve interviewers’ skills. The EuroQol Group Jogjakarta for interviewers who worked in Jogjakarta,
facilitates use of the EQ-VT QC tool, which is a software Surabaya and Medan. The QC reports for their interviews
programme that automates the production of QC reports were presented, discussions were held to address non-
based on data from EQ-VT studies. Bi-weekly meetings compliance problems, and suitable solutions were agreed
(teleconference-based) were organized to discuss the QC upon among the interviewers. After the retraining pro-
reports with the EQ-VT support team. The aim of these gramme, the daily supervisor continuously created QC
meetings was to evaluate and improve the interviewers’ reports, made notes at the group and individual levels, and
performance and to check for possible non-compliance to sent feedback to the interviewers, so that they were able to
the interview protocol. learn from their own and other interviewers’ performance.
1158 F. D. Purba et al.

2.6 Data Analysis values as dependent variables and the health states as
explanatory variables. This was achieved by the imple-
We describe the sample characteristics including self-re- mentation of a Tobit model (hyreg with ll() option), which
ported health on the EQ-5D-5L descriptive system and the assumes a latent variable Y*it underlying the observed Yit
EQ-VAS using percentages for discrete variables and of C-TTO values when there is either left- or right-cen-
means and standard deviations for continuous variables in soring in the dependent variable. The C-TTO data, in
comparison with the Indonesian population. A general particular the lead-time C-TTO for WTD health states, is
z test was used to investigate whether the proportions in the by nature censored at -1 [ll(-1) option on hyreg com-
sample were similar to, or different from, the general mand]. This means that observed preference values were
population. valued by the C-TTO method at -1, despite the latent
In this investigation we used TTO (specifically C-TTO) preferences of respondents possibly including values lower
and DCE. TTO has limitations such as loss of aversion than -1 [36]. The Tobit model accounts for this censoring
[33], but also has advantages as the TTO-based value sets by estimating the latent variable Y*it, which can take on
are anchored on a scale of (0) death to (1) full health. DCE predicted preference values extrapolated beyond the range
is not exempt from limitations, as lexicographic behaviour of the observed values. Variance of C-TTO data is not
from respondents has been widely reported in the literature homogeneous among health states; this led us to model
[34]. It is also noticeable that DCE, in its present form, C-TTO data as heteroskedastic data. We used the hetcont()
where time is not incorporated in health state presentations, option of the hyreg command as suggested by Ramos-Goñi
does not anchor value sets on a (0) death to (1) full health et al. [37]. The dummy variables included in the hetcont()
scale. Therefore, DCE produces value sets on an arbitrary option were the same as those included in the main model,
scale based on the relative distances between health states. that is, the 20 dummies that specified the main effects
However, both techniques attempt to measure health model.
states preference, but using different underlying assump- DCE (forced pair comparisons in our case) responses
tions, and seem to not share the same limitations. There- were modelled as a conditional logistic regression model
fore, the data obtained from these two elicitation methods including the same 20 dummy parameters as those used for
could be seen as complementary, not necessarily compet- the C-TTO data. Nevertheless, we did not use the coeffi-
ing with each other. Hence, we chose the solution pre- cients estimated from a conditional logit model because
sented by Oppe and van Hout [35], who combined DCE they were expressed on a latent arbitrary utility scale. We
with C-TTO in a ‘hybrid model’, imposing the (0) death to rescaled the DCE coefficients using the same parameter h
(1) full health scale as determined by C-TTO. that was estimated in the hybrid model. This rescaling
To illustrate how the hybrid model combined C-TTO assumes that the C-TTO model coefficients are propor-
and DCE responses in this study, we also present the results tional to the DCE model coefficients. For more details on
from the models estimated from each C-TTO and DCE the modelling see Ramos-Goñi et al. [23, 37].
separately, with the same assumptions as those used for the Pearson product-moment correlation analysis was
hybrid model. We used the 20-parameter main effects applied to measure the strength and direction of association
model, which estimates four parameters for the five levels that exists between C-TTO, DCE rescaled and hybrid
of each of the five dimensions: mobility, self-care, usual predicted values for 3125 health states.
activities, pain/discomfort and anxiety/depression. Each
coefficient represents the additional utility decrement of
moving from one level to another. Hence, the overall 3 Results
decrement of moving from ‘no problems’ to ‘unable/ex-
treme problems’ is calculated as the sum of the coefficients 3.1 Respondent Characteristics
of ‘no problems to slight problems’, ‘slight problems to
moderate problems’, ‘moderate problems to severe prob- In total, 1056 of 1117 respondents who were approached
lems’, and ‘severe problems to unable/extreme problems’. after the retraining of the interviewers completed the
Presenting the TTO, the DCE and the hybrid model also interview. Reasons for interview failure were refusal to
allows us to compare the value distribution in the form of participate (36, 3.2%), conflicting schedules (14, 1.25%),
the correlations between the predicted values of the mod- discontinuation of the interview at the respondent’s request
els, and we can compare the weights of the individual (10, 0.89%), and discontinuation of the interview by the
dimensions. This gives information about construct validity interviewer’s decision because of the respondent’s lack of
in the form of ‘convergent validity’, or ‘concordance’. understanding (1, 0.09%). From the remaining 1056
Modelling was undertaken using the STATA statistical respondents, we excluded two respondents who had a
package. C-TTO data were modelled using the response positive slope on the regression between their values on
The Indonesian EQ-5D-5L Value Set 1159

C-TTO and the sum score of the level digits of the health 3.2 Self-Reported Health Problems
states, indicating that the respondent provided higher utility
values for poorer health states on average, leaving 1054 Table 2 shows that the highest proportion of health problems
respondents in the final sample. No interviewers were was reported in the pain/discomfort dimension (39.66%
removed because of persistent low-quality data. reported ‘any problems’) and the lowest in the self-care
Characteristics of the respondents in the final sample dimension (1.9%). From the final sample, 464 (44.02%)
were similar to those of the Indonesian population in terms reported no health problems on any dimension (‘11111’).
of residence, gender, and religion. There were some sta-
tistically significant differences in some of the age groups, 3.3 Data Characteristics
education levels, and ethnicities, but the absolute differ-
ences are small as these are \4% (Table 1). The 1054 respondents provided 10540 C-TTO observations
(respondents valued 10 health states each). We excluded 45
observations because they were ‘irrational answers’: pre-
ferring life B (10 years in the corresponding health state,
which is worse than full health) to life A (10 years in full
health) and not shifting after his/her initial response was
Table 1 Characteristics of the study respondents/general population
reconfirmed by the interviewer. Furthermore, 1033 obser-
Characteristics Study sample Indonesian Differences vations that were pointed out by the respondents in the
(N = 1054) general (%)
Feedback Module task were removed. Accordingly, the
n (%) populationa (%)
C-TTO dataset contained 9462 observations. Of these, 187
Residence (1.97%) observations relayed the value 0, and another 3349
Urban 549 (52.09) 53.30 -1.21 (35.39%) were negative values (see Fig. 1 for the his-
Rural 505 (47.91) 46.70 ?1.21 togram of the observed C-TTO values). The 86 observed
Gender mean C-TTO values ranged from -0.719 for state ‘55555’
Female 526 (49.91) 49.65 ?0.26 to 0.909 for state ‘12111’. The mean observed values were
Male 528 (50.09) 50.35 -0.26 negative for 29 health states out of 86 used in the C-TTO
Age design (see Online resource 1 in the electronic supple-
17–19 159 (15.09)* 12.35 ?2.74 mentary material).
20–29 236 (22.39) 24.37 -1.98 The DCE dataset comprised 7378 observations (all
30–39 264 (25.05) 22.68 ?2.37 respondents completed seven paired comparisons). Twenty
40–49 180 (17.08) 18.08 -1.00 respondents (1.89%) answered with suspicious patterns:
50–59 164 (15.56)* 11.84 ?3.72 AAAAAAA (always chose the health state at the left of the
60–69 43 (4.08)* 6.36 -2.28 screen), BBBBBBB (always chose the health state at the
70? 8 (0.76)* 4.31 -3.55 right of the screen), ABABABA or BABABAB; however,
Education their responses were not excluded from the final dataset.
Low 339 (32.16)* 35.18 -3.02
Middle 550 (52.18) 51.72 ?0.46
3.4 Modelling Results
High 165 (15.65)* 13.10 ?2.55
There were 657 (6.92%) left-censored C-TTO observa-
tions: when respondent gave the lowest possible value (-1)
Islam 920 (87.29) 87.18 ?0.11
for a health state in the C-TTO task. The Tobit C-TTO
Christian 103 (9.77) 9.86 -0.09
model results were logically consistent. Conditional
Others 31 (2.94) 2.96 -0.02
logistic regression was used to model the DCE responses
that were also logically consistent (we used the rescaled
Jawa 441 (41.84) 40.22 ?1.62
DCE coefficients). C-TTO and rescaled DCE predicted
Sunda 199 (18.88)* 15.50 ?3.38
values for 3125 health states were correlated, as Fig. 2a
Sumatera 128 (12.14)* 15.02 -2.88
shows (r = 0.9881, p value \0.0001). Table 3 shows that
Sulawesi 63 (5.98)* 8.09 -2.11
both sets of coefficients were in relative agreement; that is,
Madura—Bali 52 (4.93) 4.70 ?0.23
the most important dimension was mobility and the least
Others 171 (16.22) 16.47 -0.25
important was pain/discomfort. The hybrid model, which
* Significant difference at a = 0.05 from z test utilized both C-TTO and DCE data, was also in relative
Data from Indonesian Bureau of Statistics (BPS) agreement with both C-TTO and DCE models. Figure 2b, c
1160 F. D. Purba et al.

Table 2 Self-reported health EQ-5D-5L descriptive system with scores in %

using the EQ-5D-5L descriptive
system and the EQ VAS Mobility Self-care Usual activities Pain/discomfort Anxiety/depression

No problems 92.03 98.11 89.18 60.34 65.75

Slight problems 6.74 1.71 9.68 36.53 28.18
Moderate problems 1.04 0.09 1.14 2.56 5.50
Severe problems 0.19 0.09 0.00 0.57 0.38
Unable/extreme problems 0.00 0.00 0.00 0.00 0.19
Mean SD 25th percentile Median 75th percentile

VAS score 79.38 14.01 70.00 80.00 90.00

EQ EuroQol, VAS visual analogue scale

severe problems in PD (0.103) - no problems to slight

problems in AD (0.079) - slight problems to moderate
problems in AD (0.055) - moderate problems to severe
problems in AD (0.093) - severe problems to extreme
problems in AD (0.078) = 0.240.
Note that each coefficient represents the additional
utility decrement of moving from one level to another.

4 Discussion

The aim of this study was to obtain social preferences and

thus derive an EQ-5D-5L value set from the Indonesian
general population. To obtain values for 3125 EQ-5D-5L
Fig. 1 Observed C-TTO values. C-TTO composite time trade-off
health states, 1054 respondents were interviewed using the
computer-assisted EuroQol Group valuation protocol.
show a high correlation of hybrid predicted utility with C-TTO and DCE were part of the protocol employed in six
models predicted from C-TTO (r = 0.995, p \ 0.0001) cities and their surrounding areas. We used an iterative
and rescaled DCE (r = 0.997, p \ 0.0001). quality control approach in order to obtain high-quality
The hybrid model with main effects was logically con- data. The socio-demographic characteristics of the
sistent (Table 3). Using this as the final model to obtain respondents were similar to those of the Indonesian pop-
3125 EQ-5D-5L health states, the maximum value was ulation with respect to residence, gender, age, level of
1.000 for full health (health state ‘11111’) followed by the education, ethnicity, and religion. This makes EQ-5D-5L
health state ‘11112’ with value 0.921. The minimum value suitable for health economic evaluations that will benefit
was -0.865 for the ‘55555’ state. Of the 3125 health states, the national health insurance scheme. Furthermore, non-
1108 (35.46%) had negative values using the hybrid model. HTA studies in Indonesia such as those using patient-re-
The coefficients from the hybrid model were also in ported outcome measures (PROMs), clinical trials or
agreement with the previous two models regarding improvements in hospital care could use EQ-5D-5L as an
mobility appearing as the most important dimension and instrument to measure HRQOL, with the notion that the
pain/discomfort as the least important. values attached to the health states are societal values.
To obtain utility for an EQ-5D-5L health state, for Several limitations of this study should be considered. It
instance ‘12345’, the following calculation based on the could be argued that there are still statistically significant
hybrid model (final value set) is needed: differences in the distribution of background variables in
Utility weight (‘12345’) = 1 - no problems in MO the sample compared with the data provided by the
(0) - no problems to slight problems in SC (0.101) - no National Bureau of Statistics. There are statistically sig-
problems to slight problems in UA (0.090) - slight prob- nificant differences, but these are small, and limited to
lems to moderate problems in UA (0.066) - no problems some age groups, some education levels, and some eth-
to slight problems in PD (0.086) - slight problems to nicity groups. As a check to see if such small differences
moderate problems in PD (0.009) - moderate problems to were of importance, we compared observed C-TTO values
The Indonesian EQ-5D-5L Value Set 1161

Fig. 2 a Comparison of C-TTO and DCE rescaled predicted utilities. b Comparison of C-TTO and hybrid predicted utilities. c Comparison of
DCE rescaled and hybrid predicted utilities. C-TTO composite time trade-off, DCE discrete choice experiment

for each health state between respondents with different representative sample based on pre-determined variables:
levels of age, education, and ethnicity. There was no clear rural/urban, gender, age, level of education, religion and
pattern of differences in the health state values. Moreover, ethnicity. A further investigation could be conducted to
as can be seen in Table 1, the percentage deviations were find out whether recruiting respondents via personal net-
small and statistical significance should be seen in the light works of interviewers and/or respondents is not preferable
of the statistical power of more than 1000 respondents. or acceptable.
Given these observations, and given that weighting for Indonesia has five major islands that are inhabited by
background variables would add additional complexity, we 93.5% of the population [1]. However, 92.9% of respon-
chose not to introduce weighting for these small deviations dents interviewed in this study were living on Java Island.
from full representativeness. This might raise questions about the representativeness of
The strategy of finding respondents using personal net- the study sample. However, we focused the data collection
works of the interviewers and the respondents could raise on Java island because it is the most populous island (57%
questions about the objectivity/representativeness of the of the population) and the main target of migration from all
study sample. Yet we preferred this way of recruitment in over Indonesia. The diversity of its residents in terms of
order to find respondents who fit into the pre-determined ethnicity helps to fulfil all the categories in our quota
quota groups because we judged it to be a lesser problem sampling in a cost-effective way. We do not know whether
than insufficiently filled categories in the quota sampling. the values obtained in Java from these migrants would have
The quota groups were determined on the variables that differed from the values should the interviews have been
were considered to be important in defining representa- conducted on their original islands. One way to investigate
tiveness. In that respect, we have constructed a whether location is indeed an issue in valuing health in
1162 F. D. Purba et al.

Table 3 Estimation results for C-TTO model, DCE rescaled model, and hybrid model
Independent variables of the model C-TTO Tobit model DCE conditional logistic Hybrid model censored C-TTO
censored at -1 model rescaled values at -1 (final value set)
Coeff. (SE) p value Coeff. (SE) p value Coeff. (SE) p value

Mobility (MO)
No problems to slight problems 0.088 (0.015) 0.000 0.139 (0.015) 0.000 0.119 (0.008) 0.000
Slight problems to moderate problems 0.086 (0.017) 0.000 0.080 (0.017) 0.000 0.073 (0.011) 0.000
Moderate problems to severe problems 0.250 (0.019) 0.000 0.196 (0.016) 0.000 0.218 (0.013) 0.000
Severe problems to unable 0.170 (0.018) 0.000 0.219 (0.018) 0.000 0.203 (0.012) 0.000
Self-care (SC)
No problems to slight problems 0.085 (0.014) 0.000 0.101 (0.016) 0.000 0.101 (0.007) 0.000
Slight problems to moderate problems 0.056 (0.018) 0.002 0.038 (0.018) 0.032 0.039 (0.010) 0.000
Moderate problems to severe problems 0.128 (0.018) 0.000 0.085 (0.019) 0.000 0.108 (0.013) 0.000
Severe problems to unable 0.035 (0.016) 0.030 0.097 (0.017) 0.000 0.068 (0.012) 0.000
Usual activities (UA)
No problems to slight problems 0.071 (0.015) 0.000 0.092 (0.016) 0.000 0.090 (0.006) 0.000
Slight problems to moderate problems 0.106 (0.017) 0.000 0.051 (0.017) 0.003 0.066 (0.011) 0.000
Moderate problems to severe problems 0.137 (0.019) 0.000 0.154 (0.017) 0.000 0.145 (0.013) 0.000
Severe problems to unable 0.061 (0.018) 0.001 0.091 (0.017) 0.000 0.084 (0.013) 0.000
Pain/discomfort (PD)
No problems to slight problems 0.089 (0.013) 0.000 0.081 (0.016) 0.000 0.086 (0.006) 0.000
Slight problems to moderate problems 0.007 (0.019) 0.721 0.012 (0.018) 0.513 0.009 (0.011) 0.395
Moderate problems to severe problems 0.135 (0.018) 0.000 0.085 (0.017) 0.000 0.103 (0.013) 0.000
Severe problems to extreme problems 0.024 (0.019) 0.211 0.053 (0.018) 0.003 0.048 (0.013) 0.000
Anxiety/depression (AD)
No problems to slight problems 0.079 (0.014) 0.000 0.050 (0.017) 0.003 0.079 (0.006) 0.000
Slight problems to moderate problems 0.055 (0.018) 0.002 0.061 (0.017) 0.000 0.055 (0.011) 0.000
Moderate problems to severe problems 0.086 (0.017) 0.000 0.114 (0.018) 0.000 0.093 (0.012) 0.000
Severe problems to extreme problems 0.062 (0.016) 0.000 0.085 (0.018) 0.000 0.078 (0.012) 0.000
Log likelihood -6189.97 -3958.62 -9325.84
AIC 12,421.93 7957.24 18,735.69
BIC 12,572.19 8109.23 19,060.41
Examples of estimated utility values
U(21111) 0.912 0.861 0.881
U(31111) 0.826 0.781 0.808
U(41111) 0.576 0.585 0.590
U(51111) 0.406 0.366 0.387
U(12345) 0.225 0.268 0.240
U(21231) 0.745 0.676 0.696
U(55555) -0.810 -0.884 -0.865
AIC Akaike information criteria, BIC Bayesian information criteria, C-TTO composite time trade-off, DCE discrete choice experiments, SE
standard error

Indonesia would be to sample values for health states at archipelago. For the time being, we conclude that the
different places/islands in the republic. For instance, the present value set is the best representative values set for the
same health states could be valued in Aceh (west), Java EQ-5D-5L now available for Indonesia.
(middle) and Papua (east). Such a study could then be used Several study findings are worth highlighting. First, this
to provide the motivation for additional studies that sample is the first study in Asia to have used the hybrid model to
the values for people living in other parts of the maximize information obtained from C-TTO and DCE.
The Indonesian EQ-5D-5L Value Set 1163

The models demonstrated logical consistency and signifi- Fourth, this study was performed according to the
cant regression coefficients. Two possible reasons that the EuroQol Group’s EQ-5D-5L valuation protocol. Hence, the
data led to logical and significant models could be that the results are comparable to findings obtained in other coun-
data were of high quality, which was assured by (1) the tries. The final Indonesian value set shows that the mobility
extensive use of the QC report provided by the EuroQol dimension influenced utility estimates the most, similar to
Group, and (2) the retraining programme conducted after EQ-5D-5L valuation study results from Uruguay and South
dropping the first 102 interviews owing to their poor Korea [41, 43]. The pain/discomfort dimension had the
quality [30]. The QC report identified the first 102 inter- least influence on utility estimates, quite the opposite of the
views as problematic; indeed, further analysis using the EQ-5D-5L value sets of England and the Netherlands
hybrid model demonstrated that the results of these inter- where this dimension was in the top two most influential,
views showed logical inconsistencies in self-care and pain/ after anxiety/depression [38, 39]. Perhaps this was because,
discomfort dimensions, together with a regression coeffi- in countries such as the Netherlands and the UK, problems
cient that was not significant for pain/discomfort level 4 with mobility had less influence due to better infrastructure
(p = 0.179). The lesson learned here is that even sophis- provision and less emphasis on manual labour. It could also
ticated models profit from high-quality data. be argued that Indonesian people have adapted to mild
Second, the Indonesian results present more negative levels of pain or discomfort, or perhaps they considered a
values than any other EQ-5D-5L valuation study under- mild level of pain or discomfort something they have to
taken so far (i.e. in the UK, the Netherlands, Canada, live with. The same line of reasoning applies to anxiety/
Uruguay, Japan and Korea [38–43]). It could be argued that depression. Indonesian people report more problems with
the high number of negative values is the result of inter- anxiety/depression and have adapted to these mild levels of
action between a process-related factor—quality control anxiety/ depression, or they consider this as part of normal
process and a cultural-related factor—interdependence life. It could also be a result of small differences in
among the members of a society (collectivism vs individ- translation. If the translated Indonesian words for depres-
ualism). This study implemented the quality control pro- sion and anxiety refer to a lighter problem, then it makes
cess rigorously. It is possible that this quality control sense that the prevalence was higher and the disutility
process provides the interviewer with better feedback and lower. Indeed, there are some indications that this was the
therefore better skills to administer the complex WTD case. In the Indonesian EQ-5D translation, the word
trade-offs. Therefore, the more valid administration of the ‘sedih’, which might also be translated as ‘sadness’, is
C-TTO means that more interviewers follow the protocol, added to the description of the anxiety/depression dimen-
which could have led to a higher proportion of negative sion. These kinds of interactions between the description of
values. The cultural factor, namely collectivism, might the dimensions and the values attached justify attempts to
play a role. People from collectivistic cultures, such as utilize local and linguistically matched value sets for utility
Indonesia, are more concerned with how their illness might questionnaires such as EQ-5D. If not, value sets based on
affect their closest circles such as family and friends [44]. other languages might apply the wrong (higher) utilities to
Moreover, they are more reluctant to explicitly ask for help the descriptors. For instance, it is now clear that one cannot
[45]. Some comments from our respondents support this: use the UK value for anxiety/depression for the Indonesian
having severe or extreme/unable problems in EQ-5D descriptor with an additional word ‘sadness’.
dimensions was very bad for them individually, but will Several policy implications of the present study can be
also be a burden for their closest circles (family and considered. The finding that the mobility dimension most
friends). For other respondents, they preferred to die than affects utility could be implemented in Indonesian gov-
to bother their families and friends when they have a severe ernment policies, such as allocating more funds to the
illness. The EQ-5D-3L value set of Singapore, a neigh- prevention of diabetic foot ulcers or other interventions that
bouring country of Indonesia and a collectivistic country as improve mobility like better wheelchairs. Moreover, the
well, showed the all-worst state ‘33333’ was -0.769 [46]. anxiety and depression problems reported should be
When more national valuation studies are published, it will addressed. If so, the discussion concerning the translation
be possible for a further investigation to disentangle the of the anxiety and depression dimension mentioned in the
effect of these factors on proportion of WTD values in an paragraph above should be taken into account. If indeed
EQ-5D-5L valuation study. anxiety and depression are such common afflictions in
Third, we had a low level of non-response: only 61 of Indonesia, mental health treatment by professionals such as
the 1117 respondents. Our recruitment strategy, which psychologists and psychiatrists within the national health
involved local leaders and asking respondents to recom- insurance scheme should be considered.
mend our study to other people, contributed to this low Indonesia is endeavouring to implement HTA compre-
number. hensively. The present research shows that in measuring
1164 F. D. Purba et al.

and valuing quality of life, Indonesia bears comparison References

You might also like