Download as pdf or txt
Download as pdf or txt
You are on page 1of 11

Review of Financial Economics 26 (2015) 25–35

Contents lists available at ScienceDirect

Review of Financial Economics

journal homepage: www.elsevier.com/locate/rfe

The wages of social responsibility — where are they? A critical review of


ESG investing
Gerhard Halbritter ⁎, Gregor Dorfleitner 1
University of Regensburg, Department of Finance, Universitätsstraße 31, 93053 Regensburg, Germany

a r t i c l e i n f o a b s t r a c t

Article history: This paper contributes both to investigating the link between the corporate social and financial performance
Received 22 December 2014 based on environmental, social and corporate governance (ESG) ratings and to reviewing the existing empirical
Received in revised form 18 February 2015 evidence pertaining to this relationship. The sample used includes ESG data of ASSET4, Bloomberg and KLD for
Accepted 24 March 2015
the U.S. market from 1991 to 2012. The econometrical framework applies an ESG portfolio approach using the
Available online 2 April 2015
Carhart (1997) four-factor model as well as cross-sectional Fama and MacBeth (1973) regressions. Previous
JEL classification:
empirical research indicates a relationship between ESG ratings and returns. As against this, the ESG portfolios
G10 do not state a significant return difference between companies with high and low ESG ratings. Although the
G11 Fama and MacBeth (1973) regressions reveal a significant influence of several ESG variables, investors are hardly
G12 able to exploit this relationship. The magnitude and direction of the impact are substantially dependent on the
rating provider, the company sample and the particular subperiod. The results suggest that investors should no
Keywords: longer expect abnormal returns by trading a difference portfolio of high and low rated firms with regard to
ESG ESG aspects.
Environmental social governance
© 2015 Elsevier Inc. All rights reserved.
Socially responsible investing
Corporate social performance
Financial performance

1. Introduction from the corporate financial performance (CFP), private and institution-
al investors are increasingly interested in the corporate social perfor-
Several studies examine the relationship between a company's mance (CSP) of a particular company. Firms are being ever more
social and financial performance based on environmental, social and encouraged to consider non-monetary goals.
corporate governance ratings, or short ESG ratings, mostly finding a Nevertheless, most investors consider social concerns just as a side
positive connection. This paper contributes to this subject by revisiting condition while a maximization of the return is still the primary objec-
the evidence from the perspective of a U.S. investor utilizing the data tive. In this context the question of whether there is a link between the
of three different ESG ratings. By analyzing the problem with a sample financial and social performance of a company arises. Even investors
consisting of data up to 2012 and by applying an econometrically without non-monetary interests could exploit a potential relationship
sound methodology, we show that the positive effects reported earlier to gain abnormal returns. A meta-analysis of Orlitzky, Schmidt, and
are both dependent on the rating concept used and the time interval Rynes (2003) finds that the CFP is positively correlated with the CSP
in which the observations were made. while the dependence is bidirectional and simultaneous. Wallis and
Throughout the last decade, Socially Responsible Investments (SRIs) Klein (in press) also suggest that there is a certain amount of evidence
have experienced an impressive development. According to the U.S. for an outperformance of socially responsible over conventional
Forum for Sustainable and Responsible Investments USSIF (2012), SRIs investments.
account for more than 20% of the global capital market. Assets exceed- The oldest line of SRI research is concerned with a comparison of the
ing $30 trillion are based on SRI principles. These figures reveal the performance of conventional and SRI funds. Most studies in this field,
outstanding importance for both investors and researchers. Apart such as Hamilton, Jo, and Statman (1993), Statman (2000), Bauer,
Koedijk, and Otten (2005), Bello (2005), Kreander, Gray, Power, and
Sinclair (2005), and Utz and Wimmer (2014), do not indicate significant
performance differences. A number of papers, such as Sauer (1997),
⁎ Corresponding author. Tel.: +49 941 943 2684.
E-mail addresses: [email protected] (G. Halbritter), Gregor.Dorfl[email protected]
Statman (2000), Schröder (2004), Statman (2006), Schröder (2007),
(G. Dorfleitner). and Lee and Faff (2009), do not provide evidence of an out- or
1
Tel.: +49 941 943 2684. underperformance of SRI indices compared to conventional indices.

CFP: Corporate financial performance


http://dx.doi.org/10.1016/j.rfe.2015.03.004 CSP: Corporate social performance
1058-3300/© 2015 Elsevier Inc. All rights reserved. SRI: Social Responsible Investments.
26 G. Halbritter, G. Dorfleitner / Review of Financial Economics 26 (2015) 25–35

A recent analysis of Belghitar, Clark, and Deshmukh (2014) finds that 2. Related literature
there is no difference regarding the expected returns and their variance.
However, SR investors pay a high price in terms of utility if higher In this section we provide an overview of the literature concerning
moments are taken into account. the link between a firm's return and CSP level, measured in terms of
Utz and Wimmer (2014) state that SRI mutual funds do not, on ESG criteria. All papers investigating this issue for the U.S. market by
average, hold socially responsible firms to a greater extent than conven- using ESG portfolios are considered. Moreover, we review research ap-
tional funds do. As a result, it becomes questionable whether packaged plying cross-sectional regressions in order to identify the explanatory
SRI products are at all suitable for investigating the link between the power of ESG variables without relying on portfolio assumptions.
social and the financial performance. In this context, ESG ratings allow Based on the ratings of Innovest, Derwall, Guenster, Bauer, and
for a more appropriate approach, as they provide a direct measure of Koedijk (2005) investigate whether ecological responsibility has an
the CSP at company level. Specialized rating agencies define a certain impact on a company's return. The sample includes U.S. companies for
set of criteria incorporating a variety of sustainability issues. Each corpo- the years 1997 to 2003. By applying a high–low strategy, the Carhart
ration in the rating universe receives a specific rating score. In contrast (1997) four-factor model reveals a significant outperformance of high-
to SRI funds and indices, ESG ratings result in large panel data sets rated firms over low-rated ones.
which offer a considerably more precise understanding of how sustain- Eccles, Ioannou, and Serafeim (2014) follow a combined approach to
ability aspects influence a firm's return. The related literature discussed identify high and low sustainability firms from a sample of 180 U.S.
in Section 2 provides evidence that investors are able to attain abnormal companies. Beside the ESG ratings of ASSET4 and SAM, they also rely
returns by trading corporations with high ESG scores long and low ESG on personal research and interviews. Based on the general impression,
scores short. half of the firms are categorized as being high or rather low. The
This paper contributes to reviewing this existing empirical Carhart (1997) four-factor model following a high–low strategy reveals
evidence based on ESG ratings pertaining to the link between the annual abnormal returns of up to 4.8% for a sample period from 1993 to
CSP and CFP of a company. Most studies focus on one special ESG 2010.
rating database. However, it is possible that implications related to Lee, Faff, and Rekker (2013) also investigate the performance of U.S.
returns are dependent on the underlying rating approach. companies dependent on the ESG ratings of SAM. The Carhart (1997)
Dorfleitner, Halbritter, and Nguyen (2014) reveal significant differ- four-factor model for 1998 to 2007 provides evidence in favor of a
ences in distribution, level and risk of various ESG rating concepts. significant outperformance of high-rated companies as well as of
Furthermore, most studies are based on very short time-series high-rated sectors. As opposed to our work, the study only accounts
since most rating agencies did not commence their work before the for an overall ESG score and not for the particular ESG pillars.
beginning of the last decade. Since KLD was one of the first ESG rating providers, their database is
This is the first paper to compare the impact of sustainability relatively large and starts early in 1990. They also use a very transparent
issues on a firm's return with respect to three ESG rating providers, scoring approach based on a number of strengths and concerns. For this
namely ASSET4, Bloomberg and KLD. As a part of Thomson Reuters, reason, KLD data is used in numerous empirical papers. As one of the
ASSET4 is one of largest ESG rating agencies. The data set starts in first studies based on KLD ratings, Kempf and Osthoff (2007) compare
2002 and includes more than 1000 U.S. companies. The Bloomberg the performance of high- and low-rated companies from the S&P 500
sample starts in 2005 and also contains over 1000 U.S. firms. Similar and DS 400 for the years 1992 to 2004. The value-weighted portfolios
studies mostly apply the ESG scores according to Kinder, Lydenberg, are constructed by using a 10% cut-off.2 The Carhart (1997) four-factor
and Domini Research & Analytics (KLD). With more than 4000 firms model reveals a significant performance difference as related to the
and a history starting in 1990, KLD offers the most comprehensive high–low portfolio. Investors were able to realize an abnormal return
ESG database. The relatively large sample allows us both to gain a of up to 8.7% per year.
profound insight into the relationship of the financial and the social Statman and Glushkov (2009) also compose high and low portfolios
performance and to compare the three ESG approaches in terms of based on KLD rating data from 1992 to 2007. As opposed to Kempf and
return predictability. Osthoff, (2007), their portfolios are equally-weighted and the cut-off is
The empirical framework is guided by existing research ap- one third. For most ESG categories, both the CAPM and the Carhart
proaches presented in Section 2 and by econometrical adequacy. (1997) four-factor model indicate a significant positive abnormal return
We follow two different strategies — an ESG portfolio method in of a high–low strategy.
the spirit of Kempf and Osthoff (2007) and cross-sectional regres- In contrast to previous studies, Galema et al. (2008) construct
sions similar to Galema, Plantinga, and Scholtens (2008). ESG portfo- portfolios separately for each KLD strength and concern. All strength-
lios provide a straightforward strategy for investors to exploit a minus-concern portfolios demonstrate a positive Carhart (1997) four-
potential relationship between ESG ratings and the financial perfor- factor model alpha using data from 1992 to 2006. By estimating Fama
mance while the cross-sectional regressions allow us to obtain a and MacBeth (1973) regressions, they conclude that the employee
more profound understanding of how the ESG level affects the return relations indicator has a significant positive effect on the return.
in the cross-section. Manescu (2011) examines the connection between ESG scores and
The ESG portfolio method generally constructs a high and a low returns in the cross-section based on KLD data from 1991 to 2006
portfolio including ESG out- and underperformers, respectively. Addi- including all S&P 500 and DS 400 companies. Fama and MacBeth
tionally, a high–low difference portfolio buys the high portfolio and (1973) regressions demonstrate that the community relations criteria
short sells the low one. We use the Carhart (1997) four-factor model have a significant positive influence on the return. The overall ESG
for estimating abnormal returns. A best-in-class approach instead uses rating does not explain returns.
sector-specific ESG ratings for composing the portfolios. In order to In summary, the findings of the empirical research provide evidence
exploit the full sample size and to avoid assumptions regarding the in favor of a link between returns and the CSP level, measured in terms
portfolio construction, we also consider monthly cross-sectional Fama of the environmental, social and corporate governance dimensions.
and MacBeth (1973) regressions. Based on ASSET4, Innovest, KLD and SAM ratings, Derwall et al.
The remainder of this paper is organized as follows. Section 2 (2005), Eccles et al. (2014), Kempf and Osthoff (2007) and Statman
outlines the current state of research. Section 3 presents the data set. and Glushkov (2009) find a significant positive impact of the ESG
Section 4 develops the empirical framework for our analysis while
Section 5 presents the results and implications. Finally, Section 6 2
In the context of ESG portfolios the cut-off generally describes the top and bottom
concludes. quantile of companies considered as high and low, respectively.
G. Halbritter, G. Dorfleitner / Review of Financial Economics 26 (2015) 25–35 27

score on the return. This result suggests that investors can gain abnor- Table 1
mal returns by trading ESG difference portfolios. While these studies Descriptive statistics: ESG ratings.

only apply the ESG portfolio approach, Galema et al. (2008) and Mean SD Min Max Observations
Manescu (2011) account for a dependence in the cross-section. Both ASSET4 ESG Overall 51.87 28.19 3.02 98.57 N 88,600
studies show a significant positive relationship between the ESG and Between 23.62 4.23 97.44 n 1170
return variables for at least a few indicators. Within 15.19 −8.84 114.00 T 75.73
Although there is evidence of the fact that the financial performance ENV Overall 40.05 30.61 8.90 97.16 N 88,600
of a company is dependent on its ESG score, this issue must be critically Between 25.16 9.10 96.59 n 1170
Within 16.37 −30.07 108.64 T 75.73
reviewed. Most studies are based on one specific ESG data set. According
SOC Overall 44.30 28.41 3.68 98.88 N 88,600
to Dorfleitner, Halbritter, and Nguyen (2014), the ESG ratings of ASSET4, Between 23.80 4.47 95.61 n 1170
Bloomberg and KLD are significantly different in terms of both distribu- Within 15.22 −20.10 104.87 T 75.73
tion and risk. This aspect may also affect a potential correlation with GOV Overall 73.20 16.94 1.50 97.94 N 88,600
financial items. For this reason it is crucial to account for different ESG Between 13.59 7.50 95.28 n 1170
rating providers. As a consequence, our study uses multiple ESG rating Within 11.52 3.26 118.86 T 75.73
ECN Overall 52.09 28.32 1.21 98.98 N 88,600
concepts. We consider both the overall ESG level and the pillars in
Between 21.31 2.09 96.65 n 1170
terms of the environmental, social and governance performance. More- Within 19.37 −15.51 117.67 T 75.73
over, due to the fact that our database carries on until 2012, we are able Bloomberg ESG Overall 21.07 11.94 0.75 85.12 N 56,721
to include recent developments in our analysis. Compared to the studies Between 9.98 6.61 60.96 n 1073
of Galema et al. (2008) and Manescu (2011), we account for autocorre- Within 5.04 −20.73 50.76 T 52.86
lations of the time-variant Fama and MacBeth (1973) coefficients as ENV Overall 20.55 15.84 0.78 89.92 N 27,135
Between 13.26 0.78 61.37 n 585
proposed by Fama and French (2002) and Petersen (2009).
Within 7.65 −22.34 53.42 T 46.38
SOC Overall 15.42 15.12 3.13 83.33 N 53,606
3. Data Between 12.60 3.13 65.61 n 986
Within 6.77 −33.28 55.19 T 54.37
The data set covers the full ASSET4 rating universe for U.S. compa- GOV Overall 52.89 6.23 5.36 85.71 N 56,673
nies from 2002 to 2011. The rating approach is based on more than Between 5.49 16.52 72.50 n 1073
Within 3.17 18.61 70.53 T 52.82
850 indicators with regard to ESG aspects. Within various steps these
KLD ESG Overall 67.28 12.04 25.58 100.00 N 342,457
indicators are condensed to four pillars relating to the environmental, Between 8.25 34.23 93.35 n 4209
social, governance and economic performance. As a part of the scoring Within 9.72 20.40 95.19 T 81.36
process, all companies are benchmarked against the complete firm ESG* Overall 71.56 10.81 32.65 100.00 N 342,457
universe. ASSET4 also provides an overall ESG score composed of the Between 7.37 42.63 94.30 n 4209
Within 8.62 30.54 95.77 T 81.36
equally weighted pillars.
ENV Overall 64.09 13.66 9.09 100.00 N 342,457
From Bloomberg we only acquired ESG scores for firms which are
Between 9.25 18.18 88.89 n 4209
also included in the ASSET4 database. The sample covers the years Within 11.06 3.04 102.75 T 81.36
from 2005 to 2011. The rating approach of Bloomberg incorporates SOC Overall 61.01 11.16 17.86 100.00 N 342,457
over 100 data points. Similar to ASSET4, these indicators are aggregated Between 7.41 32.14 92.83 n 4209
to a total ESG score and the three sub-categories environment, social Within 8.15 17.07 92.40 T 81.36
and governance. GOV Overall 62.82 17.40 0.00 100.00 N 342,457
Between 10.67 33.33 100.00 n 4209
We fully include the database of KLD except for privately held com- Within 15.07 4.02 117.79 81.36
T
panies. As KLD is one of the first ESG rating providers, the sample covers
the years from 1990 to 2011. Compared to ASSET4 and Bloomberg, KLD This table presents the mean, standard deviation, minimum and maximum values of the
ESG scores separated into providers and pillars. Overall denotes the full data set. Between
does not offer numerical ESG scores but rather binary indicators for indicates the cross-section while within describes the time-series dimension. N is the
numerous strengths and concerns. The total number of indicators is number of total observations for n companies over an average time-series of T months.
variable over time. In order to make the KLD ratings compatible with
the scores of ASSET4 and Bloomberg, the strengths and concerns need
to be transformed to numerical values. On this account we follow of the ESG data.3 Concerning ASSET4, the mean ESG scores of the
Kempf and Osthoff (2007) and revert all concerns back into strengths entire rating universe are supposed to be in a range of 50 due to
by taking the opposite binary value. For each pillar environment, social their scoring approach. Considering only U.S. companies, the mean
and governance we sum up the particular indicators and normalize ESG and ECN values are consistent with this assumption. As opposed
them between 0 and 100. Analogously, the total ESG score is calculated
by using all indicators. Additionally, KLD provides data whether compa- 3
The standard deviations are calculated as follows:
nies are involved in controversial business sectors, namely alcohol, fire-
arms, gambling, military, nuclear power, and tobacco. We therefore vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u
u 1 X N X T  2
calculate an alternative total ESG* score including these indicators in Overall : so ¼ t x −x :
NT−1 i¼1 t¼1 i;t
the sense of concerns.
As a result, we have a comparable data set of three ESG rating vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u
u 1 X N
providers including variables for the total ESG score (ESG) as well as Between : sb ¼ t
2
ðx −xÞ :
N−1 i¼1 i
for the individual score of the pillars environment (ENV), social (SOC)
and governance (GOV). In addition, ASSET4 reports upon a fourth sub- vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi
u
score related to a firm's economic sustainability (ECN). For KLD we u 1 X N X T  2
Within : sw ¼ t x −xi :
also account for a total ESG score including negative screening criteria NT−1 i¼1 t¼1 i;t

(ESG*).
Minimum and maximum values are based on the following variables:
Even though all three rating agencies offer a measure for a firm's
CSP, their data sets feature notable differences. Dorfleitner, Overall : xi;t :
Halbritter, and Nguyen (2014) find significant variations in distribu- Between : xi :
tion and risk characteristics. Table 1 shows the descriptive statistics Within : xi;t −xi þ x:
28 G. Halbritter, G. Dorfleitner / Review of Financial Economics 26 (2015) 25–35

to this, the GOV rating exhibits significantly higher values of approx- In each year p from 1991 to 2012 we construct two market capi-
imately 75. This indicates that U.S. companies are above average talization weighted portfolios for each ESG rating of the three data
regarding corporate governance practices. On the other hand, the providers. For this purpose, we sort the companies according to the
environmental and social dimensions are slightly below average. score available at p-1. The 20% best (worst) performing firms in a
The mean Bloomberg scores are located on a lower level while particular category are assigned to the high (low) portfolio. In order
KLD's means are clearly above 60. The average scores are even higher to compare the performance of both portfolios, we investigate a
if controversial business involvement data is incorporated. This high–low strategy which contains the high portfolio in a long posi-
suggests that the majority of the companies in the sample are related tion while the low portfolio is held in a short position.
to indisputable sectors. The overall standard deviation of Bloomberg To evaluate the performance of these ESG portfolios, we apply
and KLD ranges between 10 and 20 while ASSET4 exhibits notably the Carhart (1997) four-factor model. The model is estimated given
higher ones around 30. Regarding ASSET4 and Bloomberg, the varia- the following equation:
tion between the firms is twice as big as the temporal variation  
within one company. The KLD ratings show lower volatilities which r i;t −r f ;t ¼ α i þ βi r m;t −r f ;t þ si SMBi;t þ hi HMLi;t þ wi WMLi;t þ ui;t ;
are fairly in line with each other.
ð1Þ
The KLD data set features 342,457 total observations (N) based
on 4209 U.S. companies (n) with an average time-series of 81 months where ri,t − rf,t is the excess return of the portfolio i over the risk-free
(T). Since ASSET4 and Bloomberg do not solely provide ratings for U.S. interest rate in month t. It is explained by the excess return of the mar-
firms, they focus on larger corporations. ASSET4 offers scores for 1170 ket rm,t − rf,t as well as by the size, value and momentum factor denoted
U.S. enterprises with a mean history of 76 months. The Bloomberg as SMBi,t, HMLi,t, and WMLi,t, respectively. The coefficients αi, βi, si, hi, and
data consists of 1073 companies and an average time-series of wi are estimated performing a linear regression while ui,t represents the
53 months. 996 firms are covered by all three agencies. residual. Standard errors are calculated using the Newey and West
As this analysis takes the perspective of a U.S. investor, the risk-free (1987) procedure.5
interest rate is represented by the one-month U.S. Treasury bill and all For each ESG score type we consider both the high and low portfolio
data are denominated in U.S. Dollars. Total returns, market capitalization and the high–low strategy. In terms of the latter, the alpha expresses the
data and book-to-market ratios are retrieved from Thomson Reuters return difference between ESG outperformers and underperformers. In
Datastream. Discarded or insolvent companies are included up to the a first step, we allow for the full existing data set. To control for firm-
last available financial or rating information. In the case of a merger, specific effects, we restrict the sample to companies and observations
both companies are devolved upon the firm retaining its Datastream available from all three providers in a second step.
ticker. Therefore, our study is not subject to a survivorship bias. In order to gain further robustness, we also estimate the full sample
In the context of the Carhart (1997) four-factor model, we also need model by applying various modifications. Accounting for sector-specific
risk premia relating to size, value and momentum. Research commonly issues we also test a best-in-class version. The best-in-class score of a
uses risk factors available from Kenneth French's data library. However, company is calculated as the difference of the individual and the aver-
Asness and Frazzini (2013) criticize the construction of the high-minus- age industry score. During the portfolio selection process firms are by
low factor (HML). Forming the book-to-market value portfolios, Fama implication ranked among their own peer group. As a consequence,
and French (1992) use data that is up to 18 months old. As a con- companies can also be eligible for a high portfolio even if they belong
sequence, the value measure is not unraveled from the momentum to a sector which is difficult in terms of ESG requirements.
anomaly determined on a monthly interval. Based on a more timely To investigate whether the results are dependent on the assump-
measurement, Asness and Frazzini (2013) account for a true value strat- tions of the portfolio selection process, we also consider variations in
egy, not a value-momentum strategy. For this reason we apply the risk the cut-off and weighting definition. In addition to the initial cut-off of
premia provided on the website of Andrea Frazzini.4 These risk premia 20%, we incorporate a 1, 5, 10, 25 and 50% barrier, each in a market
are constructed in accordance with Fama and French (1992, 1993), capitalization and equally weighted implementation.
Carhart (1997) and Asness and Frazzini (2013). Furthermore, we examine whether the link between the financial
Within the framework of the panel regressions, sector data is and social performance remains constant over time. The sample is
required to account for intra-industry dependencies. For this purpose, split into the three subperiods from 1991 to 2001, from 2002 to 2006,
the Standard Industrial Classification (SIC) is used. The data set includes and from 2007 to 2012. The first 10 years of the full sample period are
653 different SIC classes. For the best-in-class ratings the companies are only applicable to the KLD data set. Since we are interested in a compar-
only grouped into the ten major sectors of basic materials, consumer ison of the three rating providers, we do not further split this period. In
cyclicals, consumer non-cyclicals, energy, financials, healthcare, indus- the second decade we also consider rating data of ASSET4 and
trials, technology, telecommunication services and utilities. This en- Bloomberg. In this case, shorter periods allow us to track the return
sures a sufficiently high number of firms in each class. development based on the ESG ratings of all three providers.
The portfolio is constructed both in a market capitalization and
4. Methodology equally weighted version. Given the core issue of this paper, all robust-
ness checks focus on the alphas of the high–low strategy as a measure of
4.1. ESG portfolios the abnormal performance.

As the empirical literature illustrates, constructing ESG portfolios is 4.2. Cross-sectional regressions
one of the most common approaches to investigating the relationship
between the social and financial performance of companies. Based on In the previous section, the relationship of the CSR on the financial
ESG ratings, this method easily aggregates large panel data sets to a sin- performance is examined aggregating the cross-sectional dimension
gle time-series dimension. This allows the application of basic asset to portfolios. In this section, we apply a panel-based strategy to analyze
pricing models. Furthermore, it provides a straightforward trading strat-
egy to investors. The empirical framework of this section largely com- 5
A Breusch and Pagan (1979) test performed on all models provides evidence of the
plies with Kempf and Osthoff (2007) and Statman and Glushkov (2009). fact that the residuals of the linear regressions are subject to heteroskedasticity. A Durbin
and Watson (1971) test indicates autocorrelations for some of the models. As a robustness
check, we also estimate the regressions using the conventional OLS estimator. The impli-
4
See: http://www.econ.yale.edu/~af227/data_library.htm. cations remain unchanged.
G. Halbritter, G. Dorfleitner / Review of Financial Economics 26 (2015) 25–35 29

the direct impact of ESG variables on the return using Fama and and MacBeth (1973) procedure equally weights each period, the pooled
MacBeth (1973) regressions. While the portfolio approach only includes OLS method equally weights each observation. Therefore, both results
companies with a very high and very low ESG score, this procedure may vary in the case of an unbalanced data set.
accounts for the full cross-section without the necessity to make
assumptions about the portfolio construction.
Fama and MacBeth (1973) regressions are designed to measure the 5. Results
influence of the systematic risk factor beta on the return of a company.
Fama (1991) and Fama and French (1992) emphasize that both the 5.1. ESG portfolios
market capitalization and the book-to-market ratio were also important
for explaining stock returns in the cross-section. In this context ESG Table 2 presents the regression results of the high and low portfolios
scores could also contribute. as well as of the difference portfolios. Concerning the ASSET4 ratings, all
We estimate two model specifications. While model (I) uses the high–low portfolios exhibit a positive alpha. While the high portfolios
overall ESG score as an explanatory variable, model (II) focuses on the are largely in line with the overall market, the low portfolios slightly
particular ESG pillars. The betas are estimated following the two-step underperform the market. However, compared to earlier studies the
sorting procedure of Black, Jensen, and Scholes (1972). A vector of abnormal returns are rather small and are therefore statistically insig-
control variables is also included. Following Galema et al. (2008) and nificant. The portfolios based on the Bloomberg data set do not provide
Hong and Kacperczyk (2009), we use the natural logarithm of the evidence for a significant relationship between ESG scores and returns,
market capitalization and the book-to-market ratio as well as the either. Only companies featuring a high social score generate significant
average return over the last 12 months. positive abnormal returns of more than five percent per year. The high–
For each month t we estimate the following cross-sectional low portfolio as related to corporate governance indeed shows negative
regression: alphas of up to two percent. The KLD portfolios do not offer significant
return differences between high- and low-rated firms, either. Further-
^ þX
r i;t −r f ;t ¼ γ0;t þ γ 1;t β i;t i;t−1 γX;t þ ESGi;t−1 γESG;t þ ui;t ∀ t ¼ 1; 2; …; T; more, the inclusion of controversial business involvement data has no
ð2Þ significant effect on the alphas. Solely companies with a high environ-
mental rating or a low corporate governance score achieve a significant
where ri,t − rf,t is the return of company i over the risk-free interest rate. outperformance up to almost three percent per year.
^ is the post-ranking beta. Xi,t − 1 is the lagged vector of control
β In summary, in all three samples the particular ESG ratings show a
i;t
significantly lower influence on the financial performance than previ-
variables. ESGi,t − 1 represents the ESG scores available at t − 1. γk,t
ous studies indicate. Although the alphas are not significant, we can
constitutes the coefficient of variable k in month t. Finally, ui,t is the
still see differences between the three providers. In order to cancel
regression residual.
out sample specific effects, we next restrict the analysis to companies
Each time-variant coefficient can be characterized as the realization
providing ESG scores of all three agencies. Table 3 presents the estimat-
of a random variable. The expected value of the estimated coefficients γ ^k
ed coefficients of the Carhart (1997) four-factor model. Concerning
is given by the average over the t coefficients: Bloomberg, the results are similar to the full sample model since portfo-
lios are barely affected by the adjustment. ASSET4 still does not show a
1X T
significant out- or underperformance of the high–low portfolios. In
^k ¼
γ ^ :
γ ð3Þ
T t¼1 k;t contrast, the KLD results are strongly influenced by the sample restric-
tion. The alphas of the high–low portfolios are now negative. These
Fama and MacBeth (1973) propose a simple t-test to evaluate the findings point out that direction and magnitude of performance differ-
statistical significance of these coefficients. Their approach is designed ences are not only dependent on the rating approach but also on the
to rule out intra-temporal dependencies along the cross-section. underlying company sample.
According to Petersen (2009), standard errors can be biased down- Regarding the factor loadings of the Carhart (1997) four-factor
wards if the independent variables and the residuals are correlated model, we can see notable differences between the betas of the high
over time. In our data set, neither a company's market capitalization and low portfolios. Corporations with a higher ESG score are exposed
nor its book-to-market ratio is independently distributed over time. to a lower systematic risk resulting in a lower beta. Some of the beta
Furthermore, Dorfleitner, Halbritter, and Nguyen (2014) show that differences are even significant. Solely, the environmental and social
ESG scores exhibit a low temporal variability. The current rating is large- factors of KLD lead to opposite implications. In addition to this, most
ly dependent on the rating of prior periods. In order to test for serial high–low portfolios indicate a significantly different influence of the
correlations in our panel data model, we perform a Wooldridge size factor between companies having a high or low score. This applies
(2002) test. The t-statistics provide clear evidence for intertemporal both to the overall ESG and the single pillar portfolios. For each of the
dependencies, leading to biased standard errors of the traditional dimensions, firms with a high score are therefore less strongly subjected
Fama and MacBeth (1973) approach (Petersen, 2009). to size risk. In the case of ASSET4 and Bloomberg, the HML coefficient is
Hence, Fama and French (2002) suggest correcting the standard higher for companies with high ESG ratings. For some scores, the HML
errors through the correlation of the estimated coefficients (ρ ^). Accord- difference is even significant. This provides evidence that firms with
ing to Petersen (2009) this leads to a substantial improvement of the high ESG scores also feature a higher book-to-market risk. The implica-
standard errors. As a result, the test statistic is calculated as follows tions are contrary for the KLD sample. Beside the corporate governance
rating, all high–low portfolios show significant negative loadings on the
  ^
γ value factor. Firms with a high KLD rating also have lower weights on
^k ¼
t γ skffiffiffiffiffiffiffiffiffiffiffiffi : ð4Þ the momentum premium. For ASSET4 the results indicate an inverse
σ γ^ k 1 þ ρ ^
pffiffiffi relationship. The Bloomberg scores do not exhibit an obvious trend.
T 1−ρ ^ Subsequently, we consider a number of robustness checks to
improve the validity of our results. Table 4 illustrates the alphas of the
As a robustness check of the cross-sectional approach, we addition- high–low portfolios obtained by applying the best-in-class approach.
ally perform pooled OLS regressions with SIC-clustered standard errors. The findings do not suggest a link between the ESG score and the return.
More than 500 SIC industry classes in our data set ensure a sufficient All alphas are insignificant while the magnitude and direction do not
number of clusters to obtain an unbiased test statistic. While the Fama follow a certain pattern. Differences between the three data providers
30 G. Halbritter, G. Dorfleitner / Review of Financial Economics 26 (2015) 25–35

Table 2
ESG portfolios: Time-series regressions based on full sample.

Alpha MKT SMB HML WML R2

ASSET4
ESG High −0.001 0.939*** −0.248*** 0.048 0.027*** 0.973
Low −0.011 1.042*** 0.180*** 0.011 −0.024 0.927
High–low 0.011 −0.102 −0.428*** 0.037 0.051 0.364
ENV High 0.001 0.949*** −0.253*** 0.046 0.014 0.969
Low 0.000 1.072*** 0.055 −0.099 −0.045** 0.932
High–low 0.001 −0.122 −0.308*** 0.144 0.059** 0.287
SOC High 0.004 0.931*** −0.196*** 0.036 0.002 0.972
Low −0.010 1.121*** 0.126 −0.234** 0.010 0.903
High–low 0.013 −0.190** −0.322*** 0.270** −0.008 0.292
GOV High 0.000 0.941*** −0.218*** 0.012 0.039*** 0.970
Low −0.006 1.063*** −0.015 −0.022 −0.103*** 0.933
High–low 0.006 −0.122 −0.203* 0.035 0.143*** 0.341
ECN High 0.010 0.947*** −0.172*** −0.027 −0.010 0.971
Low 0.000 1.151*** 0.180*** −0.057 −0.001 0.921
High–low 0.011 −0.203*** −0.352*** 0.030 −0.009 0.364

Bloomberg
ESG High 0.012 0.972*** −0.267*** 0.062* 0.040*** 0.969
Low −0.002 1.015*** 0.076 −0.129** 0.015 0.917
High–low 0.014 −0.043 −0.342*** 0.191** 0.025 0.145
ENV High 0.020 0.959*** −0.113 0.137 −0.055* 0.911
Low 0.021 1.068*** −0.037 −0.005 −0.001 0.935
High–low −0.001 −0.109 −0.076 0.142 −0.054 0.028
SOC High 0.052** 1.018*** 0.225** −0.055 −0.102*** 0.922
Low 0.010 1.070*** −0.129** 0.084 −0.020 0.955
High–low 0.041 −0.052 0.354*** −0.140 −0.082* 0.092
GOV High −0.002 0.990*** −0.289*** 0.082*** 0.013 0.975
Low 0.018 1.012*** −0.014 −0.188*** −0.059** 0.927
High–low −0.020 −0.022 −0.274*** 0.271*** 0.072 0.177

KLD
ESG High 0.017 0.956*** −0.213*** −0.057 −0.075*** 0.890
Low 0.005 1.005*** −0.226*** 0.194*** 0.044 0.880
High–low 0.012 −0.049 0.013 −0.251*** −0.119*** 0.121
ESG* High 0.015 0.964*** −0.215*** −0.055 −0.066*** 0.897
Low 0.007 0.982*** −0.193*** 0.215*** 0.037 0.867
High–low 0.008 −0.018 −0.022 −0.269*** −0.103*** 0.122
ENV High 0.029* 0.987*** −0.18*** −0.051 −0.056 0.802
Low 0.006 0.961*** −0.296*** 0.204*** 0.039 0.894
High–low 0.022 0.026 0.116* −0.255*** −0.095*** 0.142
SOC High 0.007 1.029*** −0.226*** −0.058 −0.029 0.907
Low 0.013 0.961*** −0.187*** 0.159*** −0.002 0.863
High–low −0.006 0.068* −0.039 −0.217** −0.027 0.089
GOV High −0.003 0.933*** −0.036 0.181** −0.030 0.679
Low 0.018** 1.024*** −0.241*** 0.014 −0.004 0.949
High–low −0.021 −0.092** 0.205** 0.167* −0.026 0.059

This table presents the results of the Carhart (1997) four-factor model over the variable sample period from 1991 to 2012 on a monthly basis. The regressions are run individually for each
ESG score and portfolio type using the full available data. The high (low) portfolio consists of the 20% best (worst) performing companies in terms of a particular ESG score. The high–low
portfolio trades the high-rated companies long while the low-rated companies are traded short. All portfolios are weighted by the firms' market capitalization. Annualized alphas, factor
loadings concerning size, value and momentum as well as adjusted R2s are reported. The explanatory factors are provided by Asness and Frazzini (2013). The standard errors are estimated
using the Newey and West (1987) procedure. ***, ** and * indicate a significance level of 1%, 5% and 10%.

are even more obvious. The implications are consistent to the standard long-short strategy. Over time the positive alphas slowly diminish and
model. therefore lack statistical significance. From 2002 to 2006 they are only
Furthermore, Table 5 presents Carhart (1997) four-factor model of found in a range of 2.5% when using value weighted portfolios. Equally
the high–low portfolios dependent on portfolio cut-off and weighting. weighted portfolios lead to even lower abnormal returns which are
With a few exceptions, all alphas are insignificant. There is also no close to zero. In 2007 to 2012 all alphas with the exception of corporate
pattern concerning strength and sign of the abnormal returns. At this governance are negative, some of them even being significantly nega-
it makes no difference whether the portfolios are value or equally tive. In this case, the long-short investor would now lose up to six
weighted. All in all, the models do not support evidence for a relation- percent per year. The portfolios based on the corporate governance
ship between the social and financial performance. score do not follow any particular trend while the alphas are mostly
According to previous studies there could be a link between ESG negative. Portfolios based on ASSET4 ratings show a similar develop-
scores and financial returns in earlier years. Table 6 presents the ment. The high–low alphas slightly converge to zero and the signifi-
annualized Carhart (1997) four-factor model alphas for the high–low cances decline also. The same holds true for the high–low portfolios
portfolios estimated for various subperiods. Concerning KLD, we can based on Bloomberg scores.
see a notable downward movement of the abnormal returns. Beside Summarizing, the ESG portfolio strategy does not support a relation-
the corporate governance score, all difference portfolios exhibit positive ship between the CFP and the CSP of a company measured in terms of
alphas during the first period from 1991 to 2001. Most of them are ESG scores. The Carhart (1997) four-factor model cannot show signifi-
statistically significant. Consistently with previous studies investors cant return differences between high- and low-rated firms for any of
are able to achieve annual abnormal returns of up to 6.6% following a the portfolios. Comparing the three data providers, we can also find
G. Halbritter, G. Dorfleitner / Review of Financial Economics 26 (2015) 25–35 31

Table 3
ESG portfolios: Time-series regressions based on an overlapping sample.

Alpha MKT SMB HML WML R2

ASSET4
ESG High 0.005 0.946*** −0.245*** 0.053 0.036*** 0.963
Low −0.003 1.124*** 0.221** −0.043 0.013 0.906
High–low 0.008 −0.178* −0.466*** 0.096 0.023 0.336
ENV High 0.000 0.957*** −0.236*** 0.033 0.020 0.963
Low 0.001 1.063*** 0.169*** −0.134** −0.054*** 0.948
High–low −0.001 −0.106 −0.406*** 0.168* 0.073*** 0.345
SOC High 0.015 0.933*** −0.212*** 0.041 0.021* 0.962
Low 0.007 1.106*** 0.130 −0.253*** 0.005 0.894
High–low 0.008 −0.174* −0.342*** 0.294** 0.016 0.248
GOV High 0.008 0.931*** −0.237*** 0.032 0.057*** 0.948
Low 0.006 1.128*** −0.065 0.022 −0.106*** 0.942
High–low 0.002 −0.197** −0.172* 0.009 0.163*** 0.440
ECN High 0.024 0.952*** −0.13** −0.004 −0.009 0.959
Low −0.015 1.138*** 0.134 −0.082 −0.033 0.914
High–low 0.039 −0.186** −0.264** 0.078 0.024 0.233

Bloomberg
ESG High 0.009 0.971*** −0.258*** 0.061* 0.045*** 0.968
Low −0.014 1.076*** 0.114 −0.120 0.008 0.911
High–low 0.024 −0.105 −0.372*** 0.182* 0.037 0.205
ENV High 0.005 0.993*** −0.046 0.207** −0.072* 0.914
Low 0.027 1.065*** −0.027 −0.010 −0.004 0.932
High–low −0.021 −0.072 −0.019 0.217* −0.068 0.037
SOC High 0.042** 1.026*** 0.169 −0.020 −0.111*** 0.920
Low 0.010 1.067*** −0.128** 0.105 −0.011 0.951
High–low 0.032 −0.041 0.297** −0.125 −0.100* 0.070
GOV High −0.005 0.978*** −0.289*** 0.116*** 0.022** 0.970
Low 0.022 1.019*** −0.052 −0.165*** −0.025 0.933
High–low −0.026 −0.041 −0.237*** 0.281*** 0.047 0.163

KLD
ESG High −0.012 0.938*** −0.114 0.065 −0.042* 0.929
Low 0.025 1.048*** −0.194* −0.074 0.141*** 0.843
High–low −0.035 −0.110 0.080 0.139 −0.183*** 0.081
ESG* High −0.011 0.933*** −0.110 0.073 −0.043* 0.929
Low 0.031 1.043*** −0.219** −0.077 0.14*** 0.849
High–low −0.041 −0.110 0.109 0.149 −0.183*** 0.092
ENV High 0.008 0.956*** −0.123* −0.014 −0.065** 0.931
Low 0.036 1.067*** −0.277** 0.032 0.159*** 0.855
High–low −0.027 −0.111 0.154 −0.046 −0.225*** 0.105
SOC High −0.020 0.967*** −0.168** 0.137* −0.049* 0.929
Low 0.006 1.027*** −0.097 −0.084 0.101*** 0.879
High–low −0.026 −0.060 −0.071 0.221 −0.150*** 0.093
GOV High 0.007 0.952*** −0.137*** 0.042 0.017 0.945
Low 0.010 1.014*** −0.175*** 0.065 −0.006 0.951
High–low −0.002 −0.062* 0.038 −0.023 0.023 −0.004

This table presents the results of the Carhart (1997) four-factor model over the variable sample period from 1991 to 2012 on a monthly basis. The regressions are run individually for each
ESG score and portfolio type using only observations available for all providers. The high (low) portfolio consists of the 20% best (worst) performing companies in terms of a particular ESG
score. The high–low portfolio trades the high-rated companies long while the low-rated companies are traded short. All portfolios are weighted by the firms' market capitalization.
Annualized alphas, factor loadings concerning size, value and momentum as well as adjusted R2s are reported. The explanatory factors are provided by Asness and Frazzini (2013). The
standard errors are estimated using the Newey and West (1987) procedure. ***, ** and * indicate a significance level of 1%, 5% and 10%.

notable differences in the direction and magnitude of the estimated results hold various modifications of the standard model as related to
coefficients. Different ESG concepts are therefore non-consistent in portfolio construction and best-in-class ratings. Overall, the findings
terms of return predictability. The robustness tests illustrate that these strongly argue against the existing evidence proposing abnormal
returns of an ESG portfolio strategy (Derwall et al., 2005; Eccles et al.,
2014; Kempf & Osthoff, 2007; Statman & Glushkov, 2009). Splitting
Table 4 the sample into several subperiods reveals the determinants for this.
ESG portfolios: High–low alphas based on best-in-class ratings. We find a significant decline in the explanatory power of ESG scores
ESG ESG* ENV SOC GOV ECN over the last decade. While in earlier year companies featuring high
ESG scores significantly outperformed their low counterparts, as of
ASSET4 −0.016 0.012 −0.018 −0.023 −0.005
Bloomberg −0.027 −0.024 −0.005 0.001 2012 not one of the three rating concepts appears to be able to antici-
KLD 0.017 0.005 0.028 0.026 0.002 pate abnormal returns.
This table presents the annualized high–low alphas of the four-factor model over the
variable sample period from 1991 to 2012 on a monthly basis. The regressions are run 5.2. Cross-sectional regressions
individually for each ESG score using the full available data. The high–low portfolio buys
the 20% best performing companies in terms of a particular ESG best-in-class score Even if the ESG portfolio approach does not provide evidence of a
while the worst 20% are traded short. All portfolios are weighted by market capitalization.
The explanatory factors are provided by Asness and Frazzini (2013). The standard errors
relationship between the social and financial performance, there
are estimated using the Newey and West (1987) procedure. ***, ** and * indicate a signif- might be a dependence in the cross-section. Using the full panel struc-
icance level of 1%, 5% and 10%. ture of our three databases allows us to obtain a more detailed
32 G. Halbritter, G. Dorfleitner / Review of Financial Economics 26 (2015) 25–35

Table 5
ESG portfolios: High–low alphas dependent on cut-off and weighting approach.

Market capitalization weighted Equally weighted

1% 5% 10% 25% 50% 1% 5% 10% 25% 50%

ASSET4
ESG −0.079 0.010 0.001 0.013 0.014 −0.058 −0.031 −0.014 0.015 0.018*
ENV −0.065 −0.059** −0.017 0.009 0.007 −0.097 −0.043** −0.008 0.011 0.013
SOC 0.011 0.009 0.006 0.011 −0.017 −0.049 −0.001 0.014 0.026*** 0.010
GOV 0.010 0.010 −0.022 0.015 0.018 −0.009 −0.015 −0.023* −0.001 0.010
ECN −0.034 0.024 −0.019 0.028 −0.004 −0.008 −0.005 0.000 0.014 0.007

Bloomberg
ESG 0.041 0.006 −0.011 0.015 −0.001 −0.083 −0.106*** −0.048** −0.013 −0.019
ENV −0.143 0.058 −0.019 −0.027 −0.002 0.018 0.064 0.033 0.003 −0.006
SOC 0.115 0.035 0.000 0.040 0.025 0.060 0.003 0.003 0.003 0.038*
GOV −0.122 0.063 0.025 −0.032* −0.032* −0.037 0.017 0.009 −0.027** −0.024

KLD
ESG 0.014 0.004 0.005 0.010 −0.002 −0.041 −0.002 −0.008 0.003 −0.003
ESG* 0.016 −0.005 −0.006 0.006 −0.006 −0.030 0.007 −0.011 −0.002 −0.008
ENV −0.028 0.016 0.039 0.022 0.002 −0.013 0.037 0.019 −0.003 −0.014
SOC −0.074* −0.027 −0.003 −0.008 −0.003 −0.071 −0.011 0.001 0.004 0.003
GOV 0.033 −0.010 −0.032 −0.032 0.007 0.012 −0.029 −0.034 −0.039* −0.027***

This table presents the annualized high–low alphas of the Carhart (1997) four-factor model over the variable sample period from 1991 to 2012 on a monthly basis. The regressions are run
individually for each ESG score using the full available data. The high–low portfolio buys the best performing companies in terms of a particular ESG score while the worst are traded short.
Portfolio cut-offs of 1%, 5%, 10%, 25% and 50% are applied, both in a market capitalization and equally weighted version. The explanatory factors are provided by Asness and Frazzini (2013).
The standard errors are estimated using the Newey and West procedure. ***, ** and * indicate a significance level of 1%, 5% and 10%.

understanding of how ESG scores can help in predicting returns. To identify the particular determinants of these results, we need
Compared to the portfolio approach, all companies and observations to consider the individual pillars. Model (II) employs the subcriteria
can be included in the analysis. Furthermore, there is no need for relating to environment, social and corporate governance in order to
assumptions with regard to the portfolio construction. explain returns in the cross-section. Concerning the ASSET4 sample,
Table 7 presents the estimated coefficients for the monthly Fama the findings reveal that the significant influence of the overall ESG
and MacBeth (1973) regressions. In terms of ASSET4 and Bloomberg, score is mainly driven by the economic rating. Increasing the economic
model (I) indicates a significant link between the overall ESG rating score by one point leads to a return growth of 0.014%. The environmen-
and the return. An increase of the total Bloomberg ESG score by one tal and social pillars do not have significant explanatory power. The
point implicates a monthly return increment of 0.014%. The total ESG corporate governance score even has a slightly significant negative rela-
variable of ASSET4 exhibits a significant coefficient of 0.008%. If we tionship to the financial performance. The Bloomberg sample shows
assume a 20 points higher ASSET4 ESG score which is relatively high that only the social score exhibits a small significant coefficient of
on a scale between 0 and 100, the annualized return would ceteris 0.007% while the other pillars do not indicate an influence on a firm's
paribus increase by 1.92%. For KLD both the overall ESG score and the return. The significance of the overall ESG rating is mainly driven by
ESG* score feature insignificant coefficients. the sum of the environmental and social indicators. In the case of KLD

Table 6
ESG portfolios: High–low alphas within various subperiods.

Market capitalization weighted Equally weighted

Full 1991 2002 2007 Full 1991 2002 2007

Sample 2001 2006 2012 Sample 2001 2006 2012

ASSET4
ESG 0.011 0.020 0.016 0.009 0.013 0.003
ENV 0.001 −0.007 0.012 0.009 −0.044 0.028
SOC 0.013 0.035 0.007 0.021** 0.024* 0.015
GOV 0.006 −0.009 −0.005 −0.021** −0.033** −0.025**
ECN 0.011 0.013 0.034 0.005 −0.020 0.006

Bloomberg
ESG 0.014 0.015 0.008 −0.013 −0.015 −0.019
ENV −0.001 −0.127 −0.013 0.016 −0.180*** 0.005
SOC 0.041 −0.073 0.020 0.008 −0.146** 0.001
GOV −0.020 −0.017 −0.020 −0.025 0.149* −0.030

KLD
ESG 0.012 0.054** 0.026 −0.002 −0.004 0.055*** 0.003 −0.055*
ESG* 0.008 0.042* 0.023 −0.007 −0.010 0.046** 0.004 −0.060**
ENV 0.022 0.056 0.024 −0.003 0.003 0.066*** −0.024 −0.040
SOC −0.006 0.019 0.027 −0.017 0.004 0.039* 0.004 −0.024
GOV −0.021 −0.047 −0.043 0.014 −0.033 −0.035 −0.012 −0.045*

This table presents the annualized high–low alphas of the Carhart (1997) four-factor model over the subperiods 1991 to 2001, 2002 to 2006 and 2007 to 2012 as well as the full sample
period. The regressions are run individually for each ESG score using the full available data. The high–low portfolio buys the 20% best performing companies in terms of a particular ESG
score while the worst 20% are traded short. Both a market capitalization and equally weighted version is presented. The explanatory factors are provided by Asness and Frazzini (2013). The
standard errors are estimated using the Newey and West (1987) procedure. ***, ** and * indicate a significance level of 1%, 5% and 10%.
G. Halbritter, G. Dorfleitner / Review of Financial Economics 26 (2015) 25–35 33

Table 7
Modified Fama-MacBeth regressions based on the full sample.

ASSET4 Bloomberg KLD

(I) (II) (I) (II) (Ia) (Ib) (II)

BETA −0.020 0.029 0.093 −0.001 0.070 0.064 0.038


(−0.138) (0.210) (0.417) (−0.007) (0.459) (0.424) (0.246)
lnSIZE −0.313*** −0.332*** −0.335*** −0.203*** −0.032 −0.030 −0.047
(−4.454) (−3.987) (−5.711) (−4.286) (−1.285) (−1.234) (−1.641)
lnBM 0.280*** 0.281*** 0.284* 0.255** 0.524*** 0.526*** 0.530***
(4.652) (4.521) (1.950) (2.103) (14.080) (14.129) (14.312)
MOM 0.113*** 0.093*** 0.061 0.023 0.083*** 0.083*** 0.085***
(6.465) (5.257) (1.623) (0.591) (2.984) (2.977) (2.944)
ESG 0.008*** 0.014*** 0.007
(4.362) (3.953) (0.858)
ESG* 0.009
(1.001)
ENV 0.001 0.004 0.006***
(1.498) (0.729) (2.615)
SOC −0.002 0.007** 0.003
(−1.009) (2.271) (0.522)
GOV −0.003** −0.002 −0.004***
(−2.597) (−0.224) (−3.567)
ECN 0.014***
(10.649)

This table presents the results of the adjusted Fama and MacBeth (1973) model over the variable sample period from 1992 to 2012 on a monthly basis. Model (I) only considers the overall
ESG score while model (II) investigates the impact of the particular pillars. BETA, lnSIZE, lnBM and MOM are control variables with regard to beta, market capitalization, book-to-market
ratio and average returns of the last 12 months. The dependent variable is given in percentage points. Standard errors are adjusted for autocorrelation. ***, ** and * indicate a significance
level of 1%, 5% and 10%.

both the social and the corporate governance variables exhibit signifi- to compare the three rating concepts without being subjected to
cant coefficients. Nevertheless, they feature the opposite direction and sample-specific effects. Table 8 presents the regression results. Model
the effects cancel out in sum. In general, corporate governance practices (I) finds a significant relationship between the ASSET4 and Bloomberg
appear to have a negative impact on a firm's return. This may be driven overall ESG scores and the returns. For each ESG point the monthly
by the companies' size as large corporations tend to achieve higher gov- return increases by 0.016%. Concerning KLD, the overall ESG ratings
ernance ratings and lower returns than small firms (Bauer, Günster, & now negatively influence the financial performance in the cross-
Otten, 2003; Humphrey, Lee, & Shen, 2012). Since we only control section, even though not significantly. The particular pillars of KLD
for the market capitalization, other aspects, such as the number of notably lose ground compared to the full sample. Solely the corporate
employees, could possibly contribute. The governance rating may governance score exhibits a small significant coefficient of 0.005%. The
proxy these effects. subcriteria of Bloomberg are fairly consistent with the standard model
In addition to the full sample, we consider an overlapping sample whereby only the social score is a significant determinant of returns.
only including companies rated by all three providers. This allows us The ASSET4 coefficients are notably strengthened but the corporate

Table 8
Modified Fama-MacBeth regressions based on an overlapping sample.

ASSET4 Bloomberg KLD

(I) (II) (I) (II) (Ia) (Ib) (II)

BETA 0.052 0.076 0.095 0.051 0.092 0.094 0.160


(0.215) (0.284) (0.403) (0.224) (0.393) (0.402) (0.759)
lnSIZE −0.353*** −0.343*** −0.318*** −0.221*** −0.229*** −0.229*** −0.258***
(−6.062) (−6.028) (−4.563) (−3.818) (−4.008) (−3.978) (−3.244)
lnBM 0.315** 0.329** 0.314** 0.265** 0.294** 0.295** 0.278**
(2.231) (2.318) (2.120) (1.979) (2.165) (2.160) (2.178)
MOM 0.060 0.036 0.060 −0.014 0.052 0.052 0.045
(1.447) (0.908) (1.391) (−0.362) (1.368) (1.372) (1.307)
ESG 0.009*** 0.016*** −0.008
(8.478) (4.162) (−1.272)
ESG* −0.008
(−1.174)
ENV 0.004** 0.006 −0.010
(1.995) (0.910) (−1.164)
SOC −0.009*** 0.007** −0.001
(−2.932) (2.563) (−0.443)
GOV 0.005 0.002 0.005*
(1.652) (0.172) (1.708)
ECN 0.017***
(10.303)

This table presents the results of the adjusted Fama and MacBeth (1973) model over the variable sample period from 1992 to 2012 on a monthly basis using only observations available for
all providers. Model (I) only considers the overall ESG score while model (II) investigates the impact of the particular pillars. BETA, lnSIZE, lnBM and MOM are control variables with regard
to beta, market capitalization, book-to-market ratio and average returns of the last 12 months. The dependent variable is given in percentage points. Standard errors are adjusted for
autocorrelation. ***, ** and * indicate a significance level of 1%, 5% and 10%.
34 G. Halbritter, G. Dorfleitner / Review of Financial Economics 26 (2015) 25–35

Table 9
Modified Fama-MacBeth regressions for various subperiods.

ASSET4 Bloomberg KLD

(I) (II) (I) (II) (Ia) (Ib) (II)

Panel A: 2002 bis 2006


BETA 0.090 0.087 0.004 0.158 −0.247 −0.249 −0.283
(0.268) (0.263) (0.010) (0.313) (−0.662) (−0.661) (−0.728)
lnSIZE −0.467*** −0.514*** −0.442*** −0.334** −0.094 −0.095 −0.107*
(−10.428) (−9.11) (−2.878) (−2.518) (−1.500) (−1.530) (−1.769)
lnBM 0.400*** 0.404*** 0.667*** 0.610*** 0.652*** 0.653*** 0.655***
(7.386) (6.844) (5.403) (7.631) (13.427) (13.395) (13.139)
MOM 0.092*** 0.077** 0.056 −0.009 0.027 0.027 0.029
(2.937) (2.542) (0.398) (−0.082) (1.252) (1.240) (1.314)
ESG 0.009*** 0.018*** −0.012
(5.093) (2.926) (−0.991)
ESG* −0.013
(−0.949)
ENV 0.001 0.022** −0.002
(1.469) (2.011) (−0.389)
SOC 0.003*** 0.014* −0.007
(2.913) (1.906) (−1.027)
GOV −0.004*** −0.023** −0.004***
(−2.917) (−2.121) (−2.632)
ECN 0.010**
(2.641)

Panel B: 2007 bis 2012


BETA −0.112 −0.019 0.128 −0.065 0.148 0.150 0.124
(−0.446) (−0.077) (0.405) (−0.209) (0.553) (0.561) (0.486)
lnSIZE −0.183*** −0.178*** −0.292*** −0.151** 0.039* 0.037* 0.047**
(−3.774) (−3.635) (−4.139) (−2.608) (1.766) (1.734) (2.185)
lnBM 0.178*** 0.176*** 0.131 0.113 0.400*** 0.400*** 0.407***
(4.340) (4.623) (1.519) (1.476) (6.968) (6.935) (6.795)
MOM 0.130*** 0.106*** 0.063 0.036 0.039 0.039 0.037
(3.213) (2.654) (1.263) (0.557) (1.211) (1.210) (1.150)
ESG 0.006** 0.013*** −0.010
(2.198) (2.890) (−1.429)
ESG* −0.011
(−1.325)
ENV 0.001 −0.002 0.004
(0.498) (−0.505) (0.642)
SOC −0.007*** 0.004 −0.014***
(−2.895) (1.518) (−4.806)
GOV −0.002 0.007 −0.000
(−0.829) (1.358) (−0.048)
ECN 0.018***
(7.554)

This table presents the results of the adjusted Fama and MacBeth (1973) model over the subperiods from 2002 to 2006 (panel A) and from 2007 to 2012 (panel B). Model (I) only considers
the overall ESG score while model (II) investigates the impact of the particular pillars. BETA, lnSIZE, lnBM and MOM are control variables with regard to beta, market capitalization, book-
to-market ratio and average returns of the last 12 months. The dependent variable is given in percentage points. Standard errors are adjusted for autocorrelation. ***, ** and * indicate a
significance level of 1%, 5% and 10%.

governance score loses statistical significance. Overall, the findings Summarizing the findings of the cross-sectional analysis, we find
emphasize that the evidence in favor of a link between the social and significant differences between the ESG concepts of the three providers.
financial performance is strongly dependent on the underlying sample The overall ESG scores of ASSET4 and Bloomberg both have a significant
and rating provider consistent to the results of Dorfleitner, Halbritter, influence on the returns. These results are robust for different subpe-
and Nguyen (2014). riods. This indicates that there could be a relationship between ESG
We also split the full sample in two subperiods, because in the ratings and returns, even if investors are not able to exploit it through
context of the ESG portfolios the implications are conditional on an ESG portfolio strategy. As opposed to this, the overall KLD scores do
the sample period. Table 9 presents the estimated coefficients for not provide evidence for a link between the ESG level and the financial
the Fama and MacBeth (1973) model separately for the years from performance. This is consistent with Galema et al. (2008) and Manescu
2002 to 2006 (panel A) and from 2007 to 2012 (panel B). Consistent (2011). Considering the individual pillars shows that the impact is
with the ESG portfolios, model (I) suggests a slow decline in the mainly driven by only half of the ESG indicators. Aggregating the
explanatory power of the overall ESG variables. Nevertheless, the subcriteria leads to overlaying effects while some scores even cancel
total ESG score of ASSET4 and Bloomberg still has significant power each other out. In terms of the particular pillars, we do not find consis-
for explaining returns in the cross-section. Model (II) reveals the de- tent patterns among the three data sets. This emphasizes that the rating
terminants of the total variables. The environmental and corporate concept choice has a significant impact on the implications. A restriction
governance scores of ASSET4 become less important whereas the of the sample to firms, which are covered by all three providers, leads to
economic performance gathers strength. In the case of Bloomberg, significant changes of some coefficients. Furthermore, the subperiods
all ESG subcriteria lose their statistical significance from panel A to indicate a slowly diminishing explanatory power of ESG variables. As a
panel B. Except for the social score, this is also valid for the KLD robustness check we also estimate the models using pooled OLS regres-
data. Altogether, these results indicate a slightly declining impact sions with SIC clustered standard errors. The results are fairly similar to
of sustainability issues on the financial performance. those of the Fama and MacBeth (1973) procedure.
G. Halbritter, G. Dorfleitner / Review of Financial Economics 26 (2015) 25–35 35

6. Conclusion Breusch, T.S., & Pagan, A.R. (1979). A simple test for heteroscedasticity and random
coefficient variation. Econometrica: Journal of the Econometric Society, 1287–1294.
Carhart, M.M. (1997). On persistence in mutual fund performance. Journal of Finance,
Applying two different approaches, this paper investigates the 52(1), 57–82.
relationship between the corporate social and financial performance Derwall, J., Guenster, N., Bauer, R., & Koedijk, K. (2005). The eco-efficiency premium
puzzle. Financial Analysts Journal, 61(2), 51–63.
based on ESG ratings. Although previous empirical literature suggests Dorfleitner, G., Halbritter, G., & Nguyen, M. (2014). Measuring the level and risk of
a positive link between ESG rating levels and returns, we provide a corporate responsibility — An empirical comparison of different ESG rating ap-
critical review due to a number of concerns. Previous research identifies proaches. Working Paper. Universität Regensburg.
Dorfleitner, G., Utz, S., & Wimmer, M. (2014). Patience pays off — Financial long-term
significant variations in the characteristics of several ESG rating benefits of sustainable management decisions. Working Paper. Universität
concepts. As a consequence, it is crucial to address this question using Regensburg.
more than a single ESG data set. Furthermore, most studies rely on Durbin, J., & Watson, G.S. (1971). Testing for serial correlation in least squares regression
iii. Biometrika, 58(1), 1–19.
data ending before 2007. Due to an enormous development of the SRI
Eccles, R.G., Ioannou, I., & Serafeim, G. (2014). The impact of corporate sustainability on
market over the last decade, a current sample is decisive. To gain organizational processes and performance. Management Science, 60(11), 2835–2857.
evidence from different perspectives, we employ an ESG portfolio Edmans, A. (2011). Does the stock market fully value intangibles? Employee satisfaction
strategy as well as Fama and MacBeth's (1973) cross-sectional and equity prices. Journal of Financial Economics, 101(3), 621–640.
Fama, E.F. (1991). Efficient capital markets: II. Journal of Finance, 46(5), 1575–1617.
regressions. Fama, E.F., & French, K.R. (1992). The cross-section of expected stock returns. Journal of
The ESG portfolios do not show significant return differences Finance, 47(2), 427–465.
between companies featuring high and low ESG rating levels. This Fama, E.F., & French, K.R. (1993). Common risk factors in the returns on stocks and bonds.
Journal of Financial Economics, 33(1), 3–56.
applies both to the overall scores and to the particular pillars. This Fama, E.F., & French, K.R. (2002). Testing trade-off and pecking order predictions about
finding is robust for a variation of portfolio cut-offs as well as dividends and debt. Review of Financial Studies, 15(1), 1–33.
weightings. A best-in-class approach using sector-specific ESG scores Fama, E.F., & MacBeth, J.D. (1973). Risk, return, and equilibrium: Empirical tests. Journal of
Political Economy, 81(3), 607–636.
does not generate abnormal returns either. These results strongly Galema, R., Plantinga, A., & Scholtens, B. (2008). The stocks at stake: Return and risk
argue against previous studies suggesting abnormal returns of an ESG in socially responsible investment. Journal of Banking and Finance, 32(12),
portfolio strategy (Derwall et al., 2005; Eccles et al., 2014; Kempf & 2646–2654.
Hamilton, S., Jo, H., & Statman, M. (1993). Doing well while doing good? The investment
Osthoff, 2007; Statman & Glushkov, 2009). Splitting the sample into performance of socially responsible mutual funds. Financial Analysts Journal, 49(6),
three subperiods reveals the main determinant of this apparent contra- 62–66.
diction. The Carhart (1997) four-factor model shows an obvious decline Hong, H., & Kacperczyk, M. (2009). The price of sin: The effects of social norms on
markets. Journal of Financial Economics, 93(1), 15–36.
of the preceding outperformance over the last years.
Humphrey, J.E., Lee, D.D., & Shen, Y. (2012). Does it cost to be sustainable? Journal of
Although ESG portfolios are not able to detect a link between the Corporate Finance, 18(3), 626–639.
social and financial performance, the Fama and MacBeth (1973) regres- Kempf, A., & Osthoff, P. (2007). The effect of socially responsible investing on portfolio
sions suggest an ambiguous significant influence of some ESG variables performance. European Financial Management, 13(5), 908–922.
Kreander, N., Gray, R., Power, D., & Sinclair, C. (2005). Evaluating the performance of
in the cross-section. Nevertheless, the influence is strongly dependent ethical and non-ethical funds: A matched pair analysis. Journal of Business Finance
on the particular ESG rating provider. Furthermore, we do not identify and Accounting, 32(7/8), 1465–1493.
a systematic pattern concerning the individual ESG dimensions among Lee, D.D., & Faff, R.W. (2009). Corporate sustainability performance and idiosyncratic risk:
A global perspective. Financial Review, 44(2), 213–237.
the three data sets. However most effects are robust for a number of Lee, D.D., Faff, R.W., & Rekker, S.A. (2013). Do high and low-ranked sustainability stocks
subperiods, the results also provide evidence of a decreasing influence perform differently? International Journal of Accounting and Information
of ESG variables on the returns. This is consistent with the outcomes Management, 21(2), 116–132.
Manescu, C. (2011). Stock returns in relation to environmental, social and governance
of the Carhart (1997) four-factor model. performance: Mispricing or compensation for risk? Sustainable Development, 19(2),
In summary, this study strongly questions whether there is 95–118.
actually a relationship between ESG ratings and returns which is Newey, W.K., & West, K.D. (1987). A simple, positive semi-definite, heteroskedasticity
and autocorrelation consistent covariance matrix. Econometrica, 55(3), 703–708.
exploitable with a trading strategy in the sense of the Carhart Orlitzky, M., Schmidt, F.L., & Rynes, S.L. (2003). Corporate social and financial perfor-
(1997) four-factor model. This result is relevant both for researchers mance: A meta-analysis. Organization Studies, 24(3), 403–441.
and for investors who focus on a portfolio composition based on ESG Petersen, M.A. (2009). Estimating standard errors in finance panel data sets: Comparing
approaches. Review of Financial Studies, 22(1), 435–480.
ratings. Even if recent evidence by Edmans (2011) or Dorfleitner,
Sauer, D.A. (1997). The impact of social-responsibility screens on investment perfor-
Utz, and Wimmer (2014) indicates that financial benefits of a strong mance: Evidence from the Domini 400 social index and Domini equity mutual
CSP may only become visible if the respective stocks are held for a fund. Review of Financial Economics, 6(2), 137–149.
long time, social responsibility is considered worthwhile as a catego- Schröder, M. (2004). The performance of socially responsible investments: Investment
funds and indices. Financial Markets and Portfolio Management, 18(2), 122–142.
ry of its own by many investors — even without yielding an addition- Schröder, M. (2007). Is there a difference? The performance characteristics of SRI equity
al return. indices. Journal of Business Finance and Accounting, 34(1), 331–348.
Statman, M. (2000). Socially responsible mutual funds. Financial Analysts Journal, 56(3),
30–39.
References Statman, M. (2006). Socially responsible indexes: Composition, performance, and track-
ing error. Journal of Portfolio Management, 32(3), 100–109.
Asness, C.S., & Frazzini, A. (2013). The devil in HML's details. Journal of Portfolio Statman, M., & Glushkov, D. (2009). The wages of social responsibility. Financial Analysts
Management, 39(4), 49–68. Journal, 65(4), 33–46.
Bauer, R., Günster, N., & Otten, R. (2003). Empirical evidence on corporate governance. USSIF (2012). Report on sustainable and responsible investing trends in the United States.
The effect on stock returns, firm value and performance. Journal of Asset Management, Forum for Sustainable and Responsible Investment.
5(2), 91–104. Utz, S., & Wimmer, M. (2014). Are they any good at all? A financial and ethical analysis of
Bauer, R., Koedijk, K., & Otten, R. (2005). International evidence on ethical mutual fund socially responsible mutual funds. Journal of Asset Management, 15(1), 72–82.
performance and investment style. Journal of Banking and Finance, 29(7), 1751–1767. Wallis, M. v., & Klein, C. (2014). Ethical requirement and financial interest: A literature
Belghitar, Y., Clark, E., & Deshmukh, N. (2014). Does it pay to be ethical? Evidence from review on socially responsible investing. Business Research (http://dx.doi.org/10.
the FTSE4Good. Journal of Banking and Finance, 47, 54–62. 1007/s40685-014-0015-7, in press).
Bello, Z.Y. (2005). Socially responsible investing and portfolio diversification. Journal of Wooldridge, J.M. (2002). Econometric analysis of cross section and panel data. MIT Press.
Financial Research, 28(1), 41–57.
Black, F., Jensen, M.C., & Scholes, M. (1972). The capital asset pricing model: Some
empirical tests. In M.C. Jensen (Ed.), Studies in the Theory of Capital Markets
(pp. 72–194). New York: Praeger.

You might also like