Tobit model: Difference between revisions

Content deleted Content added

Inline

Revision as of 17:19, 10 June 2019

The Tobit model is a statistical model proposed by James Tobin (1958) to describe the relationship between a non-negative dependent variable $y_{i}$ and an independent variable (or vector) $x_{i}$ .^[1] The term Tobit was derived from Tobin's name by truncating and adding -it by analogy with the probit model.^[2] The Tobit model is distinct from the truncated regression model, which is in general different and requires a different estimator.^[3]

The model supposes that there is a latent (i.e. unobservable) variable $y_{i}^{*}$ . This variable linearly depends on $x_{i}$ via a parameter (vector) $\beta$ which determines the relationship between the independent variable (or vector) $x_{i}$ and the latent variable $y_{i}^{*}$ (just as in a linear model). In addition, there is a normally distributed error term $u_{i}$ to capture random influences on this relationship. The observable variable $y_{i}$ is defined as the ramp function: equal to the latent variable whenever the latent variable is above zero, and zero otherwise.

y_{i}={\begin{cases}y_{i}^{*}&{\text{if }}y_{i}^{*}>0,\\0&{\text{if }}y_{i}^{*}\leq 0,\end{cases}}

where $y_{i}^{*}$ is a latent variable:

y_{i}^{*}=\beta x_{i}+u_{i},\qquad u_{i}\sim N(0,\sigma ^{2}).\,

Etymology

When asked why it was called the "Tobit" model, instead of Tobin, James Tobin explained that this term was introduced by Arthur Goldberger, either as a portmanteau of "Tobin probit", or as a reference to the novel The Caine Mutiny, a novel by Tobin's friend Herman Wouk, in which Tobin makes a cameo as "Mr Tobit". Tobin reports having actually asked Goldberger which it was, and the man refused to say.^[4]

Consistency

If the relationship parameter $\beta$ is estimated by regressing the observed $y_{i}$ on $x_{i}$ , the resulting ordinary least squares regression estimator is inconsistent. It will yield a downwards-biased estimate of the slope coefficient and an upward-biased estimate of the intercept. Takeshi Amemiya (1973) has proven that the maximum likelihood estimator suggested by Tobin for this model is consistent.^[5]

Interpretation

The $\beta$ coefficient should not be interpreted as the effect of $x_{i}$ on $y_{i}$ , as one would with a linear regression model; this is a common error. Instead, it should be interpreted as the combination of (1) the change in $y_{i}$ of those above the limit, weighted by the probability of being above the limit; and (2) the change in the probability of being above the limit, weighted by the expected value of $y_{i}$ if above. ^[6]

Variations of the Tobit model

Variations of the Tobit model can be produced by changing where and when censoring occurs. Amemiya (1985, p. 384) harvtxt error: multiple targets (2×): CITEREFAmemiya1985 (help) classifies these variations into five categories (Tobit type I – Tobit type V), where Tobit type I stands for the first model described above. Schnedler (2005) provides a general formula to obtain consistent likelihood estimators for these and other variations of the Tobit model.^[7]

Type I

The Tobit model is a special case of a censored regression model, because the latent variable $y_{i}^{*}$ cannot always be observed while the independent variable $x_{i}$ is observable. A common variation of the Tobit model is censoring at a value $y_{L}$ different from zero:

y_{i}={\begin{cases}y_{i}^{*}&{\text{if }}y_{i}^{*}>y_{L},\\y_{L}&{\text{if }}y_{i}^{*}\leq y_{L}.\end{cases}}

Another example is censoring of values above $y_{U}$ .

y_{i}={\begin{cases}y_{i}^{*}&{\text{if }}y_{i}^{*}<y_{U},\\y_{U}&{\text{if }}y_{i}^{*}\geq y_{U}.\end{cases}}

Yet another model results when $y_{i}$ is censored from above and below at the same time.

y_{i}={\begin{cases}y_{i}^{*}&{\text{if }}y_{L}<y_{i}^{*}<y_{U},\\y_{L}&{\text{if }}y_{i}^{*}\leq y_{L},\\y_{U}&{\text{if }}y_{i}^{*}\geq y_{U}.\end{cases}}

The rest of the models will be presented as being bounded from below at 0, though this can be generalized as done for Type I.

Type II

Type II Tobit models introduce a second latent variable.^[8]

y_{2i}={\begin{cases}y_{2i}^{*}&{\text{if }}y_{1i}^{*}>0,\\0&{\text{if }}y_{1i}^{*}\leq 0.\end{cases}}

In Type I Tobit, the latent variable absorbs both the process of participation and the outcome of interest. Type II Tobit allows the process of participation (selection) and the outcome of interest to be independent, conditional on observable data.

The Heckman selection model falls into the Type II Tobit,^[9] which is sometimes called Heckit after James Heckman.^[10]

Type III

Type III introduces a second observed dependent variable.

y_{1i}={\begin{cases}y_{1i}^{*}&{\text{if }}y_{1i}^{*}>0,\\0&{\text{if }}y_{1i}^{*}\leq 0.\end{cases}}

y_{2i}={\begin{cases}y_{2i}^{*}&{\text{if }}y_{1i}^{*}>0,\\0&{\text{if }}y_{1i}^{*}\leq 0.\end{cases}}

The Heckman model falls into this type.

Type IV

Type IV introduces a third observed dependent variable and a third latent variable.

y_{1i}={\begin{cases}y_{1i}^{*}&{\text{if }}y_{1i}^{*}>0,\\0&{\text{if }}y_{1i}^{*}\leq 0.\end{cases}}

y_{2i}={\begin{cases}y_{2i}^{*}&{\text{if }}y_{1i}^{*}>0,\\0&{\text{if }}y_{1i}^{*}\leq 0.\end{cases}}

y_{3i}={\begin{cases}y_{3i}^{*}&{\text{if }}y_{1i}^{*}>0,\\0&{\text{if }}y_{1i}^{*}\leq 0.\end{cases}}

Type V

Similar to Type II, in Type V only the sign of $y_{1i}^{*}$ is observed.

y_{2i}={\begin{cases}y_{2i}^{*}&{\text{if }}y_{1i}^{*}>0,\\0&{\text{if }}y_{1i}^{*}\leq 0.\end{cases}}

y_{3i}={\begin{cases}y_{3i}^{*}&{\text{if }}y_{1i}^{*}>0,\\0&{\text{if }}y_{1i}^{*}\leq 0.\end{cases}}

The likelihood function

Below are the likelihood and log likelihood functions for a type I Tobit. This is a Tobit that is censored from below at $y_{L}$ when the latent variable $y_{j}^{*}\leq y_{L}$ . In writing out the likelihood function, we first define an indicator function $I(y_{j})$ where:

I(y_{j})={\begin{cases}0&{\text{if }}y_{j}\leq y_{L},\\1&{\text{if }}y_{j}>y_{L}.\end{cases}}

Next, let $\Phi$ be the standard normal cumulative distribution function and $\varphi$ to be the standard normal probability density function. For a data set with N observations the likelihood function for a type I Tobit is

{\mathcal {L}}(\beta ,\sigma )=\prod _{j=1}^{N}\left({\frac {1}{\sigma }}\varphi \left({\frac {y_{j}-X_{j}\beta }{\sigma }}\right)\right)^{I\left(y_{j}\right)}\left(1-\Phi \left({\frac {X_{j}\beta -y_{L}}{\sigma }}\right)\right)^{1-I\left(y_{j}\right)}

and the log likelihood is given by

\log {\mathcal {L}}(\beta ,\sigma )=\sum _{j=1}^{n}I(y_{j})\log \left({\frac {1}{\sigma }}\varphi \left({\frac {y_{j}-X_{j}\beta }{\sigma }}\right)\right)+(1-I(y_{j}))\log \left(1-\Phi \left({\frac {X_{j}\beta -y_{L}}{\sigma }}\right)\right)

Note that this is different from the likelihood function of the truncated regression model.^[3]

Non-parametric version

If the underlying latent variable $y_{i}^{*}$ is not normally distributed, one must use quantiles instead of moments to analyze the observable variable $y_{i}$ . Powell's CLAD estimator offers a possible way to achieve this.^[11]

Dynamic unobserved effects Tobit model

In a panel data Tobit model,^[12]^[13] if the outcome $Y_{i,t}$ partially depends on the previous outcome history $Y_{i,0},\ldots ,Y_{t-1}$ this Tobit model is called "dynamic". For instance, taking a person who finds a job with a high salary this year, it will be easier for her to find a job with a high salary next year because the fact that she has a high-wage job this year will be a very positive signal for the potential employers. The essence of this type of dynamic effect is the state dependence of the outcome. The "unobservable effects" here refers to the factor which partially determines the outcome of individual but cannot be observed in the data. For instance, the ability of a person is very important in job-hunting, but it is not observable for researchers. A typical dynamic unobserved effects Tobit model can be represented as

Y_{i,t}=Y_{i,t}^{1}[Y_{i,t}>0];

Y_{i,t}=z_{i,t}\delta +\rho y_{i,t-1}+c_{i}+u_{i,t};

c_{i}\mid y_{i,0},\ldots ,y_{i,t-1}\sim F(y_{i,0}x_{i});

u_{i,t}\mid z_{i,t},y_{i,0},\ldots ,y_{i,t-1}\backsim _{N}(0,1).

In this specific model, $\rho y_{i,t-1}$ is the dynamic effect part and $c_{i}$ is the unobserved effect part whose distribution is determined by the initial outcome of individual i and some exogenous features of individual i.

Based on this setup, the likelihood function conditional on $\{y_{i,0}\}_{i-1}^{N}$ can be given as

\prod _{i=1}^{N}\int f_{\theta }(c_{i}\mid y_{i,0},x_{i})\left[\prod _{t=1}^{T}{\Bigl (}1[y_{i,t}=0][1-\Phi (z_{i,t}\delta +\rho y_{i,t-1}>0]{\frac {\varphi (z_{i,t}\delta +\rho y_{i,t-1}+c_{i})}{\Phi (z_{i,t}\delta +\rho y_{i,t-1}+c_{i})}}{\biggr )}\right]\,dc_{i}

For the initial values $\{y_{i,0}\}_{i-1}^{N}$ ,there are two different ways to treat them in the construction of the likelihood function: treating them as constant, or imposing a distribution on them and calculate out the unconditional likelihood function. But whichever way is chosen to treat the initial values in the likelihood function, we cannot get rid of the integration inside the likelihood function when estimating the model by Maximum Likelihood Estimation (MLE). Expectation Maximum (EM) algorithm is usually a good solution for this computation issue.^[14] Based on the consistent point estimates from MLE, Average Partial Effect (APE)^[15] can be calculated correspondingly.^[16]

Applications

Tobit models have, for example, been applied to estimate factors that impact grant receipt, including financial transfers distributed to sub-national governments who may apply for these grants. In these cases, grant recipients cannot receive negative amounts, and the data is thus left-censored. For instance, Dahlberg and Johansson (2002)^[17] analyse a sample of 115 municipalities (42 of which received a grant). Dubois and Fattore (2011)^[18] use a Tobit model to investigate the role of various factors in European Union fund receipt by applying Polish sub-national governments. The data may however be left-censored at a point higher than zero, with the risk of mis-specification. Both studies apply Probit and other models to check for robustness. Tobit models have also been applied in demand analysis to accommodate observations with zero expenditures on some goods. In a related application of Tobit models, a system of nonlinear Tobit regressions models has been used to jointly estimate a brand demand system with homoscedastic, heteroscedastic and generalized heteroscedastic variants.^[19]

References

^ Tobin, James (1958). "Estimation of relationships for limited dependent variables". Econometrica. 26 (1): 24–36. doi:10.2307/1907382. JSTOR 1907382.
^ International Encyclopedia of the Social Sciences (2008)
^ ^a ^b Park, B. U.; Simar, L.; Zelenyuk, V. (2008). "Local Likelihood Estimation of Truncated Regression and its Partial Derivatives: Theory and Application". Journal of Econometrics. 146 (1): 185–198. doi:10.1016/j.jeconom.2008.08.007.
^ The ET Interview: Professor James Tobin
^ Amemiya, Takeshi (1973). "Regression analysis when the dependent variable is truncated normal". Econometrica. 41 (6): 997–1016. doi:10.2307/1914031. JSTOR 1914031.
^ McDonald, John F.; Moffit, Robert A. (1980). "The Uses of Tobit Analysis". The Review of Economics and Statistics. 62 (2): 318–321. doi:10.2307/1924766. JSTOR 1924766.
^ Schnedler, Wendelin (2005). "Likelihood estimation for censored random vectors". Econometric Reviews. 24 (2): 195–217. doi:10.1081/ETC-200067925.
^ Amemiya, Takeshi (1985). Advanced econometrics. Cambridge, Mass: Harvard University Press. p. 384. ISBN 0-674-00560-0. OCLC 11728277.
^ Heckman, James J. (1979). "Sample Selection Bias as a Specification Error". Econometrica. 47 (1): 153–161. doi:10.2307/1912352. ISSN 0012-9682. JSTOR 1912352.
^ Sigelman, Lee; Zeng, Langche (1999). "Analyzing Censored and Sample-Selected Data with Tobit and Heckit Models". Political Analysis. 8 (2): 167–182. doi:10.1093/oxfordjournals.pan.a029811. ISSN 1047-1987. JSTOR 25791605.
^ Powell, James L (1 July 1984). "Least absolute deviations estimation for the censored regression model". Journal of Econometrics. 25 (3): 303–325. CiteSeerX 10.1.1.461.4302. doi:10.1016/0304-4076(84)90004-6.
^ Greene, W. H. (2003). Econometric Analysis. Upper Saddle River, NJ: Prentice Hall.
^ The model framework comes from Wooldridge, J. (2002). Econometric Analysis of Cross Section and Panel Data. Cambridge, Mass: MIT Press. p. 542. But the author revises the model more general here.
^ For more details, refer to: Cappé, O.; Moulines, E.; Ryden, T. (2005). "Part II: Parameter Inference". Inference in Hidden Markov Models. New York: Springer-Verlag. {{cite book}}: External link in |chapterurl= (help); Unknown parameter |chapterurl= ignored (|chapter-url= suggested) (help)
^ Wooldridge, J. (2002). Econometric Analysis of Cross Section and Panel Data. Cambridge, Mass: MIT Press. p. 22.
^ For more details, refer to: Amemiya, Takeshi (1984). "Tobit models: A survey". Journal of Econometrics. 24 (1–2): 3–61. doi:10.1016/0304-4076(84)90074-5.
^ Dahlberg, Matz; Johansson, Eva (2002-03-01). "On the Vote-Purchasing Behavior of Incumbent Governments". American Political Science Review. null (1): 27–40. CiteSeerX 10.1.1.198.4112. doi:10.1017/S0003055402004215. ISSN 1537-5943.
^ Dubois, Hans F. W.; Fattore, Giovanni (2011-07-01). "Public Fund Assignment through Project Evaluation". Regional & Federal Studies. 21 (3): 355–374. doi:10.1080/13597566.2011.578827. ISSN 1359-7566.
^ Baltas, George (2001). "Utility-consistent Brand Demand Systems with Endogenous Category Consumption: Principles and Marketing Applications". Decision Sciences. 32 (3): 399–422. doi:10.1111/j.1540-5915.2001.tb00965.x. ISSN 0011-7315.

@@ Line 63: / Line 63: @@
 ===Type II===
 {{mergefrom|Generalized Tobit|discuss=Talk:Tobit model#Generalized Tobit merge|section=yes|date=May 2019}}
-Type II Tobit models introduce a second latent variable.<ref>{{cite book | last=Amemiya | first=Takeshi | title=Advanced econometrics | publisher=Harvard University Press | publication-place=Cambridge, Mass | year=1985 | isbn=0-674-00560-0 | oclc=11728277 | page=384}}</ref>
+Type II Tobit models introduce a second latent variable.<ref>{{cite book | last=Amemiya | first=Takeshi | title=Advanced econometrics | publisher=Harvard University Press | location=Cambridge, Mass | year=1985 | isbn=0-674-00560-0 | oclc=11728277 | page=384}}</ref>
 : <math> y_{2i} = \begin{cases}
@@ Line 72: / Line 72: @@
 In Type I Tobit, the latent variable absorbs both the process of participation and the outcome of interest. Type II Tobit allows the process of participation (selection) and the outcome of interest to be independent, conditional on observable data.
-The [[Heckman correction|Heckman selection model]] falls into the Type II Tobit,<ref name="Heckman1979">{{cite journal|last1=Heckman|first1=James J.|title=Sample Selection Bias as a Specification Error|journal=Econometrica|volume=47|issue=1|year=1979|pages=153|issn=00129682|doi=10.2307/1912352|jstor=1912352}}</ref> which is sometimes called Heckit after [[James Heckman]].<ref>{{cite journal|last1=Sigelman|first1=Lee|last2=Zeng|first2=Langche|title=Analyzing Censored and Sample-Selected Data with Tobit and Heckit Models|journal=Political Analysis|volume=8|issue=2|year=1999|pages=167–182|issn=1047-1987|doi=10.1093/oxfordjournals.pan.a029811|jstor=25791605}}</ref>
+The [[Heckman correction|Heckman selection model]] falls into the Type II Tobit,<ref name="Heckman1979">{{cite journal|last1=Heckman|first1=James J.|title=Sample Selection Bias as a Specification Error|journal=Econometrica|volume=47|issue=1|year=1979|pages=153–161|issn=00129682|doi=10.2307/1912352|jstor=1912352}}</ref> which is sometimes called Heckit after [[James Heckman]].<ref>{{cite journal|last1=Sigelman|first1=Lee|last2=Zeng|first2=Langche|title=Analyzing Censored and Sample-Selected Data with Tobit and Heckit Models|journal=Political Analysis|volume=8|issue=2|year=1999|pages=167–182|issn=1047-1987|doi=10.1093/oxfordjournals.pan.a029811|jstor=25791605}}</ref>
 ===Type III===
@@ Line 155: / Line 155: @@
 == Applications ==
-Tobit models have, for example, been applied to estimate factors that impact grant receipt, including financial transfers distributed to sub-national governments who may apply for these grants.  In these cases, grant recipients cannot receive negative amounts, and the data is thus left-censored. For instance, Dahlberg and Johansson (2002)<ref>{{Cite journal|last=Dahlberg|first=Matz|last2=Johansson|first2=Eva|date=2002-03-01|title=On the Vote-Purchasing Behavior of Incumbent Governments|url=http://journals.cambridge.org/article_S0003055402004215|journal=American Political Science Review|volume=null|issue=1|pages=27–40|doi=10.1017/S0003055402004215|issn=1537-5943|citeseerx=10.1.1.198.4112}}</ref> analyse a sample of 115 municipalities (42 of which received a grant). Dubois and Fattore (2011)<ref>{{Cite journal|last=Dubois|first=Hans F. W.|last2=Fattore|first2=Giovanni|date=2011-07-01|title=Public Fund Assignment through Project Evaluation|journal=Regional & Federal Studies|volume=21|issue=3|pages=355–374|doi=10.1080/13597566.2011.578827|issn=1359-7566}}</ref> use a Tobit model to investigate the role of various factors in European Union fund receipt by applying Polish sub-national governments. The data may however be left-censored at a point higher than zero, with the risk of mis-specification. Both studies apply Probit and other models to check for robustness. Tobit models have also been applied in demand analysis to accommodate observations with zero expenditures on some goods. In a related application of Tobit models, a system of nonlinear Tobit regressions models has been used to jointly estimate a brand demand system with homoscedastic, heteroscedastic and generalized heteroscedastic variants.<ref>{{Cite journal|last=Baltas|first=George|date=2001|title=Utility-consistent Brand Demand Systems with Endogenous Category Consumption: Principles and Marketing Applications|journal=Decision Sciences|language=en|volume=32|issue=3|pages=399–422|doi=10.1111/j.1540-5915.2001.tb00965.x|issn=0011-7315}}</ref>
+Tobit models have, for example, been applied to estimate factors that impact grant receipt, including financial transfers distributed to sub-national governments who may apply for these grants.  In these cases, grant recipients cannot receive negative amounts, and the data is thus left-censored. For instance, Dahlberg and Johansson (2002)<ref>{{Cite journal|last=Dahlberg|first=Matz|last2=Johansson|first2=Eva|date=2002-03-01|title=On the Vote-Purchasing Behavior of Incumbent Governments|journal=American Political Science Review|volume=null|issue=1|pages=27–40|doi=10.1017/S0003055402004215|issn=1537-5943|citeseerx=10.1.1.198.4112}}</ref> analyse a sample of 115 municipalities (42 of which received a grant). Dubois and Fattore (2011)<ref>{{Cite journal|last=Dubois|first=Hans F. W.|last2=Fattore|first2=Giovanni|date=2011-07-01|title=Public Fund Assignment through Project Evaluation|journal=Regional & Federal Studies|volume=21|issue=3|pages=355–374|doi=10.1080/13597566.2011.578827|issn=1359-7566}}</ref> use a Tobit model to investigate the role of various factors in European Union fund receipt by applying Polish sub-national governments. The data may however be left-censored at a point higher than zero, with the risk of mis-specification. Both studies apply Probit and other models to check for robustness. Tobit models have also been applied in demand analysis to accommodate observations with zero expenditures on some goods. In a related application of Tobit models, a system of nonlinear Tobit regressions models has been used to jointly estimate a brand demand system with homoscedastic, heteroscedastic and generalized heteroscedastic variants.<ref>{{Cite journal|last=Baltas|first=George|date=2001|title=Utility-consistent Brand Demand Systems with Endogenous Category Consumption: Principles and Marketing Applications|journal=Decision Sciences|language=en|volume=32|issue=3|pages=399–422|doi=10.1111/j.1540-5915.2001.tb00965.x|issn=0011-7315}}</ref>
 ==See also==

v t e Economics
Theoretical	Microeconomics Decision theory Price theory Game theory Contract theory Mechanism design Macroeconomics Mathematical economics Computational economics Behavioral economics Pluralism in economics
Empirical	Econometrics Economic statistics Experimental economics Economic history
Applied	Agricultural Behavioral Business Cultural Democracy Demographic Development Digitization Ecological Education Engineering Environmental Evolutionary Expeditionary Feminist Financial Geographical Happiness Health Historical Humanistic Industrial organization Information Institutional Knowledge Labour Law Managerial Monetary Natural resource Organizational Participation Personnel Planning Policy Public Public choice / Social choice theory Regional Rural Service Socio Sociological Solidarity Statistics Urban Welfare
Schools (history)	Mainstream Heterodox American (National) Ancient thought Anarchist Mutualism Austrian Behavioral Buddhist Chartalism Modern monetary theory Chicago Classical Critique of political economy Democratic Disequilibrium Ecological Evolutionary Feminist Georgism Happiness Historical Humanistic Institutional Keynesian Neo- (neoclassical–Keynesian synthesis) New Post- Circuitism Malthusianism Marginalism Marxian Neo- Mercantilism Mixed Neoclassical Lausanne New classical Real business-cycle theory New institutional Physiocracy Socialist Stockholm Supply-side Thermo
Economists	de Mandeville Quesnay Smith Malthus Say Ricardo von Thünen List Bastiat Cournot Mill Gossen Marx Walras Jevons George Menger Marshall Edgeworth Clark Pareto von Böhm-Bawerk von Wieser Veblen Fisher Pigou Heckscher von Mises Schumpeter Keynes Knight Polanyi Frisch Sraffa Myrdal Hayek Kalecki Röpke Kuznets Tinbergen Robinson von Neumann Hicks Lange Leontief Galbraith Koopmans Schumacher Friedman Samuelson Simon Buchanan Arrow Baumol Solow Rothbard Greenspan Sowell Becker Ostrom Sen Lucas Stiglitz Thaler Hoppe Krugman Piketty more
Lists	Glossary Economists Publications (journals) Schools
Category Index Lists Outline Publications Business portal