Econometrics of Event Studies -
���Econometrics of Event Studies��� S. P Khotari and Jerold B. Warner Forthcoming in B. Espen Eckbo (ed.), Handbook of Corporate Finance: Empirical Corporate Finance, Volume A (Handbooks in Finance Series, Elsevier/North-Holland), Ch. 1, 2006
Econometrics of Event Studies S.P. Kothari Sloan School of Management, MIT Jerold B. Warner William E. Simon Graduate School of Business Administration University of Rochester May 19, 2006 Key words: Event study, abnormal returns, short-horizon tests, long-horizon tests, cross- sectional tests, risk adjustment This article will appear in the Handbook of Corporate Finance: Empirical Corporate Finance (Elsevier/North-Holland), which is edited by B. Espen Eckbo. We thank Espen Eckbo, Jon Lewellen, Adam Kolasinski, and Jay Ritter for insightful comments, and Irfan Safdar and Alan Wancier for research assistance.
2 ABSTRACT The number of published event studies exceeds 500, and the literature continues to grow. We provide an overview of event study methods. Short-horizon methods are quite reliable. While long-horizon methods have improved, serious limitations remain. A challenge is to continue to refine long-horizon methods. We present new evidence illustrating that properties of event study methods can vary by calendar time period and can depend on event sample firm characteristics such as volatility. This reinforces the importance of using stratified samples to examine event study statistical properties.
3 Table of Contents 1. Introduction and Background 2. The Event Study Literature 2.1 The stock and flow of event studies 2.2 Changes in event study methods: the big picture 3. Characterizing Event Study Methods 3.1 An event study: the model 3.2 Statistical and economic hypotheses 3.3 Sampling distributions and test statistics 3.4 Criteria for ���reliable��� event study tests 3.5 Determining specification and power 3.6 A quick survey of our knowledge 3.7 Cross-sectional tests 4. Long-Horizon Event Studies 4.1 Background 4.2 Risk adjustment and expected returns 4.3 Approaches to Abnormal Performance Measurement 4.4 Significance tests for BHAR and Jensen-alpha measures 4.4.1 Skewness 4.4.2 Cross-correlation 4.4.3 The bottom line
1. Introduction and Background This chapter focuses on the design and statistical properties of event study methods. Event studies examine the behavior of firms��� stock prices around corporate events.1 A vast literature written over the past several decades has become an important part of financial economics. Prior to that time, ���there was little evidence on the central issues of corporate finance. Now we are overwhelmed with results, mostly from event studies��� (Fama, 1991, p. 1600). In a corporate context, the usefulness of event studies arises from the fact that the magnitude of abnormal performance at the time of an event provides a measure of the (unanticipated) impact of this type of event on the wealth of the firms��� claimholders. Thus, event studies focusing on announcement effects for a short-horizon around an event provide evidence relevant for understanding corporate policy decisions. Event studies also serve an important purpose in capital market research as a way of testing market efficiency. Systematically nonzero abnormal security returns that persist after a particular type of corporate event are inconsistent with market efficiency. Accordingly, event studies focusing on long-horizons following an event can provide key evidence on market efficiency (Brown and Warner, 1980, and Fama, 1991). Beyond financial economics, event studies are useful in related areas. For example, in the accounting literature, the effect of earnings announcements on stock prices has received 1 We discuss event studies that focus only on the mean stock price effects. Many other types of event studies also appear in the literature, including event studies that examine return variances (e.g., Beaver, 1968, and Patell, 1976), trading volume (e.g., Beaver, 1968, and Campbell and Wasley, 1996), operating (accounting) performance (e.g., Barber and Lyon, 1996), and earnings management via discretionary accruals (e.g., Dechow, Sloan, and Sweeney, 1995, and Kothari, Leone, and Wasley, 2005).
5 much attention. In the field of law and economics, event studies are used to examine the effect of regulation, as well as to assess damages in legal liability cases. The number of published event studies easily exceeds 500 (see section 2), and continues to grow. A second and parallel literature, which concentrates on the methodology of event studies, began in the 1980���s. Dozens of papers have now explicitly studied statistical properties of event study methods. Both literatures are mature. From the methodology papers, much is known about how to do ��� and how not to do ��� an event study. While the profession���s thinking about event study methods has evolved over time, there seems to be relatively little controversy about statistical properties of event study methods. The conditions under which event studies provide information and permit reliable inferences are well-understood. This chapter highlights key econometric issues in event study methods, and summarizes what we know about the statistical design and the interpretation of event study experiments. Based on the theoretical and empirical findings of the methodology literature, we provide clear guidelines both for producers and consumers of event studies. Rather than provide a comprehensive survey of event study methods, we seek to sift through and synthesize existing work on the subject. We provide many references and borrow heavily from the contributions of published papers. Two early papers that cover a wide range of issues are by Brown and Warner (1980, 1985). More recently, an excellent chapter in the textbook of Campbell, Lo, and MacKinlay (1997) is a careful and broad outline of key research design issues. These standard references are recommended reading, but predate important advances in our understanding of event study methods, in particular on long horizon methods. We provide an updated and much needed overview, and include a bit of new evidence as well.
6 Although much emphasis will be on the statistical issues, we do not view our mission as narrowly technical. As financial economists, our ultimate interest is in how to best specify and test interesting economic hypotheses using event studies. Thus, the econometric and economic issues are interrelated, and we will try to keep sight of the interrelation. In section 2, we briefly review the event study literature and describe the changes in event study methodology over time. In Section 3 we discuss how to use events studies to test economic hypotheses. We also characterize the properties of the event study tests and how these properties depend on variables such as security volatility, sample size, horizon length, and the process generating abnormal returns. Section 4 is devoted to issues most likely encountered when conducting long-horizon event studies. The main issues are risk adjustment, cross- correlation in returns, and changes in volatility during the event period. 2. The Event Study Literature: Basic Facts 2.1 The stock and flow of event studies To quantify the enormity of the event study literature, we conducted a census of event studies published in 5 leading journals: the Journal of Business (JB), Journal of Finance (JF), Journal of Financial Economics (JFE), Journal of Financial and Quantitative Analysis (JFQA), and the Review of Financial Studies (RFS). We began in 1974, the first year the JFE was published. Table 1 reports the results for the years 1974 through 2000. The total number of papers reporting event study results is 565. Since many academic and practitioner-oriented journals are excluded, these figures provide a lower bound on the size of the literature. The number of papers published per year increased in the 1980���s, and the flow of papers has since been stable. The
7 peak years are 1983 (38 papers), 1990 (37 papers), and 2000 (37 papers). All five journals have significant representation. The JFE and JF lead, with over 200 papers each. Table 1 makes no distinction between long horizon and short horizon studies. While the exact definition of ���long horizon��� is arbitrary, it generally applies to event windows of 1 year or more. Approximately 200 of the 565 event studies listed in Table 1 use a maximum window length of 12 months or more, with no obvious time trend in the year by year proportion of studies reporting a long-horizon result. No survey of these 565 event study papers is attempted here. For the interested reader, the following are some examples of event study surveys. MacKinlay (1997) and Campbell, Lo, and MacKinlay (1997) document the origins and breadth of event studies. The relation of event studies to tests of market efficiency receives considerable attention in Fama (1991), and in recent summaries of long-horizon tests in Kothari and Warner (1997) and Fama (1998). Smith (1986) presents reviews of event studies of financing decisions. Jensen and Ruback (1983), Jensen and Warner (1988), and Jarrell, Brickley and Netter (1988) survey corporate control events. Recently, Kothari (2001) reviews event studies in the accounting literature. 2.2 Changes in event study methods: the big picture Even the most cursory perusal of event studies done over the past 30 years reveals a striking fact: the basic statistical format of event studies has not changed over time. It is still based on the table layout in the classic stock split event study of Fama, Fisher, Jensen, and Roll (1969). The key focus is still on measuring the sample securities��� mean and cumulative mean abnormal return around the time of an event.
8 Two main changes in methodology have taken place, however. First, the use of daily (and sometimes intraday) rather than monthly security return data has become prevalent, which permits more precise measurement of abnormal returns and more informative studies of announcement effects. Second, the methods used to estimate abnormal returns and calibrate their statistical significance have become more sophisticated. This second change is of particular importance for long-horizon event studies. The changes in long-horizon event study methods reflect new findings in the late 1990s on the statistical properties of long-horizon security returns. The change also parallels developments in the asset pricing literature, particularly the Fama-French 3-factor model. While long-horizon methods have improved, serious limitations of long-horizon methods have been brought to light and still remain. We now know that inferences from long-horizon tests ���require extreme caution��� (Kothari and Warner, 1997, p. 301) and even using the best methods ���the analysis of long-run abnormal returns is treacherous��� (Lyon, Barber, and Tsai, 1999, p. 165). These developments underscore and dramatically strengthen earlier warnings (e.g., Brown and Warner, 1980, p. 225) about the reliability - or lack of reliability - of long- horizon methods. This contrasts with short-horizon methods, which are relatively straightforward and trouble-free. As a result, we can have more confidence and put more weight on the results of short-horizon tests than long-horizon tests. Short-horizon tests represent the ���cleanest evidence we have on efficiency��� (Fama, 1991, p.1602), but the interpretation of long- horizon results is problematic. As discussed later, long-horizon tests are highly susceptible to the joint-test problem, and have low power. Of course these statements about properties of event study tests are very general. To provide a meaningful basis for assessing the usefulness of event studies - both short- and long-
9 horizon ��� it is necessary to have a framework that specifies: i) the economic and statistical hypotheses in an event study, and ii) an objective basis for measuring and comparing the performance of event study methods. Section 3 lays out this framework, and summarizes general conclusions from the methodology literature. In the remainder of the chapter, additional issues and problems are considered with more specificity. 3. Characterizing Event Study Methods 3.1 An event study: the model An event study typically tries to examine return behavior for a sample of firms experiencing a common type of event (e.g., a stock split). The event might take place at different points in calendar time or it might be clustered at a particular date (e.g., a regulatory event affecting an industry or a subset of the population of firms). Let t = 0 represent the time of the event. For each sample security i, the return on the security for time period t relative to the event, Rit, is: Rit = Kit + eit (1) where Kit is the ���normal��� (i.e., expected or predicted return given a particular model of expected returns), and eit is the component of returns which is abnormal or unexpected.2 Given this return decomposition, the abnormal return, eit, is the difference between the observed return and the predicted return: eit = Rit - Kit (2) Equivalently, eit is the difference between the return conditional on the event and the expected return unconditional on the event. Thus, the abnormal return is a direct measure of the 2 This framework is from Brown and Warner (1980) and Campbell, Lo, and MacKinlay (1997).
10 (unexpected) change in securityholder wealth associated with the event. The security is typically a common stock, although some event studies look at wealth changes for firms��� preferred or debt claims. A model of normal returns (i.e., expected returns unconditional on the event but conditional on other information) must be specified before an abnormal return can be defined. A variety of expected return models (e.g., market model, constant expected returns model, capital asset pricing model) have been used in event studies.3 Across alternative methods, both the bias and precision of the expected return measure can differ, affecting the properties of the abnormal return measures. Properties of different methods have been studied extensively, and are discussed later. 3.2 Statistical and economic hypotheses Cross-sectional aggregation. An event study seeks to establish whether the cross- sectional distribution of returns at the time of an event is abnormal (i.e., systematically different from predicted). Such an exercise can be conducted in many ways. One could, for example, examine the entire distribution of abnormal returns. This is equivalent comparing the distributions of actual with the distribution of predicted returns and asking whether the distributions are the same. In the event study literature, the focus almost always is on the mean of the distribution of abnormal returns. Typically, the specific null hypothesis to be tested is whether the mean abnormal return (sometimes referred to as the average residual, AR) at time t is equal to zero. Other parameters of the cross-sectional distribution (e.g., median, variance) and determinants of the cross-sectional variation in abnormal returns are sometimes studied as well. The focus on mean effects, i.e., the first moment of the return distribution, makes sense if one 3 For descriptions of each of these models, see Brown and Warner (1985) or Campbell, Lo, and MacKinlay (1997).
11 wants to understand whether the event is, on average, associated with a change in security holder wealth, and if one is testing economic models and alternative hypotheses that predict the sign of the average effect. For a sample of N securities, the cross-sectional mean abnormal return for any period t is: ��� = N i=1 N eit ARt 1 . (3) Time-series aggregation. It is also of interest to examine whether mean abnormal returns for periods around the event are equal to zero. First, if the event is partially anticipated, some of the abnormal return behavior related to the event should show up in the pre-event period. Second, in testing market efficiency, the speed of adjustment to the information revealed at the time of the event is an empirical question. Thus, examination of post-event returns provides information on market efficiency. In estimating the performance measure over any multi-period interval (e.g., time 0 through +6), there are a number of methods for time-series aggregation over the period of interest. The cumulative average residual method (CAR) uses as the abnormal performance measure the sum of each month���s average abnormal performance. Later, we also consider the buy-and-hold method, which first compounds each security���s abnormal returns and then uses the mean compounded abnormal return as the performance measure. The CAR starting at time t1 through time t2 (i.e., horizon length L = t2 - t1 + 1) is defined as: ( ) ��� =t1 = 2 , t2 t t AR t t1 CAR . (4) Both CAR and buy-and-hold methods test the null hypothesis that mean abnormal performance is equal to zero. Under each method, the abnormal return measured is the same as
12 the returns to a trading rule which buys sample securities at the beginning of the first period, and holds through the end of the last period. CARs and buy-and-hold abnormal returns correspond to security holder wealth changes around an event. Further, when applied to post-event periods, tests using these measures provide information about market efficiency, since systematically nonzero abnormal returns following an event are inconsistent with efficiency and imply a profitable trading rule (ignoring trading costs). 3.3 Sampling distributions of test statistics For a given performance measure, such as the CAR, a test statistic is typically computed and compared to its assumed distribution under the null hypothesis that mean abnormal performance equals zero.4 The null hypothesis is rejected if the test statistic exceeds a critical value, typically corresponding to the 5% or 1% tail region (i.e., the test level or size of the test is 0.05 or 0.01). The test statistic is a random variable because abnormal returns are measured with error. Two factors contribute to this error. First, predictions about securities��� unconditional expected returns are imprecise. Second, individual firms��� realized returns at the time of an event are affected for reasons unrelated to the event, and this component of the abnormal return does not average to literally zero in the cross-section. For the CAR shown in equation (4), a standard test statistic is the CAR divided by an estimate of its standard deviation.5 Many alternative ways to estimate this standard deviation have been examined in the literature (see, for example, Campbell, Lo, and MacKinlay, 1997). The test statistic is given by: 4 Standard tests are ���classical��� rather than ���Bayesian.��� A Bayesian treatment of event studies is beyond the scope of this chapter. 5 An alternative would be a test statistic that aggregates standardized abnormal returns, which means each observation is weighted in inverse proportion of the standard deviation of the estimated abnormal return. The standard deviation of abnormal returns is estimated using time-series return data on each firm. While a test using standardized abnormal returns is in principle superior under certain conditions, empirically in short-horizon event studies it typically makes little difference (see Brown and Warner, 1980, and 1985).
13 ( ) )] (t1,t2 [�� 2 t1,t2 CAR �� , (5) where ( ) (AR ) t L�� t t1 2 2 2 , �� = (6) and ) (ARt 2 �� is the variance of the one-period mean abnormal return.. Equation (6) simply says that the CAR has a higher variance the longer is L, and assumes time-series independence of the one-period mean abnormal return. The test statistic is typically assumed unit normal in the absence of abnormal performance. This is only an approximation, however, since estimates of the standard deviation are used. The test statistic in eq. (5) is well-specified provided the variance of one-period mean abnormal return is estimated correctly. Event-time clustering renders the independence assumption for the abnormal returns in the cross-section incorrect (see Collins and Dent, 1984, Bernard, 1987, and Petersen, 2005, and more detailed discussion in section 4 below). This would bias the estimated standard deviation estimate downward and the test statistic given in eq. (5) upward. To address the bias, the significance of the event-period average abnormal return can be and often is gauged using the variability of the time series of event portfolio returns in the period preceding or after the event date. For example, the researcher can construct a portfolio of event firms and obtains a time series of daily abnormal returns on the portfolio for a number of days (e.g., 180 days) around the event date. The standard deviation of the portfolio returns can be used to assess the significance of the event-window average abnormal return. The cross- sectional dependence is accounted for because the variability of the portfolio returns through
14 time incorporates whatever cross-dependence that exists among the returns on individual event securities. The portfolio return approach has a drawback, however. To the extent the event period is associated with increased uncertainty, i.e., greater return variability, the use of historical or post- event time-series variability might understate the true variability of the event-period abnormal performance. An increase in event-period return variability is economically intuitive. The event might have been triggered by uncertainty-increasing factors and/or the event itself causes uncertainty in the economic environment for the firm. In either case, the event-period return variability is likely to exceed that during other time periods for the event firms. Therefore, the statistical significance of the event-window abnormal performance would be overstated if it is evaluated on the basis of historical variability of the event-firm portfolio returns (see Brown and Warner, 1980, and 1985, and Collins and Dent, 1984). One means of estimating the likely increase in the variability of event-period returns is to estimate the cross-sectional variability of returns during the event and non-event periods. The ratio of the variances during the event period and non-event periods might serve as an estimate of the degree of increase in the variability of returns during the event period, which can be used to adjust for the bias in the test statistic calculated ignoring the increased event-period uncertainty.6 3.4 Criteria for ���reliable��� event study tests Using the test statistics, errors of inference are of two types. A Type I error occurs when the null hypothesis is falsely rejected. A Type II error occurs when the null is falsely accepted. Accordingly, two key properties of event study tests have been investigated. The first is whether the test statistic is correctly specified. A correctly-specified test statistic yields a Type I error 6 Use of non-parametric tests of significance, as suggested in Corrado (1989), might also be effective in performing well-specified tests in the presence of increased event-period uncertainty.