A PROPOSED ARCHIVAL EMPIRICAL RESEARCH METHODOLOGY TO TEST RELIABITIY AND VALIDITY OF DISCOUNTED RESIDUAL INCOME MODEL

Kerlinger and Lee (2000) defines reliability as “the proportion of the „true‟ variance to the total obtained variance of the data yielded by a measuring instrument” and content validity as “representativeness or sampling adequacy of the content—the substance, the matter, the topic of measuring instrument”. The goal of this research is to provide an empirical research method to quantify the reliability and validity of residual income model in the prediction of the value of equity (stock price), by proposing to compare all active U.S. firms from 1981 to 2005 traded in the NYSE and the AMEX (the time period and listed stocks are subject to change based upon the availability of data from different sources).


INTRODUCTION
The key research methodology in this paper is to repeat the experiments on different data sets employing the current existing model to investigate the reliability and validity of the model (the measuring instrument). This research methodology is also promoted by Beaver (1998).
Most financial economists agree that a stock's intrinsic value is the present value of its expected future dividends (or cash flows) to common shareholders, based on currently available information. However, not long ago, the academic studies began to pay attention to the practical problem of measuring intrinsic value. And a vast amount of literature has explored the methods to estimate the relationship between stock price and accounting information. The focus on the models of discounted dividend and residual income is suggested by previous empirical and analytical studies. The discounted residual income model (RIM) is one of the main trends since Ohlson (1995) reintroduced it into the literature. It provides a framework for analyzing the relationship between accounting numbers and the intrinsic value of the firm. I will use RIM because RIM has the strength that involves neither of the estimates of factor loadings nor factor risk premium, and it is less sensitive to the daily price fluctuations of individual firm. Current existing studies have intensively compared the superiority of RIM and other methods of estimating cost of equity or the value of firm (stock prices) when each is applied with a finite-horizon forecast and have provided the various techniques to link the current available information to the future estimation. This link is the key for each of those models. If it applies to an algebraic allegory, this link works as a transformation matrix projecting the estimated future value of firms (the space with unknown information) to the current accounting available information (space with known information). If the projection is one-to-one, then the analysts can get a unique answer closing the true future value of firm. Thus, the methods of finding the efficient estimation link and of measuring its explanatory power have been the central focus of discussions. This study will mainly discuss the validity and the reliability of RIM.
The remainder of the proposal is organized as follows: In section 1, I will briefly introduce the literature related to RIM. Section 2 introduces the dividend discounted model and RIM. Section 3 illustrates how to selection data and research methodology. Section 3 discusses possible data and model implementation. Section 4 introduces possible datasets which could be used to test the model. Section 5 proposes possible hypotheses and test methods. Section 6 discusses potential measurement errors and concerns. Section 7 discusses some possible future research topics.

BACKGROUND OF RESIDUAL DISCOUNTED MODEL AND RELEVANT RESEARCH
Various formulations of the residual income model have previously been used to assess cost of capital directly, to assess the cost of capital by assuming a given terminal growth in abnormal earnings (e.g., Claus & Thomas, 2001;Gebhardt, Lee, & Swaminathan, 2001;Lee, Myers, & Swaminathan, 1999), or to assess both growth and cost of capital simultaneously (e.g., Easton, Taylor, Shroff, & Sougiannis, 2002).
In the residual income model, the dividend payment is split into earnings and capital components that allow different valuation coefficients that improve the explanatory power of the model. This approach was supported by Frankel and Lee (1998). They show that the valuations generated from the RIM are not only comparable to the discounted cash flow model tested by Kaplan and Ruback (1995), but also generate valuations closer to intrinsic value than those generated by models using earnings, book value or dividends alone. The use of the RIM was further supported by the consistent finding that the components of the dividend (earnings and book value) take on different valuation multiples (e.g., Collins, Maydew, & Weiss, 1997). The residual income model is both an alternative accounting formulation of the dividend discount model and a non-different version of the earnings level and changes returns model (Easton, 1999).
RIM is an infinite forecast model. For empirical purposes, the infinite residual income and dividend models are traditionally converted into finite ones by truncating the models, generating forecasts for periods up to and including the forecast horizon (i.e., the point of truncation) and absorbing the excluded forecasts for periods that lie beyond the forecast horizon into a terminal value. There are some studies that document the significance of the terminal value to the accuracy of the valuation model (e.g., Penman & Sougiannis, 1998;Francis, Olsson, & Oswald, 2000) and of different approaches adopted to generate this terminal value. Claus and Thomas (2001) provide an economic argument for estimating the rate of growth beyond the forecast horizon. Frankel and Lee (1998), Gebhardt et al. (1999) and  assume that current growth rates fade to some (industry) norm following the forecast horizon. Easton et al. (2002) generate an implicit growth rate at the same time as the cost of capital estimate and then utilize this estimated value to calculate the terminal value.

MODEL SPECIFICATION
For the sake of brief and illustration, I would adopt a model as in .
Referring the dividend discounted model (DDM), the stock price is the present value of its expected future dividends (free cash flows to equity) based on all current available information. The dividend discount model can be expressed as: Standard academic view among financial economists is that a security's price (P t ) at time t is the best empirical proxy for the stock's intrinsic value (V * t ) at same time. Indeed, many studies in finance and accounting begin with the presumption that P t V * t . Therefore, equation (1a) can be written as is the expected future dividends for period t+i conditional on information available at time t, and e r is the cost of equity capital based on the information set at time t. This definition assumes a flat term-structure of discount rates. Thus, one could measure the intrinsic value (unobserved) by its stock price (observed). To see how we estimate valuation of the intrinsic value of equity, one could use P t instead of V * t in the remainder of this study.
Many studies show that, provided a firm's earnings and book value are forecast in a manner consistent with "clean surplus" accounting 1 , the stock price defined in equation (1a) can be written as the reported book value, plus an infinite sum of discounted residual income (economic profits) (Ohlson, 1995;Feltham & Ohlson, 1995): Where: This residual income model is algebraically identical to the dividend discount model, but converts dividend values into accounting numbers. Therefore, equation (2) relies on the same theory and is subject to the same theoretical limitation as the DDM.

DATA AND MODEL IMPLEMENTATION ISSUES Forecast Horizons and Terminal Values
Once could adopt the two-stage approach to estimate the intrinsic value: 1) forecast earning for next three (3) years, and 2) forecast earnings beyond 3 implicitly, by linearly fading the period t+3 ROE to the median industry ROE by period t+T. The "fade rate" attempts to capture the long-term erosion of abnormal ROE over time. The terminal value beyond period T is estimated by taking the period T residual income as perpetuity. This procedure assumes that there is no value-relevant growth in cash flows after period T.
Computing the finite horizon estimate for each firm is as follows. 2 is the book value per share for year t+i-1. Beyond the third year, FROE is forecasted using a linear fade rate to the industry median ROE.

Cost of Equity Capital
The RIM calls for a discount rate that corresponds to the risk of future cash flows to shareholders. Referring to the documents in literature (e.g., Fama & French, 1995;, to compute either k as a short-term or as a long-term risk-free rate, two classes of cost-of-capital estimates can be generated: e r (TB) = monthly annualized one-month T-bill rate + market risk premium relative to returns on the one-month T-bills (R m -R 1 tb ); e r (LT) = monthly annualized long-term Treasury bond rate + market risk premium relative to returns on the long-term treasury bonds (R m -R e r (TB) = monthly annualized one-month T-bill rate + market risk premium relative to returns on the one-month T-bills (R m -R ltb ); Readers who are interested in applying RIM for the computing cost of capital over various forecast horizons could refer to it in prior literature (e.g., Alam, Liu, & Peng 2015;Liu, 2018).

Explicit Earnings Forecasts
One could obtain earnings forecasts for the next three year from I/B/E/S. I/B/E/S analysts provide a one-year-ahead ( 1  t FEPS ) and a two-year-ahead ( 2  t FEPS ) EPS forecast, as well as an estimate of the long-term growth rate to compute a three-year-ahead earnings forecast: There are earnings forecasts for the next Combining these earning forecasts with the dividend payout ratio, one can obtain the explicit forecasts of future book values and ROEs, using CSR.

Matching Book Value to I/B/E/S I/B/E/S provides monthly consensus forecasts as of the third Thursday of each month. It might cause a problem when I/B/E/S has updated its forecast, but the company has not yet released its
annual report yet. To keep consistency of the data, from the month of the earnings announcement until four months after the fiscal year end, the new book value can be estimated as book value for year t-1 plus earnings minus dividends (

Dividend Payout Ratio
To estimate the expected portion of earnings to be paid out in net dividends, which is the k in the model, one could use the ratio of actual dividends from the last fiscal year compared to the earnings over the same time period. Then, to estimate the future book value, one could use The above analyses about the application of RIM model is just a simple introduction for this model. Readers who are interested in broader application of RIM could refer to prior literature in addition to studies documented in this paper (e.g. Penman, 1998aPenman, , 1998b; Dechow, Hutton, & Sloan, 1999).

THOUGHTS ABOUT DATA SELECTION
This section briefly discusses couple of possible sample data sources. Of course, researcher could select the sample to conduct the test from any source that is suitable for the research purpose.
First, if researcher(s) would like to use the U.S. firms to conduct the test, then, the researcher(s) could select U.S active firms traded in the NYSE and AMEX, at least once on the last day of any month between May 1981 and June 2005 (the time period is subjected to change for the sake of available data in different sources). Financial data on those companies can be extracted from the COMPUSTAT annual industrial file. The firms in the sample should have book values, earnings, dividends, and long-term debt in COMPUSTAT and have the necessary CRSP (Center for Research in Securities Prices) stock prices, trading volume, and shares outstanding information, and have a one-year-ahead and a two-year-ahead earnings-per-share (EPS) forecast from I/B/E/S. Second, if researcher(s) would like to use the international sample data, meaning that the sample firms publicly listed outside of the U.S., then, the researcher(s) could use Global Advantages in the Compustat, where there are financial and security data for international firms; or World Scope in which there are financial data for international and the U.S. firms and DataStream where there are security data for international firms. The availability of variables in the World Scope and DataStream depends on the subscription of the institution. In general, World Scope includes more variables and more firms for the international companies than Global Advantages in the Compustat. Applying RIM in the computation of cost of capital and using international data could be found in John, Liu, and Sunder (2017).
No matter selecting sample firms from the U.S. listed or international listed, the sample firms must meet the selection criteria that the research model requires, which could make the size of finally selection sample smaller than the population in the database.

HYPOTHESIS AND MEASUREMENT METHODS
All data manipulation will be performed by using any statistical computing language or software (e.g., SAS, STATA, PYTHON, R, MATLAB…). The hypothesis testing could employ multiple regression analysis.

H 1 : The discounted residual income model is a statistically valid method for analyzing the intrinsic value of equity. H 2 : The discounted residual income model is a statistically reliable method for analyzing the intrinsic value of equity.
From the statistical results for the hypothesis, researcher(s) can obtain the reliability and validity of RIM. Notice here, the researcher(s) can perform more hypotheses based on the necessity of the purpose of research, because once the researcher(s) know whether RIM is a good model to explain the intrinsic value of equity, then the researcher(s) can use this model to predict the items related to the intrinsic value of equity. So, as the metaphor mentioned at the beginning, RIM can be a transformation matrix that can carry on the observed information from one observed space to unobserved space, by doing so to predict the unknown information, which is wanted to be known. Furthermore, the projection has to be one-by-one; otherwise, RIM is a poor instrument for the forecast purpose.

Correlation with Future Returns: One-dimensional Analyses
Several measures of intrinsic value will be considered:  End-of-month dividend yield on the sample, D/P ratio, defined as the dividends from most recent fiscal year divided by end-of-month whole sample value,  End-of-month earning-to-price ratio on this sample, E/P ratio, defined as earnings from the most recent fiscal year divided by end-of-month whole sample value,  End-of-month book-to-market ratio on the latest available book value and shares outstanding, B/M, and  Variations of sample value-to-price ratio, V/P.
To find the relationship between the future return and those independent variables, the researcher(s) will regress stock price on each of the explanatory variables, and the combinations of those independent variables.
Based on previous the findings of previous literature, one would expect those ratios to be positive in general. If they are positive, the mathematical function used to test the hypotheses could have real number solution; then, one could expect that the unobserved intrinsic value of equity is able to explain by observed price. Then the researcher(s) can switch P t and V * t in my model upon my research desires.
(My expectation about those ratios is based on the previous empirical results.)

Correlation with Future Returns: Two-dimensional Analyses
One would consider how much explanatory power of V/P for the long-term return is due to its correlation with firm size and B/P. In other word, the researcher(s) could examine the interactions between those factors. The researcher(s) could construct two pairs of interaction factors, V/P & size, and V/P & B/P, to see which pair can best improve predict power to explain future returns.
There are several considerations related to the experiment:  How long of a time period should be defined as long-term,  What is the best way to construct portfolio (such as basing deciles or quintiles of certain variables),  What is the best statistical estimation method (e.g., the Monte Carlo simulation technique) to construct the portfolio, and  How to run the regression (e.g., adding a variable one time or adding all variables into the regression),  What are the control variables which shall be included in the model (such as controlling for the macroeconomic condition, industry business performance, or geographical factors…)?
Because the sample is time series cross-sectional data, there is a possibility that there is autocorrelation between explanatory variables. One could use the difference (change) of the dependent variable to regress on the difference (change) of each independent variable to detrend. Specifically, regressing the change of dependent variable on the change of each independent variable might avoid autocorrelation.

POTENTIAL MEASUREMENT ERROS OR CONCERNS
First, this research relies heavily on the assumption of the clean surplus accounting condition. However, in the real world, the clean surplus assumption is often violated. Penman (2013) discusses the situations under which clean surplus assumption could be violated.
Second, using terminal value to estimate the infinite value is another concern, because the short-term up to three years forecast time-series is just a general choice in the literature. Therefore, one could try to find out how long of the forecast interval the best choice to obtain the accurate terminal value is. Third, whether adding another variable could improve the forecast power for this existing model is another concern, such as the growth rate of a firm. How to incorporate this variable into the model could be a good research question, since current studies regard it as a flat rate not variant with time. But we are aware that the growth rates of companies change with respect to time.
Fourth, how to deal with missing data, because some studies assign a certain value to missing data, but I am concerned whether the statistical results will be reliable if partial of data is not actual data, but mimic data. Of course, the results of using mimic data, which could provide any informative inference, is better than no result at all.

FUTURE RSEARCH INTERESTS
These concerns mentioned in section 6 are quite common among current literature. They are good research topics for future studies. Moreover, most of the current studies are concentrated on the application and implication of RIM, and test whether the RIM is superior to the DDM (e.g., Liu & Rhodd, 2018). Very few researchers try to reconcile two methods. Russell (2001) argued that both the cash flow model and the RIM should provide the same results in the application of estimating equity value, but he did not provide any sizable empirical result, but only used a couple of firms as examples to reasoning his conclusion. I would think that it is a good argument, but it is not sufficient. It could be a good topic that whether and how two methods could reconcile. Finally, researchers still try to minimize the concerns mentioned in the previous section.
Of course, this paper is just to propose some possible research method to validate the popular research models and does not say that the current research does poorly on the empirical tests of RIM. In fact, the RIM has become more popular over the past two decades. In my opinion, I would think that RIM is an excellent valuation estimation method that using accounting data to estimate value of firm when the criteria of using DDM is not met, while all my thoughts here are just try to provide some useful inferences to people who are interested in valuation models.