Shrinkage estimation of covariance matrix for portfolio selection on vietnam stock market

List of Abbreviations . iv

List of Figures . vi

List of Tables.viii

CHAPTER 1: INTRODUCTION . 1

1.1 Vietnam stock market overview . 1

1.2 Problem statements. 6

1.3 Objectives and research questions . 11

1.4 Research Methodology . 11

1.5 Expected contributions . 13

1.6 Disposition of the dissertation . 13

CHAPTER 2: LITERATURE REVIEW . 16

2.1 Modern Portfolio Theory Framework . 16

2.1.1 Concept of risk and return. 17

2.1.2 Assumptions of the modern portfolio theory . 18

2.1.3 MPT investment process. 19

2.1.4 Critism of the theory. 20

2.2 Parameter estimation . 21

2.2.1 Expected returns parameter . 23

2.2.2 The covariance matrix parameter. 25

2.3 Portfolio Selection. 30

2.3.1 Mean-Variance Model. 30

2.3.2 Global Minimum Variance Model (GMV) . 32

CHAPTER 3: THEORETICAL FRAMEWORK . 34

3.1 Basic preliminaries . 34

3.1.1 Return . 34

3.1.2 Variance. 35

129 trang | Chia sẻ: honganh20 | Lượt xem: 507 | Lượt tải: 1

Bạn đang xem trước 20 trang tài liệu Shrinkage estimation of covariance matrix for portfolio selection on vietnam stock market, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên

tor for γ is stated: ̂ ∑∑( ) The result of estimation ̂ shows that and are respectively being consistent estimators of and After three components estimated, the shrinkage coefficient of SCCM will be determined in the practice as follows: 47 ̂ { { ̂ }} In which: ̂ ̂ ̂ ̂ 3.3.6 Shrinkage to identity matrix (STIM) In this section, giving a shrinkage’s foundation of Ledoit and Wolf (2004) by using identity matrix to reproduce covariance matrix. There is the main difference between the STIM and the two other shrinkage methods. The fundamental difference comes from the target matrix that Ledoit and Wolf used in the linear shrinkage method. In the SSIM and SCCM, they require “a judicious choice of the shrinkage target, which must be based on known features of the true covariance matrix for the application at hand”. For example, Ledoit and Wolf exploited the first known feature that stock returns have a factor – model structure to calculate covariance matrix of a series of stock returns in the SSIM and the second known feature that “the average correlation of stock returns is positive” to estimate the covariance matrix in the SCCM. However, the shrinkage towards identity matrix is “the natural choice for a generic target” because the identity matrix is just a square matrix that all the main diagonal elements are equal to one and the remaining of elements are zero and this choice of target matrix does not have any benefit from application – specific knowledge. Besides, we know that the shrinkage towards identity matrix is a weighted combination between a SCM and an identity matrix which is considered as a shrinkage target matrix. The properties of identity matrix are no variance but a lot of bias compared to the true covariance matrix. However, the sample covariance matrix totally opposite that contains a lot of variance but has no biased. Therefore, the combination of two matrices has a strong ability to generate the better covariance matrix compared to ones estimated by 48 each single matrix. When Ledoit and Wolf proposed the STIM method in 2004, they wondered that whether the investors could select the optimized portfolios in the absence of finance knowledge or not. If the target matrix of shrinkage method is the identity matrix, the optimal shrinkage intensity will be estimated as follows: As mentioned in the research of Ledoit and Wolf (2004), four scalars which play a signiﬁcant role in analysis are:  〈 〉  ‖ ‖  ,‖ ‖ -  ,‖ ‖ - Having that: ,‖ ‖ - ,‖ ‖ - ,‖ ‖ - E,‖ ‖ - ,〈 〉- = ,‖ ‖ - ‖ ‖ + 2 〈 , - 〉 (3.3.11) Noted that, E[S] = Σ. So that, E[S −Σ] = 0 As a result, the equation (3.3.11) becomes: ,‖ ‖ - ,‖ ‖ - ‖ ‖ So, let reconsidering the optimization problem of covariance matrix with using the identity matrix (I): ⏟ ,‖ ‖ - (3.3.12) 49 By changing variable, equation (3.3.12) can be expressed as: ( ) Such that, the optimization problem becomes: ⏟ ,‖ ‖ - ( ) With some mathematical techniques and have been proved in Ledoit and Wolf (2004), the investor can easily achieve the objective function as: E,‖ ‖ - ‖ ‖ ( ) ,‖ ‖ - By reducing ρ optimal value of can be interpreted, thus: ⏟ ‖ ‖ Then, ‖ ‖ ‖ ‖ 〈 〉 (3.3.13) Taking the ﬁrst derivative of equation (3.3.13) and making it equal zero, the value of can be obtained as: 〈 〉 〈 〉 Replacing (3.3.14) to equation (3.3.12), the objective function will be: E,‖ ‖ - ( ) The ﬁrst derivative follows ρ is: ( ) (3.3.14) 50 So the optimization problem equals: [‖ ‖ ( ) ( ) ] To sum up, the shrinkage identity covariance matrix (Σ ) will be determined as follows: ,‖ ‖ - 51 CHAPTER 4: METHODOLOGY The methodology will be presented in this chapter for answering the research questions mentioned above. First, the study will clarify how the input data of the portfolio optimization proceeding are collected and processed. Second, the author will present the portfolio performance evaluation methodology as well as the performance metrics that are employed to measure portfolios performance. Third, the author also introduces the way to compute p – values for measuring the statistical significance of the differences among the performance metrics. Finally, VN-Index and 1/N portfolios will be mentioned as benchmarks to compare the performances of different strategies. 4.1 Input Data The input data for optimization procedure are weekly stock price series, so the weekly returns are calculated for all stocks involved. The return is measured according to the adjusted price in which dividends and changes in capital by stocks splits are included. The author divides the observation sample data set D(t) into two parts W(t) and V(t). In which, W(t) is considered as the initialization phase to estimate the covariance matrix and initialize the first portfolio. This period is called in–the–sample. V(t) shall be regarded as the evaluation period used to test the performance of the estimation methods and called out – of – sample period. More details, the total of the observed data points in this dissertation D(t) = 468, corresponding to 468 weeks starting from January 2011 to December 2019. The initialization period W(t) = 104 weeks corresponds to a period of two years from January 2011 to January 2013. The remaining data set from January 2013 to December 2019 shall be the evaluation period V(t) = 364 weeks. All the companies listed on Ho Chi Minh City Stock Exchange (HOSE) will be considered, but excluding the companies do not have enough 2 years period from Initial Public Offering (IPO). Therefore, the total number of selected stocks for optimization procedure is 350 stocks. The data is taken from Ho Chi 52 Minh City Stock Exchange (HOSE) and is VND denominated. The VN-Index which is Vietnam stock index is used as reference index in the SIM and SSIM methods. The whole dataset was taken directly from HOSE and checked carefully with the other data sources. In the preprocessing step, the author faced with some errors due to the data ingestion issues in the server. Two most popular scenarios include missing price and/or volume, and multiple successive days having the same price with the volumes are all zeros. Hence, after crawling and updating the data into the database at the end of a trading day, the author need to match the information of stocks’ prices and volumes with other sources and/or using different techniques to impute the data before jumping into all later computational steps. The sample dataset are collected as follows: Table 4.1: The sample dataset are collected in the period of 2011 - 2019 Stock ticker Date Opening price (VND) Closing price (VND) Adjusting opening price (VND) Adjusting closing price (VND) Daily volume (Shares) AAA 31/12/2019 12,650 12,700 12,650 12,700 1,258,300 BID 31/12/2019 46,200 46,150 46,200 46,150 630,410 CNG 31/12/2019 24,400 25,000 23,370 23,940 63,350 DHG 31/12/2019 92,000 91,500 92,000 91,500 13,530 EIB 31/12/2019 17,800 17,800 17,800 17,800 149,850 FPT 31/12/2019 58,600 58,300 58,600 58,300 689,030 GAS 31/12/2019 96,800 93,700 96,800 93,700 388,160 .. .. .. .. .. .. .. Source: Ho Chi Minh City Stock Exchange (HOSE) 2011-2019 53 By the end of 2019, a total of 382 companies listed on the HOSE have been collected. The number of listed companies in 2019 increased by more than 90 compared to 2013 with only 292 listed companies (Figure 4.1). However, the maximum number of companies is selected for the back – testing period of 2013 – 2019 is only 350 companies. The reason is that companies selected in the optimization process must satisfy the following requirements. First, the liquidity of a company must be guaranteed, meaning that its daily trading volume must be greater than its average trading volume of the previous 20 days. Second, the company must have long enough to be listed on HOSE, at least 2 years corresponding to 104 weeks. Third, the reliability of such company data must also be guaranteed. Source: Ho Chi Minh City Stock Exchange (HOSE) 2013-2019 Figure 4.1: The universe of stocks on HOSE from 2013 - 2019 54 Besides, among the selected companies, the stock ticker of the company with the highest market capitalization is VIC, equivalent to nearly 377 trillion VND. Meanwhile, ICF is the ticker with the smallest market capitalization, equivalent to a value of 10.5 billion VND. Moreover, YEG is a stock with the highest trading price at 343,000 VND and VHG is the one with the lowest trading price at 370 VND during the back – testing period. The stock with the largest daily trading volume is DIG, which corresponds to volume of 12.8 million shares. Figure 4.2: The number of listed companies into industry groups on HOSE, 2019 Furthermore, the selected companies divided into 11 industry groups based on Global Industry Classification Standard (GICS). The industry groups include Utilities, Information Technology, Materials, Health Care, Consumer Staples, Financial, Energy, Communication Services, Industrials, Consumer Discretionary and Real Estate. In which, Utilities, 25 Information Technology, 4 Materials, 59 Health Care, 12 Consumer Staples, 35 Financial , 28 Energy, 10 Communication Services, 2 Industrials, 105 Consumer Discretionary, 40 Real Estate, 47 55 the number of listed companies is most concentrated in Industrials with a total of 105 companies, accounting for 27%. Meanwhile, Communication Services is the industry with the least number of listed companies with only 2 companies. Figure 4.3: The market capitalization of industry groups on HOSE, 2019 Despite being the industry with the most number of listed companies, the total market capitalization of the Industrials industry accounts for only 8%; the industry with the largest market capitalization is Financial which account for 30%, followed by Real Estate with 26% and Communication Services is still the industry with the lowest market capitalization as only 2 companies are listed. 4.2 Portfolio performance evaluation methodology To examine the performances of different estimators of covariance matrix in a portfolio optimization problem, a back-testing process will be applied in this dissertation. Algorithmic trading is different from other kinds of investment classes because an investor can make more reliable future performance forecasts from past performance Utilities 7% Information Technology 2% Materials 5% Health Care 1% Consumer Staples 16% Financial 30% Energy 2% Communication Services 0% Industrials 8% Consumer Discretionary 3% Real Estate 26% 56 because of the abundance of available data. The process by which this is done is called back – testing. More details, the back-testing process performed by providing “the particular strategy algorithm to a stream of historical ﬁnancial data, which leads to a set of trading signals” (Ernest, 2009). Every trade, that is selling or buying signal, will bring an associated proﬁt or loss. The total profit and loss will be calculated from the accumulation of this profit or loss over the period of time of back – testing strategy. Back-testing process allows “the (prior) statistical properties of the strategy to be examined, providing insight into whether a strategy can be profitable in the future” (Ernest, 2013). More detail of back-testing process will be presented in the following sections of this research. Based on the back-testing system, the author compares the different policies or covariance matrix estimations employing a “rolling-horizon” procedure. First, the author chooses a window to make estimation. The author denotes the estimation window length by T < L, in which L is seen as “the total number of samples” in the data set. Second, the return data in the estimation window will be used to compute the different optimal portfolios. Third, the author repeats this “rolling-window” process for the next period by updating the newest data point and dropping the earliest data point. The process will be repeated continuously until the last data point of the data set. Thus, the author has computed “L−T portfolio weight vectors” for each estimator of covariance matrix at the end of this process; which is “ for t = T,, L – 1 and for each estimator k”. In more details, beginning 1st January 2013, weekly historical data from two years back will be employed to calculate the covariance matrix parameter in the minimum – variance optimization procedure to initialize the ﬁrst portfolio. This period is called as in the sample. Thereafter, the minimum - variance optimized portfolios will be maintained for one week, which is considered as the out – of – sample period. To be specific, the rebalancing point of portfolios is on a weekly basis and the covariance matrix is re- calculated at every rebalancing point. This process will be repeated until the last rebalancing point that is 31st December 2019. In order to assess that whether there are 57 improvements on the minimum-variance optimized portfolios through altering the estimators of covariance matrix, the back-testing process shall be applied for all the estimators of covariance matrix presented in the dissertation, while all other things are preserved. In summary, to evaluate the efficiency of covariance matrix shrinkage methods, a back- testing process is built and applied in this research from using a back-testing platform in Tran et al.(2020). Back-testing process supports authors in appraising the possibility and potential application of near future estimation, with the series of price value in portfolio. The considered back-testing process is conducted as follows: Step 1: Dividing observations D(t) into two parts W and V. Therein, W is considered as initial stage to estimate covariance matrix, usually call in-the-sample process and V is considered as testing stage of methodologies in portfolio selection, usually called out-of-sample process. In the study, based on the policies and settings of Vietnam stock market (for example, three days are required for selling or buying stocks), the author choose weekly trading other than daily trading. Hence, the total observation is D(t) = 416, each data point equal to unit of time is week. Therein, initial stage W = 104 weeks within 2 years and testing stage V = 312 weeks. Step 2: Using the data in initial part W to estimate covariance matrix and use this matrix as input in the portfolio optimization for selecting the optimal portfolios. And then, the optimal portfolios will be tested on data point based on the portfolio performance criteria. Step 3: Carrying out replacing data with data point in the initial part W to create , and then continue the optimal portfolio selection process and evaluate results of the selection as in step 2 on data point . This process is repeated during testing process V and end at data point . 58 Step 4: Calculating and extracting the results during testing stage V. The portfolio performance criteria are applied to evaluate portfolio selection process V including: average return of portfolio, volatility of portfolio, portfolio turnover, maximum drawdown, winning rate and Jensen’s Alpha. Moreover, transaction costs are also considered during the testing procedure of this study. Each time the portfolio status changes according to optimal results, the transaction costs are incurred. The trading cost would be assumed to be 0.3% for either total buying value or selling value of the portfolio each time. This figure is according to the real percentage applied in most stock firms on the Vietnam equity exchange. One last thing, instead of simply checking the estimation methods on a single portfolio (N = 350 shares), the author would check these estimation methods on four portfolios with specific stock numbers (N = 50, 100, 200, 350). The allocation of stocks into portfolios will be dependent on the market capitalization of those stocks. For example, N = 50 means that the portfolio will consist of 50 securities with the largest market capitalization; N = 100 is a portfolio of 100 securities with the highest market capitalization, equivalent to N = 200 and N = 350. The market capitalization of a company is measured by the trading price multiplied by the number of shares outstanding of the company. 59 The testing process is presented in the diagram below (see Figure 4.4): Figure 4.4: Back – testing procedure 4.3 Transaction costs Transaction costs were ignored in the initial MVO problem by Markowitz (1952). However, in the case of the portfolio is rebalanced more often, the influence of transaction costs is very large to the portfolios’ returns. Therefore, the transaction costs considered and incorporated into the optimization procedure will help investors control the number of trades and rebalancing points. This is attractive to the investors because it make their portfolios turnover to be lower. In the research of Ledoit and Wolf (2003a), they did not mention to the transaction costs that can have a large influence on the portfolio performance. In the recent researches, more and more researchers and practices care to the transaction costs in the portfolio optimization to have more understanding about their optimization methods, for examples DeMiguel et al. (2009), Han (2018)In these papers, they used a turnover indicator that represents for the change of portfolio status among different periods, from that they determine how the transaction costs will affect to their optimization strategies. In this D(t) W V W i = 1 i = 2 i = V Testing point 𝑡𝑤 .. 𝑊 𝑊𝑣 Testing point 𝑡𝑤 Testing point 𝑡𝑤 𝑣 60 study, the author also uses this indicator, which will be presented more detailed in the next part, to measure the influence of transaction costs to the performance of portfolios. Moreover, to bring the optimization strategies to the real world, each transaction cost will be calculated as 0.3% per a total buying portfolio value or a selling portfolio value. This percentage is applied by many stock companies on the Vietnam stock market. 4.4 Performance metrics In this part, the author will present the performance metrics used to measure the efficiency of optimal portfolios. The common performance metrics are usually applied as return of portfolio, or risk of portfolio that are introduced clearly the previous sessions. The return of portfolio can be defined as the gain or loss of portfolio for a given period of time while the risk of portfolio is seen as the volatility of portfolio’s return and can be measured by the variance of portfolio’s return. Besides these common performance metrics, this dissertation also considers the other criteria such as Sharpe ratio, Maximum drawdown, Portfolio turnover, Winning rate or Jensen’s Alpha. 4.4.1 Sharpe ratio (SR) The Sharpe ratio, which was introduced by Sharpe in 1964, measures the ratio between the excess return (return after subtracting the risk – free rate) and volatility. It is defined as: = ̅ ̅ ̅ Where: ̅ : the annualized net return (after transaction costs) for strategy i. ̅ : the annualized risk-free rate for the evaluated period ̅ : the annualized volatility for strategy i. 61 This indicator is developed to aiming at helping investors consider return of investment compared with its risk. This measure is useful for considering return over a risk unit when invest in portfolio. Thus, the higher indicator is the better for investors. To Sharpe’s point of view, here is a robust indicator under the assumption that “volatility (standard deviation) as a good proxy for risk holds true”. 4.4.2 Maximum drawdown (MDD) The maximum drawdown is also an important indicator that is used to evaluate the portfolio performance. This indicator is mentioned by Chekhlov and Uryasev in 2005. It reflects the maximum loss from a peak to a lowest point over a time period of a certain portfolio. Investors will have more information about their portfolio volatility in the situation when the market is going down. Specially, this metric estimates the accumulated loss which investors can suffer from their investment. The investors usually consider this metric as an indicator of downside risk. So the lower of this indicator is, the safer the portfolio become, it is calculated as following: The maximum decrease level of a value series i estimated in return is stated as follows: = ⏟ ( ) [ ⏟ ( ) ( )] Where represents for the portfolio value at period t when the portfolio of covariance matrix estimator i is rebalanced. In other words, the lower maximum drawdown will attract the investors because it shows that the investment strategy is less risky. 4.4.3 Portfolio turnover (PT) As DeMiguel and Nogales (2009) state, “Portfolio turnover provides information regarding the stability of a strategy i that rebalance portfolios over an investment horizon. It measures the extent of trading that has to be done to implement the strategy”. In the other way, this indicator shows the stability of portfolio at the time that portfolio change the status according to optimal strategy. If the indicator has high value meaning that structure of portfolio suffers significant change after optimizing portfolio each time 62 causing various risk related to liquidity issue as selling or buying great number of stocks; it also causes significant transaction cost impacting portfolio return. Therefore, the investors will prefer a lower turnover, because this shows that the liquidity risks will reduce and the transaction costs are also going lower. De Miguel and Nogales (2009) introduced the formula to calculate the portfolio weight turnover ( ) of a strategy i as following: = ∑ ∑ ( - |) “where T is the number of rebalancing points and is the weight of asset j under strategy j at time t + 1. N is the size of the considered asset universe. i.e., the equation measures the average absolute changes of the portfolio weights over the T rebalancing points”. 4.4.4 Winning rate (WR) Most traders focus on the winning rate or win/loss ratio that is introduced by Nick Radge (2006). The attraction is to eventually reach that stage where all most their trades are winners. In this paper, the author uses the winning rate that shows how many trades the investors win out of all their trades. The portfolios have high win rate does not mean that they will guarantee proﬁtability for the investors, but they can increase the winning probability of the investment. Therefore, the higher winning rate will be better for portfolios. The winning rate of the strategy i ( ) will be identified as follows: = 4.4.5 Jensen’s Alpha This metric is developed by American economist Michael Jensen in 1968. The Jensen’s metric is defined as a “risk-adjusted performance” measure which reflects the average return on an investment portfolio, higher or lower that calculated by the CAPM, given the 63 beta of investment portfolio and the average return of market. In practice, investors often call this metric as “Jensen’s alpha”, or simply “alpha”. Thus, Jensen’s alpha is calculated as following: α = – [ + β( - )] Where: is the portfolio’s return, is the risk-free rate, β is beta coefficient of portfolio and is the market’s return. The value of Alpha (α) can fall into the following three cases. In the case of Alpha value is positive; this states that the investment portfolio has earned the superior return than the theoretical portfolio return calculated from the CAPM because of either selection or timing skills, or both. In other case, the value of Alpha is zero; it reflects a performance that portfolio managers call as neutral because the investment portfolio has performed just as well as “the unmanaged portfolios with buy and hold stocks” that are selected randomly. In the last case, Alpha has a negative value; this means that the portfolio managers performed worse than of the market because their investment portfolios have generated the returns which are much lower than the CAPM’s return. 4.4.6 The statistical significance of the differences between two strategies on the performance measures After the performance measures are computed by the back – testing process on the out – of – sample, the author uses p – values that evaluate the statistical sign

Các file đính kèm theo tài liệu này:

shrinkage_estimation_of_covariance_matrix_for_portfolio_sele.pdf