List of Abbreviations . iv
List of Figures . vi
List of Tables.viii
CHAPTER 1: INTRODUCTION . 1
1.1 Vietnam stock market overview . 1
1.2 Problem statements. 6
1.3 Objectives and research questions . 11
1.4 Research Methodology . 11
1.5 Expected contributions . 13
1.6 Disposition of the dissertation . 13
CHAPTER 2: LITERATURE REVIEW . 16
2.1 Modern Portfolio Theory Framework . 16
2.1.1 Concept of risk and return. 17
2.1.2 Assumptions of the modern portfolio theory . 18
2.1.3 MPT investment process. 19
2.1.4 Critism of the theory. 20
2.2 Parameter estimation . 21
2.2.1 Expected returns parameter . 23
2.2.2 The covariance matrix parameter. 25
2.3 Portfolio Selection. 30
2.3.1 Mean-Variance Model. 30
2.3.2 Global Minimum Variance Model (GMV) . 32
CHAPTER 3: THEORETICAL FRAMEWORK . 34
3.1 Basic preliminaries . 34
3.1.1 Return . 34
3.1.2 Variance. 35
129 trang |
Chia sẻ: honganh20 | Ngày: 15/03/2022 | Lượt xem: 401 | Lượt tải: 1
Bạn đang xem trước 20 trang tài liệu Shrinkage estimation of covariance matrix for portfolio selection on vietnam stock market, để xem tài liệu hoàn chỉnh bạn click vào nút DOWNLOAD ở trên
tor for γ is stated:
̂ ∑∑( )
The result of estimation ̂ shows that and are respectively being consistent
estimators of and
After three components estimated, the shrinkage coefficient of SCCM will be determined
in the practice as follows:
47
̂ { {
̂
}}
In which:
̂
̂ ̂
̂
3.3.6 Shrinkage to identity matrix (STIM)
In this section, giving a shrinkage’s foundation of Ledoit and Wolf (2004) by using
identity matrix to reproduce covariance matrix. There is the main difference between the
STIM and the two other shrinkage methods. The fundamental difference comes from the
target matrix that Ledoit and Wolf used in the linear shrinkage method. In the SSIM and
SCCM, they require “a judicious choice of the shrinkage target, which must be based on
known features of the true covariance matrix for the application at hand”. For example,
Ledoit and Wolf exploited the first known feature that stock returns have a factor – model
structure to calculate covariance matrix of a series of stock returns in the SSIM and the
second known feature that “the average correlation of stock returns is positive” to
estimate the covariance matrix in the SCCM. However, the shrinkage towards identity
matrix is “the natural choice for a generic target” because the identity matrix is just a
square matrix that all the main diagonal elements are equal to one and the remaining of
elements are zero and this choice of target matrix does not have any benefit from
application – specific knowledge.
Besides, we know that the shrinkage towards identity matrix is a weighted combination
between a SCM and an identity matrix which is considered as a shrinkage target matrix.
The properties of identity matrix are no variance but a lot of bias compared to the true
covariance matrix. However, the sample covariance matrix totally opposite that contains
a lot of variance but has no biased. Therefore, the combination of two matrices has a
strong ability to generate the better covariance matrix compared to ones estimated by
48
each single matrix. When Ledoit and Wolf proposed the STIM method in 2004, they
wondered that whether the investors could select the optimized portfolios in the absence
of finance knowledge or not.
If the target matrix of shrinkage method is the identity matrix, the optimal shrinkage
intensity will be estimated as follows:
As mentioned in the research of Ledoit and Wolf (2004), four scalars which play a
significant role in analysis are:
〈 〉
‖ ‖
,‖ ‖ -
,‖ ‖ -
Having that:
,‖ ‖ - ,‖ ‖ -
,‖ ‖ - E,‖ ‖ - ,〈 〉-
= ,‖ ‖ - ‖ ‖ + 2 〈 , - 〉 (3.3.11)
Noted that, E[S] = Σ. So that, E[S −Σ] = 0
As a result, the equation (3.3.11) becomes:
,‖ ‖ - ,‖ ‖ - ‖ ‖
So, let reconsidering the optimization problem of covariance matrix with using the
identity matrix (I):
⏟
,‖ ‖ -
(3.3.12)
49
By changing variable, equation (3.3.12) can be expressed as:
( )
Such that, the optimization problem becomes:
⏟
,‖ ‖ -
( )
With some mathematical techniques and have been proved in Ledoit and Wolf (2004),
the investor can easily achieve the objective function as:
E,‖ ‖ - ‖ ‖ ( ) ,‖ ‖ -
By reducing ρ optimal value of can be interpreted, thus:
⏟
‖ ‖
Then,
‖ ‖ ‖ ‖ 〈 〉 (3.3.13)
Taking the first derivative of equation (3.3.13) and making it equal zero, the value of
can be obtained as:
〈 〉
〈 〉
Replacing (3.3.14) to equation (3.3.12), the objective function will be:
E,‖ ‖ - ( )
The first derivative follows ρ is:
( )
(3.3.14)
50
So the optimization problem equals:
[‖ ‖ (
)
(
)
]
To sum up, the shrinkage identity covariance matrix (Σ ) will be determined as follows:
,‖ ‖ -
51
CHAPTER 4: METHODOLOGY
The methodology will be presented in this chapter for answering the research questions
mentioned above. First, the study will clarify how the input data of the portfolio
optimization proceeding are collected and processed. Second, the author will present the
portfolio performance evaluation methodology as well as the performance metrics that
are employed to measure portfolios performance. Third, the author also introduces the
way to compute p – values for measuring the statistical significance of the differences
among the performance metrics. Finally, VN-Index and 1/N portfolios will be mentioned
as benchmarks to compare the performances of different strategies.
4.1 Input Data
The input data for optimization procedure are weekly stock price series, so the weekly
returns are calculated for all stocks involved. The return is measured according to the
adjusted price in which dividends and changes in capital by stocks splits are included.
The author divides the observation sample data set D(t) into two parts W(t) and V(t). In
which, W(t) is considered as the initialization phase to estimate the covariance matrix and
initialize the first portfolio. This period is called in–the–sample. V(t) shall be regarded as
the evaluation period used to test the performance of the estimation methods and called
out – of – sample period.
More details, the total of the observed data points in this dissertation D(t) = 468,
corresponding to 468 weeks starting from January 2011 to December 2019. The
initialization period W(t) = 104 weeks corresponds to a period of two years from January
2011 to January 2013. The remaining data set from January 2013 to December 2019 shall
be the evaluation period V(t) = 364 weeks. All the companies listed on Ho Chi Minh City
Stock Exchange (HOSE) will be considered, but excluding the companies do not have
enough 2 years period from Initial Public Offering (IPO). Therefore, the total number of
selected stocks for optimization procedure is 350 stocks. The data is taken from Ho Chi
52
Minh City Stock Exchange (HOSE) and is VND denominated. The VN-Index which is
Vietnam stock index is used as reference index in the SIM and SSIM methods.
The whole dataset was taken directly from HOSE and checked carefully with the other
data sources. In the preprocessing step, the author faced with some errors due to the data
ingestion issues in the server. Two most popular scenarios include missing price and/or
volume, and multiple successive days having the same price with the volumes are all
zeros. Hence, after crawling and updating the data into the database at the end of a
trading day, the author need to match the information of stocks’ prices and volumes with
other sources and/or using different techniques to impute the data before jumping into all
later computational steps. The sample dataset are collected as follows:
Table 4.1: The sample dataset are collected in the period of 2011 - 2019
Stock
ticker
Date
Opening
price
(VND)
Closing
price
(VND)
Adjusting
opening price
(VND)
Adjusting
closing price
(VND)
Daily
volume
(Shares)
AAA 31/12/2019 12,650 12,700 12,650 12,700 1,258,300
BID 31/12/2019 46,200 46,150 46,200 46,150 630,410
CNG 31/12/2019 24,400 25,000 23,370 23,940 63,350
DHG 31/12/2019 92,000 91,500 92,000 91,500 13,530
EIB 31/12/2019 17,800 17,800 17,800 17,800 149,850
FPT 31/12/2019 58,600 58,300 58,600 58,300 689,030
GAS 31/12/2019 96,800 93,700 96,800 93,700 388,160
.. .. .. .. .. .. ..
Source: Ho Chi Minh City Stock Exchange (HOSE) 2011-2019
53
By the end of 2019, a total of 382 companies listed on the HOSE have been collected.
The number of listed companies in 2019 increased by more than 90 compared to 2013
with only 292 listed companies (Figure 4.1). However, the maximum number of
companies is selected for the back – testing period of 2013 – 2019 is only 350 companies.
The reason is that companies selected in the optimization process must satisfy the
following requirements. First, the liquidity of a company must be guaranteed, meaning
that its daily trading volume must be greater than its average trading volume of the
previous 20 days. Second, the company must have long enough to be listed on HOSE, at
least 2 years corresponding to 104 weeks. Third, the reliability of such company data
must also be guaranteed.
Source: Ho Chi Minh City Stock Exchange (HOSE) 2013-2019
Figure 4.1: The universe of stocks on HOSE from 2013 - 2019
54
Besides, among the selected companies, the stock ticker of the company with the highest
market capitalization is VIC, equivalent to nearly 377 trillion VND. Meanwhile, ICF is
the ticker with the smallest market capitalization, equivalent to a value of 10.5 billion
VND. Moreover, YEG is a stock with the highest trading price at 343,000 VND and
VHG is the one with the lowest trading price at 370 VND during the back – testing
period. The stock with the largest daily trading volume is DIG, which corresponds to
volume of 12.8 million shares.
Figure 4.2: The number of listed companies into industry groups on HOSE, 2019
Furthermore, the selected companies divided into 11 industry groups based on Global
Industry Classification Standard (GICS). The industry groups include Utilities,
Information Technology, Materials, Health Care, Consumer Staples, Financial, Energy,
Communication Services, Industrials, Consumer Discretionary and Real Estate. In which,
Utilities,
25
Information
Technology, 4
Materials, 59
Health Care, 12
Consumer
Staples, 35
Financial
, 28
Energy, 10 Communication
Services, 2
Industrials, 105
Consumer
Discretionary, 40
Real Estate, 47
55
the number of listed companies is most concentrated in Industrials with a total of 105
companies, accounting for 27%. Meanwhile, Communication Services is the industry
with the least number of listed companies with only 2 companies.
Figure 4.3: The market capitalization of industry groups on HOSE, 2019
Despite being the industry with the most number of listed companies, the total market
capitalization of the Industrials industry accounts for only 8%; the industry with the
largest market capitalization is Financial which account for 30%, followed by Real Estate
with 26% and Communication Services is still the industry with the lowest market
capitalization as only 2 companies are listed.
4.2 Portfolio performance evaluation methodology
To examine the performances of different estimators of covariance matrix in a portfolio
optimization problem, a back-testing process will be applied in this dissertation.
Algorithmic trading is different from other kinds of investment classes because an
investor can make more reliable future performance forecasts from past performance
Utilities
7%
Information
Technology
2%
Materials
5%
Health Care
1%
Consumer Staples
16%
Financial
30%
Energy
2%
Communication
Services
0%
Industrials
8%
Consumer
Discretionary
3%
Real Estate
26%
56
because of the abundance of available data. The process by which this is done is called
back – testing. More details, the back-testing process performed by providing “the
particular strategy algorithm to a stream of historical financial data, which leads to a set
of trading signals” (Ernest, 2009). Every trade, that is selling or buying signal, will bring
an associated profit or loss. The total profit and loss will be calculated from the
accumulation of this profit or loss over the period of time of back – testing strategy.
Back-testing process allows “the (prior) statistical properties of the strategy to be
examined, providing insight into whether a strategy can be profitable in the future”
(Ernest, 2013). More detail of back-testing process will be presented in the following
sections of this research.
Based on the back-testing system, the author compares the different policies or
covariance matrix estimations employing a “rolling-horizon” procedure. First, the author
chooses a window to make estimation. The author denotes the estimation window length
by T < L, in which L is seen as “the total number of samples” in the data set. Second, the
return data in the estimation window will be used to compute the different optimal
portfolios. Third, the author repeats this “rolling-window” process for the next period by
updating the newest data point and dropping the earliest data point. The process will be
repeated continuously until the last data point of the data set. Thus, the author has
computed “L−T portfolio weight vectors” for each estimator of covariance matrix at the
end of this process; which is “
for t = T,, L – 1 and for each estimator k”.
In more details, beginning 1st January 2013, weekly historical data from two years back
will be employed to calculate the covariance matrix parameter in the minimum – variance
optimization procedure to initialize the first portfolio. This period is called as in the
sample. Thereafter, the minimum - variance optimized portfolios will be maintained for
one week, which is considered as the out – of – sample period. To be specific, the
rebalancing point of portfolios is on a weekly basis and the covariance matrix is re-
calculated at every rebalancing point. This process will be repeated until the last
rebalancing point that is 31st December 2019. In order to assess that whether there are
57
improvements on the minimum-variance optimized portfolios through altering the
estimators of covariance matrix, the back-testing process shall be applied for all the
estimators of covariance matrix presented in the dissertation, while all other things are
preserved.
In summary, to evaluate the efficiency of covariance matrix shrinkage methods, a back-
testing process is built and applied in this research from using a back-testing platform in
Tran et al.(2020). Back-testing process supports authors in appraising the possibility and
potential application of near future estimation, with the series of price value in portfolio.
The considered back-testing process is conducted as follows:
Step 1: Dividing observations D(t) into two parts W and V. Therein, W is
considered as initial stage to estimate covariance matrix, usually call in-the-sample
process and V is considered as testing stage of methodologies in portfolio selection,
usually called out-of-sample process. In the study, based on the policies and settings of
Vietnam stock market (for example, three days are required for selling or buying stocks),
the author choose weekly trading other than daily trading. Hence, the total observation is
D(t) = 416, each data point equal to unit of time is week. Therein, initial stage W = 104
weeks within 2 years and testing stage V = 312 weeks.
Step 2: Using the data in initial part W to estimate covariance matrix and use this
matrix as input in the portfolio optimization for selecting the optimal portfolios. And
then, the optimal portfolios will be tested on data point based on the portfolio
performance criteria.
Step 3: Carrying out replacing data with data point in the initial part W to
create , and then continue the optimal portfolio selection process and evaluate results
of the selection as in step 2 on data point . This process is repeated during testing
process V and end at data point .
58
Step 4: Calculating and extracting the results during testing stage V. The portfolio
performance criteria are applied to evaluate portfolio selection process V including:
average return of portfolio, volatility of portfolio, portfolio turnover, maximum
drawdown, winning rate and Jensen’s Alpha. Moreover, transaction costs are also
considered during the testing procedure of this study. Each time the portfolio status changes
according to optimal results, the transaction costs are incurred. The trading cost would be
assumed to be 0.3% for either total buying value or selling value of the portfolio each
time. This figure is according to the real percentage applied in most stock firms on the
Vietnam equity exchange. One last thing, instead of simply checking the estimation
methods on a single portfolio (N = 350 shares), the author would check these estimation
methods on four portfolios with specific stock numbers (N = 50, 100, 200, 350). The
allocation of stocks into portfolios will be dependent on the market capitalization of those
stocks. For example, N = 50 means that the portfolio will consist of 50 securities with the
largest market capitalization; N = 100 is a portfolio of 100 securities with the highest
market capitalization, equivalent to N = 200 and N = 350. The market capitalization of a
company is measured by the trading price multiplied by the number of
shares outstanding of the company.
59
The testing process is presented in the diagram below (see Figure 4.4):
Figure 4.4: Back – testing procedure
4.3 Transaction costs
Transaction costs were ignored in the initial MVO problem by Markowitz (1952). However,
in the case of the portfolio is rebalanced more often, the influence of transaction costs is
very large to the portfolios’ returns. Therefore, the transaction costs considered and
incorporated into the optimization procedure will help investors control the number of
trades and rebalancing points. This is attractive to the investors because it make their
portfolios turnover to be lower.
In the research of Ledoit and Wolf (2003a), they did not mention to the transaction costs
that can have a large influence on the portfolio performance. In the recent researches,
more and more researchers and practices care to the transaction costs in the portfolio
optimization to have more understanding about their optimization methods, for examples
DeMiguel et al. (2009), Han (2018)In these papers, they used a turnover indicator that
represents for the change of portfolio status among different periods, from that they
determine how the transaction costs will affect to their optimization strategies. In this
D(t)
W V
W
i = 1
i = 2
i = V
Testing point 𝑡𝑤
..
𝑊
𝑊𝑣
Testing point 𝑡𝑤
Testing point 𝑡𝑤 𝑣
60
study, the author also uses this indicator, which will be presented more detailed in the
next part, to measure the influence of transaction costs to the performance of portfolios.
Moreover, to bring the optimization strategies to the real world, each transaction cost will
be calculated as 0.3% per a total buying portfolio value or a selling portfolio value. This
percentage is applied by many stock companies on the Vietnam stock market.
4.4 Performance metrics
In this part, the author will present the performance metrics used to measure the
efficiency of optimal portfolios. The common performance metrics are usually applied as
return of portfolio, or risk of portfolio that are introduced clearly the previous sessions.
The return of portfolio can be defined as the gain or loss of portfolio for a given period of
time while the risk of portfolio is seen as the volatility of portfolio’s return and can be
measured by the variance of portfolio’s return. Besides these common performance
metrics, this dissertation also considers the other criteria such as Sharpe ratio, Maximum
drawdown, Portfolio turnover, Winning rate or Jensen’s Alpha.
4.4.1 Sharpe ratio (SR)
The Sharpe ratio, which was introduced by Sharpe in 1964, measures the ratio
between the excess return (return after subtracting the risk – free rate) and volatility. It
is defined as:
=
̅ ̅
̅
Where:
̅ : the annualized net return (after transaction costs) for strategy i.
̅ : the annualized risk-free rate for the evaluated period
̅ : the annualized volatility for strategy i.
61
This indicator is developed to aiming at helping investors consider return of investment
compared with its risk. This measure is useful for considering return over a risk unit
when invest in portfolio. Thus, the higher indicator is the better for investors. To Sharpe’s
point of view, here is a robust indicator under the assumption that “volatility (standard
deviation) as a good proxy for risk holds true”.
4.4.2 Maximum drawdown (MDD)
The maximum drawdown is also an important indicator that is used to evaluate the
portfolio performance. This indicator is mentioned by Chekhlov and Uryasev in 2005. It
reflects the maximum loss from a peak to a lowest point over a time period of a certain
portfolio. Investors will have more information about their portfolio volatility in the
situation when the market is going down. Specially, this metric estimates the accumulated
loss which investors can suffer from their investment. The investors usually consider this
metric as an indicator of downside risk. So the lower of this indicator is, the safer the
portfolio become, it is calculated as following:
The maximum decrease level of a value series i estimated in return is stated as follows:
= ⏟
( )
[ ⏟
( )
(
)]
Where represents for the portfolio value at period t when the portfolio of covariance
matrix estimator i is rebalanced. In other words, the lower maximum drawdown will
attract the investors because it shows that the investment strategy is less risky.
4.4.3 Portfolio turnover (PT)
As DeMiguel and Nogales (2009) state, “Portfolio turnover provides information
regarding the stability of a strategy i that rebalance portfolios over an investment horizon.
It measures the extent of trading that has to be done to implement the strategy”. In the
other way, this indicator shows the stability of portfolio at the time that portfolio change
the status according to optimal strategy. If the indicator has high value meaning that
structure of portfolio suffers significant change after optimizing portfolio each time
62
causing various risk related to liquidity issue as selling or buying great number of stocks;
it also causes significant transaction cost impacting portfolio return. Therefore, the
investors will prefer a lower turnover, because this shows that the liquidity risks will
reduce and the transaction costs are also going lower. De Miguel and Nogales (2009)
introduced the formula to calculate the portfolio weight turnover ( ) of a strategy i as
following:
=
∑ ∑ (
- |)
“where T is the number of rebalancing points and is the weight of asset j
under strategy j at time t + 1. N is the size of the considered asset universe. i.e., the
equation measures the average absolute changes of the portfolio weights over the T
rebalancing points”.
4.4.4 Winning rate (WR)
Most traders focus on the winning rate or win/loss ratio that is introduced by Nick Radge
(2006). The attraction is to eventually reach that stage where all most their trades are
winners. In this paper, the author uses the winning rate that shows how many trades the
investors win out of all their trades. The portfolios have high win rate does not mean that
they will guarantee profitability for the investors, but they can increase the winning
probability of the investment. Therefore, the higher winning rate will be better for
portfolios. The winning rate of the strategy i ( ) will be identified as follows:
=
4.4.5 Jensen’s Alpha
This metric is developed by American economist Michael Jensen in 1968. The Jensen’s
metric is defined as a “risk-adjusted performance” measure which reflects the average
return on an investment portfolio, higher or lower that calculated by the CAPM, given the
63
beta of investment portfolio and the average return of market. In practice, investors often
call this metric as “Jensen’s alpha”, or simply “alpha”.
Thus, Jensen’s alpha is calculated as following:
α = – [ + β( - )]
Where: is the portfolio’s return, is the risk-free rate, β is beta coefficient of
portfolio and is the market’s return.
The value of Alpha (α) can fall into the following three cases. In the case of Alpha value
is positive; this states that the investment portfolio has earned the superior return than the
theoretical portfolio return calculated from the CAPM because of either selection or
timing skills, or both. In other case, the value of Alpha is zero; it reflects a performance
that portfolio managers call as neutral because the investment portfolio has performed
just as well as “the unmanaged portfolios with buy and hold stocks” that are selected
randomly. In the last case, Alpha has a negative value; this means that the portfolio
managers performed worse than of the market because their investment portfolios have
generated the returns which are much lower than the CAPM’s return.
4.4.6 The statistical significance of the differences between two strategies on the
performance measures
After the performance measures are computed by the back – testing process on the out –
of – sample, the author uses p – values that evaluate the statistical sign
Các file đính kèm theo tài liệu này:
- shrinkage_estimation_of_covariance_matrix_for_portfolio_sele.pdf