The identification of factors that predict the cross-section of stock returns has been a focus of asset pricing theory for decades. We address this challenging problem for both equity performance and risk, the latter through the maximum drawdown measure. We test a variety of regression-based models used in the field of supervised learning including penalized linear regression, tree-based models, and neural networks. Using empirical data in the US market from January 1980 to June 2018, we find that a number of firm characteristics succeed in explaining the cross-sectional variation of active returns and maximum drawdown, and that the latter has substantially better predictability. Non-linear models materially add to the predictive power of linear models. Finally, environmental, social, and governance impact enhances predictive power for non-linear models when the number of variables is reduced.

Here are my notes on Saad Mouti's presentation at UC Berkeley

Tuesday, February 19th, 2019
11:00 AM - 12:30 PM
1011 Evans Hall

Question: Can government issues and politics drive stock prices?

Goal: Analyze the cross-section of the return and maximum drawdown of firms using their characteristics level. Explore whether ESG scores improve the prediction of the dependent variables (future abnormal returns (AR) and maximum drawdown (MDD) over one year).

E (environmental)
S (social)
G (governance)

Regress one of the dependent variabls (AR or MDD) to the firms' characteristics using a set of Machine Learning methods
Look at the predictive power of firm's characteristics through an out-of-sample (oos) analysis

Current Results: MDD presents the best forecast results with an overall $ R^2 $ increase by 17% for the three years oos period. \

Motivation and Literature

"In the short run, the market is a voting machine but in the long run, it is a weighing machine" - Benjamin Graham

Long term investors need to be able to choose their stocks wisely based on a projection of long term return and risk to allocate their portfolio.

From a risk perspective:

  • MDD is a major risk as it takes years to breakeven for long buy & hold portfolios
  • The S&P 500 took about 4 years to breakeven after the bottom of March 2009
  • Single stocks like JPM (JP Morgan) also took 4 years to breakeven while BAC (Bank of America) hasn't reached its previous high after the subprimes crisis

The proliferation of quality data and descent calculation performance along with the advances in machine learning opened the door to conduct extensive empirical analyses

Environmental, Social and Governance issues can be a reason for a stock's cumulative negative performance (e.g. Facebook losing investors' trust due to data (mis)management)

Some signals can be indicative of a stock potential outperformance of the market, or the risk of a high drawdown risk

Explore which ones are dominant and whether ESG data improves the forecast of future returns and MDD with a focus on MDD

In the following we will focus mainly on MDD before presenting the results for abnormal returns as well.


  • Study of its distributive properties: Under a Brownian diffusion framework, analyze the MDD while give an infinite representation of its distribution ...
  • The analysis of derivd risk measures from MDD (introduce the Conditional Drawdown at Risk (CDaR), while formalice the Conditional Expected Drawdown (CED) and explore its properties as a generalized deviation measure)
  • ...

A fourth direction of drawndown analysis focuses on factors that drive asset returns in crash periods:

  • lead an empirical analysis of maximum drawdown determinants for funds of hedge funds : find that size is NOT a factor while age increases in time a FHF needs to overcome a maximum drawdown
  • explore portfolio performance using drawdown-based measures: find that momentum, value and size lead to different ranking than Sharpe ratio for middle and high performance but keep as is worst performance portfolios
  • focus on momentum strategies: find that these strategies are reversed following drawdowns and are marked by an asymmetry in the winner and loser exposure during extreme times

[Data Collection]

  • consider companies with positive -book-to-market ratio and eliminate the 0.05% smallest ones -> it results in 1.97M data points for the whole period
  • We consider 81 characteristics, 74 are numeric (e.g. volativility, beta) and 7 are categorical (eg Sector, Sin)
  • Some firms characteristics are based on returns, others on accounting variables or are out
  • 52 variables are updated annually (e.g. dividend to price, book-to-market, leverage), 12 updated quarterly (e.g. return of asset, return on equity), and 31 updated monthly (e.g. beta, volatility, momentums, ESG data)

(ESG scores are provided by Owl Analytics)

Dependent variables

  • consider the abnormal returns as the excess stocks log returns to the market (value weighted)
  • use maximum drawdown MDD defined as the maximum negative performance of a portfolio over a fixed time horizon $T:$ %$$ MDD = \max_{1}


  • explore which characteristics help forecast future risk and return and conduct the following analyses:
  • exploratory analysis of Pearson correlation for a given data and on pooled data based on normalized MDD
  • ...

[Steps in Machine Learning Methods]

  • Goal: Predict $y$ given $p$ regressors $x_1, \dots , x_p$
  • We are interested in finding if some firms' charateristisc explain the cross-section of the dependent variable using pooled regression
  • In its general form $$ y_{i,t+1} = E_t (y_{i,t+1}) + \epsilon_{i, t+1}$$ where $$E_t (y_{i,t+1}) = g^{*}(z_{i,t})$$ $i = 1,\dots, N$ denotes stocks and $t = 1 ,\dots, T$ denotes months, $z$ are the CS normalized firms' characteristics and $MDD$ is calculated using daily observed stock prices

Our objective is to map $y$ to a set of predictors through the function $g^{*}$ using pooled data.

We compare a set of ML regression models (Linear Model, Ridge, Lasso, Elastic-Net, Principal Component Regression, Partial Least Square, Multi-Layer Perceptron, Random Forest, XGBoost)

Goal: We want to compare the forecast of $y$ to the true value $y_{i,t+1}$.

We normalized factors cross-sectionally for each given date by: $$X_{SC} = \frac{X - E(X)}{\sigma(X)}$$

  • We replace missing values by $0$ ( the mean after re-scaling ),
  • reduce the universe of factors to a few ones ( beta, bm, ep, mve, retvol, momentums, roaq, roeq, E, S, G, and ESG) and reapply the framework, split the data into training, validation and test sets,
  • use training and validation sets to choose the parameters of the models and then concatenate them to train the model for the oos analysis.

Maybe add Differential privacy?


  • FC = firm characteristics
  • trimmed-FC = a trimmed set of twelve firm characteristics: sector, book-to-market(bm), volatity (retvol), beta, size (mve), return on equity (roeq), return on asset (roaq), momentums by month (mom1m, mom6m, mom12m, mom36m) and earnings-to-price (ep) (or case 1)
  • refined-E/S/G: Non-aggregated environmental, social, and governance scores totaling 12 scores
  • E/S/G: defined as before
  • ESG: single score averaging E,S,G scores

Performance measures:

  • Higher R-squared, better model and thus assign higher credibility

With FC+ESG, Multi-Layer Perceptron (MLP) keeps more independent variables best, and RF(random forest) worst.

Observed limitations:

  • consider a performance measure that focuses on the ranking o stocks and their MDD
  • compare the results in terms of investment strategies
  • consider a variation of MDD that is more centered around 0: we consider active maximum drawdown and repeat the analysis