Wednesday 6 November 2019

The meat of Markowitz 1952

In the end, what Markowitz 1952 does is twofold:

First, it introduces the problem of minimisation of variance subject to constraints in the application context of portfolios of return-bearing entities.  Once introduced, the case of a small number of entities is solved geometrically.   By 1959, the preferred solution to this was the simplex method.  By 1972 Black noted that all you need is two points on the efficient frontier to be able to extrapolate all points.  By 2019 you have a plethora of R (and other) libraries which can do this for you.

Second, a connection is established with the then current economic theory of rational utility.  Here he sketches the briefest of arguments for whether his maxim (expected mean maximisation with expected variance minimisation) is a decent model of investment behaviour.  He claims that his rule is more like investment behaviour than speculative behaviour.  However he makes a typo (one of several I spotted).  He claims that, for his maxim  $\frac {\partial U}{\partial E} > 0$ but also that $\frac {\partial U}{\partial E} < 0$ whereas that second one should read $\frac {\partial U}{\partial V} < 0$.  His claim that his approximation to the wealth utility function, having no third moment, distinguishes it from the propensity to gamble.  It was t be over a decade later before a proper mathematical analysis of how the E-V shaped up as a possible candidate investor utility function, and, if so, what an equilibrium world would look like if every investor operated under the same utility function.

Markowitz and expectation

One of Harry Markowitz's aha moments comes when he reads John Burr WIlliams, on equity prices being the present value of future dividends received.  Markowitz rightly tightened this definition up to foreground the fact that this model works with some future uncertainty, so the phrase 'present value of future dividends' ought to be 'expected present value of future dividends'.  We are dealing with a probability distribution here, together with some variance expressing our current uncertainty.  When variance here is a now-fact, representing our own measure of ignorance, that fits well with a Bayesian/Information Theoretic framework.  

I note that in 1952 the idea of future expected volatility was very dramatically under-developed.  It was still two decades away from the Black-Scholes paper and the trading of listed equity options on exchange.  The term implied volatility was not in common finance parlance.  

The other interpretation of variance in Markowitz's classic Portfolio Selection, 1952 is that it ought to be the expected future variability in the stock's (or portfolio's, or asset's, or factor's) return.  That is, the first of Markowitz's two stages in selecting a portfolio is making an estimate of the expected return and expected variance of the return stream.

He says:
The process of selecting a portfolio may be divided into two stages. The first stage starts with observation and experience and ends with beliefs about the future performances of available securities. The second stage starts with the relevant beliefs about future performances and ends with the choice of portfolio. 
 I'm mentioning this since I think Markowitz thought of minimum variance as a tool in the 'decision making with uncertainty' toolbox, namely that it in effect operationalises diversification, something he comes into the discussion wanting to foreground more than it had been foregrounded in the last.

What has happened largely since then is that maximum likelihood historical estimates of expected return and expected variance have taken precedence.  Of course, this is convenient, but it doesn't need to be so.  For example, imagine that a pair of companies have just entered into an M&A arrangement.  In this case, historical returns tell only a part of the story.

Also, if you believe Shiller 1981, the realised volatility of stock prices in general over the next time period will be much greater than the volatility on show for dividends and perhaps also not much like the realised volatility for the time period just past.

Taking a step back even further, we are assuming that the expected mean and variance of the relevant expected distribution is of the sort of shape which can appropriately be summarised by a unimodal distribution with finite variance, and that these first two moments give us a meaningful flavour of the distribution.  But again, just think of the expected distribution of an acquired company in an M&A deal half way through the deal.  This isn't likely to be normal-like for example, and may well be bimodal.