Sunday, September 23, 2007

Black Swans, Instantons, Hedge Funds and Network Collapse

On my flight to Europe last July, I read The Black Swan: The Impact of the Highly Improbable by N. Taleb. Unfortunately, I found the book irksome for several reasons:
  • I already knew the mathematical underpinnings of the metaphors used in the book (more on that below).
  • Taleb's writing style is unnecessarily condescending toward others mentioned in the book and to the reader.
  • Some rather obvious points are labored. The weirdest of these comes in the form of a entirely fictitious character to which an entire chapter is devoted.
  • Many of his often poor and sometimes inaccurate examples kept reminding me of something a Stanford mathematician once told me: "Economists are mathematically unsophisticated."
  • He describes a general problem or syndrome related to how people assess risk incorrectly, but he doesn't really offer any solutions (or maybe I missed it in the chapter entitled, "How to Look for Bird Poop" ... seriously).
I must say this book was a disappointment because it was a stark contrast to seeing him interviewed months earlier on PBS, where he came across as more thoughtful and measured. My opinion notwithstanding, you might find the book worth reading because it's an easy read, it covers many topics (mostly with a financial slant—the author's background), and he's also warning the reader about the dangers of things like high-risk hedge funds. Moreover, as I shall try to demonstrate here, these same concepts also impinge on performance analysis (not that Taleb is aware of that) and whereas they might otherwise be impenetrable to the non-mathematician, possibly they are made a little more accessible in a book like this. In a nutshell, I believe he is saying: Think wild, not mild; easy to say, hard to do, as I shall try to explain.

The black swan of the title is Taleb's emblem for the outlier, the rare event that occurs infrequently but can have very significant consequences when it does. Black swans (the ornithological type) were literally unknown until they were discovered by early explorers in Australia, so the prevailing wisdom in Europe until then was that all swans were white. The point is that people inferred that all swans were white based on sampled data. They became complacent because that inference continued to be reinforced by the data until suddenly, out of nowhere, new data violated that conclusion. The financial analogues include: the unanticipated stock-market crash of 1929, Black Friday 1987, the Russian ruble tanking in 1998, the U.S. Federal Reserve Bank's bail-out of Long Term Capital Management's "Trillion Dollar Bet", the current sub-prime mortgage implosion, and so on.

From a mathematical statistics standpoint, Taleb is underscoring our tendency to think that most events are neatly described by a ``bell-curve'' (i.e., a Normal or Gaussian distribution) with a well-defined statistical mean. In other words, most outcomes are within a few standard deviations of the mean or average. As a very simple example (which I don't recall Taleb mentioning), we expect that tossing a fair coin will produce either heads or tails half the time. But that's not what actually happens, even with a fair coin. As most people are aware, the heads and tails come up in some random sequence such that the sequence approaches 50/50 the longer the coin is tossed, i.e., the sequence approaches the average value 1/2, in the long run. But that's not exactly what happens either, and this is the main concept that Taleb is trying to convey. It is possible to observe what is known as a run of either heads or tails, i.e., a sub-sequence of all heads (or all tails) that:

  1. can occur spontaneously and therefore unexpectedly
  2. can be quite long (persistent) once it does occur
even with a fair coin. Such runs are rare but they can and do occur as a normal part of many statistical processes. The complement of this situation is the Gambler's fallacy: "I've had a string of losses up until now, therefore I'm bound to win the next bet." Gambling casinos are financed by such people. Mathematically, rare events like runs are described by the theory of large deviations (as opposed to standard deviations) which formalizes how they are controlled by the tail of a probability distribution rather than its mean.

How does all this relate to performance analysis? In Part III of The Practical Performance Analyst I discuss, amongst other things, the packet dynamics that led to the collapse of the Internet in 1986. TCP slow start is the control mechanism that was introduced in the hope of avoiding such network congestion in the future (could there be other black swans in the Internet?). The diagram gives a kind of schematic idea how network congestion can appear spontaneously and also remain persistent. This sudden congestion manifests itself as a very significant drop in delivered packet throughput and a concomitant increase in routing delays. The Internet, for example, can be represented as a circuit of queues each of which corresponds to a router or bridge. Packets arrive into a router, enqueue, receive service, and depart to the next routing stage. The simultaneous arrivals and departures of packets cause the queue length at a router to fluctuate about some average value. Average queue length determines system performance metrics like response time and throughput. Under certain conditions, the queue length can fluctuate about more than one stable value, i.e., a relatively short stable value, and a relatively long stable average. A long queue means that it will take longer for a packet to be serviced, on average. More importantly, the presence of two stable queue lengths (the two "valleys" in the diagram) implies that the system can move dynamically between these two extremes. Given that there are two valleys, it follows that there must be "hill" separating them i.e., an intermediate unstable queue length. Like trying to balance a ball on top of the hill, it will tend to roll into one valley or the other. It turns out that such a transition between valleys can occur very suddenly in real networks and once it reaches the lowest valley, it is likely to stay there for a long time.

In 1987, I recognized that the problem of estimating the average time until such a transition actually occurs between bistable queue-lengths is analogous to calculating the decay rate of an atom in quantum mechanics; a transition known as quantum tunneling. Although a network does not exhibit quantum behavior in the strict sense of that term, the mathematics is very similar because computer systems are stochastic, not deterministic. Tunneling or probability leakage between the valleys corresponds to finding something called the instanton solution. If you imagine the two valleys projecting into the plane of your browser (along the z-axis), the instanton solution is like a piece of string that straddles the hill from the floor of one valley to the floor of the other. This string-like solution and its wobbles, controls the leakage rate between the valleys; the wobbles providing corrections to conventional large deviations theory. It's a rather astounding and underappreciated fact that such things (black swan?) exist in computer systems.


Returning to Taleb's book, with the instanton picture in mind, it seems that to avoid being surprised by black swans or to see them flying on the horizon is equivalent to first of all knowing about the existence of a deeper valley (seeing what's on the other side of the hill) and second, finding the instanton solution (measuring the profile of the hill with very high precision) before the spontaneous transition suddenly takes us over that hill. The likelihood of being able to do all that seems like a bit of a black swan in itself; even for a Bayesianist.

No comments: