When 'Average' is a meaningless number

node v6.17.1
version: master
endpointsharetweet
We talk about the 'average' size of a task so often that we actually forget to question whether 'taking the average' produces a number that describes anything useful - more often than not, I believe it doesn't! Let's look at the Cauchy distribution, one which looks so much like the famous Bell Curve, aka Gaussian, aka Normal distribution—centered, and with a peak around 0, that we might easily be misled into talking about the *average* number drawn from it.
"<img src='https://upload.wikimedia.org/wikipedia/commons/thumb/8/8c/Cauchy_pdf.svg/300px-Cauchy_pdf.svg.png'/>"
The interpretation of the Cauchy I give, for my Ultimate Frisbee loving peeps (shoutout to Mr. Josh Seamon!), is that if I stood across from you and tried to throw you a frisbee, but I simply chose an angle at random between -90 and +90, but centered around you at 0, the Cauchy would be the distribution of how far the frisbee landed from you in the horizontal direction. We tend to believe that numbers 'average' out. That, if we repeat our frisbee experiment, our distances away from you will 'settle down'. That positive deviants will cancel out negative deviants. Here we do an experiment where we average 10,000 random numbers in 2 diferent trials:
var math = require("mathjs") var numSamples = 10000 var total1 = 0 for(var i=0; i < numSamples; i++) { total1 = total1 + math.random(-1, 1) } var total2 = 0 for(var i=0; i < numSamples; i++) { total2 = total2 + math.random(-1, 1) } var ave1 = total1 / numSamples var ave2 = total2 / numSamples "Average 1 is " + ave1 + " Average 2 is " + ave2
And we see here that their averages are very near each other, and close to 0. They'd be even closer to 0 with 100,000 tries, we'd probably believe. And indeed, for a uniformly distributed random number (this is what math.random computes) the more numbers we average in, the closer to zero the sum becomes.
Things get interesting though, when we use our frisbee-throwing function to come up with a random number. Since we're talking about angles, we'll range from -PI/2 to PI/2, and we'll also take the tangent of the angle because—trigonometry tells us that tells us how far the frisbee lands from you. Other than that, our experiment is exactly the same.
var math = require("mathjs") var numSamples = 10000 var total1 = 0 for(var i=0; i < numSamples; i++) { total1 = total1 + math.tan(math.PI * 0.5 * math.random(-1, 1)) } var total2 = 0 for(var i=0; i < numSamples; i++) { total2 = total2 + math.tan(math.PI * 0.5 * math.random(-1, 1)) } var ave1 = total1 / numSamples var ave2 = total2 / numSamples "Average 1 is " + ave1 + " Average 2 is " + ave2
What do we have here? Now, one run is way far away from zero, and the averages of our two runs are not even near each other!
"<img src='https://media.giphy.com/media/h2OLfcSKKthRK/giphy.gif'/>"
You might be thinking I did something tricky with the math - that ON AVERAGE, most randomness does not behave this way. But what makes you so sure? In fact, 'Wild Randomness', or heavy-tailed distributions may in fact be the norm, and our thinking about what average is may be wrong. Read Mandelbrot's Misbehavior of Markets, or google Long-Tailed distributions if you want to really rock your world. In fact, often times it is the outliers that define the norm. Maybe that's true in business. Maybe that's true in Ultimate Frisbee. Maybe it's true of quantities like Intelligence, whatever that is, which are allowed to grow and compound. Maybe it's true of income distributions. So the takeaway is - next time you catch yourself using the term 'on average' - recognize that even Statisticians don't pin themselves down to one definition of average. And if you have the Calculus chops to investigate moments, and *prove* that the average of the Cauchy distribution does not 'settle down' like that of the normal distribution - please drop me a line with a write-up or video, and I'll shoot you something fun back. Cheers, and Happy Math-ing! Dean, President Deanius Solutions
Further Reading: When Randomness Gets Wild: https://softwaredevelopmentperestroika.wordpress.com/2013/12/18/the-blindfolded-archer-cauchy-distribution-when-randomness-gets-wild/
Loading…

no comments

    sign in to comment