Thursday, March 15, 2012

Book Review: Think Stats

I reviewed 'Think Stats' by Allen Downey on Amazon.com

One  line summary: Quickest introduction to Bayesian stats if you know some python programming

Friday, March 9, 2012

Pre budget rally - what pre budget rally?

Media has moved away from UP elections and the talking heads are now talking about budget. I heard this term 'pre budget rally' being tossed around.


Do they actually bother to go and look at the data? I mean in this case, the data is easily available and calculation is trivial. You don't even need Excel, a calculator is sufficient.

Year Budget Date return 1 week before return 1 week after
2000-2001 29-Feb-2000 -4.84 2.90
2001-2002 28-Feb-2001 -1.36 -4.51
2002-2003 28-Feb-2002 -0.68 4.47
2003-2004 28-Feb-2003 -0.26 -4.35
2004-2005 (interim) 3-Feb-2004 -7.12 6.32
2004-2005 8-Jul-2004 -1.24 1.40
2005-2006 28-Feb-2005 2.94 2.70
2006-2007 28-Feb-2006 1.29 3.52
2007-2008 28-Feb-2007 -8.57 -3.16
2008-2009 29-Feb-2008 2.21 -8.65
2009-2010 (interim) 16-Feb-2009 -2.45 -4.02
2009-2010 6-Jul-2009 -5.13 -4.60
2010-2011 26-Feb-2010 1.60 3.38
2011-2012 28-Feb-2011 -3.36 2.44
2012-2013 16-Mar-2012





Average (%)
-1.93 -0.16    

What pre-budget rally? Didn't happen at least in last decade. If anything, the result in the week after budget has been better than the week before.

However, since the number of data points in this case is too small (a grand total of 14 points including the interim budgets), it is hard to say whether the difference between pre-budget week and post-budget week is statistically significant.

I don't prefer to dwell in such 'tiny data' domain. Visualizations can actually mislead you in this case e.g. you might conclude that week before budget looks bad. This is where the classical statistics can help. T-test was designed precisely for such a purpose.

> t.test(df$ret.before, df$ret.after)

 Welch Two Sample t-test

data:  df$ret.before and df$ret.after 
t = -1.1539, df = 24.465, p-value = 0.2597
alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval:
 -4.938753  1.394467 
sample estimates:
 mean of x  mean of y 
-1.9264286 -0.1542857 

And, the t-test tells us that difference is not statically significant. So what to do this week? Whatever you do, don't cite the budget week as reason.