Listening to data for fun and profit: 2012

Wednesday, October 10, 2012

Nifty flash crash analysis - part 2

When we look at the order book in real time, we get to see only top 5 buy and sell orders in the book. That is just the tip of the iceberg and one is left wondering how much of the iceberg is really below the surface. As it turns out, not much!

Thanks to the flash crash, we just got a sneak peek below the surface.

Take a look at the plot below which depicts transacted value (in Cr) during 43rd minute plotted against the price drop during that minute. Click the image for more resolution.

Two things stand out:
1) As shown by the blue line, for the smaller transacted values, the drop is roughly proportional to the transacted value. Even in this range, the impact cost is already above 5%. Market simply cannot absorb more than 5 Cr worth of transaction in a single name nifty component.

2) There is a cluster of names around 20% drop. These are likely to be passive limit orders in the system where the buyer didn't really expect to be hit. Since emkay did not ask for trades to be reversed, that's one lucky and/or smart buyer.

I wonder what would have happened if there were limit buy orders at totally ridiculous prices. Maybe we could have seen the trades like Accenture trading at 1 cent which happened in US market flash crash on May 6, 2010.

Note to self: Figure out how to automatically put in ridiculous buy orders every day at open. Chances of getting lucky are non-zero.

Update (11/Oct/2012):
In the post I speculated about smart/lucky guy having limit orders 20% below CMP. Today's Economic times story confirms that hypothesis.

How Inventure's client made a killing from Nifty's Friday crash

"He randomly puts many buy orders in stocks that are part of the Nifty at 20% below the market value. On Friday, this bet clicked."

Nifty flash crash analysis - part 1

Investing is an observational science i.e. We cannot run controlled experiements and see what happens. Sometimes, a natural experiment takes place and we get to see some of the market mechanisms.

Nifty flash crash of October 5 offers just such an opportunity for a data driven investigation. For detailed analysis, we will have to wait till NSE releases the trade-by-trade data. In the meanwhile, we can look at the minute-by-minute data (available via Google Finance or vendors like Global datafeed) and make some conjectures.

From the story in the media, we know that a trader at Emkay Securities punched wrong orders that amounted to Rs 650 Cr sell orders hitting the market and thereby leading to flash crash.

From the minute-by-minute charts, it is clear that all hell broke loose in the 43^rd minute. Let's piece together the minute-by-minute trading value for nifty components for the minutes leading up to the 43^rd minute.

Minute, Approximated Traded Value (Cr)

40, 15.9

41, 19.5

42, 35.7

43, 610.8

44, 39.6

... Market halted

The minute data that we have roughly matches the news report value of 650 Cr of transactions happening in a small time window. Let's drill down into minute number 43.

We know that what emkay tried to execute was a Nifty basket sell order i.e. all the nifty components should be sold in proportion to their weight in Nifty. For example, Hindustan Lever which has 3.2% weight in the Nifty should have been sold approx 19.5 Cr (610 Cr basket order * 3.2%). And this is exactly what we find in the minute-by-minute data.

Here is a chart of expected traded value Vs actual traded value for the 43^rd minute. (Click the image for higher resolution)

The blue line indicates the line where expected traded value is equal to actual traded value. This is the line on which the order book presumably had sufficient depth (obviously at atrocious impact cost – but that is the topic of next post).

The interesting part is to look at the outliers and speculate what could have caused them. For scrips below the blue line - such as ITC, HDFCBANK, RELIANCE, INFY, ICICIBANK - it is very likely that there was not enough depth in the limit buy order book at any price. In other words, order books were burned in the 43^rd minute.

What about scrips above blue line such as HDFC and LT which have transacted value above the ones predicted by our model? There must have been some other big traders in the market in the 43^rd minute in these scrips apart from emkay.

Bulk trade data corraborates this theory for HDFC. There was a bulk sale trade by Carlyle on October 5^th.

Date	Symbol	Security Name	Client Name	Buy / Sell	Quantity Traded	Trade Price / Wght. Avg. Price	Remarks
05-Oct-2012	HDFC	HDFC Ltd.	CMP ASIA LIMITED	SELL	430,00,000	761.08	-

Takeaways:

1) Indian markets hardly have any depth. It is surprising and worrying to see that a mere 15 Cr worth of sell order was sufficient to knock off 44,000 Cr of value from ITC's market cap for a few seconds. What if this happens in the last couple of minutes on F&O expiry date?

2) It is also interesting (well, not really!) to note that some players in the market were already aware of HDFC Carlyle trade happening and were positioned with buy orders.

Monday, August 6, 2012

Sell in May and go away?

Recently came across a paper which claims that the 'Sell in May and go away' strategy continues to work in many markets.

Here is the paper abstract:
Abstract:
We perform the first out-of-sample test of the Sell in May effect studied by Bouman and Jacobsen (American Economic Review, 2002). Surprisingly to us, the old adage "Sell in May and Go Away" remains good advice. Reducing equity exposure starting in May and levering it up starting in November persists a profitable market timing strategy. The economic magnitude of the effect is the same in- and out-of-sample: on average, stock returns are about 10 percentage points higher in November to April semesters than in May to October semesters.

10 percentage points out-performance with just 2 trades per year? I had to try this at home.

After about 30 minutes on a rainy afternoon with R and some friendly packages (hat tip to ggplot2, plyr, lubridate, xts) my hopes were dashed.

Nope, doesn't work in India.

If anything, it is: Buy in May and stay put till Santa Clause!
(sigh, doesn't rhyme!)

R code for those interested follows:

Thursday, March 15, 2012

Book Review: Think Stats

I reviewed 'Think Stats' by Allen Downey on Amazon.com

One line summary: Quickest introduction to Bayesian stats if you know some python programming

Friday, March 9, 2012

Pre budget rally - what pre budget rally?

Media has moved away from UP elections and the talking heads are now talking about budget. I heard this term 'pre budget rally' being tossed around.

Do they actually bother to go and look at the data? I mean in this case, the data is easily available and calculation is trivial. You don't even need Excel, a calculator is sufficient.

Year	Budget Date	return 1 week before	return 1 week after
2000-2001	29-Feb-2000	-4.84	2.90
2001-2002	28-Feb-2001	-1.36	-4.51
2002-2003	28-Feb-2002	-0.68	4.47
2003-2004	28-Feb-2003	-0.26	-4.35
2004-2005 (interim)	3-Feb-2004	-7.12	6.32
2004-2005	8-Jul-2004	-1.24	1.40
2005-2006	28-Feb-2005	2.94	2.70
2006-2007	28-Feb-2006	1.29	3.52
2007-2008	28-Feb-2007	-8.57	-3.16
2008-2009	29-Feb-2008	2.21	-8.65
2009-2010 (interim)	16-Feb-2009	-2.45	-4.02
2009-2010	6-Jul-2009	-5.13	-4.60
2010-2011	26-Feb-2010	1.60	3.38
2011-2012	28-Feb-2011	-3.36	2.44
2012-2013	16-Mar-2012

Average (%)		-1.93	-0.16

What pre-budget rally? Didn't happen at least in last decade. If anything, the result in the week after budget has been better than the week before.

However, since the number of data points in this case is too small (a grand total of 14 points including the interim budgets), it is hard to say whether the difference between pre-budget week and post-budget week is statistically significant.

I don't prefer to dwell in such 'tiny data' domain. Visualizations can actually mislead you in this case e.g. you might conclude that week before budget looks bad. This is where the classical statistics can help. T-test was designed precisely for such a purpose.

> t.test(df$ret.before, df$ret.after)

 Welch Two Sample t-test

data:  df$ret.before and df$ret.after 
t = -1.1539, df = 24.465, p-value = 0.2597
alternative hypothesis: true difference in means is not equal to 0 
95 percent confidence interval:
 -4.938753  1.394467 
sample estimates:
 mean of x  mean of y 
-1.9264286 -0.1542857

And, the t-test tells us that difference is not statically significant. So what to do this week? Whatever you do, don't cite the budget week as reason.

Thursday, February 23, 2012

Book review: machine learning for hackers

There is a new book out titled 'Machine learning for hackers' from the publishing house O'Reilly.

I have written a review which is now live on Amazon.

One line summary: 5-stars if you already know some R.

I will be using some of the machine learning techniques in the forthcoming series of posts. Since this is a vast field, there is always that nagging feeling at the back of your mind that maybe what you are doing is naive and not at all what the guru data miners will do. Maybe there is some complex technique that can bypass all the grunt work of data scraping / cleaning / plotting and just transform the data into some n-dimensional space where the problem is trivially solved :)

The book served as a morale boosting benchmark. Authors are well known bloggers in R community. It is reassuring to learn that every traveler to machine learning promised land has to muddle through the messy swamp of unclean data and foggy relationships.

Monday, January 30, 2012

Calendar effects in USD-INR exchange rate

In a previous post, i looked at calendar effects in Indian markets using Nifty index data and concluded that end-of-month effect is very much alive in Indian markets.

The beauty of doing this analysis in a higher level language like R is that the work becomes easily reusable. Let's see if we can find some other nails to hit using this hammer. Such as - currencies.

The reference exchange rates between INR and USD, Euro and JPY are available on RBI site. As of late 2008, retail investors can trade currencies on NSE and MCX-SX through their regular brokerage accounts.

Before we look at the data, let's look at conventional wisdom. Here is a newfeed from last week (26 January 2012) by Reuters which was picked up by ET, Mint and Moneycontrol. Quoting:

The rupee fell on Wednesday as dollar demand from oil importers for month-end payments offset a rise in the local share market, ...
...
Oil is India's largest import item and oil refiners, the largest buyers of dollar in the local market, step up demand towards the end of every month to meet payment requirements.

Going by this, we would expect that rupee would depreciate toward the end of the month. Time for the evidence.

Let's first look at last 3 years when the currency markets were opened up for retail participation.

The chart below is that of long Rupee trade i.e. what return you would have made by going long rupee (which is equivalent to short USD) on various calendar dates. Such a strategy would have lost money when rupee depreciated - remember, we are long rupee - and made money when rupee appreciated against dollar.

So, going by conventional wisdom, one would expect this strategy to lose money toward end of the month.

Wow. We find that rupee actually appreciates toward the end of the month.

Maybe market anticipates the oil dollar buying and moves ahead of it.
Or, maybe, this effect is related to end-of-month effect in Nifty i.e. FIIs bring in the money toward the end of the month lifting both the index and the rupee.
Or, maybe there is some other reason. All that is fodder for next few posts.

In any case, let's explore what happened in 2011 when rupee fell all the way from 45 level to 53 level? A long rupee strategy would have killed you, right? Right. Except, if you did it only during the last 3 days of the month and first 3 days of month. These turn-of-month days were still good in such a horrendous year.

Just in case you are interested, here is data for all of last 10 years: from 2002 to 2011.

Long live the calendar effects!

Wednesday, January 25, 2012

(OT) Tools of the trade

I posted on my other blog about the free online valuation courses being offered by Prof. Ashwath Damodaran.

Prof Damodaran is using an online system (again, free) called coursekit and i decided to give it a go for the 'introduction to R' course i am conducting.

Coursekit has facebook like look and feel. Students love it. The interaction among them has gone up by a magnitude compared to the old system of using google-groups - which was mainly email oriented.

I am totally impressed with coursekit.

Tuesday, January 17, 2012

A technical market timing method for Nifty

Technical investor wants to be in sync with the market. Go with the flow is their motto.

On the other hand, fundamental investor wants to zag when the market zigs. Be Fearful when others are greedy and be greedy when others are fearful, says Warren Buffett. Invest when there is blood in the streets, says Sir Templeton.

Who is right? I have a hunch that maybe both are right - at different time scales. Anyhow, in the spirit of this blog, let's test the hunches with the data. Let's see what data has to say.

In this post, i will look at one of the most common technical method for market timing viz. moving average method. (Fundamental based market timing methods are a topic for another post)

A Moving average is commonly used with time series data to smooth out short-term fluctuations and highlight longer-term trends. Traders most commonly use moving averages of 50-period, 100-period and 200-period.

The strategy is very simple:
i) If today's closing price for Nifty is above N days moving average, then stay in the market.
ii) If today's closing price falls below N days moving average, sell out and stay out of the market till condition i signals you to get back in again.

How did this strategy do in Indian markets over last 10 years? The chart below shows the results of 100 Rupees invested according to 50 day / 100 day and 200 day simple moving average (SMA) strategy. The start date is 1st January 2002 and end date is 31 Dec 2011.

The benchmark is the buy-and-hold strategy i.e. you just invested 100 Rupees in an index fund and stayed invested through thick and thin. This is shown by the third line from the top in the graph.

Click on the graph to see bigger picture.

Both 50 and 100 day SMA strategies did better than buy and hold. They did this primarily by stepping out of the market for a time in 2008. However, the really good part about them is that they allowed one to participate fully during the bull markets. Participate in the upside, minimize damage during downside ... what more can one ask for ;)

Several caveats are in order:
i) Look at 200 day SMA. That strategy didn't work in Indian markets in last 10 years. You left way too much money on the table during bull runs. You were better off just buying and holding through thick and thin.

Note that 200 day strategy is most well known. See Mebane Faber's paper on SSRN where he sings eulogies of 10-month moving average (essentially 200 day). This paper is the 2nd most downloaded paper on SSRN.

With the benefit of hindsight, one can come up with arguments such as 200-day is too slow for today's markets and 50 or 100 days is better. But you didn't know that back then.

ii) Even with a 100 day SMA strategy, there were 6 signals on average per year. That means you had to switch in and out of the markets 6 times a year. Imagine churning your entire portfolio that many times. (There is a workaround which will get you most of the benefits without churning your portfolio - see post on my other blog)

iii) It wasn't very effective in 2011. By the way, i consider that as good news.

So there it is.

As usual, all the nifty index data is from NSE and i have posted the code to github. I have also posted an Excel file so that people who are not familiar with R could replicate my work in Excel and come up with even better timing models.

Tuesday, January 10, 2012

Looking at calendar effects in Indian Markets

What is 'too good to be true' is generally so - especially so, in stock markets.

You come across something like Calendar Effects and your first reaction is: nice try. Maybe it used to work but cannot work anymore. After all, markets are efficient and if everybody knows something to work, lo and behold, it will stop working.

In this post, let's look at the so called 'turn of the month effect'. For an accessible commentary, see Larry Connors 'Do Stocks show a bullish bias at month end'? Larry concludes: 'We show that over the past 11 3/4 years, stocks have been much stronger near months-end than they have been any other time of the month ... The fund managers like to buy near months-end, ...'.

So here we have:
a) an effect which has been documented in other markets &
b) there is a plausible reason why it works and might continue to work

Let's test it in Indian markets.

I split the data into 10 groups so that group 1 contains dates 1st - 3rd, group 2 contains dates 4th - 6th and so on.

If there is any turn of the month effect, we should be able to see it when we plot average returns for each group. If there is turn of month effect then the group 10 (dates 28 to 31 of month) should have higher average returns. See the graphs for yourself.

Conclusion: Yes Virginia, there is a turn of the month effect in Indian equity markets. Last 3 days of month have been unusually good historically.

This is the graph for last 11 years: 2001 to 2011

Does it still work? Maybe the average is driven by earlier period?
Let's plot data for last 3 years i.e. 2009, 2010 and 2011.

It continued to work in last 3 years.
How about the really bad 2008 - when nothing worked?

Cool. Even in the horrendous 2008, you actually made money if you were in the market only for last calendar week.

Notes:
(1) Data: S&P Nifty index data for last 11 years viz. from 1st January 2001 to 31 Dec 2011. You can download it from NSE. The data is already clean and in CSV format.
(2) For the geek reader, the code in R is hosted here

Sunday, January 8, 2012

Welcome guest

In this blog, I want to look at publicly available datasets to verify the "conventional wisdom" or, maybe, refute it. If the data refutes conventional wisdom, that should be fun.

Typically I will look at equity market data as a) it is readily available in abundance and b) there is a lot of conventional wisdom i.e. folklore around how to make money in the markets.

I am looking for occasional profit in addition to regular fun. Wait a minute, maybe i should reverse it - that sounds better - I am looking for regular profit and occasional fun.