Scaling a very seasonal business.

A note from ClearTax Engineering.

Income Tax Return filing is a very seasonal business. It peaks from 15th July to 31st July in India.

I have started tuning my Controllers and the actions to make sure I extract the most juice. Scaling horizontally is done when needed too. 

But we are talking about making a single instance of a server give out more juice.

Mostly we are not CPU intensive. We are I/O bound as we very conservatively write to DB the forms you fill out. Nothing is worse than a pissed off user who did a bunch of work only to see it disappear if the server is re-deployed or moved around by Appharbor.

Strategies:
1. Using AsyncController. Don’t tie up those IIS threads which are I/O bound. I do synchronous I/O so far. Average read is taking 223 ms on a shared SQL server with almost no load. Writes take around 450 ms. So the IIS thread is basically tied up for 223 ms and 450 ms when reading or writing. Definitely scope for improvement. 

2. Low hanging fruit: I am going to see how a dedicated SQL instance is going to change performance vs a shared SQL server.

3. Doing fewer reads from the Database: Totally possible. I think memcached use is in order. The slight worry is when memcached server crashes, how do the server instances degrade? Do they start reading and writing directly from Server? Or we measure how many times the memcached server actually crashes and then worry. Empirical FTW.

4. Writing optimally to Database: We have some very good techniques to write only the dirty data to DB. We don’t use any ORM so this too is hand rolled. But this is also conservative, we’d rather write more than lose an update. Tuning this takes a lot of time, so we won’t do it unless absolutely needed.

5. Writing less to Database: Least appealing at this point. I guess empirical is better than speculation. 

6. Most user behavior will be bi-modal. Either that Income Tax Section is not useful for them at all, so they’ll skip quickly generating load quickly. OR they’ll spend a bit of time reading and puzzling about the section and then filling it out. I’d categorize most normal users in the latter group.

Aah as a special reward for the readers of the blog, I’ll let you monitor the read / write performance of the SQL server. (Unless adult supervision asks us to kill this page). 

Ssh, don’t tell others.
Comments are closed.