Sunday, March 7, 2010

Obsession with API Performance

Over the last few weeks, I've noticed that different point apps at work have significantly different response times to API calls made against them.  For example, a ListManager API call using gSOAP takes about a 10-15 ms round-trip time (at least to my recollection).  The API calls to our core authentication engine is on the order of 300 ms, and the round-trip time to the email engine is about 1 to 2 seconds.  Over the next few weeks, I plan to explore various technology stacks for implementing an API service.  As a follow-up, I'll be looking at optimizations for the most promising stacks, and then look at API clients (e.g., JavaScript vs. Flex).

The for API servers, we'll look at 3 different API calls:
  1. An simple API call that has no database interaction (e.g., a function that just returns the current time)
  2. A function that implements a basic authentication - (e.g., inputs: username and password, output: auth token)
  3. A function that pulls a fair amount of data from a database, packages it up, and ships the data back.
Here are some ideas for some of the service stacks to try out:
  1. LAMP stack using POX (plain old XML) - The P being PHP in this stack.
  2. Tomcat (Java)  using servlets to process POX
  3. Tomcat (Java) SOAP using Axis2
  4. C++ using libxml++ -- stand-alone server using POX
  5. C++ using gSOAP
The POX protocols will be identical across all technology stacks, and identical to the SOAP versions in terms of arguments and results.  I'll investigate using a REST-like protocol to the extent possible.  For the back-end database, we'll use MySQL, and use the same dataset/database for each experiment. 

I've dabbled in the various areas above, but I'm ready for a deep dive into each stack to be able to gain the expertise to do a real apples-to-apples comparison.  The tests will consist of running each script a number of times in a row (say 100), and measuring the average, min, max, and standard deviation, to see how consistent each stack is.  We'll also be running MySQL with caching disabled for the database calls, to eliminate caching as a source of variation (we'll turn it back on when we look at optimizing some of the stacks).

No comments:

Post a Comment