Sunday, November 8, 2009

Extensive functional testing Vs Formal verification

Found this on James Hamilton's blog (quoting Butler Lampson at SOSP) :
"Problem is that testing is expensive, so there is a trade-off between extensive testing and cost. Essentially you can’t test everything. This is part of the reason why the lowest levels of the system need to be simple enough to formally reason about their behavior."
A similar thought had been running in my mind following repeated discovery of some "corner cases" in a recently released feature. The problem was that, in the absence of QA engineers (yes, can you believe it!), testing by by developers was never going to be sufficient. In response, I had proposed:
1) Listing system invariants.
2) Formally proving that (based on the current implementation of various operations exposed by the interface), these constraints would not be violated. This would have helped us single out systemic issues much earlier and would not have necessitated hand-writing expensive integration for various scenarios. That way, the functional tests could be better utilized to test other scenarios (than verifying if system invariants hold at all times).
On a related note, there must be a way of specifying these invariants formally and have a utility generate test cases against a given API? Have you heard of any?


Monday, August 3, 2009

Review comments and turning off the brain

I'd like to say +1 to this post on 37signals about the pitfalls of working in a zero-brains, "follow the review comments" mode while incorporating comments about your design or code. I have fallen into this trap myself and have often realized later that I could've come up with a much better fix or revision of my design/code if I didn't just follow every comment of the reviewer to the T.

Saturday, March 21, 2009

Hybrid infrastructure with "the cloud" and a private data center

Let's say you started up and you initially didn't have the time or resources to host and run your own infrastructure. So you launched your service on AWS - the de-facto infrastructure-as-a-service provider. You got wildly successful, are now flush with cash, and believe that you could realize some savings (and have greater control?) by running your own infrastructre. But you're also smart enough to realize that no matter how hard you try, you possibly cannot achieve the elasticity or the scale of EC2 and S3. More significantly, it makes little sense to procure and provision hardware for peaks like the one animoto experienced in April '08 (slide 17 in Jeff Bezos' presentation), and then have your CPU utilization hover at 2% for most of the year. So what do you do? No sweat : you still have your private infrastructure provisioned to deal with regular workloads, but configure your load balancer (fronting your web-facing servers) to offload to EC2 when there are unexpected peaks. Sounds too far-fetched? Well, that's precisely what SmartSheet and Picnik did - use private infrastructure for the base load, with everything beyond offloaded to AWS. Perfect.
I found it a pleasant coincidence that this was posted on James Hamilton's blog just weeks after I hit a eureka moment arriving at exactly the same hybrid model (see? great minds think alike :-)) when thinking of how a present day startup might make the transition from running completely off the cloud to being run on its own infrastructure while being resilient to traffic spikes all along.