Cheating Tests

Another week goes by and more leading consumer websites crashed and burned, causing losses in both revenue (millions) and, perhaps more importantly, in consumer confidence in ecommerce. This week fail whales were sited swimming in the oceans of one of the world’s largest retailers, a leading children’s toy manufacturer and the industry’s leading online payment processor. PayPal alone (which was down for several hours this past Friday) may have lost hundreds of millions of dollars for itself and the enormous network of retail outlets that rely on them for their financial transaction services.

So why are fail whales continuing to happen so frequently? Don’t these companies test their sites? The answer, for load and performance testing, is of course they do…most of the time. In fact, some companies are spending millions of dollars on people, hardware and tools to load and performance test their websites. So why all the fail whales? One answer may be that organizations that have chosen the wrong testing tools or test service are, in fact, cheating on their tests!

Consumer facing website have become the primary channel of revenue and product information for millions of companies around the globe. So why cheat on testing? The pressure to deliver (business agility) is enormous today, for all IT organizations, and may well be a key reason cheating has become a common practice. Many test subcontractors and test companies cheat on their tests simply because they run out of time. According to PCWeek, this is what happened to AT&T on the pre-registration site for the iPhone4 launch. Due to a last minute feature upgrade they had no time to adequately performance test the site, which, of course, crashed an hour after it was launched, due to a 10x spike in traffic, creating a PR and revenue nightmare! Other organizations cheat because they just can’t afford the resources (people, hardware and tools) to properly test their sites in the first place.

The most common way to defend this cheating is to use semantics to obscure the shortcuts taken. When asked by an enraged business owner “did you test the site before going live?”, the answer is always “of course we did”. The problem is that it’s the wrong question. The right question is “did you test the site by accurately simulating real users performing both normal and unusual tasks, at and above expected volumes?”. For instance, if the goal was to simulate 5,000 concurrent users, a tester may respond that they tested for 5,000 “page views”! This is when language matters, since 5,000 page loads rarely equals the activity of 5,000 real users. In fact, it likely represents only a fraction of the target volume. By simply substituting page views or transactions for accurately simulating the activity of users on the site they almost certainly won’t reach the expected goal set by the business owner.

Another method of cheating the system is to adjust the timings of test scenarios. This practice is widely used by the testing community, primarily due to the cost of hardware and software when using traditional testing tools. For example, if buying a plane ticket online typically takes about 10 minutes, a clever tester may reduce the timing of this process in the test to just 1 minute. This, on the face of it, allows many more “users” into the system, but it doesn’t accurately simulate real world conditions. Finally, many leading edge companies are beginning to realize that the only way to accurately test a site is by including production testing. Testing only in the lab, and then extrapolating the results for the production environment, leaves far too many variables unaccounted for in the complex deployment environment that is the web. Again, cheating the testing system.

Whatever the reason for the cheating (lack of time, people, or resources) we must change this game now to maintain a high level of consumer confidence and continue to expand the growth of online commerce.


0 opinions: