Finally I get to write a post about testing. Here's the scenario I had to troubleshoot yesterday: a client of ours has a Web app that uses a java applet for FTP transfers to a back-end server. The java applet presents a nice GUI to end-users, allowing them to drag and drop files from their local workstation to the server.
The problem was that some file transfers were failing in a mysterious way. We obviously looked at the network connectivity between the user reporting the problem initially and our data center, then we looked at the size of the files he was trying to transfer (he thought files over 10 MB were the culprit). We also looked at the number of files transferred, both multiple files in one operation and single files in consecutive operations. We tried transferring files using both a normal FTP client, and the java applet. Everything seemed to point in the direction of 'works for me' -- a stance well-known to testers around the world. All of a sudden, around an hour after I started using the java applet to transfer files, I got the error 'unable to upload one or more files', followed by the message 'software caused connection abort: software write error'. I thought OK, this may be due to web sessions timing out after an hour. I did some more testing, and the second time I got the error after half an hour. I also noticed that I let some time pass between transfers. This gave me the idea of investigating timeout setting on the FTP server side (which was running vsftpd). And lo and behold, here's what I found in the man page for vsftpd.conf:
The timeout, in seconds, which is the maximum time a remote client may spend between FTP commands. If the timeout triggers, the remote client is kicked off.
My next step was of course to wait 5 minutes between file transfers, and sure enough, I got the 'unable to upload one or more files' error.
Lesson learned: pay close attention to the timing of your tests. Also look for timeout settings both on the client and on the server side, and write corner test cases accordingly.
In the end, it was by luck that I discovered the cause of the problems we had, but as Louis Pasteur said, "Chance favors the prepared mind". I'll surely be better prepared next time, timing-wise.
One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...
I first saw nsupdate mentioned on the devops-toolchain mailing list as a tool for dynamically updating DNS zone files from the command line....
Here's a good interview question for a tester: how do you define performance/load/stress testing? Many times people use these terms inte...
Gatling is a modern load testing tool written in Scala. As part of the Jenkins setup I am in charge of , I wanted to run load tests using Ga...