Friday, February 25, 2005

Web app testing with Python part 1: MaxQ

I intend to write a series of posts on various Web app testing tools that use Python/Jython. I'll start by covering MaxQ, then I'll talk about mechanize, Pamie, Selenium and possibly other tools.

First of all, I'll borrow Bret Pettichord's terminology by saying that there are two main classes of Web app testing tools:
  1. Tools that simulate browsers (Bret calls them "Web protocol drivers") by implementing the HTTP request/response protocol and by parsing the resulting HTML
  2. Tools that automate browsers ("Web browser drivers") by driving them for example via COM calls in the case of Internet Explorer
Examples of browser simulators:
Examples of browser drivers:
  • Pamie (Python), which is based on Samie (Perl): IE automation via COM
  • Watir (Ruby): IE automation via COM
  • JSSh (Mozilla C++ extension module): Mozilla automation via JavaScript shell connections
One tool that doesn't fit neatly in these categories is Selenium from ThoughtWorks. I haven't experimented with it yet, so all I can say is that it has several components:
  • Server-side instrumentation adapted to the particular flavor of the Web server under test
  • JavaScript automation engine -- the "Browser Bot" -- that runs directly in the browser and parses tests written as HTML tables (a la FIT)
The main advantage that Selenium has over the other tools I mentioned is that it's cross-platform, cross-browser. The main disadvantage is that it requires server-side instrumentation. The syntax used by the Selenium JavaScript engine inside the browser is called "Selenese". In Bret Pettichord's words from his blog:

"You can also express Selenium tests in a programming language, taking advantage of language-specific drivers that communicate in Selenese to the browser bot. Java and Ruby drivers have been released, with dot-Net and Python drivers under development. These drivers allow you to write tests in Java, Ruby, Python, C#, or VB.net."

Jason Huggins, the main Selenium developer, is at the same time a Plone developer. He pointed me to the Python code already written for Selenium. Right now it's only available via subversion from svn://beaver.codehaus.org/selenium/scm/trunk. I checked it out, but I haven't had a chance to try it yet. It's high on my TODO list though, so stay tuned...

One issue that almost all browser simulator tools struggle with is dealing with JavaScript. In my experience, their HTML parsing capabilities tend to break down when faced with rich JavaScript elements. This is one reason why Wilkes Joiner, one of the creators of jWebUnit, said that jWebUnit ended up being used for simple "smoke test"-type testing that automates basic site navigation, rather than for more complicated acceptance/regression testing. No browser simulator tool I know of supports all of the JavaScript constructs yet. But if the Web application you need to test does not make heavy use of JavaScript, then these tools might prove enough for the job.

Browser driver tools such as Watir, Samie and Pamie do not have the JavaScript shortcoming, but of course they are limited to IE and Windows. This may prove too restrictive, especially in view of the recent Firefox resurgence. I haven't used the Mozilla-based JSSh tool yet.

The tool I want to talk about in this post is MaxQ. I found out about it from Titus Brown's blog. MaxQ belongs to the browser simulator category, but it is different from the other tools I mentioned in that it uses a proxy to capture HTTP requests and replies. One of its main capabilities is record/playback of scripts that are automatically written for you in Jython while you are browsing the Web site under test. The tests can then be run either using the GUI version of the tool (which also does the capture), or from the command line.

MaxQ is written in Java, but the test scripts it generates are written in Jython. This is a typical approach taken by other tools such as The Grinder and TestMaker. It combines the availability of test libraries for Java with the agility of a scripting language such as Jython. It is a trend that I see gaining more traction in the testing world as Jython breaks more into the mainstream.

MaxQ's forte is in automating HTTP requests (both GET and POST) and capturing the response codes, as well as the raw HTML output. It does not attempt to parse HTML into a DOM object, as other tools do, but it does offer the capability of verifying that a given text or URI exists in the HTTP response. There is talk on the developer's mailing list about extending MaxQ with HttpUnit, so that it can offer more finely-grained control over HTML elements such as frames and tables. MaxQ does not support HTTPS at this time.

One question you might have (I know I had it) is why should you use MaxQ when other tools offer more capabilities, at least in terms of HTML parsing. Here are some reasons:
  • The record/playback feature is very helpful; the fact that the tool generates Jython code makes it very easy to modify it by hand later and maintain it
  • MaxQ retrieves all the elements referenced on a given Web page (images, CSS), so it makes it easy to test that all links to these objects are valid
  • Form posting is easy to automate and verify
  • The fact that MaxQ does not do HTML parsing is sometimes an advantage, since HTML parsing is brittle (especially when dealing with JavaScript), and relying on HTML parsing makes your tests fragile and prone to break whenever the HTML elements are modified
In short, I would say that you should use MaxQ whenever you are more interested in testing the HTTP side of your Web application, and not so much the HTML composition of your pages.

Short MaxQ tutorial

As an example of the application under test, I will use a fresh installation of Bugzilla and I will use MaxQ to test a simple feature: running a Bugzilla query with a non-existent summary results in an empty results page.
Install MaxQ

I downloaded and installed MaxQ on a Windows XP box. I already had the Java SDK installed. To run MaxQ, go to a command prompt, cd to the bin sub-directory and type maxq.bat. This will launch the proxy process, which by default listens on port 8090. It will also launch the MaxQ Java GUI tool.

In the GUI tool, go to File->New to start either a "standard" script or a "compact" script. The difference is that the standard script will include HTTP requests for all the elements referenced on every Web page you visit (such as images or CSS), whereas the compact script will only include one HTTP request per page, to the page URL. The compact script also lives up to its name by aggregating the execution of the HTTP request and the validation of the response in one line of code.

To start a recording session, go to Test->Start Recording.

Now configure your browser to use a proxy on localhost:8090.

Record the test script

For my first test, I created a new standard script and MaxQ generated this code:

# Generated by MaxQ [com.bitmechanic.maxq.generator.JythonCodeGenerator]
from PyHttpTestCase import PyHttpTestCase
from com.bitmechanic.maxq import Config
from org.python.modules import re
global validatorPkg
if __name__ == 'main':
validatorPkg = Config.getValidatorPkgName()
# Determine the validator for this testcase.
exec 'from '+validatorPkg+' import Validator'


# definition of test class
class test_bugzilla_empty_search(PyHttpTestCase):
def runTest(self):
self.msg('Test started')

# ^^^ Insert new recordings here. (Do not remove this line.)


# Code to load and run the test
if __name__ == 'main':
test = MaxQTest("MaxQTest")
test.Run()

Note that the test class is derived from PyHttpTestCase, a Jython class that is itself derived from a Java class: HttpTestCase. What HttpTestCase does is encapsulate the HTTP request/response functionality. Its two main methods are get() and post(), but it also offers helper methods such as responseContains(text) or responseContainsURI(uri), which verify that a given text or URI is present in the HTTP request.

I started recording, then I went to http://example.com/bugs in my browser (real URL omitted) and got to the main Bugzilla page. I then clicked on the "Query existing bug reports" link to go to the Search page. I entered "nonexistentbug!!" in the Summary field, then clicked Search. I got back a page containing the text "Zarro Boogs found."

While I was busily navigating the Bugzilla pages and posting the Search query, this is what MaxQ automatically recorded for me:

# Generated by MaxQ [com.bitmechanic.maxq.generator.JythonCodeGenerator]
from PyHttpTestCase import PyHttpTestCase
from com.bitmechanic.maxq import Config
from org.python.modules import re
global validatorPkg
if __name__ == 'main':
validatorPkg = Config.getValidatorPkgName()
# Determine the validator for this testcase.
exec 'from '+validatorPkg+' import Validator'


# definition of test class
class test_bugzilla_empty_search(PyHttpTestCase):
def runTest(self):
self.msg('Test started')
self.msg("Testing URL: %s" % self.replaceURL('''http://example.com/bugs'''))
url = "http://example.com/bugs"
params = None
Validator.validateRequest(self, self.getMethod(), "get", url, params)
self.get(url, params)
self.msg("Response code: %s" % self.getResponseCode())
self.assertEquals("Assert number 1 failed", 301, self.getResponseCode())
Validator.validateResponse(self, self.getMethod(), url, params)

self.msg("Testing URL: %s" % self.replaceURL('''http://example.com/bugs/query.cgi'''))
url = "http://example.com/bugs/query.cgi"
params = None
Validator.validateRequest(self, self.getMethod(), "get", url, params)
self.get(url, params)
self.msg("Response code: %s" % self.getResponseCode())
self.assertEquals("Assert number 2 failed", 200, self.getResponseCode())
Validator.validateResponse(self, self.getMethod(), url, params)

params = [
('''short_desc_type''', '''allwordssubstr'''),
('''short_desc''', '''nonexistentbug!!!'''),
('''long_desc_type''', '''allwordssubstr'''),
('''long_desc''', ''''''),
('''bug_file_loc_type''', '''allwordssubstr'''),
('''bug_file_loc''', ''''''),
('''bug_status''', '''NEW'''),
('''bug_status''', '''ASSIGNED'''),
('''bug_status''', '''REOPENED'''),
('''emailassigned_to1''', '''1'''),
('''emailtype1''', '''substring'''),
('''email1''', ''''''),
('''emailassigned_to2''', '''1'''),
('''emailreporter2''', '''1'''),
('''emailcc2''', '''1'''),
('''emailtype2''', '''substring'''),
('''email2''', ''''''),
('''bugidtype''', '''include'''),
('''bug_id''', ''''''),
('''votes''', ''''''),
('''changedin''', ''''''),
('''chfieldfrom''', ''''''),
('''chfieldto''', '''Now'''),
('''chfieldvalue''', ''''''),
('''cmdtype''', '''doit'''),
('''order''', '''Reuse same sort as last time'''),
('''field0-0-0''', '''noop'''),
('''type0-0-0''', '''noop'''),
('''value0-0-0''', ''''''),]
self.msg("Testing URL: %s" % self.replaceURL('''http://example.com/bugs/buglist.cgi?short_desc_type=allwordssubstr&short_desc=nonexistentbug!!!&long_desc_type=allwordssubstr&long_desc=&bug_file_loc_type=allwordssubstr&bug_file_loc=&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&emailassigned_to1=1&emailtype1=substring&email1=&emailassigned_to2=1&emailreporter2=1&emailcc2=1&emailtype2=substring&email2=&bugidtype=include&bug_id=&votes=&changedin=&chfieldfrom=&chfieldto=Now&chfieldvalue=&cmdtype=doit&order=Reuse same sort as last time&field0-0-0=noop&type0-0-0=noop&value0-0-0='''))
url = "http://example.com/bugs/buglist.cgi"
Validator.validateRequest(self, self.getMethod(), "get", url, params)
self.get(url, params)
self.msg("Response code: %s" % self.getResponseCode())
self.assertEquals("Assert number 3 failed", 200, self.getResponseCode())
Validator.validateResponse(self, self.getMethod(), url, params)

# ^^^ Insert new recordings here. (Do not remove this line.)


# Code to load and run the test
if __name__ == 'main':
test = test_bugzilla_empty_search("test_bugzilla_empty_search")
test.Run()

The generated script is a bit on the verbose side. Note that getting and verifying the HTTP request for http://example.com/bugs takes 8 lines:

   
self.msg("Testing URL: %s" % self.replaceURL('''http://example.com/bugs'''))
url = "http://example.com/bugs"
params = None
Validator.validateRequest(self, self.getMethod(), "get", url, params)
self.get(url, params)
self.msg("Response code: %s" % self.getResponseCode())
self.assertEquals("Assert number 1 failed", 301, self.getResponseCode())
Validator.validateResponse(self, self.getMethod(), url, params)

This is where the compact script form comes in handy. The equivalent compact expression is:

self.get('http://example.com/bugs', None, 301)

MaxQ shines at retrieving form fields (even hidden ones), filling them with the values given by the user and submitting the form via an HTTP POST operation. This is what the second part of the generated Jython script does.

I manually added this line before the "Insert new recordings" line:

assert self.responseContains("Zarro Boogs found")
This shows how to use the responseContains helper method from the HttpTestCase class in order to verify that the returned page contains a given page.

You can also do an ad-hoc validation on the returned HTML by using a regular expression applied to the raw HTML (which can be retrieved via the getResponse() method). So you can do something like this:

assert re.search(r'Zarro', self.getResponse())
Caveat: a simple "import re" will not work; you need to import the re module like this:
from org.python.modules import re
Run the test script

When you are done browsing the target Web site for the functionality you want to test, go to Test->Stop Recording. You will be prompted for a file name. I chose test_bugzilla_empty_search.py. At this point, you can run the Jython test script inside the MaxQ GUI by going to Test->Run. The output is something like:

Test started
Testing URL: http://example.com/bugs
Response code: 301
Testing URL: http://example.com/bugs/query.cgi
Response code: 200
Testing URL: http://example.com/bugs/buglist.cgi?short_desc_type=allwordssubstr&short_desc=nonexistentbug!!!&long_desc_type=allwordssubstr&long_desc=&bug_file_loc_type=allwordssubstr&bug_file_loc=&bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&emailassigned_to1=1&emailtype1=substring&email1=&emailassigned_to2=1&emailreporter2=1&emailcc2=1&emailtype2=substring&email2=&bugidtype=include&bug_id=&votes=&changedin=&chfieldfrom=&chfieldto=Now&chfieldvalue=&cmdtype=doit&order=Reuse same sort as last time&field0-0-0=noop&type0-0-0=noop&value0-0-0=
Response code: 200
Test Ran Successfully
You can also run the script at the command line by invoking:
maxq -r test_bugzilla_empty_search.py
Conclusion

I think MaxQ is a useful tool for regression-testing simple Web site navigation and form processing. Its record/playback feature is very helpful in taking away from the tediousness of manually generating test scripts (as an aside, TestMaker uses the MaxQ capture/playback engine for its own functionality.) The fact that the script language is Jython is a big plus, since testers can enhance the generated scripts with custom Python logic. The source code is clean and easy to grasp, and development is active at maxq.tigris.org.

Another nifty feature I haven't mentioned is that it is easy to add your own script generator plugins. All you need to do is write a Java class derived from AbstractCodeGenerator, put it in java/com/bitmechanic/maxq/generator under the main maxq directory, recompile maxq.jar via ant, then add the class to conf/maxq.properties in the generator.classnames section. The MaxQ GUI tool will then automatically pick up your generator at run time and offer it in the File->New menu. For an example of a custom generator, see Titus Brown's PBP script generator.

On the minus side, MaxQ is not the best tool to use if you need fine-grained control over HTML elements such as links, tables and frames. If you need this functionality, you are better off using a tool such as HttpUnit or HtmlUnit and drive it from Jython. If instead of Jython you want to use pure Python, you can use mechanize or webunit, which I'll discuss in a future post.

7 comments:

Paulo Eduardo Neves said...

The last time I saw MaxQ, there was a annoying problem. The generated jython test class was a subclass of a java class, instead of the pyunit TestCase class. Since the java base class can't instropect python methods, if I remember well (maybe I'm wrong) two problems appear:
1) You can't define your own setUp and tearDown methods in jython, they aren't called.
2) You can't have a lot of testMethods in the same testing class.

So you couldn't record a simple session, and programatically modify it to test a lot of different paths.

Anonymous said...

Forgive me if I engage my typewriter before I engage my brain ;-)

You have pointed out that we have some tools to test web sites, fair enough.

Some seem to good, like pamie, but are early in the development stages and not fully tested.

Why didn't you offer even the smallest critique to the classes Microsoft gives us for navigating controls in web pages?

Anonymous said...

Why didn't you offer even the smallest critique to the classes Microsoft gives us for navigating controls in web pages?

I believe PAMIE does that under the hood. From the PAMIE site -
Pamie allows you to automate I.E. by manipulating I.E.'s Document Object Model via COM.

By all means, do it at the fiddly COM level yourself.

from win32com.client import Dispatch

ie = Dispatch("InternetExplorer.Application")

Go hard. ;)

Giri said...

can you tell me, whether maxq supports java scripts or not.

Anonymous said...

I found this tool for testing. Behind is the COM component of programs.

http://www.tizmoi.net/watsup/intro.html

It goes in section 2.

Lp,
admir.mustafic@xxxxxx.xx

Frederic Torres said...

InCisif.net supports C#, VB.NET and IronPython.

Anonymous said...

thank you for your useful comment about choosing of right approach of web application testing.

An ecommerce web site without a good web application is not possible. Web application development is the backbone of any online business irrespective of the fact whether it is catering to a large, small or medium customer base. There are many good companies in India that make such services easy and affordable, just browse the net and make your pick…

Modifying EC2 security groups via AWS Lambda functions

One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...