Thursday, December 16, 2004

STAF/STAX tutorial

Automated test distribution, execution and reporting with STAF/STAX

Assume you are part of a test team whose goal is to automate the distribution of tests to a large set of clients running on various platforms. You want to run an automated 'smoke test' in the following scenario:

  • A nightly build process sends out email notification that a new version of the software is ready to be tested.
  • The notification email triggers a 'Start Smoke Test' request sent to a dedicated machine (I will call it the "test management" machine), which coordinates all clients to be tested
  • The test management machine somehow tells all clients that version x.y.z of the software is available, then tells all clients to run a test harness and report back the results
  • After getting back the test results from all the clients, the test management machine sends out a test summary email containing the overall, failed, and successful test case count

You could try to implement this functionality yourself by writing for example a simple XML-RPC agent that runs on every client and accepts commands from the test management machine, but you soon realize that you need something more robust, something that had already been proved in large test environments.

I will show you how to use the STAF/STAX framework from IBM, which offers all the features listed in the smoke-test scenario just described.

The idea behind STAF is to run a very simple agent on all the machines that participate in the STAF testbed. Every machine can then run services on any other machine, subject to a so-called trust level. In practice, one machine will act as what I called the 'test management' machine, and will coordinate the test runs by sending jobs to the test clients. STAX is one of the services offered on top of the low-level STAF plumbing. It greatly facilitates the distribution of jobs to the test clients and the collection and logging of test results. STAX jobs are XML files spiced up with special <script> tags that contain Python code (actually Jython, but there are no differences for the purpose of this tutorial). This in itself was for us a major reason for choosing STAF over other solutions.

Here is the test environment that I will use in my example:

  • 3 clients that will run the test harness: one called win1 running some flavor of Windows, one called linux1 running some flavor of Linux, and one called sol1 running some flavor of Solaris
  • 1 test management machine, called mgmt1
  • 1 desktop PC, called desktop1

What follows is a step-by-step guide to configuring STAF and STAX on the machines in the example testbed:

Step 1: Install and configure STAF on the test clients

Install STAF on all 5 machines (I refer the readers to the STAF User Guide for details on installing STAF). Here is an example of a STAF configuration file (on Unix, it's usually in /usr/local/staf/bin/STAF.cfg) for one of the 3 client machines:


# Enable TCP/IP connections
interface tcpip

# Turn on tracing of internal errors and deprecated options
trace on error deprecated

serviceloader library STAFDSLS

SET CONNECTTIMEOUT 15000
SET MAXQUEUESIZE 10000

TRUST LEVEL 5 MACHINE mgmt1

Note that the 3 client machines need to increase the trust level (default is 3) for the test management machine, so that the latter can initiate jobs on the clients.

Step 2: Install and configure STAX on the management host


Install the STAX service on the test management machine. In STAF parlance, this machine is called the STAX Service machine (readers are referred to the STAX User's Guide for details on STAX). There are a few things to remember in terms of requirements for this machine:
    • Java 1.2 or later needs to be installed
    • The following 2 variables need to be set (for example in .bash_profile):
export CLASSPATH=$CLASSPATH:/usr/local/staf/lib/JSTAF.jar	

export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/staf/lib
The STAF.cfg configuration file needs to have the STAX service added to it (note the increase to trust level 4 for the desktop1 machine, which will act as the monitoring machine and needs special rights to connect to mgmt1):

# Enable TCP/IP connections
interface tcpip

# Turn on tracing of internal errors and deprecated options
trace on error deprecated

serviceloader library STAFDSLS

SERVICE STAX LIBRARY JSTAF EXECUTE /usr/local/staf/services/STAX/STAX.jar

SET MAXQUEUESIZE 10000

TRUST LEVEL 4 MACHINE desktop1


Step 3: Start the STAF agent

Run STAFProc on all 5 machines. STAFProc is the STAF agent that listens on a specific port (default is 6500) for STAF-specific commands.

Step 4: Create STAX job files


Create the STAX XML job files that will be interpreted by the STAX service on mgmt1. Here is an example of a job file, called client_test_harness.xml, that will run a test harness on our 3 clients

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE stax SYSTEM "C:\QA\STAF\stax.dtd">
<stax>
<!--
The following <script> element is overriden if the global_vars.py SCRIPTFILE is used
A SCRIPTFILE can be specified either in the STAX Monitor, or directly when submitting a job to STAX
-->

<script>
VERSION = '1.0.1'
HARNESS_TIMER_DURATION = '60m'

clients_os = { 'win1':'win',
'sol1':'unix',
'linux1':'unix'
}
harness_path = {
'unix': '/qa/harness',
'win' : 'C:/qa/harness'
}
tests_unix = [
[ 'unix_perms', 'brv_unix_perms.py' ],
[ 'long_names', 'brv_long_names.py' ]]
tests_win = [
[ 'unicode_names', 'brv_unicode_names.py' ]]
</script>

<defaultcall function="Main"/>

<function name="Main">
<sequence>
<import machine="'mgmt1'" file="'/QA/STAF/stax_jobs/log_result.xml'"/>
<call function="'ClientTestHarness'">
[clients_os, harness_path, tests_unix, tests_win]
</call>
</sequence>
</function>

<function name="ClientTestHarness">
<function-list-args>
<function-required-arg name='clients_os'/>
<function-required-arg name='harness_path'/>
<function-required-arg name='tests_unix'/>
<function-required-arg name='tests_win'/>
<function-other-args name='args'/>
</function-list-args>
<paralleliterate var="machine" in="clients_os.keys()">
<sequence>
<script>
os_type = clients_os[machine]
tests = {}
if os_type == 'unix':
tests = tests_unix
if os_type == 'win':
tests = tests_win
</script>
<iterate var="test" in="tests">
<sequence>
<script>
test_name = machine + "_" + test[0]
</script>
<testcase name="test_name">
<sequence>
<script>
cmdline = harness_path[os_type] + "/" + test[1] </script>
<timer duration = "HARNESS_TIMER_DURATION">
<process>
<location>machine</location>
<command>'python'</command>
<parms>cmdline</parms>
<stderr mode="'stdout'" />
<returnstdout />
</process>
</timer>
<call function="'LogResult'">machine</call>
</sequence>
</testcase>
</sequence>
</iterate>
</sequence>
</paralleliterate>
</function>
</stax>

The syntax may seem overwhelming at first, but it turns out to be quite manageable once you get he hang of it. Here are the salient points in the above file:
  • The first <script> element sets a number of Python variables which are then used in the body of the XML document; think of them as global constants
  • There is one function called in the element; this function is called Main and is defined in the first element
  • The Main function imports another XML file (log_result.xml) in order for this job to be able to call a function (LogResult) defined in the imported file
  • The Main function then calls a function called ClientTestHarness, passing it as arguments four Python variables defined at the top
  • Almost all the action in this job happens in the ClientTestHarness function, which starts by declaring its required arguments, then proceeds by running a series of tests in parallel on each of our 3 client machines; the parallelism is achieved by means of the element
  • The <script> element that follows is simple Python code that retrieves the test suite to be run from the global dictionaries, via the machine name
  • On each machine, the tests in the test suite are executed sequentially, via the element
  • A element is defined for each test, so that we can easily retrieve the test statistics at the end of the run, via the LogResult function
  • For each test, the ClientTestHarness function executes a element, which runs a command (for example brv_unix_perms.py) on the target machine; the element is surrounded by a element which will mark the test as failed if the specified time interval reaches its limit
  • The element also specifies that the command to be executed redirect stderr to stdout, and return stdout
  • Finally, the ClientTestHarness function calls LogResult, passing it the machine name as the only argument

The LogResult function is defined in the log_result.xml file. Its tasks are to:
  • interpret the return code (which is a STAF-specific variable called RC) and the output (which is a STAX-specific variable called STAXResult) for each test case
  • set the result of the test run to PASS or FAIL
  • log it accordingly

Here is the log_result.xml file:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE stax SYSTEM "C:\QA\STAF\stax.dtd">
<stax>
<function name="LogResult">
<function-list-args>
<function-required-arg name='machine'/>
<function-other-args name='args'/>
</function-list-args>
<if expr="RC != 0">
<sequence>
<tcstatus result="'fail'">'Failed with RC=%s' % RC</tcstatus>
<log level="'error'">'Process failed with RC=%s, Result=%s' % (RC, STAFResult)</log>
</sequence>
<elseif expr="STAXResult != None">
<iterate var="file_info" in="STAXResult" indexvar="i">
<if expr="file_info[0] == 0">
<sequence>
<script>
import re
fail = re.search('FAIL', file_info[1])
log_msg = 'HOST:%s\n\n%s' % (machine,file_info[1])
</script>
<if expr = "fail">
<sequence>
<tcstatus result="'fail'">'Test output contains FAIL'</tcstatus>
<log level="'error'">log_msg</log>
</sequence>
<else>
<sequence>
<tcstatus result="'pass'"></tcstatus>
<log level="'info'">log_msg</log>
</sequence>
</else>
</if>
</sequence>
<else>
<log level="'error'">'Retrieval of file %s contents failed with RC=%s' % (i, file_info[0])</log>
</else>
</if>
</iterate>
</elseif>
<else>
<log level="'info'">'STAXResult is None'</log>
</else>
</if>
</function>
</stax>

Step 4: Run STAX jobs on the test clients

From the desktop1 machine, which in STAX is called the monitoring machine, send a carefully crafted STAF command to the test management machine, telling it to run the client_test_harness.xml job:

STAF mgmt1 STAX EXECUTE FILE /QA/STAF/stax_jobs/client_test_harness.xml MACHINE mgmt1 SCRIPTFILE /QA/STAF/stax_jobs/global_vars.py JOBNAME "CLIENT_TEST_HARNESS" SCRIPT "VERSION='1.0.2'" CLEARLOGS Enabled

The above incantation runs a STAF command by specifying a service (STAX) and a request (EXECUTE), then passing various arguments to the request, the most common ones being a FILE (the path to the job XML file), a MACHINE to run the job file on (mgmt1), and a JOBNAME (which can be any string value).

Two other arguments, entirely optional, are Python-specific:

  • SCRIPTFILE -- points to a Python file whose code will be interpreted after the code in the top-level <script> element of the job file; in my example, the global_vars.py file contains definitions of Python variables that will override the variables defined in the job's <script> element
  • SCRIPT -- can contain any inline Python code, which will be interpreted after any code in the job's top-level <script> element, and after any code in the SCRIPTFILE; in my example, the VERSION variable is set to 1.0.2 on the command line via the SCRIPT argument, because it is retrieved from the nightly build email notification, and thus is not known in advance. The value 1.0.2 will override whatever values are given in the <script> element and in global_vars.py

To summarize, a SCRIPTFILE file is commonly used as a "static" repository for Python variables that are used across several job files, whereas the SCRIPT inline code is used to pass "dynamic" values for Python variables on the command line.

The above STAF command, if successful, returns an integer that represents the job ID. Based on this ID, we can query the log service on the STAX machine (mgmt1) by running this command:

STAF mgmt1 LOG QUERY MACHINE mgmt1 LOGNAME STAX_Job_jobID

STAX also offers a GUI monitoring tool called the STAX Job Monitor that is usually run on the monitoring machine (desktop1 in our example). The tool is a Java application that is started via the command line (java -jar STAXMon.jar) in the directory which contains the STAX service jar files. The Job Monitor displays the processes that are run within the job, as well as the test case information (test name, pass/fail status, duration) for each test in the test suite.

Conclusion

I will now show how all these steps fit together and give us the capability to run the automated smoke-test scenario I described in the beginning of this section.

  • A build completion message is sent to several distribution lists with a subject that contains the new version of the software
  • The build message is forwarded via a mail alias to an account on the test management machine
  • A .procmailrc file on the test management machine triggers a Python script that runs the "STAF mgmt1 STAF EXECUTE " command in step 4. The script then sits in a loop and periodically queries the log file (via the LOG QUERY command) for the new job identified by jobID. When it sees a line containing "Stop|JobID: jobID", the script sends a message with the job log in its body and the test count (overall, pass and fail) in its subject
  • The PARALLELITERATE and ITERATE constructs available in STAX allow us to achieve both parallel and sequential operations for the test run: we run the test harness in parallel on all clients, then on each client we run the individual tests comprising the harness in a sequential order. Another very useful STAX construct is TIMER, which makes it very easy to time out the failed tests so that the whole test run is not held up
  • Since all the individual tests are written using our framework, all the test results are also saved in the Firebird database and can be easily inspected via a Web interface
Two more things are worth mentioning:
  • Support for STAF/STAX is top-notch and comes via the staf-users mailing list from the IBM developers working on this project.I had two questions answered within hour of each posting.
  • STAF/STAX is used as the test distribution platform for the Linux Test Project. The January 2005 issue of "Linux Journal" has an article on the Linux Test Project that mentions STAF/STAX.

9 comments:

Scott said...

Thanks for the overview.

I am just getting started with STAF/STAX. With XML and Python too. Do you have a sample of the Python file global_vars.py?

Thanks,

Scott

Grig Gheorghiu said...

Scott,

Send me an email to grig at gheorghiu dot net with what exactly you need and I'll reply.

Also, for a pure Python continuous integration system, I recommend buildbot. I have a couple of blog posts on it too.

Grig

Swathi said...

I found your articles very clear and interesting!. I have a question though, in reference to you posting on performance/load/stress testing concepts(whitebox vs blackbox, i guess); How exactly could one establish the performance goals and scenarios wrt. any software application one has developed?. I could as well email you about my project and questions in detail if necessary.Besides, apart from testing web applications, do you have any ideas to test java standalone applications too, in terms of load/stress testing?

Grig Gheorghiu said...

Hi, Swathi -- please add your questions in a comment and leave the comment in the appropriate post (i.e. the one on performance/stress/load testing for example) so that other readers of this blog can benefit from the discussion.

Grig

shi said...

Hello,

thank you for taking the time to write this tutorial. This tutorial did more to make me understand what staf/stax is about than all the official staf/stax documentation together (there's a huge amount of official documentation and it's quite useful too, once you have a basic mental model about why and how you'd use staf/stax).

In my search for syntax highlighting a stax specificiation, I found out nothing existed and I was happy to find out that creating a syntax highlighting definition for stax files in vim is not so hard.

I've posted the result in the staf open discussion forum:
http://sourceforge.net/forum/forum.php?thread_id=3380653&forum_id=104045

arulPrakash said...

superb tut

Seth said...

sorry for comment on a really old post...
what do you think is a good replacement (if any) for STAF nowadays. I have some machines that still use this and am wondering if maybe there's a better way (no problems, just in a review phase is all) I came across your tutorial and thought perhaps you've moved on to something else?
Thanks,
Seth

Grig Gheorghiu said...

Seth -- I haven't used STAF/STAX in a while, it's true, but the project is still actively maintained. I'm not aware of other systems that accomplish exactly the same functionality as STAF/STAX, but many people these days include their automated test suite runs in continuous integration systems such as Hudson.

Seth said...

funny you mention hudson, I actually use it to run these jobs through STAF. For some test jobs the hudson slave is enough but for some others I like the control I have over the STAF client process. Its a bit confusing since some jobs are remote via hudson and some are local for hudson but remote via STAF - I'll leave well enough alone for now. Its always a tough call to decide when to consider that there might be a better way. thanks.