Before I discuss platform-specific issues, I want to mention the issue of timeouts. If you want to run a command that takes a long time on the buildbot slave, you need to increase the default timeout (which is 1200 sec. = 20 min.) for the ShellCommand definitions in the buildmaster's master.cfg file -- otherwise, the master will mark that command as failed after the timeout expires. To modify the default timeout, simply add a keyword argument such as timeout=3600 to the ShellCommand (or derived class) instantiation in master.cfg. I have for example this line in the builders section of my master.cfg file:
client_smoke_tests = s(ClientSmokeTests, command="%s/buildbot/run_smoke_tests.py" % BUILDBOT_PATH, timeout=3600)
where ClientSmokeTests is a class I derived from ShellCommand (if you need details on this, see again my previous post on buildbot.)
Buildbot on Windows
My setup: Windows 2003 server, Active Python 2.4.2
Issue with subprocess module: I couldn't use the subprocess module to run commands on the slave. I got errors such as these:
p = Popen(arglist, stdout=PIPE, stderr=STDOUT)
File "C:\Python24\lib\subprocess.py", line 533, in __init__
(p2cread, p2cwrite,
File "C:\Python24\lib\subprocess.py", line 593, in _get_handles
p2cread = self._make_inheritable(p2cread)
File "C:\Python24\lib\subprocess.py", line 634, in _make_inheritable
DUPLICATE_SAME_ACCESS)
TypeError: an integer is required
I didn't have too much time to spend troubleshooting this, so I ended up replacing calls to subprocess to calls to popen2.popen3(). This solved the problem.
Also, I'm not currently running the buildbot process as a Windows service, although it's on my TODO list. I wrote a simple .bat file which I called startbot.bat:
buildbot start C:\qa\pylts\buildbot\QA
To start buildbot, I launched startbot.bat from the command prompt and I left it running.
Note that on Windows, the buildbot script gets installed in C:\Python24\scripts, and there is also a buildbot.bat batch file in the same scripts directory, which calls the buildbot script.
Issue with buildbot.bat: it contains a hardcoded path to Python23. I had to change that to Python24 so that it correctly finds the buildbot script in C:\Python24\scripts.
Buildbot on Solaris
My setup: one Solaris 9 SPARC server, one Solaris 10 SPARC server, both running Python 2.3.3
Issue with ZopeInterface on Solaris 10: when I tried to install ZopeInterface via 'easy_install http://www.zope.org/Products/ZopeInterface/3.1.0c1/ZopeInterface-3.1.0c1.tgz', a compilation step failed with:
/usr/include/sys/wait.h:86: error: parse error before "siginfo_t"
A google search revealed that this was a gcc-related issue specific to Solaris 10. Based on this post, I ran:
# cd /usr/local/lib/gcc-lib/sparc-sun-solaris2.10/3.3.2/install-tools
# ./mkheaders
After these steps, I was able to install ZopeInterface and the rest of the packages required by buildbot.
For reference, here is what I have on the Solaris 10 box in terms of gcc packages:
# pkginfo | grep -i gcc
system SFWgcc2 gcc-2 - GNU Compiler Collection
system SFWgcc2l gcc-2 - GNU Compiler Collection Runtime Libraries
system SFWgcc34 gcc-3.4.2 - GNU Compiler Collection
system SFWgcc34l gcc-3.4.2 - GNU Compiler Collection Runtime Libraries
application SMCgcc gcc
system SUNWgcc gcc - The GNU C compiler
system SUNWgccruntime GCC Runtime libraries
Here is what uname -a returns:
# uname -a
SunOS sunv2403 5.10 Generic sun4u sparc SUNW,Sun-Fire-V240
Issue with exit codes from child processes not intercepted correctly: on both Solaris 9 and Solaris 10, buildbot didn't seem to intercept correctly the exit code from the scripts which were running on the build slaves. I was able to check that I had the correct exit codes by running the scripts at the command line, but within buildbot the scripts just hung as if they hadn't finish.
Some searches on the buildbot-devel mailing list later, I found the solution via this post: I replaced usepty = 1 with usepty = 0 in buildbot.tac on the Solaris slaves, then I restarted the buildbot process on the slaves, and everything was fine.
Buildbot on AIX
My setup: AIX 5.2 on an IBM P510 server, Python 2.4.1
No problems here. Everything went smoothly.
5 comments:
I'm setting up a buildbot on Debian Etch and I'm going to post my problems here in the hopes that someone else will see them.
Once I got the buildbot running, the buildslave would carry out the first build and then get lost in a nasty connect/disconnect cycle. After fooling around with the keepalive for a while, I changed usePty to 0 and that seems to have worked. I now have a happy green build.
Moral of the story: don't be afraid to fool around with the usePty setting.
Matthew -- thanks for the comment, and for the solution. In my case, I had a different problem that was solved by the same usePty trick on Solaris. Good to see it worked for you.
I've found that SLES and Suse should/need to have /etc/profile sourced into the running environment for everything to run smoothly. You'll get things to run without this, it will just be lots of annoying work.
Dan -- thanks a lot for the hint. I'm sure it will be handy for somebody some day :-)
Grig
Hi there
You might want to see Installing a Buildbot service on Windows for information about setting up a Windows service for Buildbot.
Cheers
JP
Post a Comment