Monday, July 06, 2009
Thursday, July 02, 2009
Dark launching and other lessons from Facebook on massive deployments
In the note I saw for the first time a name for a strategy that teams I've been involved with have applied before: 'dark launching'. Essentially, dark launching is releasing a new feature to a subset of your users, mostly with no UI changes, but otherwise exercising all the parts of your infrastructure involved in serving that feature. A good strategy to apply when you're dealing with massive, large-scale deployments, and when you want to see how your infrastructure behaves in conditions that are as close to production as possible. Because remember, there's nothing like production! Your careful load/stress testing exercises in a lab environment ain't gonna cut it.
The note from Facebook has all sorts of other nuggets of wisdom related to massive infrastructure deployments. I recommend subscribing to the RSS feed for 'Engineering @ Facebook's Notes'.
While googling for 'dark launching', I also came across this very good post by Dare Obasanjo. Recommended reading.
Tuesday, June 30, 2009
A redbot from mnot
redbot == Resource Expert Droid == a testing tool written in Python that "checks HTTP resources to see how they use HTTP, makes suggestions, and finds common protocol mistakes"
Use this tool if you want to see how well your Web application does in terms of HTTP connections, content negotiation, caching and validation. Very useful both for traditional Web sites and for Web services.
Monday, June 29, 2009
Presentations from AWS Start-Up Event
Until John's slides are posted to Slideshare, here are 2 other presentations, one from eHarmony on the way they use Elastic MapReduce, and one from Amazon's own Jinesh Varia on 'Architecting for the AWS cloud'. Interesting stuff. Check out this Slideshare page for more links to other AWS-related presentations.
Thursday, May 21, 2009
The Second Law of Automated Testing
In particular, I want to present here what I claim to be...
The Second Law of Automated Testing
"If you ship versioned software, you need automated tests."
At the talk last night I was waiting to be asked about the first law of automated testing, but nobody ventured to ask that question ;-) (for the record, my answer would have been 'you need to buy me a beer to find that out').
But I strongly believe that if you have software that SHIPS and that is VERSIONED, then you need automated tests for it. Why? Because how would you know otherwise that version 1.4 didn't break things horribly compared to version 1.3? You either employ an army of testers to manually test each and every 1.3 feature that is present in 1.4, or you use a strong suite of automated regression tests that cover all major features of 1.3 and that show you right away if any were broken in 1.4. Your choice.
Notice that I also qualify the software as 'software that ships'. This implies that you hopefully use sound software engineering processes and techniques to build that software. I am not referring to toy projects, or 1-page Web sites for temporary events, or even academic projects that are never shipped widely to users. All these can probably survive with no automated tests.
If you think you have some software that ships and is versioned, but you found that you're doing very well with no automated tests, I'd like to hear about it, so please leave a comment.
Friday, May 15, 2009
MySQL fault-tolerance and disaster recovery techniques
The most common fault-tolerance scenario in a MySQL environment is to have a master database server and a pool of load-balanced slave database servers. Hopefully your application is configurable so it can write to the master DB and read from the slave DB pool. If it is not, you can still use this technique (with some limitations) by going through MySQL Proxy, as detailed in another blog post of mine.
There is plenty of documentation available on setting up MySQL replication. I will jot down here some notes on things I find myself doing over and over again, in a condensed format that hopefully will benefit others too.
Step 0 is to enable binary logging on the master database. That's all you need to do for a MySQL DB server to be able to function as a master. To achieve this, you can add lines like these in /etc/my.cnf and restart mysqld:
server-id = 1log-bin = /var/lib/mysql/mysql-bin
One other option you might want to set up is the binlog format. For recent MySQL versions, the default is STATEMENT. For some types of updates to the master, I found it is better to specify ROW as the binlog format (for an explanation of the differences between the 2 types, and for more info that you ever wanted about binary logging, see the official documentation):
binlog_format = ROWYou also need to create a MySQL user on the master DB and grant it REPLICATION SLAVE rights. You can use a statement like this:
GRANT REPLICATION SLAVE ON *.* TO 'replicant'@'IP_of_slave_DB' IDENTIFIED BY 'somepassword';
Setting up a MySQL slave when you can lock tables on the master
This is the recommended way of setting up a MySQL slave DB machine. It requires locking the tables for writes on the master DB, which is something you may or may not afford to do. Here are the steps you need to go through:
1) Connect to the master DB server and issue this command:
FLUSH TABLES WITH READ LOCK;
2) Note the binlog file name and position on the master by running this command:
SHOW MASTER STATUS;
| File| Position | Binlog_Do_DB | Binlog_Ignore_DB
| mysql-bin.000004 | 87547369 || |
1 row in set (0.01 sec)
3) Leave the current mysql session open so that the tables are still locked on the master, and in a different session take a database dump of the mysql database and of the application database on the master. You can use a command line such as:
mysqldump -u root -p$MY_ROOT_PW --database mysql \
--lock-all-tables | /bin/gzip > mysql.sql.gz
mysqldump -u root -p$MY_ROOT_PW --database $MYDB \
--lock-all-tables | /bin/gzip > $MYDB.sql.gz
4) Once the dump is done (a process which on a very large database can take hours), go ahead and unlock the tables in the first MySQL session:
UNLOCK TABLES;
5) Now you're ready to set up a MySQL slave database. It's a good idea to set up binary logging on all your slaves, so that if your master DB fails, any slave can be promoted to a master. If you do turn binary logging on, do NOT also enable log-slave-updates (because if you do, and if you promote a slave to a master, then the other slaves might receive some updates twice -- complete explanation available here).
The DB machine you want to set up as a slave should have lines similar to these in its /etc/my.cnf file (server-id needs to be different from the master ID and any other slave IDs that talk to the same master):
server-id = 26) On the machine you want to set up as a slave, load the mysql dumps of the mysql DB and of your application database (the ones you took in step 3). Note that you may need to create the application database before you can load the application DB dump into it.
log-bin = /var/lib/mysql/mysql-binbinlog_format = ROW
7) On the slave, fire up a mysql prompt and use the 'CHANGE MASTER TO' command to specify the master DB, the binglog file and the binlog position (you need to use the values from step 2):
STOP SLAVE;
RESET SLAVE;
CHANGE MASTER TO
MASTER_HOST='master_database_server_name',
MASTER_USER='replicant',
MASTER_PASSWORD='somepassword',
MASTER_LOG_FILE='mysql-bin.000004',
MASTER_LOG_POS=87547369;
START SLAVE;
8) Run the 'SHOW SLAVE STATUS \G' command on the newly created slave DB and make sure that the values for both Slave_IO_Running and Slave_SQL_Running show as YES, and that Seconds_Behind_Master is 0 (it can take a while initially for this value to converge to 0, but it should do so). Here is an example of the output of this command:
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: my_master_host
Master_User: replicant
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000004
Read_Master_Log_Pos: 157767054
Relay_Log_File: crt-relay-bin.000012
Relay_Log_Pos: 112340434
Relay_Master_Log_File: mysql-bin.000004
Slave_IO_Running: Yes
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table: MYDB.tmp\_%
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 112340289
Relay_Log_Space: 112340630
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: Yes
Master_SSL_CA_File: /etc/pki/tls/cert.pem
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
1 row in set (0.00 sec)
Note that I am explicitly excluding from replication tables that start with tmp, which in my case are temporary tables created by certain operations on the master DB which are not needed on the slaves. To do this, I added this line to /etc/my.cnf on the slaves (all replication filtering is done at the slave level):
replicate-wild-ignore-table = MYDB.tmp\_%
Promoting a slave database to master
Let's say disaster strikes and your master DB goes down. At this point, if you have replication set up as above, you can easily turn one of the slave DB machines into a slave, and reconfigure the other slaves to have this newly promoted machine as their master. The official documentation for this scenario is here and it's very good. Let's slave you have master M01 and slaves S01, S02 and S03. Master 01 dies. You want to promote slave S01 to master, and set up S02 and S03 to replicate from S01.
On S01, run these commands at the MySQL prompt:
STOP SLAVE;On S02 and S03, run these commands at the MySQL prompt:
RESET MASTER;
CHANGE MASTER TO MASTER_HOST='';
STOP SLAVE;
RESET SLAVE;
CHANGE MASTER TO MASTER_HOST='S01';
START SLAVE;
Now if you run 'SHOW SLAVE STATUS\G' on the slaves, you should see no errors, and you should also see the master DB hostname shown as 'S01' instead of 'M01'.
While we're on the subject of switching the master DB, it can happen that the slave DBs will get some udpates from the newly promoted master that will conflict with their current view of the database. For example, they can receive from the master a duplicate insert, or a delete on a row that doesn't exist in their database. In these cases, to bring the slave to a sane state, you can issue commands like this one, where N is 1 or 2 (see full explanation here):
STOP SLAVE;
SET GLOBAL SQL_SLAVE_SKIP_COUNTER = N;
START SLAVE;
You can try running the skip command repeatedly until the slave goes back to a successful replication state.
Setting up a slave database from a hot backup of the master
Let's say you have your master database up and running, and you want to set up a new slave without locking the tables for writes on the master. In this case, you can use a product such as InnoDB Hot Backup, which is very much worth its $500/year/host price. What's more, they provide a 30-day free evaluation binary tied to the host name of your DB machine, which is nice if you need something in a critical situation, or if you want to test it before committing to pay.
Here's a procedure for setting up a new slave DB from a hot backup on the master. The InnoDB Hot Backup documentation is very good, and what follows is a subset I used from that documentation.
1) On the master, create two mini configuration files which are tiny subsets of my.cnf. Call one for example my.cnf.source and the other one my.cnf.destination. The source file needs to contain lines similar to these referring to the location of your live MySQL installation:
# cat /etc/my.cnf.sourceThe destination file needs to contain similar lines, but pointing to a directory where the backup files will be created (that directory needs to be empty). For example:
[mysqld]
datadir = /var/lib/mysql/
innodb_data_home_dir = /var/lib/mysql/
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /var/lib/mysql/
set-variable = innodb_log_files_in_group=2
set-variable = innodb_log_file_size=512M
# cat /etc/my.cnf.destination
[mysqld]
datadir = /var/hot-backups
innodb_data_home_dir = /var/hot-backups
innodb_data_file_path = ibdata1:10M:autoextend
innodb_log_group_home_dir = /var/hot-backups
set-variable = innodb_log_files_in_group=2
set-variable = innodb_log_file_size=512M
2) On the master, run the ibbackup binary and point it to the 2 configuration files:
# /path/to/ibbackup /etc/my.cnf.source /etc/my.cnf.destination
This step can be quite lengthy, depending on the size of your database, but note that you don't need to lock any tables on the master during this time. Upon the completion of this step, you should see an InnoDB data file (its name is the one you specified in the innodb_data_file_path variable in the config files), and an InnoDB transaction log called ibbackup_logfile. Note that this is not identical to the InnoDB logs on the master. To create those logs, you need to go to the next step.
3) On the master, apply the transaction logs created by the hot backup process by running this command:
# /path/to/ibbackup --apply-log /etc/my.cnf.destination
When this is done (again it can take a while), you should see N log files called ib_logfile1, ib_logfile2, ..., ib_logfileN in the destination directory -- where N is the value of the variable innodb_log_files_in_group that you set in the configuration file.
4) On the master, do a tar.gz of all directories in the MySQL datadir which contain MyISAM tables, or .frm tables from InnoDB tables (the main one being of course the mysql directory, containing the MyISAM tables for the mysql database -- assuming of course you've kept the default of MyISAM for the mysql DB).
5) Now you're ready to transfer the data file created in step 2, the log files created in step 3, and the archives created in step 4 to a new machine running MySQL, which you intend to set up as a slave DB. Simply scp the files over. On the target machine, stop mysql, move /var/lib/mysql (or wherever your datadir is) to /var/lib/mysql.bak, create a brand new /var/lib/mysql directory and drop all the files you transferred into that directory (un-tar-ing the tar.gz files appropriately). Also run 'chmod -R mysql.mysql /var/lib/mysql'. Finally, make sure the my.cnf file on the slave has binlog enabled (in case you ever need to promote this slave to a master).
6) Restart the mysqld process on the target machine, and make note of the binlog file and position, which are captured in the mysql log file. You should see a line similar to this:
InnoDB: Last MySQL binlog file position 0 6199825, file name /var/lib/mysql/mysql-bin.000008Now go to the mysql prompt on the target machine and run:
STOP SLAVE;
RESET SLAVE;
CHANGE MASTER TO
MASTER_HOST='master_database_server_name',
MASTER_USER='replicant',
MASTER_PASSWORD='somepassword',
MASTER_LOG_FILE='mysql-bin.000008',
MASTER_LOG_POS=6199825;
START SLAVE;
At this point, 'SHOW SLAVE STATUS\G' should show no errors, and the new slave should be replicating correctly from the master DB server. It may take a while for the slave to catch up, depending on when you took the hot backup on the master.
Before I finish this post, one word of advice when it comes to mounting EBS volumes in EC2: do not mount /var by itself on an EBS, because if for some reason the EBS becomes unavailable or fails, you won't be able to ssh back into your instance. Why is that? Because sshd (at least in CentOS) needs /var/empty to be available for privilege separation purposes.
If you want to take advantage of an EBS on an EC2 instance functioning as a MySQL database server, it's better to either mount /var/lib/mysql on an EBS, or specify a non-default data directory for MySQL, which you then mount from an EBS.
UPDATE: EC2 backup strategies
An anonymous comment reminded me that I need to also discuss backups. Doh. In an EC2 environment, it's very easy to backup up a whole EBS by means of a snapshot.
Of course, if you do a snapshot with no other backups, the database files will be 'live', but I managed in one case to
1) detach an EBS containing /var/lib/mysql from an instance that was failing, and
2) attach the EBS to another instance and mount it in /var/lib/mysql
I then restarted mysqld on the new instance and everything worked as expected. This is NOT the recommended strategy however. What is recommended is to do a database dump (either a hot backup if you can afford it, or a simple mysqldump) to an EBS, and snapshot the EBS periodically.
Alternatively, you can use various S3 utilities to capture the backups directly to S3. The EBS snapshot solution is better IMO because you can quickly recreate an EBS volume from a snapshot, then mount it to either the original instance, or to a new instance.
However, EBS volumes DO sometimes fail, so another thing to think about is to run your EC2 instances (especially your slave DBs) in different availability zones. We had an issue with 2 of our database servers failing at the same time in zone US-East-1a due to EBS issues, and the thing that saved us is that we had slaves in other availability zones that weren't affected.
Thursday, April 30, 2009
MySQL load balancing and read-write splitting with MySQL Proxy
MySQL Proxy is a simple program that sits between your client and MySQL server(s) that can monitor, analyze or transform their communication. Its flexibility allows for unlimited uses; common ones include: load balancing; failover; query analysis; query filtering and modification; and many more.
Two fairly common usage scenarios for MySQL Proxy are:
1) load balancing across MySQL slaves
2) splitting reads and writes so that reads go to the slave DB servers and writes go to the master DB server
Of course, you don't need MySQL Proxy to accomplish these goals. For slave load balancing, you can use a regular load balancer in front of your slaves. For read-write splitting, you can have your application use different DB servers for reads and writes....but that may require significant changes to your application.
If you want to make things faster in terms of read performance by sending reads to a pool of slave DB servers, while still sending writes to a master DB, AND do all this without modifying your application, then MySQL Proxy might be just the ticket for you. Before you go down that path, let me say that if you make heavy use of MySQL prepared statements, you might be out of luck. In my testing, MySQL Proxy did not support prepared statements well.
Here's a short tutorial on using MySQL Proxy:
1) Download the binary package from the download page. I tried to install it from source, but I ran into some mysterious link issues with Lua libraries. If you didn't know already, MySQL Proxy uses Lua as its scripting language for doing the tricks it's capable of doing; not sure why the authors chose Lua, I suspect it's because of its compactness (the binary version of MySQL Proxy includes the Lua interpreter, so you don't need to install Lua separately.)
In my case I downloaded mysql-proxy-0.6.0-linux-rhas3-x86_64.tar.gz, untar-ed it in ROOT_DIR, then created a symlink called mysql-proxy in ROOT_DIR pointing to mysql-proxy-0.6.0-linux-rhas3-x86_64.
The actual binary is in ROOT_DIR/mysql-proxy/sbin and it's called mysql-proxy. You can run it with --help to see what command-line options it takes.
2) Run mysql-proxy and let it do both slave load balancing and read/write splitting. Load balancing is achieved by specifying the command-line switch --proxy-read-only-backend-addresses, while r/w splitting is achieved by specifying on the command line the script , which is in /mysql-proxy/share/mysql-proxy/rw-splitting.lua
Here is a script that I use to run mysql-proxy with the options I need, and in daemon mode. The master DB server is specified with the --proxy-backend-addresses cmdline switch. An important bit in the script is setting LUA_PATH and pointing it to the directory containing the Lua scripts. If you don't do it, the rw-splitting.lua script won't be found, and you won't know about it until you hit mysql-proxy. You'll then see errors around the script not being found.
Note that LUA_PATH is on the same line as the invocation of the mysql-proxy binary.
#!/bin/bash
MASTERDB=10.1.1.1
SLAVEDB01=10.2.1.1
SLAVEDB02=10.3.1.1
SLAVEDB03=10.4.1.1
ROOT_DIR=/usr/local
LUA_PATH="$ROOT_DIR/mysql-proxy/share/mysql-proxy/?.lua" $ROOT_DIR/mysql-proxy/sbin/mysql-proxy \
--daemon \
--proxy-backend-addresses=$MASTERDB:3306 \
--proxy-read-only-backend-addresses=$SLAVEDB01:3306 \
--proxy-read-only-backend-addresses=$SLAVEDB02:3306 \
--proxy-read-only-backend-addresses=$SLAVEDB03:3306 \
--proxy-lua-script=$ROOT_DIR/mysql-proxy/share/mysql-proxy/rw-splitting.lua
3) Now you have mysql-proxy running on its default port 4040 and ready for you to use. To use it, simply point your web application to 127.0.0.1:4040 instead of MASTER_DB_SERVER:3306. You can also connect to mysql-proxy with the regular mysql command-line client by running:
mysql -uroot -p -h127.0.0.1 -P 4040
4) To start mysql-proxy at boot time, here's a very simple init.d script which assumes you saved the script above in /var/scripts/run_mysql_proxy_rw_splitting.sh:
~# cat /etc/init.d/mysql-proxy
#!/bin/bash
#
# mysql-proxy: Start mysql-proxy in daemon mode
#
# Author: OpenX
#
# chkconfig: - 99 01
# description: Start mysql-proxy in daemon mode with r/w splitting
# processname: mysql-proxy
start(){
echo "Starting mysql-proxy..."
/var/scripts/run_mysql_proxy_rw_splitting.sh
}
stop(){
echo "Stopping mysql-proxy..."
killall mysql-proxy
}
case "$1" in
start)
start
;;
stop)
stop
;;
restart)
stop
start
;;
*)
echo "Usage: mysql-proxy {start|stop|restart}"
exit 1
esac
That's about it in a nutshell. There's much more to explore about the capabilities of MySQL Proxy, and I encourage you to read the main page and the articles linked to on that page. In terms of read/write splitting, the most helpful ones are these two blog posts by the author of rw-splitting.lua, Jan Kneschke. For general usage, this O'Reilly article by Giuseppe Maxia is very good.
Thursday, April 16, 2009
Check out OpenX Market
Saturday, April 04, 2009
Experiences deploying a large-scale infrastructure in Amazon EC2
Expect failures; what's more, embrace them
Things are bound to fail when you're dealing with large-scale deployments in any infrastructure setup, but especially when you're deploying virtual servers 'in the cloud', outside of your sphere of influence. You must then be prepared for things to fail. This is a Good Thing, because it forces you to think about failure scenarios upfront, and to design your system infrastructure in a way that minimizes single points of failure.
As an aside, I've been very impressed with the reliability of EC2. Like many other people, I didn't know what to expect, but I've been pleasantly surprised. Very rarely does an EC2 instance fail. In fact I haven't yet seen a total failure, only some instances that were marked as 'deteriorated'. When this happens, you usually get a heads-up via email, and you have a few days to migrate your instance, or launch a similar one and terminate the defective one.
Expecting things to fail at any time leads to and relies heavily on the next lesson learned, which is...
Fully automate your infrastructure deployments
There's simply no way around this. When you need to deal with tens and even hundreds of virtual instances, when you need to scale up and down on demand (after all, this is THE main promise of cloud computing!), then you need to fully automate your infrastructure deployment (servers, load balancers, storage, etc.)
The way we achieved this at OpenX was to write our own custom code on top of the EC2 API in order to launch and destroy AMIs and EBS volumes. We rolled our own AMI, which contains enough bootstrap code to make it 'call home' to a set of servers running slack. When we deploy a machine, we specify a list of slack 'roles' that the machine belongs to (for example 'web-server' or 'master-db-server' or 'slave-db-server'). When the machine boots up, it will run a script that belongs to that specific slack role. In this script we install everything the machine needs to do its job -- pre-requisite packages and the actual application with all its necessary configuration files.
I will blog separately about how exactly slack works for us, but let me just say that it is an extremely simple tool. It may seem overly simple, but that's exactly its strength, since it forces you to be creative with your postinstall scripts. I know that other people use puppet, or fabric, or cfengine. Whatever works for you, go ahead and use, just use SOME tool that helps with automated deployments.
The beauty of fully automating your deployments is that it truly allows you to scale infinitely (for some value of 'infinity' of course ;-). It almost goes without saying that your application infrastructure needs to be designed in such a way that allows this type of scaling. But having the building blocks necessary for automatically deploying any type of server that you need is invaluable.
Another thing we do which helps with automating various pieces of our infrastructure is that we keep information about our deployed instances in a database. This allows us to write tools that inspect the database and generate various configuration files (such as the all-important role configuration file used by slack), and other text files such as DNS zone files. This database becomes the one true source of information about our infrastructure. The DRY principle applies to system infrastructure, not only to software development.
Speaking of DNS, specifically in the context of Amazon EC2, it's worth rolling out your own internal DNS servers, with zones that aren't even registered publicly, but for which your internal DNS servers are authoritative. Then all communication within the EC2 cloud can happen via internal DNS names, as opposed to IP addresses. Trust me, your tired brain will thank you. This would be very hard to achieve though if you were to manually edit BIND zone files. Our approach is to automatically generate those files from the master database I mentioned. Works like a charm. Thanks to Jeff Roberts for coming up with this idea and implementing it.
While we're on the subject of fully automated deployments, I'd like to throw an idea out there that I first heard from Mike Todd, my boss at OpenX, who is an ex-Googler. One of his goals is for us never to have to ssh into any production server. We deploy the server using slack, the application gets installed automatically, monitoring agents get set up automatically, so there should really be no need to manually do stuff on the server itself. If you want to make a change, you make it in a slack role on the master slack server, and it gets pushed to production. If the server misbehaves or gets out of line with the other servers, you simply terminate that server instance and launch another one. Since you have everything automated, it's one command line for terminating the instance, and another one for deploying a brand new replacement. It's really beautiful.
Design your infrastructure so that it scales horizontally
There are generally two ways to scale an infrastructure: vertically, by deploying your application on more powerful servers, and horizontally, by increasing the number of servers that support your application. For 'infinite' scaling in a cloud computing environment, you need to design your system infrastructure so that it scales horizontally. Otherwise you're bound to hit limits of individual servers that you will find very hard to get past. Horizontal scaling also eliminates single points of failure.
Here are a few ideas for deploying a Web site with a database back-end so that it uses multiple tiers, with each tier being able to scale horizontally:
1) Deploy multiple Web servers behind one or more load balancers. This is pretty standard these days, and this tier is the easiest to scale. However, you also want to maximize the work done by each Web server, so you need to find the sweet spot of that particular type of server in terms of httpd processes it can handle. Too few processes and you're wasting CPU/RAM on the server, too many and you're overloading the server. You also need to be cognizant of the fact that each EC2 instance costs you money. It can become so easy to launch a new instance that you don't necessarily think of getting the most out of the existing instances. Don't go wild unless absolutely necessary if you don't want to have a sticker shock when you get the bill from Amazon at the end of the month.
2) Deploy multiple load balancers. Amazon doesn't yet offer load balancers, so what we've been doing is using HAProxy-based load balancers. Let's say you have an HAProxy instance that handles traffic for www.yourdomain.com. If your Web site becomes wildly successful, it is imaginable that a single HAProxy instance will not be able to handle all the incoming network traffic. One easy solution for this, which is also useful for eliminating single points of failure, is to use round-robin DNS, pointing www.yourdomain.com to several IP addresses, with each IP address handled by a separate HAProxy instance. All HAProxy instances can be identical in terms of back-end configuration, so your Web server farm will get 1/N of the overall traffic from each of your N load balancers. It worked really well for us, and the traffic was spread out very uniformly among the HAProxies. You do need to make sure the TTL on the DNS record for www.yourdomain.com is low.
3) Deploy several database servers. If you're using MySQL, you can set up a master DB server for writes, and multiple slave DB servers for reads. The slave DBs can sit behind an HAProxy load balancer. In this scenario, you're limited by the capacity of the single master DB server. One thing you can do is to use sharding techniques, meaning you can partition the database into multiple instances that each handle writes for a subset of your application domain. Another thing you can do is to write to local databases deployed on the Web servers, either in memory or on disk, and then periodically write to the master DB server (of course, this assumes that you don't need that data right away; this technique is useful when you have to generate statistics or reports periodically for example).
4) Another way of dealing with databases is to not use them, or at least to avoid the overhead of making a database call each time you need something from the database. A common technique for this is to use memcache. Your application needs to be aware of memcache, but this is easy to implement in all of the popular programming languages. Once implemented, you can have your Web servers first check a value in memcache, and only if it's not there have them hit the database. The more memory you give to the memcached process, the better off you are.
Establish clear measurable goals
The most common reason for scaling an Internet infrastructure is to handle increased Web traffic. However, you need to keep in mind the quality of the user experience, which means that you need to keep the response time of the pages your serve under a certain limit which will hopefully meet and surpass the user's expectations. I found it extremely useful to have a very simple script that measures the response time of certain pages and that graphs it inside a dashboard-type page (thanks to Mike Todd for the idea and the implementation). As we deployed more and more servers in order to keep up with the demands of increased traffic, we always kept an eye on our goal: keep reponse time/latency under N milliseconds (N will vary depending on your application). When we would see spikes in the latency chart, we knew we need to act at some level of our infrastructure. And this brings me to the next point...
Be prepared to quickly identify and eliminate bottlenecks
As I already mentioned in the design section above, any large-scale Internet infrastructure will have different types of servers: web servers, application servers, database servers, memcache servers, and the list goes on. As you scale the servers at each tier/level, you need to be prepared to quickly identify bottlenecks. Examples:
1) Keep track of how many httpd processes are running on your Web servers; this depends on the values you set for MaxClients and ServerLimit in your Apache configuration files. If you're using an HAProxy-based load balancer, this also depends on the connection throttling that you might be doing at the backend server level. In any case, the more httpd processes are running on a given server, the more CPU and RAM they will use up. At some point, the server will run out of resources. At that point, you either need to scale the server up (by deploying to a larger EC2 instance, for example an m1.large with more RAM, or a c1.medium with more CPU), or you need to scale your Web server farm horizontally by adding more Web servers, so the load on each server decreases.
2) Keep track of the load on your database servers, and also of slow queries. A great tool for MySQL database servers is innotop, which allows you to see the slowest queries at a glance. Sometimes all it takes is a slow query to throw a spike into your latency chart (can you tell I've been there, done that?). Also keep track of the number of connections into your database servers. If you use MySQL, you will probably need to bump up the max_connections variable in order to be able to handle an increased number of concurrent connections from the Web servers into the database.
Since we're discussing database issues here, I'd be willing to bet that if you were to discover your single biggest bottleneck in your application, it would be at the database layer. That's why it is especially important to design that layer with scalability in mind (think memcache, and load balanced read-only slaves), and also to monitor the database servers carefully, with an eye towards slow queries that need to be optimized (thanks to Chris Nutting for doing some amazing work in this area!)
3) Use your load balancer's statistics page to keep track of things such as concurrent connections, queued connections, HTTP request or response errors, etc. One of your goals should be never to see queued connections, since that means that some user requests couldn't be serviced in time.
I should mention that a good monitoring system is essential here. We're using Hyperic, and while I'm not happy at all with its limits (in the free version) in defining alerts at a global level, I really like its capabilities in presenting various metrics in both list and chart form: things like Apache bytes and requests served/second, memcached hit ratios, mysql connections, and many other statistics obtained by means of plugins specific to these services.
As you carefully watch various indicators of your systems' health, be prepared to....
Play wack-a-mole for a while, until things get stable
There's nothing like real-world network traffic, and I mean massive traffic -- we're talking hundreds of millions of hits/day -- to exercise your carefully crafted system infrastructure. I can almost guarantee that with all your planning, you'll still feel that a tsunami just hit you, and you'll scramble to solve one issue after another. For example, let's say you notice that your load balancer starts queuing HTTP requests. This means you don't have enough Web server in the pool. You scramble to add more Web servers. But wait, this increases the number of connections to your database pool! What if you don't have enough servers there? You scramble to add more database servers. You also scramble to increase the memcache settings by giving more memory to memcached, so more items can be stored in the cache. What if you still see requests taking a long time to be serviced? You scramble to optimize slow database queries....and the list goes on.
You'll say that you've done lots of load testing before. This is very good....but it still will not prepare you for the sheer amount of traffic that the internets will throw at your application. That's when all the things I mentioned before -- automated deployment of new instances, charting of the important variables that you want to keep track of, quick identification of bottlenecks -- become very useful.
That's it for this installment. Stay tuned for more lessons learned, as I slowly and sometimes painfully learn them :-) Overall, it's been a blast though. I'm really happy with the infrastructure we've built, and of course with the fact that most if not all of our deployment tools are written in Python.
Tuesday, March 17, 2009
HAProxy and Apache performance tuning tips
Let's assume you have a cluster of Apache servers behind an HAProxy and you want to sustain 500 requests/second with low latency per request. First of all, you need to bump up MaxClients and ServerLimit in your Apache configuration, as I explained in another post. In this case you would set both variables to 500. Note that you actually need to stop and start the httpd service, because simply restarting it won't change the built-in limit (which is 256). Also ignore the warning that Apache gives you on startup:
WARNING: MaxClients of 500 exceeds ServerLimit value of 256 servers,
lowering MaxClients to 256. To increase, please see the ServerLimit
directive.
Note that the more httpd processes you have, the more CPU and RAM will be consumed on the server. You need to decide how much to push the envelope in terms of concurrent httpd processes you can sustain on a given server. A good measure is the latency / responsiveness you expect from your Web application. At some point, it will start to suffer, and that will be a sign that you need to add a new Web server to your server farm (of course, this over-simplifies things a bit, since there's always the question of the database layer; I'm assuming you can use memcache to minimize database access.) Here's a good overview of the trade-offs related to MaxClients.
Other Apache configuration variables I've tweaked are StartServers, MinSpareServers and MaxSpareServers. It sometimes pays to bump up the values for these variables, so you can have spare httpd processes waiting around for those peak times when the requests hitting your server suddenly increase. Again, there's a trade-off here between server resources and number of spare httpd processes you want to maintain.
Assuming you fine-tuned your Apache servers, it's time to tweak some variables in the HAProxy configuration. Perhaps the most important ones for our discussion are the number of maximum connections per server (maxconn), httpclose and abortonclose.
It's a good idea to throttle the maximum number of connections per server and set it to a number related to the request/second rate you're shooting for. In our case, that number is 500. Since HAProxy itself needs some connections for healthchecking and other internal bookkeeping, you should set the maxconn per server to something slightly lower than 500. In terms of syntax, I have something similar to this in the backend section of haproxy.cfg:
server server1 10.1.1.1:80 check maxconn 500
I also have the following 2 lines in the backend section:
option abortonclose
option httpclose
According to the official HAProxy documentation, here's what these options do:
option abortonclose
In presence of very high loads, the servers will take some time to respond.
The per-instance connection queue will inflate, and the response time will
increase respective to the size of the queue times the average per-session
response time. When clients will wait for more than a few seconds, they will
often hit the "STOP" button on their browser, leaving a useless request in
the queue, and slowing down other users, and the servers as well, because the
request will eventually be served, then aborted at the first error
encountered while delivering the response.
As there is no way to distinguish between a full STOP and a simple output
close on the client side, HTTP agents should be conservative and consider
that the client might only have closed its output channel while waiting for
the response. However, this introduces risks of congestion when lots of users
do the same, and is completely useless nowadays because probably no client at
all will close the session while waiting for the response. Some HTTP agents
support this behaviour (Squid, Apache, HAProxy), and others do not (TUX, most
hardware-based load balancers). So the probability for a closed input channel
to represent a user hitting the "STOP" button is close to 100%, and the risk
of being the single component to break rare but valid traffic is extremely
low, which adds to the temptation to be able to abort a session early while
still not served and not pollute the servers.
In HAProxy, the user can choose the desired behaviour using the option
"abortonclose". By default (without the option) the behaviour is HTTP
compliant and aborted requests will be served. But when the option is
specified, a session with an incoming channel closed will be aborted while
it is still possible, either pending in the queue for a connection slot, or
during the connection establishment if the server has not yet acknowledged
the connection request. This considerably reduces the queue size and the load
on saturated servers when users are tempted to click on STOP, which in turn
reduces the response time for other users.
option httpclose
As stated in section 2.1, HAProxy does not yes support the HTTP keep-alive
mode. So by default, if a client communicates with a server in this mode, it
will only analyze, log, and process the first request of each connection. To
workaround this limitation, it is possible to specify "option httpclose". It
will check if a "Connection: close" header is already set in each direction,
and will add one if missing. Each end should react to this by actively
closing the TCP connection after each transfer, thus resulting in a switch to
the HTTP close mode. Any "Connection" header different from "close" will also
be removed.
It seldom happens that some servers incorrectly ignore this header and do not
close the connection eventough they reply "Connection: close". For this
reason, they are not compatible with older HTTP 1.0 browsers. If this
happens it is possible to use the "option forceclose" which actively closes
the request connection once the server responds.
And now for something completely different.....TCP stack tuning! Even with all the tuning above, we were still seeing occasional high latency numbers. Willy Tarreau to the rescue again....he was kind enough to troubleshoot things by means of the haproxy log and a tcpdump. It turned out that some of the TCP/IP-related OS variables were set too low. You can find out what those values are by running:
sysctl -a | grep ^net
In our case, the main one that was out of tune was:
net.ipv4.tcp_max_syn_backlog = 1024
Because of this, when there were more than 1,024 concurrent sessions on the machine running HAProxy, the OS had to recycle through the SYN backlog, causing the latency issues. Here are all the variables we set in /etc/sysctl.conf at the advice of Willy:
net.ipv4.tcp_tw_reuse = 1(to have these values take effect, you need to run 'sysctl -p')
net.ipv4.ip_local_port_range = 1024 65023
net.ipv4.tcp_max_syn_backlog = 10240
net.ipv4.tcp_max_tw_buckets = 400000
net.ipv4.tcp_max_orphans = 60000
net.ipv4.tcp_synack_retries = 3
net.core.somaxconn = 10000
That's it for now. As I continue to use HAProxy in production, I'll report back with other tips/tricks/suggestions.
Wednesday, March 04, 2009
HAProxy, X-Forwarded-For, GeoIP, KeepAlive
Here's a mysterious issue that I recently solved with the help of my colleague Chris Nutting:
1) Apache/PHP server sitting behind an HAProxy instance
2) MaxMind's GeoIP module installed in Apache
3) Application making use of the geotargeting features offered by the GeoIP module was sometimes displaying those features in a drop-down, and sometimes not
It turns out that the application was using the X-Forwarded-For headers in the HTTP requests to pass the real source IP of the request to the mod_geoip module and thus obtain geotargeting information about that IP. However, mysteriously, HAProxy was sometimes (once out of every N requests) not sending the X-Forwarded-For headers at all. Why? Because KeepAlive was enabled in Apache, so HAProxy was sending those headers only on the first request of the HTTP connection that was being "kept alive". Subsequent requests in that connection didn't have those headers set, so those requests weren't identified properly by mod_geoip.
The solution in this case was to disable KeepAlive in Apache. Willy Tarreau, the author of HAProxy, also recommends setting 'option httpclose' in the HAProxy configuration file. Here's an excerpt from the official HAProxy documentation:
option forwardfor [ exceptI hope this post will be of some use to people who might run into this issue.] [ header ]
....
It is important to note that as long as HAProxy does not support keep-alive
connections, only the first request of a connection will receive the header.
For this reason, it is important to ensure that "option httpclose" is set
when using this option.
Tuesday, February 24, 2009
You're not a cloud provider if you don't provide an API
A short discussion on 'XaaS' nomenclature is in order here: 'aaS' stands for 'as a Service', and X can take various values, for example P==Platform, S==Software, I==Infrastructure. You will see these acronyms in pretty much every industry-sponsored article about cloud computing. Pundits seem to love this kind of stuff. When I talk about cloud providers in this post, I mean providers of 'Infrastructure as a Service', things like the ones I mentioned above -- virtual servers, networking and storage resources, in short the low-level plumbing of an infrastructure.
A good example of 'Platform as a Service' is Google AppEngine, which offers both a development environment (right now Python-specific), and an API to interact with the 'Google cloud' when deploying your GAE application.
'Software as a Service' is pretty much what 'ASP' used to be in the dot com days (ASP == Application Service Provider if you don't remember your acronyms). The poster child for SaaS these days seems to be salesforce.com. I do however emphasize that one significant difference between SaaS and ASP is that SaaS providers DO offer an API for your application to interact with the resources they expose.
So...the common thread between the XaaS offerings is the existence of an API which allows you, as a systems and/or application architect, to interact with and manage the resources offered by the particular provider.
I've been using two cloud APIs here at OpenX, one from AppNexus and one from Amazon EC2. The AppNexus API allows you to reserve physical servers, start up, shut down and delete virtual instances on each server, clone a virtual instance, manage load balancer pools and SSL certificates at the LB level, etc. In short, it's a very solid and easy to use API.
The Amazon EC2 API is more fine grained than the one from AppNexus, which can be an advantage, but also makes it hard to coordinate the management of various resources. For example, to launch an EC2 instance you first need to create a keypair, potentially a security group, maybe an EBS volume and an elastic IP, and only then you can tie everything together via yet other EC2 API calls. For this reason, we're building our own tools around the Amazon API, tools which allow us to deploy an instance with all its associated resources via a single command-line script (and yes, we call this collection of tools the MCP). We're also using slack to deploy specific packages and applications to each instance we launch, but that's a topic for another post.
So what does all this mean to you as a systems or application architect? For a system administrator, I think it means that you need to shore up your programming skills so that you will be able to take advantage of these APIs and automate the deployment, testing and scaling of your infrastructure. For an application architect, it means that you need to shore up your sysadmin skills so you can understand the lower-level resources exposed by cloud APIs and use them to your full advantage. I think the future is bright for people who possess both sets of skills.
Tuesday, February 17, 2009
Helping the 'printable world wide web' movement
BTW, here's all I had to add to my Blogger template to make the content printable:
<style type="text/css">
@media print {
#sidebar, #navbar-iframe, #blog-header,
#comments h4, #comments-block, #footer,
span.statcounter, #b-backlink {display: none;}
#wrap, #content, #main-content {width: 100%; margin: 0; background: #FFFFFF;}
}
</style>
Wednesday, February 04, 2009
Load Balancing in Amazon EC2 with HAProxy
Installation
I installed HAProxy via yum. Here's the version that was installed using the default CentOS repositories on a CentOS 5.x box:
# yum list installed | grep haproxy
haproxy.i386 1.3.14.6-1.el5 installed
The RPM installs an init.d service called haproxy that you can use to start/stop the haproxy process.
Basic Configuration
In true Unix fashion, all configuration is done via a text file: /etc/haproxy/haproxy.cfg. It's very important that you read the documentation for the configuration file. The official documentation for HAProxy 1.3 is here.
Emulating virtual servers
In version 1.3, you can specify a frontend section, which defines an IP address/port pair for requests coming into the load balancer (think of it as a way to specify a virtual server/virtual port pair on a traditional load balancer), and multiple backend sections for each frontend, which correspond to the real IP addresses and ports of the backend servers handling the requests. If you can assign multiple external IP addresses to your HAProxy server, then you can have each one of these IPs function as a virtual server (via a frontend declaration), sending traffic to real servers declared in a backend.
However, one fairly large limitation of EC2 instances is that you only get one external IP address per instance. This means that you can have HAProxy listen on port 80 on a single IP address in EC2. How then can you have multiple 'virtual servers' on an EC2 HAProxy load balancer? The answer is in a new feature of HAProxy called ACLs.
Here's what the official documentation says:
2.3) Using ACLs
---------------
The use of Access Control Lists (ACL) provides a flexible solution to perform
content switching and generally to take decisions based on content extracted
from the request, the response or any environmental status. The principle is
simple :
- define test criteria with sets of values
- perform actions only if a set of tests is valid
The actions generally consist in blocking the request, or selecting a backend.
So let's say for example that you want to handle both www.example1.com and www.example2.com using the same HAProxy instance, but you want to load balance traffic for www.example1.com to server1 and server2 with IP addresses 192.168.1.1 and 192.168.1.2, while traffic for www.example2.com gets load balanced to server3 and server4 with IP addresses 10.0.0.3 and 10.0.0.4. Traffic for other domains will be sent to a default backend.
First, you define a frontend section in haproxy.cfg similar to this:
frontend myfrontend *:80
log global
maxconn 25000
option forwardfor
acl acl_example1 url_sub example1
acl acl_example2 url_sub example2
use_backend example1_farm if acl_example1
use_backend example2_farm if acl_example2
default_backend default_farm
This tells haproxy that there are 2 ACLs defined -- one called acl_example1, which is triggered if the incoming HTTP request is for a URL that contains the expression 'example1', and one called acl_example2, which is triggered if the incoming HTTP request is for a URL that contains the expression 'example2'.
If acl_example1 is triggered, the backend used will be example1_farm. If acl_example2 is triggered, the backend used will be example2_farm. If no acl is triggred, the default backend used will be default_farm.
This is the simplest form of ACLs. HAProxy supports many more, and you're strongly advised to read the ACL section in the documentation for a more in-depth discussion. However, the URL-based ACLs are very useful especially in an EC2 environment.
The backend sections of haproxy.cfg will look similar to this:
backend example1_farm
mode http
balance roundrobin
server server1 192.168.1.1:80 check
server server2 192.168.1.2:80 check
backend example2_farm
mode http
balance roundrobin
server server3 10.0.0.3:80 check
server server4 10.0.0.4:80 check
backend default_farm
mode http
balance roundrobin
server server5 192.168.1.5:80 check
server server6 192.168.1.6:80 check
Logging
You can have haproxy log to syslog, but first you need to allow syslog to receive UDP traffic from 127.0.0.1 on port 514. I'll discuss syslog-ng here, with its configuration file in /etc/syslog-ng/syslog-ng.conf. To allow the UDP traffic I mention, add the line 'udp(ip(127.0.0.1) port(514));' to the source s_sys section, which in my case looks like this:
Also add a filter for facility local 0:
source s_sys {
file ("/proc/kmsg" log_prefix("kernel: "));
unix-stream ("/dev/log");
internal();
udp(ip(127.0.0.1) port(514));
};
And finally associate that filter with the d_mesg destination, which sends messages to /var/log/messages:
filter f_filter9 { facility(local0); };
Restart syslog-ng via its init.d script.
log { source(s_sys); filter(f_filter9); destination(d_mesg); };
Now for the HAProxy configuration -- you need to have a line similar to this in the 'global' section of haproxy.cfg:
This tells haproxy to log to facility 'local0' on the localhost using the severity 'info'. You could send logs to a remote syslog server just as well.
global
log 127.0.0.1 local0 info
Once you define this in the global section, you can specify the logging mechanism either in the default section (which means that all frontends will log in this way), or by a frontend-to-frontend case. If you want to have it in the default section, just write:
Once you restart haproxy, you should see messages like this in /var/log/messages:
defaults
log global
Feb 2 22:39:49 127.0.0.1 haproxy[19150]: Connect from A.B.C.D:44463 to 10.0.0.1:80 (your_frontend_name/HTTP)However, if you want you're handling HTTP traffic and you would like to see the exact HTTP requests handled by HAProxy, you need to add this line either to the default section, or to a specific frontend:
In this case, the log will contain lines that look like a regular Apache combined log line.
mode httplog
A caveat: if you do enable logging in httplog mode, make sure /var has lots of disk space. If your HAProxy will handle a lot of traffic, the messages file will become very large, very fast. Just don't have /var be part of the typically small / partition, or you can be in a world of trouble.
Logging the client source IP in the backend web logs
One issue with load balancers and reverse proxies is that the backend servers will see traffic as always originating from the IP address of the LB or reverse proxy. This is obviously a problem when you're trying to get stats from your web logs. To mitigate this issue, many LBs/proxies use the X-Forwarded-For header to send the IP address of the client to the destination server. HAProxy offers this functionality via the forwardfor option. You can simply declare
option forwardfor
in your backend, and all your backend servers will receive the X-Forwarded-For header.
Of course, you also have to tell your Web server to handle this header in its log file. In Apache you need to modify the LogFormat directive and replace %h with %{X-Forwarded-For}i.
SSL
To handle SSL traffic in HAProxy, you need 3 things:
1) Define a frontend with a unique name which handles *:443
2) Send traffic to real_server_IP_1:443 through real_server_IP_N:443 in the backend(s) associated with the frontend
3) Specify 'mode tcp' instead of 'mode http' both in the frontend section and in the backend section(s) which handle port 443. Otherwise you won't see any SSL traffic hitting your real servers, and you'll wonder why....
Load balancing algorithms
HAProxy can handle several load balancing algorithms:
- round-robin: requests are rotated among the servers in the backend -- note that servers declared in the backend section also accept a weight parameter which specifies their relative weight in that backend; the round-robin algorithm will respect that weight ratio
- leastconn: the request is sent to the server with the lowest number of connections; round-robin is used if servers are similarly loaded
- source: a hash of the source IP is divided by the total weight of the running servers to determine which server will receive the request; this ensures that clients from the same IP address always hit the same server, which is a poor man's session persistence solution
- uri: the part of the URL up to a question mark is hashed and used to choose a server that will handle the request; this is useful when you want certain sub-parts of your web site to be served by certain servers (this is used with proxy caches to maximize the cache hit rate)
- url_param: can be used to check certain parts of the URL, for example values sent via POST requests; for example a request which specifies a user_id parameter with a certain value can get directed to the same server using the url_param method -- so this is another form of achieving session persistence in some cases (see the documentation for more details)
If you're OK with the fact that not all client browsers accept cookies, and you still want to use cookies as a session persistence mechanism, then HAProxy offers an easy way to do so. If you add this line to the backend section:
cookie SERVERID insert nocache indirect
then you're telling HAProxy to insert a cookie named SERVERID in the HTTP response; the cookie will be sent to the client browser via a Set-Cookie header in the response, and which is sent back by the client in a Cookie header in all subsequent requests. Note that this cookie is only a session cookie, and will not be written to disk by the client browser. For this reason, and for issues related to caching, the documentation recommends specifying the other 2 options 'nocache' and 'indirect'. In particular, 'indirect' means that the cookie will be removed from the HTTP request once it is processed by HAProxy, so your application running on the backend servers will never see it.
Once you define the cookie, you need to associate it with the servers in the backend, like this:
server server1 10.1.1.1:80 cookie server01 check
server server2 10.1.1.2:80 cookie server02 check
If a client request will get sent to server serverN initially, the cookie will insert a SERVERID corresponding to serverN in the response. In the requests that follow, the client will send back this SERVERID in the cookie and hence will be directed to the same server for the duration of the session.
Server health checks
HAProxy verifies the health of the servers declared in the backend section by sending them periodic HTTP requests. You need to specify 'check' in the server declaration line. Here is the appropriate section from the official documentation:
check
This option enables health checks on the server. By default, a server is
always considered available. If "check" is set, the server will receive
periodic health checks to ensure that it is really able to serve requests.
The default address and port to send the tests to are those of the server,
and the default source is the same as the one defined in the backend. It is
possible to change the address using the "addr" parameter, the port using the
"port" parameter, the source address using the "source" address, and the
interval and timers using the "inter", "rise" and "fall" parameters. The
request method is define in the backend using the "httpchk", "smtpchk",
and "ssl-hello-chk" options. Please refer to those options and parameters for
more information.
Performance tuning
Section 1.2 of the official documentation details the variables you can set to tweak maximum performance out of your HAProxy. The only parameter I found critical so far is maxconn, which in some of the sample configuration files was set to 2,000. This means that if HAProxy is hit with more than 2,000 concurrent connections, only the first 2,000 will be serviced, and the subsequent ones will be queued. For this reason, I recommend you set maxconn to a high number (such as 25,000 for example) in all the sections of your haproxy.cfg file: default, frontend and backend.
From what I've seen so far, the performance of HAProxy itself is very satisfactory. Even on an EC2 m1.small instance, HAProxy took less than 1% CPU for a web site we maintain that was hit with around 20,000 connections. I can guarantee that you will discover many other bottlenecks in your infrastructure long before HAProxy itself becomes your bottleneck. The only caveat in all this is the maxconn parameter above, which you do need to set to a high value to avoid unnecessary throttling of connections at the HAProxy layer.
Utilization statistics
HAProxy offers very nice utilization statistics, with tables showing the servers in all declared backends. Here's how these tables look like:
| my_website | |||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Queue | Sessions | Bytes | Denied | Errors | Warnings | Server | |||||||||||||||||||
| Cur | Max | Limit | Cur | Max | Limit | Total | LbTot | In | Out | Req | Resp | Req | Conn | Resp | Retr | Redis | Status | Wght | Act | Bck | Chk | Dwn | Dwntme | Thrtle | |
| server01 | 0 | 0 | - | 12 | 704 | - | 177432 | 175934 | 132022286 | 108951418 | 0 | 56 | 72 | 1498 | 1h6m UP | 1 | Y | - | 86 | 10 | 56s | - | |||
| server02 | 0 | 0 | - | 13 | 716 | - | 177364 | 176665 | 132655773 | 110034272 | 0 | 13 | 218 | 699 | 10m48s UP | 1 | Y | - | 54 | 5 | 25s | ||||
To enable stats, add lines such as these to the either the 'defaults' section, or to a specific backend section:
stats enable
stats uri /lb?stats
stats realm Haproxy\ Statistics
stats auth myusername:mypassword
Then hit http://external.ip.of.haproxy/lb?stats and you'll be presented with a basic HTTP authentication dialog. Log in with the credentials you specified.
High-availability strategies
In an ideal situation, you would have 2 HAProxy instances using a heartbeat-type protocol and sharing an external IP address. In case one of them goes down, the other one would assume the IP and your site will be available at all times. You could use Linux-HA, or Wackamole and the Spread toolkit. However, this is not possible in Amazon EC2 because IP addresses cannot be shared among instances in the manner that heartbeat-type protocols expect.
What you can do instead is to use an Elastic IP and associate it with your HAProxy instance. Then you can have another stand-by HAProxy instance kept in sync with the live one (only the haproxy.cfg needs to be rsync-ed across). Your monitoring system can then detect when the live HAProxy instance goes down, and automatically assign the Elastic IP address to the other instance using for example the EC2 API Tools command ec2-associate-address.
Tuesday, January 27, 2009
Tadalist -- simple but powerful task management
In Tadalist you can only do a few things: create a list, add an item to a list, edit the list title, edit the description of an item, check off an item as done, and reorder the items in the list. It turns out this is really all you need. I especially like the feeling of checking off an item and seeing it drop to the bottom of the list, in smaller font, joining the list of tasks that are DONE! It's almost as addictive as seeing those dots when you run unit tests. Reordering items is also a very nice feature, because an item that wasn't so hot yesterday can become really critical today, in which case you want it at the top of the list.
One feature I'd like to see is for checked off items to also get a timestamp, so you can go back and see when exactly you completed a given task.
If you're not using any task management software (in which case I hope you're still using old-school pen and paper), then give Tadalist a try.
BTW -- what task management software have YOU used successfully? Please leave a comment.
Saturday, January 24, 2009
Book review: "Pro Django" by Marty Alchin
If you are serious about developing Web applications in Django, then "Pro Django" will be a great addition to your technical library. Note, however, that the "Pro" in the title really means "professional", "in-depth", at times even "obscure" -- so please, do not pick up this book if you're just starting out with Django. To really get the most out of this book, you need to already have at least one, and preferably several Django applications under your belt.
I personally just finished a small fun project for my daughter's 8th grade Science Fair: a Web site written of course in Django where her friends can take a fun science-related quiz and see if they improve their score the second time around, after being told the correct answer for each question. It was my first Django application, and I used the online documentation and tutorial (both very good), as well as the online Django book. I also used Sams' "Teach yourself Django in 24 hours" by Brad Dayley, which was very helpful for a beginner like me.
I say all this because in reading "Pro Django", you need to be already familiar with the core concepts of Django: models, views, templates, forms, and the all-important URL configuration. You won't get a feel for these concepts unless you actually start writing a Web application and understand the hard way how everything fits together. Once you have this understanding, and if you want to continue on the path of creating more Django apps, it's time for you to pick up "Pro Django".
Marty Alchin doesn't waste time delving into aspects of Python which are typically not used to their full potential by many people (including me): metaclasses, introspection, decorators, descriptors. In fact, the themes of introspection, customization and extension (which all take advantage of the dynamic nature of Python) keep coming up in almost every chapter of the book.
For example, the 'Models' chapter shows how to subclass model fields and how the use of metaclasses allows a field to know its name, and the class it was assigned to. The chapter also talks about the nifty technique of creating models dynamically at runtime. The 'URLs and Views' chapter goes into the gory details of the Django URL configuration mechanism, and shows how to use decorators to make views as generic as possible.
My favorite chapter was 'Handling HTTP'. It exemplifies what for me is the best part about Alchin's book: showing readers where and how to insert their own advanced processing code into the hooks provided by Django, without disturbing the flow of the framework. This is typically one of the hard parts of learning a Web framework, and Marty Alchin does a great job of explaining how to achieve a maximum of effect with a minimum of effort in this area, for example by writing your own middleware modules and inserting them into Django.
I also liked the last two chapters, 'Coordinating applications' and 'Enhancing applications', which show practical examples of code aggregated in mini-applications. In fact, this is also the main gripe I have about this book: I wish the author used more mini-applications throughout the book to explain the advanced concepts he described. He did show code snippets for each concept, but they were all isolated, and sometimes hard to place into the context of an application. I realize that space was limited, but it would have been so much nicer to see a real application being built and described throughout the book, with more and more functionality added at each stage.
Overall, I really enjoyed reading "Pro Django". However, reading such a book is just a start. What I really need to do is to start writing code and applying some of the new techniques I learned. I can't wait to do it!
Wednesday, January 21, 2009
Watch that Apache KeepAlive setting!
If, however, your Apache server handles small individual resources (such as images), then KeepAlive is overkill, since it will make every TCP connection linger for N seconds. Given a lot of clients, this can quickly saturate your Apache server in terms of network connections.
So...if you have a decent server that doesn't seem to be overloaded in terms of CPU/memory, yet Apache is slow-to-unresponsive, check out the KeepAlive directive and try setting it to Off. Note that the default value is On.
More Apache performance tuning tips are in the official Apache documentation.
Sunday, January 04, 2009
Happy New Year and....Teach Me Web Testing!
Now for the 'Teach Me Web Testing' part: Steve Holden graciously offered to be the host of an Open Space at PyCon 2009 on this topic. Steve started the 'Teach Me...' series at the last PyCon, with his now famous 'Teach Me Twisted' session.
For this format to work, we need to put together an audience which is formed of at least 3 types of people:
1) people interested in learning about Web testing in Python
2) people who write Python Web testing tools for fun and profit
3) people who use Python Web testing tools extensively for fun and profit
My role here is to rally people in categories 2 and 3. So if you're either a Web testing tool author or somebody who uses Web testing tools extensively in your job, please either comment on this post, or send me email at grig at gheorghiu dot net and let me know if you'd be interested in attending this Open Space session. Knowing Steve, I can guarantee it will be LOTS of fun.
Tuesday, December 16, 2008
Some issues when restoring files using duplicity
By default, duplicity will use the system default temporary directory, which on Unix is usually /tmp. If you have insufficient disk space in /tmp for the files you're trying to restore from S3, the restore operation will eventually fail with "IOError: [Errno 28] No space left on device".
One thing you can do is create another directory on a partition with lots of disk space, and specify that directory in the duplicity command line using the --tempdir command line option. Something like: /usr/local/bin/duplicity --tempdir=/lotsofspace/temp
However, it turns out that this is not sufficient. There's still a call to os.tmpfile() buried in the patchdir.py module installed by duplicity. Consequently, duplicity will still try to create temporary files in /tmp, and the restore operation will still fail. As a workaround, I solved the issue in a brute-force kind of way by editing /usr/local/lib/python2.5/site-packages/duplicity/patchdir.py (the path is obviously dependent on your Python installation directory) and replacing the line:
tempfp = os.tmpfile()
with the line:
tempfp, filename = tempdir.default().mkstemp_file()
(I also needed to import tempdir at the top of patchdir.py; tempdir is a module which is part of duplicity and which deals with temporary file and directory management -- I guess the author of duplicity just forgot to replace the call to os.tmpfile() with the proper calls to the tempdir methods such as mkstemp_file).
This solved the issue. I'll try to open a bug somehow with the duplicity author.
Friday, December 12, 2008
Working with Amazon EC2 regions
Each region has several availability zones. You can see the current ones in this nice article from the AWS Developer Zone. The default region is us-east-1, with 3 availability zones (us-east-1a, 1b and 1c). If you don't specify a region when you call an EC2 API tool, then the tool will query the default region. That's why I was baffled when I tried to launch a new AMI in Europe; I was calling 'ec2-describe-availability-zones' and it was returning only the US ones. After reading the article I mentioned, I realized I need to have 2 versions of my scripts: the old one I had will deal with the default US-based region, and the new one will deal with the Europe region by adding '--region eu-west-1' to all EC2 API calls (you need the latest version of the EC2 API tools from here).
You can list the zones available in a given region by running:
Note that all AWS resources that you manage belong to a given region. So if you want to launch an AMI in Europe, you have to create a keypair in Europe, a security group in Europe, find available AMIs in Europe, and launch a given AMI in Europe. As I said, all this is accomplished by adding '--region eu-west-1' to all EC2 API calls in your scripts.
# ec2-describe-availability-zones --region eu-west-1
AVAILABILITYZONE eu-west-1a available eu-west-1
AVAILABILITYZONE eu-west-1b available eu-west-1
Another thing to note is that the regions are separated in terms of internal DNS too. While you can access AMIs within the same zone based on their internal DNS names, this access doesn't work across regions. You need to use the external DNS name of an instance in Europe if you want to ssh into it from an instance in the US (and you also need to allow the external IP of the US instance to access port 22 in the security policy for the European instance.)
All this introduces more headaches from a management/automation point of view, but the benefits obviously outweigh the cost. You get low latency for your European customers, and you get more disaster recovery options.
Thursday, December 11, 2008
Deploying EC2 instances from the command line
After downloading and unpacking the EC2 API tools, you need to set the following environment variables in your .bash_profile file:
export EC2_HOME=/path/to/where/you/unpacked/the/tools/apiYou also need to add $EC2_HOME/bin to your PATH, so the command-line tools can be found by your scripts.
export EC2_PRIVATE_KEY = /path/to/pem/file/containing/your/ec2/private/key
export EC2_CERT = /path/to/pem/file/containing/your/ec2/cert
At this point, you should be ready to run for example:
# ec2-describe-images -o amazonwhich lists the AMIs available from Amazon.
If you manage more than a handful of EC2 AMIs (Amazon Machine Instances), it quickly becomes hard to keep track of them. When you look at them for example using the Firefox Elasticfox extension, it's very hard to tell which is which. One solution I found to this is to create a separate keypair for each AMI, and give the keypair a name that specifies the purpose of that AMI (for example mysite-db01). This way, you can eyeball the list of AMIs in Elasticfox and make sense of them.
So the very first step for me in launching and deploying a new AMI is to create a new keypair, using the ec2-add-keypair API call. Here's what I have, in a script called create_keypair.sh:
# cat create_keypair.sh
#!/bin/bash
KEYNAME=$1
if [ -z "$KEYNAME" ]
then
echo "You must specify a key name"
exit 1
fi
ec2-add-keypair $KEYNAME.keypair > ~/.ssh/$KEYNAME.pem
chmod 600 ~/.ssh/$KEYNAME.pem
Now I have a pem file called $KEYNAME.pem containing my private key, and Amazon has my public key called $KEYNAME.keypair.
The next step for me is to launch an 'm1.small' instance (the smallest instance you can get from EC2) whose AMI ID I know in advance (it's a 32-bit Fedora Core 8 image from Amazon with an AMI ID of ami-5647a33f). I am also using the key I just created. My script calls the ec2-run-instances API.
# cat launch_ami_small.shNote that the script makes some assumptions -- such as the fact that I want my AMI to reside in the us-east-1a availability zone. You can obviously add command-line parameters for the availability zone, and also for the instance type (which I intend to do when I rewrite this in Python).
#!/bin/bash
KEYNAME=$1
if [ -z "$KEYNAME" ]
then
echo "You must specify a key name"
exit 1
fi
# We launch a Fedora Core 8 32 bit AMI from Amazon
ec2-run-instances ami-5647a33f -k $KEYNAME.keypair --instance-type m1.small -z us-east-1a
Next, I create an EBS volume which I will attach to the AMI I just launched. My create_volume.sh script takes an optional argument which specifies the size in GB of the volume (and otherwise sets it to 50 GB):
# cat create_volume.shThe volume should be created in the same availability zone as the instance you intend to attach it to -- in my case, us-east-1a.
#!/bin/bash
SIZE=$1
if [ -z "$SIZE" ]
then
SIZE=50
fi
ec2-create-volume -s $SIZE -z us-east-1a
My next step is to attach the volume to the instance I just launched. For this, I need to specify the instance ID and the volume ID -- both values are returned in the output of the calls to ec2-run-instances and ec2-create-volume respectively.
Here is my script:
# cat attach_volume_to_ami.sh
#!/bin/bash
VOLUME_ID=$1
AMI_ID=$2
if [ -z "$VOLUME_ID" ] || [ -z "$AMI_ID" ]
then
echo "You must specify a volume ID followed by an AMI ID"
exit 1
fi
ec2-attach-volume $VOLUME_ID -i $AMI_ID -d /dev/sdh
This attaches the volume I just created to the AMI I launched and makes it available as /dev/sdh.
The next script I use does a lot of stuff. It connects to the new AMI via ssh and performs a series of commands:
* format the EBS volume /dev/sdh as an ext3 file system
* mount /dev/sdh as /var2, and copy the contents of /var to /var2
* move /var to /var.orig, create new /var
* unmount /var2 and re-mount /dev/sdh as /var
* append the mounting as /dev/sdh as /var to /etc/fstab so that it happens upon reboot
Before connecting via ssh to the new AMI, I need to know its internal DNS name or IP address. I use ec2-describe-instances to list all my running AMIs, then I copy and paste the internal DNS name of my newly launched instance (which I can isolate because I know the keypair name it runs with).
Here is the script which formats and mounts the new EBS volume:
The effect is that /var is now mapped to a persistent EBS volume. So if I install MySQL for example, the /var/lib/mysql directory (where the data resides by default in Fedora/CentOS) will be automatically persistent. All this is done without interactively logging in to the new instance. so it can be easily scripted as part of a larger deployment procedure.
# cat format_mount_ebs_as_var_on_ami.sh
#!/bin/bash
AMI=$1
KEYNAME=$2
if [ -z "$AMI" ] || [ -z "$KEY" ]
then
echo "You must specify an AMI DNS name or IP followed by a keypair name"
exit 1
fi
CMD='mkdir /var2; mkfs.ext3 /dev/sdh; mount -t ext3 /dev/sdh /var2; \
mv /var/* /var2/; mv /var /var.orig; mkdir /var; umount /var2; \
echo "/dev/sdh /var ext3 defaults 0 0" >>/etc/fstab; mount /var'
ssh -i ~/.ssh/$KEY.pem root@$AMI $CMD
That's about it for the bare-bones stuff you have to do. I purposely kept my scripts simple, since I use them more to remember what EC2 API tools I need to run than anything else. I don't do a lot of command-line option stuff and error-checking stuff, but they do their job.
If you run scripts similar to what I have, you should have at this point a running AMI with a 50 GB EBS volume mounted as /var. Total running time of all these scripts -- 5 minutes at most.
As soon as I have a nicer Python script which will do all this and more, I'll post it here.
Thursday, December 04, 2008
New job at OpenX
Lots of Python involved in this, lots of automation, lots of testing, so all this makes me really happy :-)
Here is some stuff I've been working on, which I intend to post on with more details as time permits:
* command-line provisioning of EC2 instances
* automating the deployment of the OpenX application and its pre-requisites
* load balancing in EC2 using HAProxy
* monitoring with Hyperic
* working with S3-backed file systems
I'll also start working soon with slack, a system developed at Google for automatic provisioning of files via the interesting concept of 'roles'. It's in the same family as cfengine or puppet, but simpler to use and with a powerful inheritance concept applied to roles.
All in all, it's been a fun and intense 2 weeks :-)
Sunday, November 30, 2008
The sad state of open source monitoring tools
A slew of other tools are based on the Nagios engine, and are trying hard to be more pleasing to the eye -- Opsview and GroundWork are some examples. Opsview seems just a wrapper around Nagios, with not a lot of improvements in terms of both functionality and UI.
I looked at the GroundWork screencast and it seemed promising, but when I tried to install it I had a very unpleasant experience. First of all, the install script uses curses (did those guys hear about unattended installs?), and requires Java 1.5. Although I had both Java 1.5 and 1.6 on my CentOS server, and JAVA_HOME set correctly, it didn't stop the installer from complaining and exiting. Good riddance.
I should say that the first open source network monitoring tool that I tried was Zenoss, which is supposed to be the poster child for Python-based monitoring tools. Believe me, I tried hard to like it. I even went back and gave it a second chance, after noticing that other tools aren't any better. But to no avail -- I couldn't get past the sensation that it's a half-baked tool, with poor documentation and obscure user interface. It could work fine if you just want to monitor some devices with SNMP, but as soon as you try to extend it with your own plugins (called Zen Packs), or if you try to use their agents (called Zen Plugins), you run into a wall. At least I did. I got tired of Python tracebacks, obscure references to 'restarting Zope' (I thought it's based on twisted), fiddling with values for the so-called zProperties of a device, trying unsuccessfully to get ssh key authentication to work with the Zen Plugins, etc, etc. I'm not the only one who went through these frustrations either -- there are plenty of other users saying in the Zenoss forums that they've had it, and that they're going to look for something else. Which is what I did too.
I also tried OpenNMS, which was better than Zenoss, but it still had a CGI feel in terms of its Web interface.
So...for now I settled on Hyperic. It's a Java-based tool with a modern Web interface, very good documentation, and it's extensible via your own plugins (which you can write in any language you want, as long as you conform to some conventions which are not overly restrictive). Hyperic uses agents that you install on every server you need to monitor. I don't mind this, I find it better than configuring SNMP to death. It does have it quirks -- for example it calls devices that it monitors 'platforms' (instead of just 'devices' or 'servers'), and it calls the plugins that monitor specific services 'servers' (instead of services). Once you get used to it, it's not that bad. However, I wish there was a standard nomenclature for this stuff, as well as a standard way for these tools to inter-operate. As it is, you have to learn each tool and train your brain to ignore all the weirdness that it encounters. Not an optimal scenario by any means.
I'm very curious to see what tools other people use. If you care to leave a comment about your monitoring tool of choice, please do so!
I'll report back with more stuff about my experiences with Hyperic.
Friday, November 21, 2008
Issues with Ubuntu 8.10 on Lenovo T61p laptop
System lock-ups with Intel 4965 wireless
The version of the iwlagn wireless driver for Intel 4965 wireless chipsets included in Linux kernel version 2.6.27 causes kernel panics when used with 802.11n or 802.11g networks. Users affected by this issue can install the linux-backports-modules-intrepid package, to install a newer version of this driver that corrects the bug. (Because the known fix requires a new version of the driver, it is not expected to be possible to include this fix in the main kernel package.)
As recommended, I did 'apt-get install backports-modules-intrepid' and I rebooted. That was around 1 hour ago, and I haven't seen any issues since. Hopefully that was it. BTW, when the Caps Lock light blinks, it means 'kernel panic'. Who knew.
Thursday, November 13, 2008
Python and MS Azure
"Windows Azure is an open platform that will support both Microsoft and non-Microsoft languages and environments. Windows Azure welcomes third party tools and languages such as Eclipse, Ruby, PHP, and Python."
While you and I may think MS says this just for marketing/PR purposes, it turns out they are walking the walk a bit. I was glad to see in the InfoQ article that a Microsoft guy wrote a Python wrapper on top of the Azure Data Storage APIs. Note that this is classic CPython, not IronPython. I assume more interesting stuff can be done with IronPython.
Wednesday, November 12, 2008
"phrase from nearest book" meme
- Grab the nearest book.
- Open it to page 56.
- Find the fifth sentence.
- Post the text of the sentence in your journal along with these instructions.
- Don’t dig for your favorite book, the cool book, or the intellectual one: pick the CLOSEST.
"A little later a marriage procession would strike into the Grand Trunk with music and shoutings, and a smell of marigold and jasmine stronger even than the reek of the dust."
Not bad, I like it :-)
Friday, October 31, 2008
Migrating SSL certs from IIS to Apache
Monday, October 27, 2008
This is depressing: Ken Thompson is also a googler
So let's see, Google has hired:
* Ken Thompson == Unix
* Vint Cerf == TCP/IP
* Andrew Morton == #2 in Linux
* Guido van Rossum == Python
* Ben Collins-Sussman and Brian Fitzpatrick == subversion
* Bram Moolenaar == vim
...and I'm sure there are countless others that I missed.
If this isn't a march towards world domination, I don't know what is :-)
Thursday, October 16, 2008
The case of the missing profile photo
echo "/dev/sds /ebs1 ext3 defaults 0 0" >> /etc/fstab
I connected to my EC2 environment with ElasticFox and saw that the EBS volume was still attached to my machine instance as /dev/sds, so I mounted it via 'mount /dev/sds/ /ebs1', then restarted httpd and mysqld, and all my sites were again up and running.
I tested my setup by rebooting. After the reboot, another surprise: httpd and mysqld were not chkconfig-ed on, so they didn't start automatically. I fixed that, I rebooted again, and finally everything came back as expected.
A few lessons learned here in terms of hosting your web sites in 'the cloud':
1) you need to test your machine setup across reboots
2) you need automated tests for your machine setup -- things like 'is httpd chkconfig-ed on?'; 'is /dev/sds mounted as /ebs1 in /etc/fstab?'
3) you need to monitor your sites from a location outside the cloud which hosts your sites; I shouldn't have to eyeball a profile photo to realize that my EC2 instance is not functioning properly!
I'll cover all these topics and more soon in some other posts, so stay tuned!
Recommended book: "Scalable Internet Architectures"
I wish the database chapter contained more in-depth architectural discussions; instead, the author spends a lot of time showing a Perl script that is supposed to illustrate some of the concepts in the chapter, but falls very short of that in my opinion.
Overall though, highly recommended.
Wednesday, October 08, 2008
Example Django app needed
Comments with suggestions would be greatly appreciated!
Thursday, October 02, 2008
Update on EC2 and EBS
Greetings from Amazon Web Services,
This e-mail confirms that your latest billing statement is available on the AWS web site. Your account will be charged the following:
Total: $73.74
So there you have it. That's how much it cost me to run the new SoCal Piggies wiki, as well as some other small sites, with very little traffic. Your mileage will definitely vary, especially if you run a high-traffic site.
I also said I'll give an update on running a MySQL database on EBS. It turns out it's really easy. On my Fedora Core 8 AMI, I did this:
* installed mysql packages via yum:
yum -y install mysql mysql-server mysql-devel
* moved the default data directory for mysql (/var/lib/mysql) to /ebs1/mysql (where /ebs1 is the mount point of my 10 GB EBS volume), then symlinked /ebs1/mysql back to /var/lib, so that everything continues to work as expected as far as MySQL is concerned:
service mysqld stop
mv /var/lib/mysql /ebs1/mysql
ln -s /ebs1/mysql /var/lib
service mysqld start
That's about it. I also used the handy snapshot functionality in the ElasticFox plugin and backed up the EBS volume to S3. In case you lose your existing EBS volume, you just create another volume from the snapshot, specify a size for it, and associate it with your AMI instance. Then you mount it as usual.
Update 10/03/08
In response to comments inquiring about a more precise breakdown of the monthly cost, here it is:
$0.10 per Small Instance (m1.small) instance-hour (or partial hour) x 721 hours = $72.10
$0.100 per GB Internet Data Transfer - all data transfer into Amazon EC2 x 0.607 GB = $0.06
$0.170 per GB Internet Data Transfer - first 10 TB / month data transfer out of Amazon EC2 x 2.719 GB = $0.46
$0.010 per GB Regional Data Transfer - in/out between Availability Zones or when using public IP or Elastic IP addresses x 0.002 GB = $0.01
$0.10 per GB-Month of EBS provisioned storage x 9.958 GB-Mo = $1.00
$0.10 per 1 million EBS I/O requests x 266,331 IOs = $0.03
$0.15 per GB-Month of EBS snapshot data stored x 0.104 GB-Mo = $0.02
$0.01 per 1,000 EBS PUT requests (when saving a snapshot) x 159 Requests = $0.01
EC2 TOTAL: $73.69
Other S3 costs (outside of EC2): $0.05
GRAND TOTAL: $73.74
Friday, September 19, 2008
Presubmit testing at Google
Monday, September 15, 2008
"Unmaintained Free Software" wiki
Saturday, September 13, 2008
Know of any Open Source projects that need maintainers?
"Folks,
As I am interested in brushing up on my coding skills, so I would
appreciate your help in identifying an existing orphan/dormant
open-source tool/toolset project who needs an owner/maintainer.
I am especially interested in software process-oriented tools that
fill a hole in an agile development/test/management tool stack."
If anybody knows of such projects, especially with a testing or agile bent, please leave a comment here. Thanks!
Tuesday, September 02, 2008
Getting around the Firefox port-blocking annoyance
1) go to about:config in the Firefox address bar
2) right click, choose new->string
3) enter the name network.security.ports.banned.override and the value 1-65535
4) there is no step 4
Monday, September 01, 2008
Experiences with Amazon EC2 and EBS
To get started, I used a great blog post on 'Persistent Django on Amazon EC2 and EBS' by Thomas Brox Røst. I will refer here to some of the steps that Thomas details in his post; if you want to follow along, you're advised to read his post.
1) Create an AWS account and sign up for the EC2 service.
2) Install the ElasticFox Firefox extension -- the greatest thing since sliced bread in terms of managing EC2 AMIs. To run the ElasticFox GUI, go to Tools->ElasticFox in Firefox; this will launch a new tabbed window showing the GUI. From now on, I will abbreviate ElasticFox as EF.
3) Add your AWS user name and access keys in EF (use the Credentials button).
4) Add an EC2 security group (click on the 'Security Groups' tab in EF); this can be thought of as a firewall rule that will replace the default one. In my case, I called my group 'gg' and I allowed ports 80 and 443 (http and https) and 22 (ssh).
5) Add a keypair to be used when you ssh into your AMI (click on the 'KeyPairs' tab in EF). I named mine gg-ec2-keypair and I saved the private key in my .ssh folder on my local machine (.ssh/gg-ec2-keypair.pem).
6) Get a fixed external IP (click on the 'Elastic IPs' tab in EF). You will be assigned an IP which is not yet associated with any AMI.
7) Get a block-based storage volume that you can format later into a file system (click on the 'Volumes and Snapshots' tab in EF). I got a 10 GB volume.
These 7 steps are the foundation of everything else you need to do when running an AMI. Choosing and launching the AMI itself is the next step, which you can run any time you want to launch an AMI.
I followed Thomas's example and chose a 32-bit Fedora Core 8 image for my AMI. In EF, you can search for Fedora 8 images by going to the 'AMIs and Instances' tab and typing fedora-8 in the search box. Right click on the desired image (mine was called ec2-public-images/fedora-8-i386-base-v1.07.manifest.xml) and choose 'Launch instance(s) of this AMI'. You will need to choose a keypair (I chose the one I created earlier, gg-ec2-keypair), an availability zone (I chose the 'us-east-1a') and a security group (I removed the default one and added the one I created earlier).
You should immediately see the instance in a 'pending' state in the Instances list. After a couple of minutes, if you click Refresh you'll see it in the 'running' state, which means it's ready for you to access and work with.
Once my AMI was running, I right-clicked it and chose 'copy instance ID to clipboard'. The instance ID is needed to associate the EBS volume and the Elastic IP to this instance.
To associate the fixed external IP, I went to the 'Elastic IPs' tab in EF, right clicked on the Elastic IP I was assigned and chose 'Associate this address', then I indicated the instance ID of my running AMI. As a side note, if you don't see anything in a given EF list (such as Elastic IPs or Volumes), click Refresh and you should see it.
To associate the EBS volume, I went to the 'Volumes and Snapshots' tab in EF, right clicked on the volume I had created, then chose 'Attach this volume'. In the next dialog box, I specified the instance ID of my AMI, then /dev/sdh as the volume name.
The next step is to ssh into your AMI and format the raw block storage into a file system. You can use the Elastic IP you were assigned (let's call it A.B.C.D), and run:
$ ssh -i .ssh/your-private-key.pem root@A.B.C.D
At this point, you should be logged in into your AMI. To format the EBS volume, run:
# mkdir /ebs1; mount -t ext3 /dev/sdh /ebs1
If you want the mount point to persist across reboots, also add this line to /etc/fstab:
$ echo "/dev/sdh /ebs1 ext3 noatime 0 0" >> /etc/fstab
At this point, you have a bare-bones Fedora Core 8 instance accessible via HTTP, HTTPS and SSH at the IP address A.B.C.D. Not very useful in and of itself, unless you install your application.
In my case, the first Web site I wanted to port over was the SoCal Piggies wiki, at www.socal-piggies.org. I used to run it on MoinMoin 1.3.1on my old server, but for this brand-new AMI experiment I installed MoinMoin 1.7.1. I also had to install httpd and python-devel via yum. And since we're talking about package installs, here's the main point you should take away from this post: you need to install all required packages every time you re-launch your AMI. I'm not talking about rebooting your AMI, which preserves your file systems; I'm talking about terminating your AMI for any reason, then re-launching a new AMI instance. This operation will start your AMI with a clean slate in terms of packages that are installed. You can obviously re-mount the EBS volume that you created, and all your files will still be there, but those are typically application or database files, and not the actual required packages themselves (such as httpd or python-devel).
So, very important point: as soon as you start porting applications over to your AMI, you'd better start designing the layout of your apps so that they take full advantage of the EBS volume(s) you created. You'll also have to script the installation of the required packages, so you can easily run the script every time you launch a new instance of your AMI. This can be seen as a curse, but to me it's a blessing in disguise, because it forces you to automate the installation of your applications. Automation entails faster deployment, less errors, better testability. In short, you win in the long run.
For the first application I ported, the SoCal Piggies wiki, I made the following design decisions:
a) I chose to install MoinMoin 1.7.1 from scratch every time I launch a new AMI instance; I also install httpd, httpd-devel and python-devel from scratch every time
b) I chose to point the specific instance of the Piggies wiki to /ebs1/wikis/socal-piggies, so all the actual content of the wiki is kept persistently in the EBS volume
c) I moved /etc/httpd to /ebs1/httpd, then I created a symlink from /ebs1/httpd to /etc, so all the Apache configuration files are kept persistently in the EBS volume
d) I pointed the DocumentRoot of the Apache virtual host for the Piggies wiki to /ebs1/www/socal-piggies, so that all the static files that need to be accessed via the www.socal-piggies.org domain are kept persistenly in the EBS volume
So what do I have to do if I decide to terminate the current AMI instance, and launch a new one? Simple -- I first associate the Elastic IP and the EBS volume with the new instance via EF, then I ssh into the new AMI (which has the same external IP as the old one) and run this command line:
# mkdir /ebs1; mount -t ext3 /dev/sdh /ebs1
Then I go to /ebs1/scripts and run this script:
# cat mysetup.sh
#!/bin/bash
# Install various packages via yum
yum -y install python-devel
yum -y install httpd httpd-devel
# Create symlinks
mv /etc/httpd /etc/httpd.orig
ln -s /ebs1/httpd /etc
# Download and install MoinMoin
cd /tmp
rm -rf moin*
wget http://static.moinmo.in/files/moin-1.7.1.tar.gz
tar xvfz moin-1.7.1.tar.gz
cd moin-1.7.1
python setup.py install
# Start apache
service httpd start
# Make sure /ebs1 is mounted across reboots
echo "/dev/sdh /ebs1 ext3 noatime 0 0" >> /etc/fstab
Even better, I can script all this on my local machine, so I don't even have to log in via ssh. This is the command I run on my local machine:
ssh -i ~/.ssh/gg-ec2-keypair.pem 75.101.140.75 'mkdir /ebs1; mount -t ext3 /dev/
sdh /ebs1; /ebs1/scripts/mysetup.sh'
That's it! At this point, I have the Piggies wiki running on a brand-new AMI.
Two caveats here:
1) the ssh fingerprint of the remote AMI that had been saved in .ssh/known_hosts on your local machine will no longer be valid, so you'll get a big security warning the first time you will try ssh-ing into your new AMI. Just delete that line from known_hosts and ssh again.
2) it takes a while (for me it was up to 5 minutes) for the Elastic IP to be ready for you to ssh into after you associate it with a brand-new AMI; so in a disaster recovery situation, keep in mind that your site can potentially be down for 10-15 minutes, time in which you launch a new AMI, associate the Elastic IP and the EBS volume with it, and run your setup scripts.
My experience so far with EC2 and EBS has been positive. As I already mentioned, the fact that it forces you to design your application to take advantage of the persistent EBS volume, and to script the installation of the pre-requisite packages, is a net positive in my opinion.
The next step for me will be to port other sites with a MySQL database backend. Fun fun fun! I will blog soon about my experiences. In the mean time, go ahead and browse the brand-new SoCal Piggies wiki :-)
Thursday, August 28, 2008
Back up your Windows desktop to S3 with SecoBackup
I think this is a good tool for backing up certain files on Windows-based desktops. For example I back up my Quicken files from within a Windows XP virtual image that I run inside VMWare workstation on top of my regular Ubuntu Hardy desktop.
Tuesday, August 26, 2008
Ruby refugees flocking to Python?
RTFL
Here are some recent examples from my work.
Apache wouldn't start properly
A 'ps -def | grep http' would show only the main httpd process, with no worker processes. The Apache error log showed these lines:
Digest: generating secret for digest authentication
A google search for this line revealed this article:
http://www.raptorized.com/2006/08/11/apache-hangs-on-digest-secret-generation/
It turns out the randomness/entropy on that box had been exhausted. I grabbed the rng-tools tar.gz from sourceforge, compiled and installed it, then ran
rngd -r /dev/urandom
...and apache started its worker processes instantly.
Cannot create InnoDB tables in MySQL
Here, all it took was to read the MySQL error log in /var/lib/mysql. It's very friendly indeed, and tells you exactly what to do!
InnoDB: Error: data file ./ibdata1 is of a different size
InnoDB: 2176 pages (rounded down to MB)
InnoDB: than specified in the .cnf file 128000 pages!
InnoDB: Could not open or create data files.
InnoDB: If you tried to add new data files, and it failed here,
InnoDB: you should now edit innodb_data_file_path in my.cnf back
InnoDB: to what it was, and remove the new ibdata files InnoDB created
InnoDB: in this failed attempt. InnoDB only wrote those files full of
InnoDB: zeros, but did not yet use them in any way. But be careful: do not
InnoDB: remove old data files which contain your precious data!
Windows-based Web sites are displaying errors
Many times I've seen Windows/IIS based Web sites displaying cryptical errors such as:
In conclusion -- RTFL and google it! You'll be surprised how large of a percentage of issues you can solve this way.
Wednesday, July 16, 2008
Monday, July 14, 2008
Zach and sugarbot going strong in Google SoC
You can see a screencast that Zach put together, as well as a list of his accomplishments so far, in this blog post. In the screencast, Zach shows how he automates the launching and testing of two Sugar activities, the Calculator and the Terminal. Very cool stuff.
It's been a pleasure mentoring Zach on his SoC project. He has already proven himself to possess strong software engineering skills, not only in programming, but also in designing complex pieces of software. I only had to provide minimal guidance to Zach, and he has been very receptive with all the advice I have given him. I liked the fact that he implemented an automated test suite for sugarbot, and he included it in a buildbot continuous integration process, only days after I suggested that to him. It has also been very satisfying to me as a mentor to see his progress as exemplified by his almost-daily blog posts. I believe he is the most active blogger on Planet SoC. Good job, Zach!
Wednesday, June 18, 2008
Celtics use Ubuntu to beat Lakers
"It was a group effort by this gang in green, which bonded behind Rivers, who borrowed an African word ubuntu (pronounced Ooh-BOON-too) and roughly means "I am, because we are" in English, as the Celtics' unifying team motto.
The Celtics gave the Lakers a 12-minute crash course of ubuntu in the second quarter.
Boston outscored Los Angeles 34-19, getting 11 field goals on 11 assists. The Celtics toyed with the Lakers, outworking the Western Conference's best inside and out and showing the same kind of heart that made Boston the center of pro basketball's universe in the '60s. "
It's not what you thought, but it's still nice to see that the ubuntu concept is used successfully in sports too. I wonder what parallel we can make between the Lakers' game last night and an operating system. The Windows Blue Screen of Death comes to mind.
Tuesday, June 17, 2008
Security testing for agile testers
Security testing is a broad topic that cannot be possibly covered in a few paragraphs. Whole books have been devoted to this subject. Here we will try to at least provide some guidelines and pointers to books and tools that might prove useful to agile teams interested in security testing.
Just like functional testing, security testing can be viewed and conducted from two perspectives: from the inside out (white-box testing) and from the outside in (black-box testing).
Inside-out security testing assumes that the source code for the application under test is available to the testers. The code can be analyzed statically with a variety of tools that try to discover common coding errors which can make the application vulnerable to attacks such as buffer overflows or format string attacks. (Resources:
http://en.wikipedia.org/wiki/Buffer_overflow and
http://en.wikipedia.org/wiki/Format_string_vulnerabilities)
A list of tools that can be used for static code analysis can be found here:
http://en.wikipedia.org/wiki/List_of_tools_for_static_code_analysis
The fact that the testers have access to the source code of the application also means that they can map what some books call "the attack surface" of the application, which is the list of all the inputs and resources used by the program under test. Armed with a knowledge of the attack surface, testers can then apply a variety of techniques that attempt to break the security of the application. A very effective class of such techniques is called fuzzing and is based on fault injection. Using this technique, the testers try to make the application fail by feeding it various types of inputs (hence the term fault injection). These inputs range from carefully crafted strings used in SQL Injection attacks, to random byte changes in given input files, to random strings fed as command line arguments. (Resources:
http://www.fuzzing.org/category/fuzzing-book/ and
http://www.fuzzing.org/fuzzing-software)
The outside-in approach is the one mostly used by attackers that try to penetrate into the servers or the network hosting your application. As a security tester, you need to have the same mindset that attackers do, which means that you have to use your creativity in discovering and exploiting vulnerabilities in your own application. You also need to stay up to date with the latest security news and updates related to the platform/operating system your application runs on. These tasks are by no means easy, they require extensive knowledge, and as such are mostly outsourced to third parties that specialize in security testing.
So what are agile testers to do when faced with the apparently insurmountable task of testing the security of their application? Here are some practical, pragmatic steps that anybody can follow:
1. Adopt a continuous integration (CI) process that periodically runs a suite of automated tests against your application.
2. Learn how to use one or more open source static code analysis tools. Add a step to your CI process which consists of running these tools against your application code. Mark the step as failed it the tools find any critical vulnerabilities.
3. Install an automated security vulnerability scanner such as Nessus
(http://www.nessus.org/nessus/). Nessus can be run in a command-line, non-GUI mode, which makes it suitable for inclusion in a CI tool. Add a step to your CI process which consists of running Nessus against your application. Capture the Nessus output in a file and parse that file for any high importance security holes found by the scanner. Mark the step as FAIL when any such holes are found.
4. Learn how to use one or more open source fuzzing tools. Add a step to your CI process which consists of running these tools against your application code. Mark the step as failed it the tools find any critical vulnerabilities.
As with any automated testing effort, running these tools is no guarantee that your code and your application will be free of security defects. However, running these tools will go a long way towards improving the quality of your application in terms of security. As always, the 80/20 rule applies. These tools will probably find the 80% most common security bugs out there while requiring 20% of your security budget.
To find the remaining 20% security defects, you're well advised to spend the other 80% of your security budget on high quality security experts. They will be able to test your application security thoroughly by the use of techniques such as SQL injection, code injection, remote code inclusion and cross-site scripting. While there are some tools that try to automate some of these techniques, they are no match for a trained professional who takes the time to understand the inner workings of your application in order to craft the perfect attack against it.
Tools for troubleshooting Web app performance
Thursday, June 12, 2008
What does your Wordle look like?
You would think I'm very self-centered, since my first and last names appear so prominently. But I think it's because every blog post ends with "posted by Grig Gheorghiu at
Friday, May 23, 2008
Incremental backups to Amazon S3
Here's what I did in order to get all this going on a CentOS 5.1 server running Python 2.5.
1) Signed up for Amazon S3 and got the AWS_ACCESS_KEY_ID and the AWS_SECRET_ACCESS_KEY.
2) Downloaded and installed the following packages: boto, GnuPGInterface, librsync, duplicity. All of them except librsync are Python-based, so they can be installed via 'python setup.py install'. For librsync you need to use './configure; make; make install'.
3) Generated a GPG key pair using "gpg --gen-key". Made a note of the hex fingerprint of the key (you can list the fingerprints of your keys via "gpg --fingerprint").
4) Wrote a simple boto-based Python script to create and list S3 buckets (the equivalent of directories in S3 parlance). Note that boto uses SSL, so your Python installation needs to have SSL enabled.
Here's how the script looks:
#!/usr/bin/env python
ACCESS_KEY_ID = 'theaccesskeyid'
SECRET_ACCESS_KEY = 'thesecretaccesskey'
from boto.s3.connection import S3Connection
conn = S3Connection(ACCESS_KEY_ID, SECRET_ACCESS_KEY)
buckets = [
'mybuckets_myserver_mysqldump',
'mybuckets_myserver_full',
]
for bucket in buckets:
conn.create_bucket(bucket)
rs = conn.get_all_buckets()
print 'Bucket listing:'
for b in rs:
print b.name
5) Wrote a bash script (heavily influenced by Tim McCormack's post) that runs duplicity and backs up the root partition of my Linux server (minus some directories) to S3. The nice thing about duplicity is that it uses rsync, so it only transfers the diffs over the wire. Here's how my script looks like:
NOTE: duplicity will interactively prompt you for your GPG key's passphrase, unless you have a variable called PASSPHRASE that contains the passphrase. Since I wanted to run this script as a cron job, I chose the less secure way of specifying the passphrase in clear inside the script. YMMV.
export myEncryptionKeyFingerprint=somehexnumber
export mySigningKeyFingerprint=somehexnumber
export AWS_ACCESS_KEY_ID=accesskeyid
export AWS_SECRET_ACCESS_KEY=secretaccesskey
export PASSPHRASE=mypassphrase
/usr/local/bin/duplicity --encrypt-key=$myEncryptionKeyFingerprint
--sign-key=$mySigningKeyFingerprint --exclude=/sys --exclude=/dev
--exclude=/proc --exclude=/tmp --exclude=/mnt --exclude=/media /
s3+http://mybuckets_myserver_full
export AWS_ACCESS_KEY_ID=
export AWS_SECRET_ACCESS_KEY=
export PASSPHRASE=
That's about it. Running the script produces an output such as this:
The first time you run the script it will take a while, but subsequent runs will only back up the files that were changed since the last run. For example, my second run transferred only 19.3 MB:
--------------[ Backup Statistics ]--------------
StartTime 1211482825.55 (Thu May 22 12:00:25 2008)
EndTime 1211488426.17 (Thu May 22 13:33:46 2008)
ElapsedTime 5600.62 (1 hour 33 minutes 20.62 seconds)
SourceFiles 174531
SourceFileSize 5080402735 (4.73 GB)
NewFiles 174531
NewFileSize 5080402735 (4.73 GB)
DeletedFiles 0
ChangedFiles 0
ChangedFileSize 0 (0 bytes)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 174531
RawDeltaSize 1200920038 (1.12 GB)
TotalDestinationSizeChange 2702953170 (2.52 GB)
Errors 0
-------------------------------------------------
--------------[ Backup Statistics ]--------------To restore files from S3, you use duplicity and specify the source as s3+http://mybuckets_myserver_full and the destination as a local directory.
StartTime 1211529638.99 (Fri May 23 01:00:38 2008)
EndTime 1211529784.18 (Fri May 23 01:03:04 2008)
ElapsedTime 145.19 (2 minutes 25.19 seconds)
SourceFiles 174522
SourceFileSize 5084478500 (4.74 GB)
NewFiles 64
NewFileSize 2280357 (2.17 MB)
DeletedFiles 28
ChangedFiles 418
ChangedFileSize 217974696 (208 MB)
ChangedDeltaSize 0 (0 bytes)
DeltaEntries 510
RawDeltaSize 2465010 (2.35 MB)
TotalDestinationSizeChange 20211663 (19.3 MB)
Errors 0
ASas
-------------------------------------------------
Thanks to Tim McCormack for his detailed blog post, it made things so much easier than digging all this info by Google Fu.
Monday, May 19, 2008
Compiling Python 2.5 with SSL support
In my case, I needed to enable SSL support for Python 2.5.2 on CentOS 5.1. I already had the openssl development libraries installed:
# yum list installed | grep ssl
mod_ssl.i386 1:2.2.3-11.el5_1.cento installed
openssl.i686 0.9.8b-8.3.el5_0.2 installed
openssl-devel.i386 0.9.8b-8.3.el5_0.2 installed
Here's what I did next, following Patrick's post:
1) edited Modules/Setup.dist from the Python 2.5.2 source distribution and made sure the correct lines were put back in (they were commented out by default):
_socket socketmodule.c
# Socket module helper for SSL support; you must comment out the other
# socket line above, and possibly edit the SSL variable:
#SSL=/usr/local/ssl
_ssl _ssl.c \
-DUSE_SSL -I$(SSL)/include -I$(SSL)/include/openssl \
-L$(SSL)/lib -lssl -lcrypto
2) ran ./configure; make; make install
3) verified that I can access socket.ssl:
# python2.5
Python 2.5.2 (r252:60911, May 19 2008, 14:23:27)
[GCC 4.1.2 20070626 (Red Hat 4.1.2-14)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import socket
>>> socket.ssl
That's it. Not sure why it's so non-intuitive though.
Thursday, May 15, 2008
Encrypting a Linux root partition with LUKS and DM-CRYPT
* Boot off of a Live CD, I used Fedora Core 9 Preview
* Find out which disk is which; for me /dev/sda was the external usb, and /dev/sdb was the internal
sfdisk -d /dev/sdb | sfdisk /dev/sda* Change the partition type to 83 for /dev/sdb2
pvcreate --verbose /dev/sda2
vgextend --verbose VolGroup00 /dev/sda2
pvmove --verbose /dev/sdb2 /dev/sda2 # This takes ages
vgreduce --verbose VolGroup00 /dev/sdb2
pvremove --verbose /dev/sdb2
fdisk /dev/sdb
* Here is when you get to choose the password that will protect your partition:
cryptsetup --verify-passphrase --key-size 256 luksFormat /dev/sdb2
cryptsetup luksOpen /dev/sdb2 cryptroot
pvcreate --verbose /dev/mapper/cryptroot
vgextend --verbose VolGroup00 /dev/mapper/cryptroot
pvmove --verbose /dev/sda2 /dev/mapper/cryptroot # This takes ages
vgreduce --verbose VolGroup00 /dev/sda2
pvremove --verbose /dev/sda2
mkdir /mnt/tmp
mount /dev/VolGroup00/LogVol00 /mnt/tmp
cp -ax /dev/* /mnt/tmp/dev # I said no to overwriting any files
chroot /mnt/tmp/
(chroot) # mount -t proc proc /proc
(chroot) # mount -t sysfs sysfs /sys
(chroot) # mount /boot
(chroot) # swapon -a
(chroot) # vgcfgbackup
For the initrd, the blog mentions /etc/sysconfig/mkinitrd as a file. CentOS had a directory, I tried doing their suggestion as a file in there, moving the directory out, and making the file as they suggested. Both failed. So I ran the following command:
(chroot) # mkinitrd -v /boot/initrd-2.6.18-53.el5.crypt.img --with=aes --with=sha256 --with=dm-crypt 2.6.18-53.el5
Now we need to modify the initrd so that it will decrypt the partition at boot time
(chroot) # cd /boot
(chroot) # mkdir /boot/initrd-2.6.18-53.el5.crypt.dir
(chroot) # cd /boot/initrd-2.6.18-53.el5.crypt.dir
(chroot) # gunzip < ../initrd-2.6.18-53.el5.crypt.img | cpio -ivd
Now, we need to modify init by adding the following lines after the line which reads “mkblkdevs” and before “echo Scanning and configuring dmraid supported devices.”:
echo Decrypting root device
cryptsetup luksOpen /dev/sda2 cryptroot
echo Scanning logical volumes
lvm vgscan --ignorelockingfailure
echo Activating logical volumes
lvm vgchange -ay --ignorelockingfailure vg00
Copy cryptsetup and lvm to be put into the initrd, the blog doesn't mention it, but I'm sure it needs it.
cp /sbin/cryptsetup bin/
cp /sbin/lvm bin/
Compress the new initrd
find ./ | cpio -H newc -o | gzip -9 > /boot/initrd-2.6.18-53.el5.crypt.img
Modify the grub.conf. Copy the grub entry for the current kernel, and change as follows
title Centos Encrypted Server (2.6.18-53.1.4.el5)
initrd /initrd-2.6.18-53.el5.crypt.img
Unmount the fs's in the chroot, and exit
cd /
umount /boot
umount /proc
umount /sys
exit
NOTE: Don't upgrade the kernel without upgrading the initrd and grub.conf.
Reboot and test :)
After you have crypto setup, you can find out information about it (such as the crypto algorithm used) via this command:
# cryptsetup luksDump /dev/sda2
LUKS header information for /dev/sda2
Version: 1
Cipher name: aes
Cipher mode: cbc-essiv:sha256
Hash spec: sha1
Payload offset: 2056
MK bits: 256
MK digest: af 2e e6 39 3e 79 60 bb 4a 2b 33 05 1c 86 3a 83 bc a0 ef c1
MK salt: 79 b2 13 53 6f 52 72 a1 b5 3d dc d3 72 cd d6 f4
e3 25 3c 6e 08 00 f3 1d 44 1e 90 47 bc 43 e7 07
MK iterations: 10
UUID: 721abe52-5122-447b-8ed0-5ca3b2b32366
Key Slot 0: ENABLED
Iterations: 247223
Salt: 86 c7 53 6a 13 a9 77 81 89 ec 90 b3 e5 6a ea 8d
da 0c 6f ad ec 3e 3c 47 2d 6e 5f 59 28 4e 7c 63
Key material offset: 8
AF stripes: 4000
Key Slot 1: DISABLED
Key Slot 2: DISABLED
Key Slot 3: DISABLED
Key Slot 4: DISABLED
Key Slot 5: DISABLED
Thursday, May 08, 2008
Monday, May 05, 2008
Guido open sources Code Review app running on GAPE
The code for Code Review is part of a Google code project called Rietveld. I haven't looked at it yet, but I'll certainly do so soon, just to see the master's view on how to write a GAPE application.
Ruby to Python bytecode compiler
You know, it's crazy that Python
and Ruby fans find themselves
battling so much. While syntax
is different, this exercise
proves how close they are to
each other! And, yes, I like
Ruby's syntax and can think much
better in it, but it would be
nice to share libs with Python
folk and not have to wait forever
for a mythical VM that runs all
possible languages.
Tuesday, April 29, 2008
Special guest for next SoCal Piggies meeting
BTW, I am putting together a Google code project for mock testing techniques in Python, in preparation for a presentation I would like to give to the group at some point. I called the project moctep, in honor of that ancient Egyptian deity, the protector of testers (or mockers, or maybe both). It doesn't have much so far, but there's some sample code you can browse through in the svn repository if you're curious. I'll be adding more meat to it soon.
Anyway, if you're a Pythonista who happens to be in the L.A. area on Thursday, please consider attending our meeting. It will be lots of fun, guaranteed.
Tuesday, April 22, 2008
"OLPC Automated Testing" project accepted for SoC
Thursday, April 17, 2008
Come work for RIS Technology
Open Source Tech Top Guns Wanted
Are you a passionate Linux user? Are you running the latest Ubuntu alpha release on your laptop just because you can? Are you wired to the latest technologies -- things like Amazon EC2/S3 and Google AppEngine? Are you a virtuoso when it comes to virtualization (Xen/VMWare)?
Do you program in Python? Do you take hard problems as personal challenges and don't give up until you solve them?
RIS Technology Inc. is a rapidly growing Los Angeles-based premium managed hosting provider that hosts and manages internet applications for medium to large size organizations nationwide. We have grown consistently at 100% each of the past four years and are currently hiring for additional growth at our corporate operations center near LAX, in Los Angeles, CA. We have immediate openings for dedicated and knowledgeable technology engineers. If the answer to the questions above is YES, then we'd like to extend an invitation to interview with us.
We are an equal opportunity employer and have excellent benefits. We realize that one of the main things that makes us excellent are the people we choose to work with. We look for the best and brightest and our goal is to make work less "work" and more fun.
Wednesday, April 16, 2008
Google App Engine feels constrictive
Also, I was talking to Michał on rewriting the Cheesecake service to run on Google App Engine, but he pointed out that cron jobs are not allowed, so that won't work either... It seems that with everything I've tried with GAE I've run into a wall so far. I know it's a 'paradigm change' for Web development, but still, I can't help wishing I had my favorite Python modules to play with.
What has your experience been with GAE so far? I know Kumar wrote a cool PyPI mirror in GAE, but I haven't seen many other 'real life' applications mentioned on Planet Python.
Friday, April 11, 2008
Ubuntu Gutsy woes with Intel 801 graphics card
Thursday, April 10, 2008
Meme du jour: shell history
$ history|awk '{a[$2]++ } END{for(i in a){print a[i] " " i}}' |sort -rn|head
121 cd
91 ssh
82 ls
46 vi
28 python
26 scp
16 dig
12 more
7 twistd
6 rm
Thursday, April 03, 2008
Steve Loughran on 'Farms, Fabrics and Clouds'
To come back to Steve's presentation -- here are the slides from a previous version. I hope he will soon post the updated version we saw yesterday, but the differences are not major. The co-author of the talk is Julio Guijarro. Their area of interest within HP Labs is the deployment of large applications across distributed resources and the management of these apps/resources with an eye to maximizing their output and minimizing their cost. A familiar (and hard) problem for everybody who works in the hosting industry.
Steve talked about how the infrastructure architectures have changed over the years from a single web server talking to a single database server, to clustering, and finally to server farms and computing-on-demand. The challenge for us 'server farmers' is to figure a way to manage thousands of servers, heaps of storage, a myriad of network infrastructure devices, and large distributed applications on top of that -- all while keeping everything purring and happy, running to their maximum potential. Sounds impossible, but Amazon seems to be doing a decent job at it. And in fact Steve spent quite some time talking about how Amazon changed the game by their S3 and EC2 offerings. Even though they're not quite ready for prime time in terms of production deployments, Amazon will soon get there. As a proof, see their recent introduction of static IP addresses in EC2, and of the possibility of running your application in different data centers.
In my opinion, the best of Steve's slides are the 'Assumptions that are now invalid' ones. They really turn the 'established facts and best practices' of infrastructure and application design on their heads. Here are some examples of assumptions that don't hold anymore in our day and time:
- it is expensive to create, deploy and duplicate a new system, running a Linux image of your choice (see Instalinux as a counter-example)
- system failure is unusal and 100% availability can be achieved
- databases are the best form of storage
- you need physical access to the data center
- a single server farm needs to scale to infinity
I really recommend that you check out Steve's slides. There's a lot to chew on, but you can't afford not to chew on it, if you have anything to do with the IT industry these days.
Here are a couple more links that might prove useful:
- Anubis: a tuple-space implementation that uses multicast to share information between hosts within a site
- SmartFrog: a technology from HP used to distribute and manage applications (think puppet but geared towards application deployment); see also Google video
Update: Steve has some more thoughts on the Agile Infrastructure concept. Intriguing. This is something I'll definitely keep a very close eye on and tinker with.
Wednesday, April 02, 2008
For you students interested in GSoC
Tuesday, April 01, 2008
TurboGears and Pylons finally merging
Monday, March 31, 2008
ReviewBoard: open source code review tool
Python code complexity metrics and tools
Update: David Goodger left a comment pointing me to Martin Blais's snakefood package, which computes and shows dependencies for your Python code. It's a good complement to the tools I mentioned above.
Friday, March 28, 2008
Recommended testing conference: CAST 2008
It's a good time to be a Python programmer
I'll probably blog separately about the technical content of the presentations, but for now I just wanted to comment on the fact that everybody seems to be hiring Python programmers -- Gorilla Nation and Virgin Charter are just two companies in the L.A. area that are aggressively looking to hire Python talent. Another thing: we used to have difficulties in finding venues for our meetings. We used to meet at either USC or Caltech, and around 10-12 people max. would show up. Now companies are clamoring for organizing the meetings at their offices, and we have 20-30 people in the audience, with many new faces at every meeting. Even more: Ruby on Rails programmers are showing up at our meetings, looking for an opportunity to be more involved with Python!
I take that as a sign that Python has arrived. It's a good time to be a Python programmer (or tester, for that matter.)
Tuesday, March 25, 2008
Easy parsing with pyparsing
I had the need to parse a load balancer configuration file and save certain values in a database. Most of the stuff I needed was fairly easily obtainable with regular expressions or Python string operations. However, I was stumped when I encountered a line such as:
bind http "Customer Server 1" http "Customer Server 2" http
This line 'binds' a 'virtual server' port to one or more 'real servers' and their ports (I'm using here this particular load balancer's jargon, but the concepts are the same for all load balancers.)
The syntax is 'bind' followed by a word denoting the virtual server port, followed by one or more pairs of real server names and ports. The kicker is that the real server names can be either a single word containing no whitespace, or multiple words enclosed in double quotes.
Splitting the line by spaces or double quotes is not the solution in this case. I started out by rolling my own little algorithm and keeping track of where I am inside the string, then I realized that I'm actually writing my own parser at this point. Time to reach for pyparsing.
I won't go into the details of how to use pyparsing, since there is great documentation available (see Paul's PyCon06 presentation, the examples on the pyparsing site, and also Paul's O'Reilly Shortcut book). Basically you need to define your grammar for the expression you need to parse, then translate it into pyparsing-specific constructs. Because pyparsing's API is so intuitive and powerful, the translation process is straightforward.
Here's how I ended up implementing my pyparsing grammar:
from pyparsing import *
def parse_bind_line(line):
quoted_real_server = dblQuotedString.setParseAction(removeQuotes)
real_server = Word(alphas, printables) | quoted_real_server
port = Word(alphanums)
real_server_port = Group(real_server + port)
bind_expr = Suppress(Literal("bind")) + \
port + \
OneOrMore(real_server_port)
return bind_expr.parseString(line)
That's all there is to it. You need to read it from the bottom up to see how the expression gets decomposed into elements, and elements get decomposed into sub-elements.
I'll explain each line, starting with the last one before the return:
bind_expr = Suppress(Literal("bind")) + \
port + \
OneOrMore(real_server_port)
A bind expression starts with the literal "bind", followed by a port, followed by one or more real server/port pairs. That's pretty much what the line above actually says, isn't it. The Suppress construct tells pyparsing that we're not interested in returning the literal "bind" in the final token list.
real_server_port = Group(real_server + port)
A real server/port pair is simply a real server name followed by a port. The Group construct tells pyparsing that we want to group these 2 tokens in a list inside the final token list.
port = Word(alphanums)
A port is a word composed of alphanumeric characters. In general, word means 'a sequence of characters containing no whitespace'. The 'alphanums' variable is a special pyparsing variable already containing the list of alphanumeric characters.
real_server = Word(alphas, printables) | quoted_real_server
A real server is either a single word, or an expression in quotes. Note that we can declare a pyparsing Word with 2 arguments; the 1st argument specifies the allowed characters for the initial character of the word, whereas the 2nd argument specified the allowed characters for the body of the word. In this case, we're saying that we want a real server name to start with an alphabetical character, but other than that it can contain any printable character.
quoted_real_server = dblQuotedString.setParseAction(removeQuotes)
Here is where you can glimpse the power of pyparsing. With this single statement we're parsing a sequence of words enclosed in double quotes, and we're saying that we're not interested in the quotes. There's also a sglQuotedString class for words enclosed in single quotes. Thanks to Paul for bringing this to my attention. My clumsy attempt at manually declaring a sequence of words enclosed in double quotes ran something like this:
no_quote_word = Word(alphanums+"-.")
quoted_real_server = Suppress(Literal("\"")) + \
OneOrMore(no_quote_word) + \
Suppress(Literal("\""))
quoted_real_server.setParseAction(lambda tokens: " ".join(tokens))
The only useful thing you can take away from this mumbo-jumbo is that you can associate an action with each token. When pyparsing will encounter that token, it will apply the action (function or class) you specified on that token. This is useful for doing validation of your tokens, for example for a date. Very powerful stuff.
Now it's time to test my function on a few strings:
if __name__ == "__main__":
tests = """\
bind http "Customer Server 1" http "Customer Server 2" http
bind http "Customer Server - 11" 81 "Customer Server 12" 82
bind http www.mywebsite.com-server1 http www.mywebsite.com-server2 http
bind ssl www.mywebsite.com-server1 ssl www.mywebsite.com-server2 ssl
bind http TEST-server http
bind http MY-cluster-web11 83 MY-cluster-web-12 83
bind http cust1-server1.site.com http cust1-server2.site.com http
""".splitlines()
for t in tests:
print parse_bind_line(t)
Running the code above produces this output:
$ ./parse_bind.py
['http', ['Customer Server 1', 'http'], ['Customer Server 2', 'http']]
['http', ['Customer Server - 11', '81'], ['Customer Server 12', '82']]
['http', ['www.mywebsite.com-server1', 'http'], ['www.mywebsite.com-server2', 'http']]
['ssl', ['www.mywebsite.com-server1', 'ssl'], ['www.mywebsite.com-server2', 'ssl']]
['http', ['TEST-server', 'http']]
['http', ['MY-cluster-web11', '83'], ['MY-cluster-web-12', '83']]
['http', ['cust1-server1.site.com', 'http'], ['cust1-server2.site.com', 'http']]
From here, I was able to quickly identify for a given virtual server everything I needed: a virtual server port, and all the real server/port pairs associated with it. Inserting all this into a database was just another step. The hard work had already been done by pyparsing.
Once more, kudos to Paul McGuire for creating such an useful and fun tool.




