For the last couple of weeks we've been working on setting up a bunch of Dell C2100 servers with Ubuntu 10.04. These servers come with 2 x 500 GB internal disks that can be set up in RAID1 with the on-board RAID controller. However, when we did that during the Ubuntu install, we never managed to get back into the OS after the initial reboot, or in some cases GRUB refused to write to the array (I think it was when we tried 11.04 in a desperation move). To make matters worse, even software RAID via mdadm stopped working, with the servers going to the initramfs BusyBox prompt after the initial reboot. My guess is that it all stems from GRUB not writing properly to /dev/md0 (the root partition RAID-ed during the install) and instead writing to /dev/sda1 and /dev/sdb1. So we decided to install the root partition and the swap space on /dev/sda and leave /dev/sdb alone.
I started to look for articles on how to set up RAID1 post-install, when you have the OS installed on /dev/sda, and you want to add /dev/sdb to a RAID1 setup with mdadm. After some fruitless searches, I finally hit the jackpot with this article on HowtoForge written by Falko Timme. I really don't have much to add, just follow the instructions closely and it will work ;-) Kudos to Falko for a great resource.
Thursday, September 08, 2011
Friday, August 19, 2011
New location for the Python Testing Tools Taxonomy
I was taken by surprise by Baiju Muthukadan's announcement which I read on Planet Python -- the Python Testing Tools Taxonomy page which I started years ago has a new incarnation on the Python wiki. I think it's a good thing (although I still wish I had been notified as a courtesy). In any case, feel free to add more tools to the page!
Wednesday, August 17, 2011
Anybody using lxc or OpenVZ in production?
I asked a similar question yesterday on Twitter ("Anybody using Linux Containers (lxc) in production, preferably with Ubuntu?") and it seemed to have struck a chord, because many people asked me to post the answers to this question, and many other people answered the question.
"I'm using straight cgroups without namespaces in production. It's pretty nice for fine-grained scheduler control."
From ohlol:
"I just began using lxc. Have three hosts in it so far as a test run. Not doing NAT, just plain bridging right now."
"I have been using it for about a week on my laptop to replace Vagrant/Virtualbox. Works great so far."
"I just posted a short write up of how I use LXC on my laptop http://t.co/CQXTPMv"
From ohlol:
"Have you tried lxc without libvirt? I found it to be a bit easier to deal with."
From vvuksan:
"Yes that is a red herring. You do not need libvirt. I had it installed already so went with it by default."
"btw, if you ever get to that point: http://t.co/2WvYaeX helped get me to a working solution"
From ichilton:
"ive been using OpenVZ in production with Debian Stable (on the host and guests) for over a year with no problems...."
From griggheo:
"@ichilton you had to recompile the kernel for OpenVZ support in Debian right?"
From ichilton:
"I didn't, there was an OpenVZ kernel package but it was Lenny at the time and not upgraded yet - will have to check Squeeze."
From ichilton:
"@vvuksan interested why you did that originally and what the advantages are in hindsight?"
From vvuksan:
"Speed. The dev env needs 5-6 boxes running at the same time and with Vbox my laptop becomes really slow. With LXC not so much."
From sstatik:
"LXC should be considerably smoother in 11.10 for both 11.10/10.04 guests. I want to see laptop-based microclouds become common."
From heckj:
"@sstatik @griggheo documentation and details getting better? its arcane to use in 11.04, and that is 1000x better than 10.x..."
So there you have it, a small snapshot of why and how people are using lxc/OpenVZ, especially on Ubuntu. I'll post my own experiences as I start playing with lxc and potentially OpenVZ.
Both Linux Containers (or lxc as the project is known) and OpenVZ are lightweight virtualization systems that operate at the file system level, and as such can be attractive to people who are looking to split a big physical server into containers, while achieving resource isolation per container. I personally want to look into both primarily as a means to run several MySQL instances per physical server while ensuring better resource isolation , especially in regards to RAM.
In any case, I thought it would be interesting to post the replies I got on Twitter to my question.
From AlTobey:
From ohlol:
"I just began using lxc. Have three hosts in it so far as a test run. Not doing NAT, just plain bridging right now."
From vvuksan:
"I just posted a short write up of how I use LXC on my laptop http://t.co/CQXTPMv"
From ohlol:
"Have you tried lxc without libvirt? I found it to be a bit easier to deal with."
From vvuksan:
"Yes that is a red herring. You do not need libvirt. I had it installed already so went with it by default."
"It just helps me not have to set up dnsmasq, iptables etc. :-) But you can certainly do away with it."
From ohlol:
From ohlol:
"Have you tried doing an apt-get upgrade in lxc? What a PITA :)"
"btw, if you ever get to that point: http://t.co/2WvYaeX helped get me to a working solution"
From ichilton:
"ive been using OpenVZ in production with Debian Stable (on the host and guests) for over a year with no problems...."
From griggheo:
"@ichilton you had to recompile the kernel for OpenVZ support in Debian right?"
From ichilton:
"I didn't, there was an OpenVZ kernel package but it was Lenny at the time and not upgraded yet - will have to check Squeeze."
From ichilton:
"@vvuksan interested why you did that originally and what the advantages are in hindsight?"
From vvuksan:
"Speed. The dev env needs 5-6 boxes running at the same time and with Vbox my laptop becomes really slow. With LXC not so much."
From sstatik:
"LXC should be considerably smoother in 11.10 for both 11.10/10.04 guests. I want to see laptop-based microclouds become common."
From mitchellh:
"@sstatik @griggheo Laptop based microclouds are the future. We're just missing quality software to help manage it."
From heckj:
"@sstatik @griggheo documentation and details getting better? its arcane to use in 11.04, and that is 1000x better than 10.x..."
So there you have it, a small snapshot of why and how people are using lxc/OpenVZ, especially on Ubuntu. I'll post my own experiences as I start playing with lxc and potentially OpenVZ.
Wednesday, July 27, 2011
Processing mail logs with Elastic MapReduce and Pig
These are some notes I took while trying out Elastic MapReduce (EMR), and more specifically its Pig functionality, by processing sendmail mail logs. A big help was Eric Lubow's blog post on EMR and Pig. Before I go into details, here's my general processing flow:
I tried the same processing steps on bzip2-compressed files using Pig's Hadoop mode (which you invoke by just running 'pig' and not 'pig -x local'). The files were loaded correctly this time, but the MapReduce phase failed with messages similar to this in /mnt/var/log/apps/pig.log:
So what I had to do was to modify a single line and cast the value used in the GROUP BY clause to chararray:
- N mail servers (running sendmail) send their mail logs to a central server running syslog-ng.
- A process running on the central logging server tails the aggregated mail log (at 5 minute intervals), parses the lines it finds, extracts relevant information from each line, and saves the output in JSON format to a local file (actually there are 2 types of files generated, one for sender information and one for recipient information, corresponding to the 'from' and 'to' lines in the mail log -- see below)
- Another process compresses the generated files in bzip2 format and uploads them to S3.
I have 2 sets of files, one set with names similar to "from-2011-07-12-20-58" and containing JSON records of the following form, one per line:
{"nrcpts": "1", "src": "info@example.com", "sendmailid": "p6D0r0u1006229", "relay": "app03.example.com", "classnumber": "0", "msgid": "WARQZCXAEMSSVWPPOOYZXRLQIKMFUY.155763@example.com", "
pid": "6229", "month": "Jul", "time": "20:53:00", "day": "12", "mailserver": "mail5", "size": "57395"}
The second set contains files with names similar to "to-2011-07-12-20-58" and containing JSON records of the following form, one per line:
{"sendmailid": "p6D0qwvm006395", "relay": "gmail-smtp-in.l.google.com.", "dest": "somebody@gmail.com", "pid": "6406", "stat": "Sent (OK 1310518380 pd12si6025606vcb.162)", "month": "Jul", "delay": "00:00:02", "time": "20:53:00", "xdelay": "00:00:02", "day": "12", "mailserver": "mail2"}
For the initial EMR/Pig setup, I followed "Parsing Logs with Apache Pig and Elastic MapReduce". It's fairly simple to end up with an EC2 instance running Hadoop and Pig that you can play with.
I then ssh-ed into the EMR master instance (note that it was still shown in 'Waiting' state in the EMR console, but once it got assigned an IP and internal name I was able to ssh into it).
In order for Pig to be able to process input in JSON format, you need to use Kevin Weil's elephant-bird library. I followed Eric Lubow's post to get that set up:
$ mkdir git && mkdir pig-jars
$ cd git && wget --no-check-certificate https://github.com/kevinweil/elephant-bird/tarball/eb1.2.1_with_jsonloader
$ tar xvfz eb1.2.1_with_jsonloader
$ cd kevinweil-elephant-bird-ecf8356/
$ cp lib/google-collect-1.0.jar ~/pig-jars/
$ cp lib/json-simple-1.1.jar ~/pig-jars/
$ ant nonothing
$ cd build/classes/
$ jar -cf ../elephant-bird-1.2.1-SNAPSHOT.jar com
$ cp ../elephant-bird-1.2.1-SNAPSHOT.jar ~/pig-jars/
$ cd git && wget --no-check-certificate https://github.com/kevinweil/elephant-bird/tarball/eb1.2.1_with_jsonloader
$ tar xvfz eb1.2.1_with_jsonloader
$ cd kevinweil-elephant-bird-ecf8356/
$ cp lib/google-collect-1.0.jar ~/pig-jars/
$ cp lib/json-simple-1.1.jar ~/pig-jars/
$ ant nonothing
$ cd build/classes/
$ jar -cf ../elephant-bird-1.2.1-SNAPSHOT.jar com
$ cp ../elephant-bird-1.2.1-SNAPSHOT.jar ~/pig-jars/
I then copied 3 elephant-bird jar files to S3 so I can register them every time I run Pig. I did that via the grunt command prompt:
$ pig -x local
grunt> cp file:///home/hadoop/pig-jars/google-collect-1.0.jar s3://MY_S3_BUCKET/jars/pig/
grunt> cp file:///home/hadoop/pig-jars/json-simple-1.1.jar s3://MY_S3_BUCKET/jars/pig/
grunt> cp file:///home/hadoop/pig-jars/elephant-bird-1.2.1-SNAPSHOT.jar s3://MY_S3_BUCKET/jars/pig/
grunt> cp file:///home/hadoop/pig-jars/json-simple-1.1.jar s3://MY_S3_BUCKET/jars/pig/
grunt> cp file:///home/hadoop/pig-jars/elephant-bird-1.2.1-SNAPSHOT.jar s3://MY_S3_BUCKET/jars/pig/
At this point, I was ready to process some of the files I uploaded to S3.
I first tried processing a single file, using Pig's local mode (which doesn't involve HDFS). It turns out that Pig doesn't load compressed files correctly via elephant-bird when you run in local mode, so I tested this on an uncompressed file previously uploaded to S3:
$ pig -x local
grunt> REGISTER s3://MY_S3_BUCKET/jars/pig/google-collect-1.0.jar;
grunt> REGISTER s3://MY_S3_BUCKET/jars/pig/json-simple-1.1.jar;
grunt> REGISTER s3://MY_S3_BUCKET/jars/pig/elephant-bird-1.2.1-SNAPSHOT.jar;grunt> json = LOAD 's3://MY_S3_BUCKET/mail_logs/2011-07-12/to-2011-07-12-16-49' USING com.twitter.elephantbird.pig.load.JsonLoader();
grunt> REGISTER s3://MY_S3_BUCKET/jars/pig/json-simple-1.1.jar;
grunt> REGISTER s3://MY_S3_BUCKET/jars/pig/elephant-bird-1.2.1-SNAPSHOT.jar;grunt> json = LOAD 's3://MY_S3_BUCKET/mail_logs/2011-07-12/to-2011-07-12-16-49' USING com.twitter.elephantbird.pig.load.JsonLoader();
Note that I used the JSON loader from the elephant-bird JAR file.
I wanted to know the top 3 mail servers from the file I loaded (this is again heavily inspired by Eric Lubow's example in his blog post):
grunt> mailservers = FOREACH json GENERATE $0#'mailserver' AS mailserver;
grunt> mailserver_count = FOREACH (GROUP mailservers BY $0) GENERATE $0, COUNT($1) AS cnt;
grunt> mailserver_sorted_count = LIMIT(ORDER mailserver_count BY cnt DESC) 3;
grunt> DUMP mailserver_sorted_count;
grunt> mailserver_sorted_count = LIMIT(ORDER mailserver_count BY cnt DESC) 3;
grunt> DUMP mailserver_sorted_count;
I won't go into detail as far as the actual Pig operations I ran -- I recommend going through some Pig Latin tutorials or buying the O'Reilly 'Programming Pig' book. Suffice to say that I extracted the 'mailserver' JSON field, then I grouped the records by mail server and counted how many there are in each group. Finally, I dumped the 3 top mail servers found.
Here's a slightly more interesting exercise: finding out the top 10 mail recipients by looking at all the to-* files uploaded to S3 (still uncompressed in this case):
grunt> to = LOAD 's3://MY_S3_BUCKET/mail_logs/2011-07-13/to*' USING com.twitter.elephantbird.pig.load.JsonLoader();
grunt> to_emails = FOREACH to GENERATE $0#'dest' AS dest;
grunt> to_count = FOREACH (GROUP to_emails BY $0) GENERATE $0, COUNT($1) AS cnt;
grunt> to_sorted_count = LIMIT(ORDER to_count BY cnt DESC) 10;
grunt> DUMP to_sorted_count;
grunt> to_emails = FOREACH to GENERATE $0#'dest' AS dest;
grunt> to_count = FOREACH (GROUP to_emails BY $0) GENERATE $0, COUNT($1) AS cnt;
grunt> to_sorted_count = LIMIT(ORDER to_count BY cnt DESC) 10;
grunt> DUMP to_sorted_count;
I tried the same processing steps on bzip2-compressed files using Pig's Hadoop mode (which you invoke by just running 'pig' and not 'pig -x local'). The files were loaded correctly this time, but the MapReduce phase failed with messages similar to this in /mnt/var/log/apps/pig.log:
Pig Stack Trace
---------------
ERROR 6015: During execution, encountered a Hadoop error.
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias to_sorted_count
at org.apache.pig.PigServer.openIterator(PigServer.java:482)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:546)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
at org.apache.pig.Main.main(Main.java:374)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6015: During execution, encountered a Hadoop error.
at .apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:862)
at .apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:474)
at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:109)
at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:255)
at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:244)
at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:94)
at .apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at .apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:363)
at .apache.hadoop.mapred.MapTask.run(MapTask.java:312)
Caused by: java.io.IOException: Type mismatch in key from map: expected org.apache.pig.impl.io.NullableBytesWritable, recieved org.apache.pig.impl.io.NullableText
... 9 more
---------------
ERROR 6015: During execution, encountered a Hadoop error.
org.apache.pig.impl.logicalLayer.FrontendException: ERROR 1066: Unable to open iterator for alias to_sorted_count
at org.apache.pig.PigServer.openIterator(PigServer.java:482)
at org.apache.pig.tools.grunt.GruntParser.processDump(GruntParser.java:546)
at org.apache.pig.tools.pigscript.parser.PigScriptParser.parse(PigScriptParser.java:241)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:165)
at org.apache.pig.tools.grunt.GruntParser.parseStopOnError(GruntParser.java:141)
at org.apache.pig.tools.grunt.Grunt.run(Grunt.java:75)
at org.apache.pig.Main.main(Main.java:374)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: org.apache.pig.backend.executionengine.ExecException: ERROR 6015: During execution, encountered a Hadoop error.
at .apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:862)
at .apache.hadoop.mapred.MapTask$OldOutputCollector.collect(MapTask.java:474)
at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.collect(PigMapReduce.java:109)
at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.runPipeline(PigMapBase.java:255)
at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapBase.map(PigMapBase.java:244)
at .apache.pig.backend.hadoop.executionengine.mapReduceLayer.PigMapReduce$Map.map(PigMapReduce.java:94)
at .apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
at .apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:363)
at .apache.hadoop.mapred.MapTask.run(MapTask.java:312)
Caused by: java.io.IOException: Type mismatch in key from map: expected org.apache.pig.impl.io.NullableBytesWritable, recieved org.apache.pig.impl.io.NullableText
... 9 more
A quick Google search revealed JIRA Pig ticket #919 which offered a workaround. Basically this happens when a value coming out of a map is used in a group/cogroup/join. By default the type of that value is bytearray, and you need to cast it to chararray to make things work (I confess I didn't dig too much into the nitty-gritty of this issue yet, I was just happy I made it work).
So what I had to do was to modify a single line and cast the value used in the GROUP BY clause to chararray:
grunt> to_count = FOREACH (GROUP to_emails BY (chararray)$0) GENERATE $0, COUNT($1) AS cnt;
At this point, I was able to watch Elastic MapReduce in action, slower than in local mode becase I only had 1 m1.small instance. I'll try it next with several instances and hopefully see a near-linear improvement.
That's it for now. This was just a toy example, but it got me started with EMR and Pig. Hopefully I'll follow up with more interesting log processing and analysis.
Friday, July 22, 2011
Results of a survey of the SoCal Piggies group
My colleague Warren Runk had the idea of putting together a survey to be sent to the mailing list of the SoCal Python Interest Group (aka SoCal Piggies), with the purpose of finding out which topics or activities would be most interesting to the members of the group in terms of future meetings. We had 10 topics in the survey, and people responded by choosing their top 5. We also had free-form response fields for 2 questions: "What do you like most about the meetings?" and "What meeting improvements are most important to you?".
We had 26 responses. Here are the votes results for the 10 topics we proposed:
#1 (18 votes): "Good practice, pitfall avoidance, and module introductions for beginners"
#2 (17 votes): "5 minute lightning talks"
#3 - #4 (15 votes): "Excellent code examples from established Python projects" and "New and upcoming Python open source projects"
#5 (14 votes): "30 minute presentations"
#6 (13 votes): "Ice breakers/new member introductions"
#7 (12 votes): "Algorithm discussions and dissections"
#8 (11 votes): "Good testing practices and pointers to new methods/tools"
#9 (10 votes): "Moderated relevant/cutting edge general tech discussions"
#10 (9 votes): "Short small group programming exercises"
It's pretty clear that people are interested most of all in good Python programming practices and practical examples of 'excellent' code from established projects. Presentations are popular too, with lightning talks edging the longer 30-minute talks. A pretty good percentage of the people attending our meetings are beginners, so we're going to try to focus on making our meetings more beginner-friendly.
We had a meeting last night where we discussed some of these topics. We tried to appoint point persons for given topics. These persons would be responsible for doing research on that topic (for example 'New and upcoming Python open source projects') and give a short presentation to the group at every meeting, while also looking for other group members to delegate this responsibility to in the future. I think this 'lieutenant' system will work well, but time will tell. My personal observation from the 7 years I've been organizing this group is that the hardest part is to get people to volunteer in any capacity, and most of all in presenting to the group. But this infusion of new ideas is very welcome, and I hope it will invigorate the participation in our group.
I hope the results of this survey and the feedback we got will be useful to other Python user groups out there.
I want to thank Warren Runk and Danny Greenfeld for their feedback, ideas and participation in making the SoCal Piggies Group better.
We had 26 responses. Here are the votes results for the 10 topics we proposed:
#1 (18 votes): "Good practice, pitfall avoidance, and module introductions for beginners"
#2 (17 votes): "5 minute lightning talks"
#3 - #4 (15 votes): "Excellent code examples from established Python projects" and "New and upcoming Python open source projects"
#5 (14 votes): "30 minute presentations"
#6 (13 votes): "Ice breakers/new member introductions"
#7 (12 votes): "Algorithm discussions and dissections"
#8 (11 votes): "Good testing practices and pointers to new methods/tools"
#9 (10 votes): "Moderated relevant/cutting edge general tech discussions"
#10 (9 votes): "Short small group programming exercises"
It's pretty clear that people are interested most of all in good Python programming practices and practical examples of 'excellent' code from established projects. Presentations are popular too, with lightning talks edging the longer 30-minute talks. A pretty good percentage of the people attending our meetings are beginners, so we're going to try to focus on making our meetings more beginner-friendly.
As far as what people like most about the meetings, here are a few things:
- "I love hearing about how Python is being used in multiple locations throughout large corporations. It helps me to promote Python at every opportunity when I can say that Python is being used at Acme Corp for XYZ!"
- "High level introductions to Python modules. Often this is not the main thrust of a talk, but the speaker chose some module for a given task and that helps me expand my horizon."
- "Becoming aware of how various companies use python, which libraries and tools are used most often, the opportunity to connect with members during breaks."
- "I like being exposed to things I don't normally see at work of if I've seen them I get to see them from a different angle. "
- "I don't have other geeks at my office so I like having the chance to hang out and get to know other Python programmers."
- ...and many people expressed their satisfaction in seeing Raymod Hettinger's presentation at Disney Animation Studios (thanks to Paul Hildebrandt for putting that together!)
Here's what people said when asked about possible improvements:
- "More best practices and module intros."
- "Keep the meetings loose, don't have too many controls. "
- "In addition to the aptly proposed "ice-breakers / introductions" how can we current members more-actively welcome beginners?"
- "Time (and some format) to discuss the issues brought up in the talks. Sometimes I think it'd be useful for the group to get more directly involved in vetting/providing critique for some of the decisions a speaker made. Controversial points made in talks are great, but sometimes I think everyone might benefit from a few other perspectives."
- "Friendlier onboarding of new members would be great."
- "Keeping the total noobs in mind"
- "I would like introductions. I have met a couple people at each of the meetings that I have attended, but I would also like to know who else is there."
- "I would like the opportunity to meet resourceful programmers and learn techniques and abilities that I can't pick up from youtube or online tutorials!"
- "I think we should try to come up with and stick with a consistent format. I like the discussion-style presentation so long as it does not detract from the topic at hand. I think we need to make sure that people stick with shorter presentations, so that there is plenty of time for Q&A without the risk of running on too long. 30 minutes should really be 30 minutes! "
- "It would be good to identify the difficulty/skill level of a presentation ahead of time so that beginners are not scared off or at least know what they're getting into. Perhaps we could try to always mix it up by warming up with a beginner/intermediate preso and follow up with an intermediate/advanced."
We had a meeting last night where we discussed some of these topics. We tried to appoint point persons for given topics. These persons would be responsible for doing research on that topic (for example 'New and upcoming Python open source projects') and give a short presentation to the group at every meeting, while also looking for other group members to delegate this responsibility to in the future. I think this 'lieutenant' system will work well, but time will tell. My personal observation from the 7 years I've been organizing this group is that the hardest part is to get people to volunteer in any capacity, and most of all in presenting to the group. But this infusion of new ideas is very welcome, and I hope it will invigorate the participation in our group.
I hope the results of this survey and the feedback we got will be useful to other Python user groups out there.
I want to thank Warren Runk and Danny Greenfeld for their feedback, ideas and participation in making the SoCal Piggies Group better.
Wednesday, July 20, 2011
Accessing the data center from the cloud with OpenVPN
This post was inspired by a recent exercise I went through at the prompting of my colleague Dan Mesh. The goal was to have Amazon EC2 instances connect securely to servers at a data center using OpenVPN.
In this scenario, we have a server within the data center running OpenVPN in server mode. The server has a publicly accessible IP (via a firewall NAT) with port 1194 exposed via UDP. Cloud instances which run OpenVPN in client mode are connecting to the server, get a route pushed to them to an internal network within the data center, and are then able to access servers on that internal network over a VPN tunnel.
Here are some concrete details about the network topology that I'm going to discuss.
Server A at the data center has an internal IP address of 10.10.10.10 and is part of the internal network 10.10.10.0/24. There is a NAT on the firewall mapping external IP X.Y.Z.W to the internal IP of server A. There is also a rule that allows UDP traffic on port 1194 to X.Y.Z.W.
I have an EC2 instance from which I want to reach server B on the internal data center network, with IP 10.10.10.20.
Install and configure OpenVPN on server A
Since server A is running Ubuntu (10.04 to be exact), I used this very good guide, with an important exception: I didn't want to configure the server in bridging mode, I preferred the simpler tunneling mode. In bridging mode, the internal network which server A is part of (10.10.10.0/24 in my case) is directly exposed to OpenVPN clients. In tunneling mode, there is a tunnel created between clients and server A on a separated dedicated network. I preferred the tunneling option because it doesn't require any modifications to the network setup of server A (no bridging interface required), and because it provides better security for my requirements (I can target individual servers on the internal network and configure them to be accessed via VPN). YMMV of course.
For the initial installation and key creation for OpenVPN, I followed the guide. When it came to configuring the OpenVPN server, I created these entries in /etc/openvpn/server.conf:
server 172.16.0.0 255.255.255.0
push "route 10.10.10.0 255.255.255.0"
tls-auth ta.key 0
The first directive specifies that the OpenVPN tunnel will be established on a new 172.16.0.0/24 network. The server will get the IP 172.16.0.1, while OpenVPN clients that connect to the server will get 172.16.0.6 etc.
The second directive pushes a static route to the internal data center network 10.10.10.0/24 to all connected OpenVPN clients. This way each client will know how to get to machines on that internal network, without the need to create static routes manually on the client.
The tls_auth entry provides extra security to help prevent DoS attacks and UDP port flooding.
Note that I didn't have to include any bridging-related scripts or other information in server.conf.
At this point, if you start the OpenVPN service on server A via 'service openvpn start', you should see an extra tun0 network interface when you run ifconfig. Something like this:
tun0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:172.16.0.1 P-t-P:172.16.0.2 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:2 errors:0 dropped:0 overruns:0 frame:0
TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:168 (168.0 B) TX bytes:168 (168.0 B)
3) Customize client.conf:
# cp /usr/share/doc/openvpn/examples/sample-config-files/client.conf /etc/openvpn
Edit client.conf and specify:
remote X.Y.Z.W 1194 (where X.Y.Z.W is the external IP of server A)
cert client_hostname.crt
key client_hostname.key
tls-auth ta.key 1
Now if you start the OpenVPN service on the client via 'service openvpn start', you should see a tun0 interface when you run ifconfig:
tun0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:172.16.0.6 P-t-P:172.16.0.5 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:2 errors:0 dropped:0 overruns:0 frame:0
TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:168 (168.0 B) TX bytes:168 (168.0 B)
In this scenario, we have a server within the data center running OpenVPN in server mode. The server has a publicly accessible IP (via a firewall NAT) with port 1194 exposed via UDP. Cloud instances which run OpenVPN in client mode are connecting to the server, get a route pushed to them to an internal network within the data center, and are then able to access servers on that internal network over a VPN tunnel.
Here are some concrete details about the network topology that I'm going to discuss.
Server A at the data center has an internal IP address of 10.10.10.10 and is part of the internal network 10.10.10.0/24. There is a NAT on the firewall mapping external IP X.Y.Z.W to the internal IP of server A. There is also a rule that allows UDP traffic on port 1194 to X.Y.Z.W.
I have an EC2 instance from which I want to reach server B on the internal data center network, with IP 10.10.10.20.
Install and configure OpenVPN on server A
Since server A is running Ubuntu (10.04 to be exact), I used this very good guide, with an important exception: I didn't want to configure the server in bridging mode, I preferred the simpler tunneling mode. In bridging mode, the internal network which server A is part of (10.10.10.0/24 in my case) is directly exposed to OpenVPN clients. In tunneling mode, there is a tunnel created between clients and server A on a separated dedicated network. I preferred the tunneling option because it doesn't require any modifications to the network setup of server A (no bridging interface required), and because it provides better security for my requirements (I can target individual servers on the internal network and configure them to be accessed via VPN). YMMV of course.
For the initial installation and key creation for OpenVPN, I followed the guide. When it came to configuring the OpenVPN server, I created these entries in /etc/openvpn/server.conf:
server 172.16.0.0 255.255.255.0
push "route 10.10.10.0 255.255.255.0"
tls-auth ta.key 0
The first directive specifies that the OpenVPN tunnel will be established on a new 172.16.0.0/24 network. The server will get the IP 172.16.0.1, while OpenVPN clients that connect to the server will get 172.16.0.6 etc.
The second directive pushes a static route to the internal data center network 10.10.10.0/24 to all connected OpenVPN clients. This way each client will know how to get to machines on that internal network, without the need to create static routes manually on the client.
The tls_auth entry provides extra security to help prevent DoS attacks and UDP port flooding.
Note that I didn't have to include any bridging-related scripts or other information in server.conf.
At this point, if you start the OpenVPN service on server A via 'service openvpn start', you should see an extra tun0 network interface when you run ifconfig. Something like this:
tun0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:172.16.0.1 P-t-P:172.16.0.2 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:2 errors:0 dropped:0 overruns:0 frame:0
TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:168 (168.0 B) TX bytes:168 (168.0 B)
Also, the routing information will now include the 172.16.0.0 network:
# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
172.16.0.2 0.0.0.0 255.255.255.255 UH 0 0 0 tun0
172.16.0.0 172.16.0.2 255.255.255.0 UG 0 0 0 tun0
...etc
Install and configure OpenVPN on clients
Here again I followed the Ubuntu OpenVPN guide. The steps are very simple:
1) apt-get install openvpn
2) scp the following files (which were created on the server during the OpenVPN server install process above) from server A to the client, into the /etc/openvpn directory:
ca.crt
ta.key
client_hostname.crt
client_hostname.key
3) Customize client.conf:
# cp /usr/share/doc/openvpn/examples/sample-config-files/client.conf /etc/openvpn
Edit client.conf and specify:
remote X.Y.Z.W 1194 (where X.Y.Z.W is the external IP of server A)
cert client_hostname.crt
key client_hostname.key
tls-auth ta.key 1
Now if you start the OpenVPN service on the client via 'service openvpn start', you should see a tun0 interface when you run ifconfig:
tun0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00
inet addr:172.16.0.6 P-t-P:172.16.0.5 Mask:255.255.255.255
UP POINTOPOINT RUNNING NOARP MULTICAST MTU:1500 Metric:1
RX packets:2 errors:0 dropped:0 overruns:0 frame:0
TX packets:2 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:168 (168.0 B) TX bytes:168 (168.0 B)
You should also see routing information related to both the tunneling network 172.16.0.0/24 and to the internal data center network 10.10.10.0/0 (which was pushed from the server):
# netstat -rn
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
172.16.0.5 0.0.0.0 255.255.255.255 UH 0 0 0 tun0
172.16.0.1 172.16.0.5 255.255.255.255 UGH 0 0 0 tun0
10.0.10.0 172.16.0.5 255.255.255.0 UG 0 0 0 tun0
....etc
At this point, the client and server A should be able to ping each other on their 172.16 IP addresses. From the client you should be able to ping server A's IP 172.16.0.1, and from server A you should be able to ping the client's IP 172.16.0.6.
Create static route to tunneling network on server B and enable IP forwarding on server A
Remember that the goal was for the client to access server B on the internal data center network, with IP address 10.10.10.20. For this to happen, I needed to add a static route on server B to the tunneling network 172.16.0.0/24, with server A's IP 10.10.10.10 as the gateway:
# route add -net 172.16.0.0/24 gw 10.10.10.10
The final piece of the puzzle is to allow server A to act as a router at this point, by enabling IP forwarding (which is disabled by default). So on server A I did:
# sysctl -w net.ipv4.ip_forward=1
# echo "net.ipv4.ip_forward=1" >> /etc/sysctl.conf
At this point, I was able to access server B from the client by using server B's 10.10.10.20 IP address.
We've just started to experiment with this setup, so I'm not yet sure if it's production ready. I wanted to jot down these things though because they weren't necessarily obvious, despite some decent blog posts and OpenVPN documentation. Hopefully they'll help somebody else out there too.
Thursday, June 30, 2011
A strategy for handling DNS in EC2 with Route 53
In my previous post I showed how to use the boto library to manage Route 53 DNS zones. Here I will show a strategy for handling DNS within an EC2 infrastructure using Route 53.
Let's assume you have a registered domain name called mycompanycloud.com. You want all your EC2 instances to use that domain name to communicate with each other. Assume you launch a database instance that you want to refer to as db01.mycompanycloud.com. What you do is you add a CNAME record in the DNS zone for mycompanycloud.com and point it to the external AWS name assigned to that instance. For example:
The advantage of this method is that DNS queries for db01.mycompanycloud.com from within EC2 will eventually resolve the CNAME to the internal IP address of the instance, while DNS queries from outside EC2 will resolve it to the external IP address -- which is in general exactly what you want.
There's one more caveat: if you need the default DNS and search domain in /etc/resolv.conf to be mycompanycloud.com, you need to configure the DHCP client to use that domain, by adding this line to /etc/dhcp3/dhclient.conf:
Then edit/overwrite /etc/resolv.conf and specify:
The line in dhclient.conf will ensure that your custom resolv.conf file will be preserved across reboots -- which is not usually the case in EC2 with the default DHCP behavior (thanks to Gerald Chao for pointing out this solution to me).
Of course, you should have all this in the Chef or Puppet recipes you use when you build out a new instance.
I've been applying this strategy for a while and it works out really well, and it also allows me to not run and take care of my own BIND servers in EC2.
Let's assume you have a registered domain name called mycompanycloud.com. You want all your EC2 instances to use that domain name to communicate with each other. Assume you launch a database instance that you want to refer to as db01.mycompanycloud.com. What you do is you add a CNAME record in the DNS zone for mycompanycloud.com and point it to the external AWS name assigned to that instance. For example:
# route53 add_record ZONEID db01.mycompanycloud.com CNAME ec2-51-10-11-89.compute-1.amazonaws.com 3600
The advantage of this method is that DNS queries for db01.mycompanycloud.com from within EC2 will eventually resolve the CNAME to the internal IP address of the instance, while DNS queries from outside EC2 will resolve it to the external IP address -- which is in general exactly what you want.
There's one more caveat: if you need the default DNS and search domain in /etc/resolv.conf to be mycompanycloud.com, you need to configure the DHCP client to use that domain, by adding this line to /etc/dhcp3/dhclient.conf:
supersede domain-name "mycompanycloud.com ec2.internal compute-1.internal" ;
Then edit/overwrite /etc/resolv.conf and specify:
nameserver 172.16.0.23 domain mycompanycloud.com search mycompanycloud.com ec2.internal compute-1.internal
The line in dhclient.conf will ensure that your custom resolv.conf file will be preserved across reboots -- which is not usually the case in EC2 with the default DHCP behavior (thanks to Gerald Chao for pointing out this solution to me).
Of course, you should have all this in the Chef or Puppet recipes you use when you build out a new instance.
I've been applying this strategy for a while and it works out really well, and it also allows me to not run and take care of my own BIND servers in EC2.
Monday, June 20, 2011
Managing Amazon Route 53 DNS with boto
Here's a quick post that shows how to manage Amazon Route 53 DNS zones and records using the ever-useful boto library from Mitch Garnaat. Route 53 is a typical pay-as-you-go inexpensive AWS service which you can use to host your DNS zones. I wanted to play with it a bit, and some Google searches revealed two good blog posts: "Boto and Amazon Route53" by Chris Moyer and "Using boto to manage Route 53" by Rob Ballou. I want to thank those two guys for blogging about Route 53, their posts were a great help to me in figuring things out.
Install boto
My machine is running Ubuntu 10.04 with Python 2.6. I ran 'easy_install boto', which installed boto-2.0rc1. This also installs several utilities in /usr/local/bin, of interest to this article being /usr/local/bin/route53 which provides an easy command-line-oriented way of interacting with Route 53.
Create boto configuration file
I created ~/.boto containing the Credentials section with the AWS access key and secret key:
Interact with Route 53 via the route53 utility
If you just run 'route53', the command will print the help text for its usage. For our purpose, we'll make sure there are no errors when we run:
If you don't have any DNS zones already created, this will return nothing.
Create a new DNS zone with route53
We'll create a zone called 'mytestzone':
Note that you will have to properly register 'mytestzone.com' with a registrar, then point the name server information at that registrat to the name servers returned when the Route 53 zone was created (in our case the 4 name servers above).
At this point, if you run 'route53 ls' again, you should see your newly created zone. You need to make note of the zone ID:
You can also get the existing records from a given zone by running the 'route53 get' command which also takes the zone ID as an argument:
Adding and deleting DNS records using route53
Let's add an A record to the zone we just created. The route53 utility provides an 'add_record' command which takes the zone ID as an argument, followed by the name, type, value and TTL of the new record, and an optional comment. The TTL is also optional, and defaults to 600 seconds if not specified. Here's how to add an A record with a TTL of 3600 seconds:
Now if you run 'route53 get MYZONEID' you should see your newly added record.
To delete a record, use the 'route53 del_record' command, which takes the same arguments as add_record. Here's how to delete the record we just added:
Managing Route 53 programmatically with boto
As useful as the route53 command-line utility is, sometimes you need to interact with the Route 53 service from within your program. Since this post is about boto, I'll show some Python code that uses the Route 53 functionality.
Here's how you open a connection to the Route 53 service:
(this assumes you have the AWS credentials in the ~/.boto configuration file)
Here's how you retrieve and walk through all your Route 53 DNS zones, selecting a zone by name:
(note that you need the ending period in the zone name that you're looking for, as in "mytestzone.com.")
Here's how you add a CNAME record with a TTL of 60 seconds to an existing zone (assuming the 'zone' variable contains the zone you're looking for). You need to operate on the zone ID, which is the identifier following the text '/hostedzone/' in the 'Id' field of the variable 'zone'.
To delete a record, you use the exact same code as above, but with "DELETE" instead of "CREATE".
I leave other uses of the 'route53' utility and of the boto Route 53 API as an exercise to the reader.
Install boto
My machine is running Ubuntu 10.04 with Python 2.6. I ran 'easy_install boto', which installed boto-2.0rc1. This also installs several utilities in /usr/local/bin, of interest to this article being /usr/local/bin/route53 which provides an easy command-line-oriented way of interacting with Route 53.
Create boto configuration file
I created ~/.boto containing the Credentials section with the AWS access key and secret key:
# cat ~./boto [Credentials] aws_access_key_id = "YOUR_ACCESS_KEY" aws_secret_access_key = "YOUR_SECRET_KEY"
Interact with Route 53 via the route53 utility
If you just run 'route53', the command will print the help text for its usage. For our purpose, we'll make sure there are no errors when we run:
# route53 ls
If you don't have any DNS zones already created, this will return nothing.
Create a new DNS zone with route53
We'll create a zone called 'mytestzone':
# route53 create mytestzone.com Pending, please add the following Name Servers: ns-674.awsdns-20.net ns-1285.awsdns-32.org ns-1986.awsdns-56.co.uk ns-3.awsdns-00.com
Note that you will have to properly register 'mytestzone.com' with a registrar, then point the name server information at that registrat to the name servers returned when the Route 53 zone was created (in our case the 4 name servers above).
At this point, if you run 'route53 ls' again, you should see your newly created zone. You need to make note of the zone ID:
root@m2:~# route53 ls ================================================================================ | ID: MYZONEID | Name: mytestzone.com. | Ref: my-ref-number ================================================================================ {}
You can also get the existing records from a given zone by running the 'route53 get' command which also takes the zone ID as an argument:
# route53 get MYZONEID Name Type TTL Value(s) mytestzone.com. NS 172800 ns-674.awsdns-20.net.,ns-1285.awsdns-32.org.,ns-1986.awsdns-56.co.uk.,ns-3.awsdns-00.com. mytestzone.com. SOA 900 ns-674.awsdns-20.net. awsdns-hostmaster.amazon.com. 1 7200 900 1209600 86400
Adding and deleting DNS records using route53
Let's add an A record to the zone we just created. The route53 utility provides an 'add_record' command which takes the zone ID as an argument, followed by the name, type, value and TTL of the new record, and an optional comment. The TTL is also optional, and defaults to 600 seconds if not specified. Here's how to add an A record with a TTL of 3600 seconds:
# route53 add_record MYZONEID test.mytestzone.com A SOME_IP_ADDRESS 3600 {u'ChangeResourceRecordSetsResponse': {u'ChangeInfo': {u'Status': u'PENDING', u'SubmittedAt': u'2011-06-20T23:01:23.851Z', u'Id': u'/change/CJ2GH5O38HYKP0'}}}
Now if you run 'route53 get MYZONEID' you should see your newly added record.
To delete a record, use the 'route53 del_record' command, which takes the same arguments as add_record. Here's how to delete the record we just added:
# route53 del_record Z247A81E3SXPCR test.mytestzone.com. A SOME_IP_ADDRESS {u'ChangeResourceRecordSetsResponse': {u'ChangeInfo': {u'Status': u'PENDING', u'SubmittedAt': u'2011-06-21T01:14:35.343Z', u'Id': u'/change/C2B0EHROD8HEG8'}}}
Managing Route 53 programmatically with boto
As useful as the route53 command-line utility is, sometimes you need to interact with the Route 53 service from within your program. Since this post is about boto, I'll show some Python code that uses the Route 53 functionality.
Here's how you open a connection to the Route 53 service:
from boto.route53.connection import Route53Connection conn = Route53Connection()
(this assumes you have the AWS credentials in the ~/.boto configuration file)
Here's how you retrieve and walk through all your Route 53 DNS zones, selecting a zone by name:
ROUTE53_ZONE_NAME = "mytestzone.com." zones = {} conn = Route53Connection() results = conn.get_all_hosted_zones() zones = results['ListHostedZonesResponse']['HostedZones'] found = 0 for zone in zones: print zone if zone['Name'] == ROUTE53_ZONE_NAME: found = 1 break if not found: print "No Route53 zone found for %s" % ROUTE53_ZONE_NAME
(note that you need the ending period in the zone name that you're looking for, as in "mytestzone.com.")
Here's how you add a CNAME record with a TTL of 60 seconds to an existing zone (assuming the 'zone' variable contains the zone you're looking for). You need to operate on the zone ID, which is the identifier following the text '/hostedzone/' in the 'Id' field of the variable 'zone'.
from boto.route53.record import ResourceRecordSets zone_id = zone['Id'].replace('/hostedzone/', '') changes = ResourceRecordSets(conn, zone_id) change = changes.add_change("CREATE", 'test2.%s' % ROUTE53_ZONE_NAME, "CNAME", 60) change.add_value("some_other_name") changes.commit()
To delete a record, you use the exact same code as above, but with "DELETE" instead of "CREATE".
I leave other uses of the 'route53' utility and of the boto Route 53 API as an exercise to the reader.
Wednesday, June 01, 2011
Technical books that influenced my career
Here's a list of 25 technical books that had a strong influence on my career, presented in a somewhat chronological order of my encounters with them:
- "The Art of Computer Programming", esp. vol. 3 "Sorting and Searching" - Donald Knuth
- "Operating Systems" - William Stallings
- "Introduction to Algorithms" - Thomas Cormen et al.
- "The C Programming Language" - Brian Kernighan and Dennis Ritchie
- "Programming Windows" - Charles Petzold
- "Writing Solid Code" - Steve Maguire
- "The Practice of Programming" - Brian Kernighan and Rob Pike
- "Computer Networks - a Systems Approach" - Larry Peterson and Bruce Davie
- "TCP/IP Illustrated" - W. Richard Stevens
- "Distributed Systems - Concepts And Design" - George Coulouris et al.
- "DNS and BIND" - Cricket Liu and Paul Albitz
- "UNIX and Linux System Administration Handbook" - Evi Nemeth et al.
- "The Mythical Man-Month" - Fred Brooks
- "Programming Perl" - Larry Wall et al.
- "Counter Hack Reloaded: a Step-by-Step Guide to Computer Attacks and Effective Defenses" - Edward Skoudis and Tom Liston
- "Programming Python" - Mark Lutz
- "Lessons Learned in Software Testing" - Cem Kaner, James Bach, Bret Pettichord
- "Refactoring - Improving the Design of Existing Code" - Martin Fowler
- "The Pragmatic Programmer" - Andrew Hunt and David Thomas
- "Becoming a Technical Leader" - Gerald Weinberg
- "Extreme Programming Explained" - Kent Beck
- "Programming Amazon Web Services" - James Murty
- "Building Scalable Web Sites" - Cal Henderson
- "RESTful Web Services" - Leonard Richardson, Sam Ruby
- "The Art of Capacity Planning" - John Allspaw
What is your list?
Tuesday, May 24, 2011
Setting up RAID 0 across ephemeral drives on EC2 instances (and surviving reboots!)
I've been experimenting with setting up RAID 0 across ephemeral drives on EC2 instances. The initial setup, be it with mdadm and lvm, or directly with lvm, is not that hard -- what has proven challenging is surviving reboots. Unless you perform certain tricks, your EC2 instance will be blissfully unaware of its new setup after a reboot. What's more, if you try to mount the new striped volume at boot time by adding it to /etc/fstab, chances are you won't even be able to ssh into the instance anymore. It happened to me many times while experimenting, hence this blog post.
Update: I realize I didn't go into details about the use case of this type of setup. This is useful if you don't want to incur EBS performance and reliability penalties, and yet you have a data set that is larger than the 400 GB offered by an individual ephemeral drive. Of course, if your instance dies, so do the ephemeral drives (after all they are named like this for a reason...) -- so make sure you have a good backup/disaster recovery strategy for the data you store there!
I also assume you want to create 1 volume group encompassing the RAID 0 array, and within that volume group you want to create 2 logical volumes with associated XFS file systems, and also 1 logical volume for swap.
Step 1 - unmount /dev/sdb
# umount /dev/sdb
I hope this will be useful to somebody out there and will avoid some head-against-the-wall moments that I had to go through....
Update: I realize I didn't go into details about the use case of this type of setup. This is useful if you don't want to incur EBS performance and reliability penalties, and yet you have a data set that is larger than the 400 GB offered by an individual ephemeral drive. Of course, if your instance dies, so do the ephemeral drives (after all they are named like this for a reason...) -- so make sure you have a good backup/disaster recovery strategy for the data you store there!
In the following, I will assume you want to set up RAID 0 across the four ephemeral drives that come with an EC2 m1.xlarge instance, and which are exposed as devices /dev/sdb through /dev/sde. By default, /dev/sdb is mounted as /mnt, while the other drives aren't mounted.
I also assume you want to create 1 volume group encompassing the RAID 0 array, and within that volume group you want to create 2 logical volumes with associated XFS file systems, and also 1 logical volume for swap.
Step 1 - unmount /dev/sdb
# umount /dev/sdb
(also comment out the entry corresponding to /dev/sdb in /etc/fstab)
Step 2 - install lvm2 and mdadm
For an unattended install of these packages (slightly complicated by the fact that mdadm also needs postfix), I do:
# DEBIAN_FRONTEND=noninteractive apt-get -y install mdadm lvm2
Step 3 - manually load the dm-mod module
# modprobe dm-mod
(this seems to be a bug in devmapper in Ubuntu)
If you want to set up RAID 0 via lvm directly, you can skip steps 4 and 5. From what I've read, you get better performance if you do the RAID 0 setup with mdadm. Also, if you need any other RAID level, you need to use mdadm.
Step 4 - configure RAID 0 array via mdadm
# mdadm --create /dev/md0 --level=0 --chunk=256 --raid-devices=4 /dev/sdb /dev/sdc /dev/sdd /dev/sde
Verify:
# mdadm --detail /dev/md0
/dev/md0:
Version : 00.90
Creation Time : Mon May 23 22:35:20 2011
Raid Level : raid0
Array Size : 1761463296 (1679.86 GiB 1803.74 GB)
Raid Devices : 4
Total Devices : 4
Preferred Minor : 0
Persistence : Superblock is persistent
Update Time : Mon May 23 22:35:20 2011
State : clean
Active Devices : 4
Working Devices : 4
Failed Devices : 0
Spare Devices : 0
Chunk Size : 256K
UUID : 03f63ee3:607fb777:f9441841:42247c4d (local to host adb08lvm)
Events : 0.1
Number Major Minor RaidDevice State
0 8 16 0 active sync /dev/sdb
1 8 32 1 active sync /dev/sdc
2 8 48 2 active sync /dev/sdd
3 8 64 3 active sync /dev/sde
Step 5 - increase block size to 64 KB for better performance
# blockdev --setra 65536 /dev/md0
Step 6 - create physical volume from the RAID 0 array
# pvcreate /dev/md0
(if you didn't want to use mdadm, you would call pvcreate against each of the /dev/sdb through /dev/sde devices)
Step 7 - create volume group called vg0 spanning the RAID 0 array
# vgcreate vg0 /dev/md0
(if you didn't want to use mdadm, you would run vgcreate and specify the 4 devices /dev/sdb through /dev/sde)
Verify:
# vgscan
Reading all physical volumes. This may take a while...
Found volume group "vg0" using metadata type lvm2
# pvscan
PV /dev/md0 VG vg0 lvm2 [1.64 TiB / 679.86 GiB free]
Total: 1 [1.64 TiB] / in use: 1 [1.64 TiB] / in no VG: 0 [0 ]
Step 8 - create 3 logical volumes within the vg0 volume group
Each local drive is 400 GB, so the total size for the volume group is 1.6 TB. I'll create 2 logical volumes at 500 GB each, and a 10 GB logical volume for swap.
# lvcreate --name data1 --size 500G vg0
# lvcreate --name data2 --size 500G vg0
# lvcreate --name swap --size 10G vg0
Verify:
# lvscan
ACTIVE '/dev/vg0/data1' [500.00 GiB] inherit
ACTIVE '/dev/vg0/data2' [500.00 GiB] inherit
ACTIVE '/dev/vg0/swap' [10.00 GiB] inherit
Step 9 - create XFS file systems and mount them
We'll create XFS file systems for the data1 and data2 logical volumes. The names of the devices used for mkfs are the ones displayed via the lvscan command above. Then we'll mount the 2 file systems as /data1 and /data2.
# mkfs.xfs /dev/vg0/data1
# mkfs.xfs /dev/vg0/data2
# mkdir /data1
# mkdir /data2
# mount -t xfs -o noatime /dev/vg0/data1 /data1
# mount -t xfs -o noatime /dev/vg0/data2 /data2
Step 10 - create and enable swap partition
# mkswap /dev/vg0/swap
# swapon /dev/vg0/swap
At this point, you should have a fully functional setup. The slight problem is that if you add the newly created file systems to /etc/fstab and reboot, you may not be able to ssh back into your instance -- at least that's what happened to me. I was able to ping the IP of the instance, but ssh would fail.
I finally redid the whole thing on a new instance (I created the RAID 0 directly with lvm, bypassing the mdadm step), but didn't add the file systems to /etc/fstab. After rebooting and running lvscan, I noticed that the logical volumes I had created were all marked as 'inactive':
# lvscan
inactive '/dev/vg0/data1' [500.00 GiB] inherit
inactive '/dev/vg0/data2' [500.00 GiB] inherit
inactive '/dev/vg0/swap' [10.00 GiB] inherit
This was after I ran 'modprobe dm-mod' manually, otherwise the lvscan command would complain:
/proc/misc: No entry for device-mapper found
Is device-mapper driver missing from kernel?
Failure to communicate with kernel device-mapper driver.
A Google search revealed this thread which offered a solution: run 'lvchange -ay' against each logical volume so that the volume becomes active. Only after doing this I was able to see the logical volumes and mount them.
So I added these lines to /etc/rc.local:
/sbin/modprobe dm-mod
/sbin/lvscan
/sbin/lvchange -ay /dev/vg0/data1
/sbin/lvchange -ay /dev/vg0/data2
/sbin/lvchange -ay /dev/vg0/swap
/bin/mount -t xfs -o noatime /dev/vg0/data1 /data1
/bin/mount -t xfs -o noatime /dev/vg0/data2 /data2
/sbin/swapon /dev/vg0/swap
After a reboot, everything was working as expected. Note that I am doing the mounting of the file systems and the enabling of the swap within the rc.local script, and not via /etc/fstab. If you try to do it in fstab, it is too early in the boot sequence, so the logical volumes will be inactive and the mount will fail, with the dire consequence that you won't be able to ssh back into your instance (at least in my case).
This was still not enough when creating the RAID 0 array with mdadm. When I used mdadm, even when adding the lines above to /etc/rc.local, the /dev/md0 device was not there after the reboot, so the mount would still fail. The thread I mentioned above does discuss this case at some point, and I also found a Server Fault thread on this topic. The solution in my case was to modify the mdadm configuration file /etc/mdadm/mdadm.conf and:
a) change the DEVICE variable to point to my 4 devices:
DEVICE /dev/sdb /dev/sdc /dev/sdd /dev/sde
b) add an ARRAY variable containing the UUID of /dev/md0 (which you can get via 'mdadm --detail /dev/md0'):
ARRAY /dev/md0 level=raid0 num-devices=4 UUID=03f63ee3:607fb777:f9441841:42247c4d
This change, together with the custom lines in /etc/rc.local, finally enabled me to have a functional RAID 0 array and functional file systems and swap across the ephemeral drives in my EC2 instance.
I hope this will be useful to somebody out there and will avoid some head-against-the-wall moments that I had to go through....
Subscribe to:
Posts (Atom)
Modifying EC2 security groups via AWS Lambda functions
One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...
-
Here's a good interview question for a tester: how do you define performance/load/stress testing? Many times people use these terms inte...
-
I will give an example of using Selenium to test a Plone site. I will use a default, out-of-the-box installation of Plone, with no customiza...
-
Update 02/26/07 -------- The link to the old httperf page wasn't working anymore. I updated it and pointed it to the new page at HP. Her...