I started to look into EnterpriseDB recently. Pretty pleased with it so far. At first I launched the beta version of their Postgres Plus Cloud Database product, but since this version is in the process of being decomissioned, I've had to transfer the database I had already created to a newly created cluster in their DBaaS model -- which basically means that the cluster manager is maintained by them, and the cluster member servers (1 master + N replicas) are part of your EC2 footprint.
In any case, I did a pg_dump of the database from the initial master, then I tried to load the dump via psql into a newly created database on the new master. However, the client_encoding parameter in postgresql.conf was SQL_ASCII on the first master, and UTF8 on the second. This posed a problem. The psql load operation failed with errors of the type:
ERROR: invalid byte sequence for encoding "UTF8": 0xe92044
CONTEXT: COPY table1, line 6606
ERROR: invalid byte sequence for encoding "UTF8": 0xa0
CONTEXT: COPY table2, line 978
ERROR: invalid byte sequence for encoding "UTF8": 0xd454
CONTEXT: COPY table3, line 3295
Obviously the encodings didn't match. I first tried to re-run the pg_dump on the first master (which had client_encode = 'SQL_ASCII') and specified "--encoding utf8" on the pg_dump command line. This didn't do the trick. I had the same exact errors when loading the dump on the second master.
One solution suggested by EnterpriseDB was to set client_encoding to SQL_ASCII on the new master, restart Postgres and retry the load. I found another solution though in a blog post very aptly titled (for my purposes) 'PostgreSQL database migration, the SQL_ASCII to UTF8 problem'. What I ended up doing, following the advice in the post, was to install the GNU recode utility (I did a yum install recode), then run the initial dump through recode, converting it from ascii to utf8. Something like this:
cat dump.sql | recode iso-8859-1..u8 > utf8_withrecode.sql
Then I modified the line
SET client_encoding = 'SQL_ASCII';
and turned it into:
SET client_encoding = 'UTF8';
after which loading the dump into the new master with psql worked like a charm.
Anyway...now for the fun part of doing some load testing against this Postgres cluster!
Thursday, March 08, 2012
PostgreSQL dump/restore and client_encoding
Subscribe to: Post Comments (Atom)
Modifying EC2 security groups via AWS Lambda functions
One task that comes up again and again is adding, removing or updating source CIDR blocks in various security groups in an EC2 infrastructur...
This post is a continuation of my previous one on " Running Gatling tests in Docker containers via Jenkins ". As I continued to se...
For the last month or so I've been experimenting with Rancher as the orchestration layer for Docker-based deployments. I've been pr...
Here's a good interview question for a tester: how do you define performance/load/stress testing? Many times people use these terms inte...
Thanks for this interesting one. It is quite helpful update.
Post a Comment