Upgrading PostgreSQL from 8.4 to 9.1

A few annoyances can be found along the way, but overall it's a reasonable process and I should have RTFM before diving in.

I was sitting back the other day and realized that I hadn't done much with Zotonic in a while.  This kind of time away from it was making me worry that I was losing my edge and would be of little help to the community.

As something of a minor foray back into Zotonic this evening I decided I would upgrade PostgreSQL and then Zotonic to see what has changed.  I started by installing the PostgreSQL 9.1 packages.  The trouble started immediately.  I got a glaring big warning that my 8.4 binaries were deprecated and that my cluster would need to be upgraded.  Some good news came with that: a tool called "pg_upgradecluster" was supposed to make this process a breeze.

I started pg_upgradecluster with great hopes of success.  First problem: dies because the 9.1 main cluster already exists.  No problem I pg_dropcluster that and try again.  Sweet!  Everything seems to be ticking along nicely now, then BANG!

Nothing actually is ever as simple as it seems however and I ran into a fun crash with SHMMAX and SHMMIN warnings.  I wind up having to alter sysctl to get it working.  This seems like a relatively extreme requirement for a database migration.  A quick trip to http://www.postgresql.org/docs/8.2/static/kernel-resources.html quickly answers that problem.

Now I'm really rolling. Right?  Actually pg_hba.conf got nuked during the pg_upgradecluster.  Not just the 9.1 main upgraded version, but the original in 8.4 main which seems nuts to me.  That would be a very frustrating situation if this were a serious business server I am running.  Then again, I'd have everything in version control and restore in.  I think this may be a case of "the cobblers children go unshod," but I digress.  I restore pg_hba.conf to it's former glory and move on to try it out.

Very odd, I'm getting pg_hba errors.  Check pg_hba.conf, all good, restart postgres, try again, fail!  I go through this cycle like a hamster in a wheel several times.  What is they say about people who try the same thing more than three times and expect different results?  Oh yeah, back on topic.  I then did the IT Crowd solution of "try turning it off and then on again" and rebooted the server.  All should be perfect now, right?

Wrong!  psql is now reporting the dreaded PostgreSQL not running with reference to a missing pipe file.  This is starting to be deja vu for MySQL bumbling eleven years ago for me.  I run the init script and all seems well.  I check /var/log for postgres and everything seems okay in there minus the not logging at all bit.  I try to run pg_ctlcluster 9.1 main start directly and got the SHMMAX and SHMMIN story again.  Clearly my sysctl skills are rusty and I failed to make the changes stick.  I revisited http://www.postgresql.org/docs/8.2/static/kernel-resources.html and ran the echos that work on /proc instead and then ran sysctl -p to reload it.  This actually fixed everything including running psql from my zotonic user account.

The moral of this story is read the upgrade documentation before making seriously unfounded assumptions about the directness of an upgrade and the failure modes of tools.  None of the postgres tools outside pg_upgradecluster and pg_ctlcluster indicated a memory allocation failure and none of the logs I could see showed it either.  Had I read the docs ahead of time I could have saved myself quite a bit of trouble.

Always review the upgrade documentation for software carefully before performing an upgrade and you will save yourself time and headaches in troubleshooting.