Saturday, February 28, 2009

Openldap migration notes

Overview:
In much the same vein as my previous post, I will be migrating an openldap directory from one server to another (zone to zone in actual fact, not that the implementation is any different).
Unlike the previous mysql migration however, the source zone is running snv_95, and at that time Sun wasn't bundling openldap at all; the snv_95 zone is running a blastwave packaged copy of openldap, and it's at version 2.3.39.

The destination zone is running snv_105, which now comes with a native SUNWopenldap(r|u) 2.4.11 implementation, so this should be a nice upgrade along the way.

The source server was a dedicated zone named ldap. The destination will be my database zone (db) that we created earlier. I plan on eventually moving away from using openldap altogether, but for tonight I just simply needed to move it between servers so I could remove the last of the zones from what is actually primarily my file server.

The migration
On the source server, I dumped the entire directory using:
root@ldap ~]#ldapsearch -b dc=griffous,dc=net -h ldap.griffous.net objectClass=* > ldapdump.ldif
This works much in the same way as mysqldump as it happens.

Next I manually editted across my changes from the default in the openldap /etc config files.
/opt/csw/etc/openldap/ldap.conf to the new /etc/openldap/ldap.conf
/opt/csw/etc/openldap/sldap.conf to the new /etc/openldap/sldap.conf (the main file)

I'd just like to highlight again that this openldap directory exists solely to serve my existing mail infrastruture, so I was able to configure it in an identical fashion, including all schemas and copying across password hashes.
Were I starting again with this, I would probably look at setting up full SSL and better security. Tonight we're just moving it, warts and all.

I also had to scp across authldap.schema, since that's a non-default schema.

With the config files modified to suit, start the server with 
root@db:/etc/openldap# svcadm enable ldap/server

I ran into issues at this point, and the smf and dmesg logs were not help at all. I ran the command that smf was using to start the server, with -d 5 (debug) to print out some extra information.
This printed out:
....
<= ldap_dn2bv(cn=manager,dc=griffous,dc=net)=0 <<< cn="Manager,dc="griffous,dc="net">,
/etc/openldap/slapd.conf: line 63: can only be set when rootdn is under suffix
slapd destroy: freeing system resources.
slapd stopped.

My initial suspicion was that it was something to do with capitilisation, but this turned out to not be the case. Line 63 was the rootpassword line, which also threw me for a bit.
After head scratching and looking at the rest of the debug log I spotted the problem further up the log. In the openldap config file I had entered 
database bdb
suffix "dc=griffous.net,dc=net"
rootdn "cn=manager,dc=griffous,dc=net"

(The suffix line is wrong, and should have read "dc=griffous,dc=net"). Whoops, that one was clearly a pebkac mistake.

I fixed that, but still the server wouldn't start! The tail end of the debug this time printed:
unable to open pid file "/var/run/openldap/slapd.pid": 2 (No such file or directory)
slapd destroy: freeing system resources.
slapd stopped.

Sure enough, that directory doesn't exist... Given that this was a new install using default paths, this felt like a fairly systemic issue, so I quickly searched the opensolaris bug repository and found this bug which is exactly my problem. OKayyyyyy, so I'm on to an OS level problem rather than my own now!
I manually created the directory, and ran the command listed in the bug to check on the permissions.
As expected that directory needed to be openldap:openldap, which I quickly changed.

Once more with feeling!
I svcadm cleared the service once more, and finally the server started properly!

The data import

Next I loaded in my data with ldapadd:
root@db:~# ldapadd -h db.griffous.net -D cn=manager,dc=griffous,dc=net -w passwordhere -f ldapdump.ldif
adding new entry dc=griffous,dc=net
adding new entry cn=Manager,dc=griffous,dc=net
adding new entry o=hosting,dc=griffous,dc=net
addding new entry ....
...

Excellent, that bit was trouble free.
In theory this new directory should be ready for production use, but given that rather worrysome bug, I wanted to test it that it survived a reboot.

It didn't.

Under Solaris, /var/run is actually mounted on swap, which means that unless openldap is creating the /var/run/openldap directory when it starts, then it's going to be lost across reboots (as anything in /var/run should be).
It would seem then that openldap is not creating this directory for one reason or another when it starts each time.
The simplest answer to me seemed to be to tell opensldap to store these 2 files in the /var/openldap directory that already used for the BDB backend for directory data.

That way I don't need to mess about with changing openldap startup scripts or playing with the permissions in /var/run, as I would have needed to do if I simply changed the path to /var/ran/* rather than /var/run/openldap/*.

This solved the problem, and openldap now works across reboots.

Testing

I updated the CNAME for ldap.griffous.net to point to the db zone, and carefully watched the ldap traffic while sending myself a few test emails and logging into the server with imap.
Everything seemed fine, so I shutdown the original zone.

If everything stays working for another couple of days, I'll delete the old zone, formally completing the migration.

And that's how you move an openldap directory between hosts!

No comments: