Jun 7, 03:42 PM
Category  

Well, I just lost my Linode. About 3AM this morning it “crashed,” remounting the root filesystem read-only. I tried to fsck it:

/dev/xvda: recovering journal
fsck.ext3: Bad magic number in super-block while trying to re-open /dev/xvda
e2fsck: io manager magic bad!

I tried to reboot it:

JobID: 563947 – System Boot – My Debian 4.0 Profile
Job Entered 06/07/2008 08:29:04 AM Status Success
Host Start Date 06/07/2008 08:29:22 AM Host Finish Date 06/07/2008 08:29:25 AM
Host Duration 3 seconds Host Message helper_main(linode9813, /dev/vg1/linode9813-65249, 1, 0, 1): mount failed: File exists

I submitted a ticket and got back:

You hit a bug in Xen that causes the filesystem to go into a bad state when it’s been formatted with a blocksize of 1k. We’re still trying to track it down.
I’ve moved your data to a new partition with 4k blocksize, and your Linode has now booted correctly.

It booted.. but I couldn’t log in. Single-user mode didn’t work either:

Starting SASL Authentication Daemon: saslauthdinstall: invalid user `root’
Starting system log daemon: syslogdchown: `root:adm’: invalid user

But booting with init=/bin/bash did. There, I could replace /etc/passwd.

But as I came to find out, I also lost /etc/php/cgi/, ~/Maildir/[tmp|new], half of /var/www/, /etc/mrtg.cfg.. at this point I can’t really be sure what’s there and what’s not. Since Linode doesn’t (yet) offer full-disk backups, to be safe I have to blow it away and start from scratch. Thankfully, I have a backup from three days ago, and I was able to recover this blog, where I wrote down everything I did to set up the server. Also thankfully, redoing this won’t be as bad as going through each step, because among the things I do have backed up are /var/lib/dpkg/ and /etc/. And /home/.

First step after deploying a new Linode image is to login via Lish and transfer the backup over. Debian comes with scp, so it’s easy.

Important. Make sure dpkg --get-selections > /root/installed-software.log is in your backup schema and back up that file. If you have all of /var/lib/dpkg it can be regenerated, but it requires trickery.

Next step is restoring the installed packages. Re-do /etc/apt/sources.list and preferences and apt.conf manually. aptitude update. Then get installed-software.log over, and

# dpkg --set-selections < /backup/installed-software.log
# apt-get dselect-upgrade

30 upgraded, 213 newly installed, 2 to remove and 0 not upgraded.
Need to get 160MB of archives
After unpacking 375MB of additional disk space will be used.
Do you want to continue [Y/n]?

Y.

We will have to manually reinstall things I took from etch-backports:

# aptitude install -t etch-backports postfix lighttpd postgrey rsync irssi vim dnsmasq mysql-server-5.0 mysql-client-5.0 nmap mutt
# aptitude install -t volatile clamav clamav-freshclam clamav-daemon 

Restore /etc/passwd, /etc/shadow, and /etc/group. Now, restoring the home directories is easy, they just need to be copied over.

Restoring /etc/ may not be so easy. I’m going to go ahead and copy everything over instead of manually redoing everything. But I will use rsync so I don’t blindly cp -ar. This may or may not be a bad idea, we’ll find out later.

# rsync -av /backup/etc/* /etc/ 

Using rsync locally like this has the advantage of only updating changed files, and not blowing away files that aren’t in the backup.

Reboot and cross your fingers.

For me, only a few things broke. I had to redo the dpkg-statoverride line for saslauthd, and redo the crontabs. I also ran into dnsmasq having clamav’s user ID and vice versa, because I must have reinstalled them in a different order. Purging and reinstalling both fixed this. Edit: Actually, looks like more than that broke; any daemon that creates and runs as its own user needed looking at.

My backup script

#!/bin/bash
date >> /root/backup.log
NAME="karrdebackup-`date +%F`.tar.gz"
dpkg --get-selections > /root/karrde-installed-software.log
mysqldump -u root -p`cat /root/dbp` --all-databases > /root/mysqldump.sql
nice ionice -c3 tar czf /backup/$NAME /root/ /home/ /etc/ /var/log/ /var/lib/dpkg/ /var/lib/textpattern /var/lib/mysql /var/spool/cron/crontabs >> /root/backup.log 2>&1 
chmod -R 600 /backup
rsync -av --stats /backup/$NAME rsync:karrde.kiserai.net-backup.tar.gz >> /root/backup.log
date >> /root/backup.log
echo -e '\n\n' >> /root/backup.log
find /backup/ -type f -name karrde* -mtime +21 -exec rm {} \;

This depends on the schedutils package for ionice, and on having a proper definition for the “rsync” target in /root/.ssh/config. Though named the same as the program, it defines a user and hostname for the ssh under the rsync command to use.

It’s kind of simple in that it doesn’t exclude any file types. But since none of my server’s users should have any rich media bigger than pictures and since my upload speed is nothing to sneeze at, it’s good enough.

All things considered, restoring a remote hosed server today didn’t go so badly. Still broke are mysql and textpattern, but those will be done after dinner.

Comment

Commenting is closed for this article.