2008-12-29

How to force a reboot when you have a RAID array attached

When the system gets stuck (yes, this happens in our beloved operating system too) who do you call?

- Sysrq of course!

My usual approach would be:
1) sync all filesystems: alt + sysrq + s (multiple times, the actual value depends on my mood :-)
2) remount all filesystems read-only: alt + sysrq + u
3) cause a reboot: alt + sysrq + b

BUT, this comes with an issue if you're doing software RAID (md).

md devices register themselves (via register_reboot_notifier) to be sync'ed whenever a reboot is about to occur. Also, if the array was being reconstructed (or re-synced) during the reboot operation, you get a nice checkpoint, so that after reboot the operation will continue off from where it left. This is all fine and dandy, but it seems that the sync/checkpointing operation does not occur if you've previously called for an emergency read-only remount of filesystems.

[ It seems that alt + sysrq + u, also makes the raid devices read-only... and thus, not being able to perform any writable operation (such as a sync or checkpoint). BUT, I've yet to find the part of the code that is the cause of this. I'll have another look though, as soon as I figure out what's going on with libata lately. Since 2.6.25 I've been having timeout troubles with my SATA drive (yes, a seagate one). The lkml-suggested 'noncq' option hasn't helped much.. ]

ANYWAY, back to our subject. So you have a raid-array and you want to reboot?
- Just sync (alt + sysrq + s) and reboot (alt + sysrq + b) without doing any RO-remounting.

No comments: