Oct 112012
 

Have you ever had your root hard drive fail and found that your shell prompt has forgotten every command? It isn’t fun. What you first notice is that whatever you type, it’s the same answer.

bash: /bin/vi: Input/output error

Eek.

Or a similar effect has been witnessed after the famous “rm -rf *” got executed under the /usr directory. This kind of thing would happen:

  # ls 
bash: ls: command not found

This is because most of the commands you use at the shell command line are binary files residing under /usr/sbin, /usr/bin, /sbin or /bin, depending on your distribution. Once they’re gone, you’re dead in the water.

So if all you have is a CLI shell prompt, but no commands work, what can you actually do?

If the problem is due to a corrupted disk, you may be able to resolve the issue by rebooting into single user mode and running an fsck on the root filesystem. In the case of missing binary directories you’d probably need to rebuild your host completely. But before you do restore from your backups (you’ve got backups, right?), you may need to make a few tweaks or check a few file contents in order to get things back up and running again.

So you can’t use “ls” or “man”. Nor will you be able to view the contents of any file – “cat”, “less”, “more” and “vi” are all binary executables. But try changing directory:

  /var# cd /tmp
  /tmp#

Success! So some commands do work.

The reason that cd works, but the other commands don’t, is that cd is a shell internal command – it’s built in to the shell itself: Bash, Korn, or whatever you’re using. Other commands are actually binary files that get called implicitly because their directories (/usr, /bin, /usr/bin) are in the environment’s path. The distinction between the two is unnoticeable under normal operation. But of course can become painfully apparent when tragedy strikes.

To further investigate which commands are built-ins, and which are binaries, use the “type” command:

Some are binaries:

  # type -a vi
vi is /usr/bin/vi

Some are builtins:

  # type -a getopts
getopts is a shell builtin

And some are both:

# type -a echo
echo is a shell builtin
echo is /bin/echo

The full list is documented here, and further examples given here. You can also type help from the Bash prompt to display a list of commands. Most of these are for low level execution control like if, while, and for, but others like getopts or echo have a more composite function. Also, access to input and filesystem can be controlled by file handle redirect operations (>, <, | ).

So the question now is, given this rather paltry list of shell controls and conditionals, what can you achieve in an emergency, when all you have is a logged in bash shell? The below examples offer a few possibilities.

Change directory

You can navigate around the filesystem with:

  # cd /tmp

As per usual.

List files

List files in the current directory with:

  # echo *

Display file contents

Cat, less, more and vi won’t work, but you can output the contents of files like this:

# while read line; do echo $line; done < file.txt

Better yet, create an alias to this so you can reuse it:

# alias newcat='while read line; do echo $line; done'
# newcat < file.txt

Copy a file

Redirect the above command into a new file:

# while read line; do echo $line; done < file.txt > newfile.txt

Create a file

echo and output redirect can be used to create a file:

  # echo "Sample text" > file.txt

Edit a file

Editing a file is a bit trickier, but can be done by identifying the line you want to change by a unique string, and then completely replacing that line:

while read line; do
   if [[ "$line" =~ .*STRING_FIND.* ]]; then
      echo "STRING_REPLACEMENT"
   else
      echo $line
   fi
done < /tmp/resolv.txt > /tmp/newfile.txt

This may be useful, for example, if you needed to edit your /etc/fstab before a reboot.

Delete file contents

Without an rm command, you can use the shorthand for sending /dev/null to a file, truncating it, like this:

  #  > /tmp/file.txt

Reboot

Most frustrating of all, you won’t be able to use any of the reboot commands – reboot, shutdown or init. If you’ve got console access, or are on a VM, fine, but otherwise, there’s only that sinking feeling you get when you realise a trip to the data centre is in order again.

But there is a little-known kernel option known as the “Magic Sysrq”. Enable it like this:

  #  echo 1 > /proc/sys/kernel/sysrq

Then reboot with this:

  # echo b > /proc/sysrq-trigger

Hopefully, at this point you’ll be able to boot back into single-user maintenance mode and fsck the filesystem. Or, if things are really messed up, reinstall your operating system completely and restore from backup. Recovery procedures are out of the scope of this little post though.

If you’re feeling adventerous, you could test this out in a virtual machine, preferably after snapshotting it, and seeing what you can actually manage to do without the binary directories. It could be good practise for an impending catastrophe.

I’d welcome any comments below. I’m curious to discover what other useful things you would be able to do in an emergency using only bash builtins.

[flattr uid=’matthewparsons’ /] How about a microdonation? Baby needs a new pair of shoes.


Matt Parsons is a freelance Linux specialist who has designed, built and supported Unix and Linux systems in the finance, telecommunications and media industries.

He lives and works in London.

  5 Responses to “Negotiating a damaged Linux filesystem using only shell builtins”

  1. Very good article Matt.
    But how about having a knoppix boot disk as another tool??
    Perhaps this is another post topic?

    • Thanks for the kind words, John. A Knoppix boot disk is an outstanding solution to recovering a stricken system, but when I wrote the post, I wanted to focus on what to do _before_ you bail out. It always seemed to me that there is plenty of information on the Web about booting into single user mode and remounting disks, but I wanted to look at what, if anything, could be done beforehand. Turns out there’s not a lot. Still, you never know when a knowledge of bash builtins can be useful.

  2. your ‘newcat’ doesn’t work for binary files ;/

    • That’s true. But then, “cat”, “less” and “more” won’t display binary files either.

      • No, but binary files can still be copied using cat. However, ‘newcat’ cannot copy binary files correctly. Seems like there is no shell-builtin that can do this.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>