For the purpose of this post, I’m going to pretend I don’t know what Mongo DB is. It’s a NoSQL database, very good at horizontal scaling. But let’s pretend. I’m selecting it as an example because it’s not too untidy, and helps me make my point.
You’re in the deep end at a new job, you’re on your own, there’s no documentation, and the monitoring shows that a host is running at high load. You’ve never come across this program before, and have no idea where to start. First thing to do, top:
We can see that the mongod process is running pretty hot. We’re not concerned at this point why it’s having problems. I just want to know what it is, what it does, and where its files are.
I’ll run ps with a full listing to find out what its command line arguments look like, and if it’s been invoked with a full path, which may indicate where it’s installed.
This show that the executable resides at /usr/bin/mongod and that it takes a configuration file /etc/mongod.conf. So not too perplexing.
Search on pattern
But let’s assume instead that we couldn’t obtain the full path of the mongod process. The next thing I like to do is just a big find command on the root directory and see what turns up, using “*mongo*” as our search pattern:
That’s actually pretty comprehensive (I’m liking mongo more and more) and logical. We can see that the log files are in the right place, and there are init scripts. But if these executables and logs weren’t named *mongo*, then this wouldn’t have been so straightforward.
So a better way to get the list of files associated with an application is to find out which package the /usr/bin/mongod comes from. On RPM based systems, like this:
Or on Debian-based systems, you’d use:
# dpkg -S /usr/bin/mongod
Turns out the software is called mongo-10gen-server. Let’s query the contents of that package (and, in a slight jump, a related package) to find out what other files are installed out of this box:
This is the official list of files that were created at installation time.
At this point, it would probably be worth perusing the configuration file /etc/mongod.conf and the log file /var/log/mongod.log for clues and comments about what the program does, and how.
List all open filehandles
The tool lsof is fantastic for seeing everything a process has open – files, sockets and pipes. Executing ps, we can see that the process ID of mongod is 6569, so we invoke lsof with the PID as an argument:
Again, we can see that mongod is holding open its logfile, /var/log/mongod.log. Also, the two lines with LISTEN show that the process is listening on TCP ports 27017 and 28017. This information could also have been obtained by typing:
# netstat -nlp
Application Network Traffic
We know from lsof and netstat that the process is listening on port 27017 and 28017. Let’s just take port 27017 and see if there’s any traffic coming in on this, and from where.
Bearing in mind that my testbox has the IP address of 10.243.52.51, we can see that incoming traffic is emanating from 10.243.24.69. I’d be logging in to that host to find out more about it and what it thinks it’s doing talking to this mongo thing.
So we’ve got this far, and we know how the process is invoked and where it writes to. I’ve often found that the tricky part about lsof is that it only show files that are held open. If a configuration file is read once on startup, then it won’t show up in lsof. It can be handy to know where a program’s inputs are coming from. Attaching the strace program to a process when it starts up can reveal all sorts of information. In this example, yes it’s obvious that mongod has had the /etc/mongod.conf configuration file passed on the command line. But the point is that even if it hadn’t, strace would reveal that the file had been opened.
There are many options that can be passed to strace, but “-e open” narrows it to filehandles being opened only, which is a bit more manageable. By running it with the “-f” option, it will also drill down to any forked processes.
Reading embedded text in the application binary
Here’s one more trick that I like to use when I’m desperate. If you’ve got no manpage, a feeble “–help” and no “Usage”, then sometimes this may be of assistance. Run the strings command against the application binary, use a grep and a less for practicality, and see if you can extract anything useful – comments, expected arguments, anything:
In the mongod case, we get a few command line options (–replSet, etc). If I was really trying to ascertain how to use a program, some of these may be helpful. Again, not the best example, but it’s sometimes worth a try.
Of course, it goes without saying that you should try the man page, although for Mongo DB I only get this:
But you’ll find that the man pages for lsof, strace, tcpdump and find are extremely comprehensive and packed with great examples.
So feel free to share a few of your favorite debugging tips in the Comments. Bob knows, I could use them.
[flattr uid=’matthewparsons’ /]
Matt Parsons is a freelance Linux specialist who has designed, built and supported Unix and Linux systems in the finance, telecommunications and media industries.
He lives and works in London.