Nov 032012
 
Sometimes – whether it be a new job or an inherited system or a colleague who’s taken time off to conceive a baby – you just don’t know what the heck is going on. There’s an application running with an unfamiliar name, there’s no man page, and the font of knowledge that is Google can only spit out cryptic snatches of email from mail-archive.com. I’m going to offer a few techniques and tips to illustrate how I gather information about unfamiliar software, its files, ports and command line arguments. It’s starts off simple and gets a little more interesting towards the end.

For the purpose of this post, I’m going to pretend I don’t know what Mongo DB is. It’s a NoSQL database, very good at horizontal scaling. But let’s pretend. I’m selecting it as an example because it’s not too untidy, and helps me make my point.

Hyopthetical Situation

You’re in the deep end at a new job, you’re on your own, there’s no documentation, and the monitoring shows that a host is running at high load. You’ve never come across this program before, and have no idea where to start. First thing to do, top:

Good ol' top

We can see that the mongod process is running pretty hot. We’re not concerned at this point why it’s having problems. I just want to know what it is, what it does, and where its files are.

Process listing

I’ll run ps with a full listing to find out what its command line arguments look like, and if it’s been invoked with a full path, which may indicate where it’s installed.

This show that the executable resides at /usr/bin/mongod and that it takes a configuration file /etc/mongod.conf. So not too perplexing.

Search on pattern

But let’s assume instead that we couldn’t obtain the full path of the mongod process. The next thing I like to do is just a big find command on the root directory and see what turns up, using “*mongo*” as our search pattern:

That’s actually pretty comprehensive (I’m liking mongo more and more) and logical. We can see that the log files are in the right place, and there are init scripts. But if these executables and logs weren’t named *mongo*, then this wouldn’t have been so straightforward.

Package Listing

So a better way to get the list of files associated with an application is to find out which package the /usr/bin/mongod comes from. On RPM based systems, like this:

Or on Debian-based systems, you’d use:

   # dpkg -S /usr/bin/mongod

Turns out the software is called mongo-10gen-server. Let’s query the contents of that package (and, in a slight jump, a related package) to find out what other files are installed out of this box:

This is the official list of files that were created at installation time.

At this point, it would probably be worth perusing the configuration file /etc/mongod.conf and the log file /var/log/mongod.log for clues and comments about what the program does, and how.

List all open filehandles

The tool lsof is fantastic for seeing everything a process has open – files, sockets and pipes. Executing ps, we can see that the process ID of mongod is 6569, so we invoke lsof with the PID as an argument:

Again, we can see that mongod is holding open its logfile, /var/log/mongod.log. Also, the two lines with LISTEN show that the process is listening on TCP ports 27017 and 28017. This information could also have been obtained by typing:

   # netstat -nlp

Application Network Traffic

We know from lsof and netstat that the process is listening on port 27017 and 28017. Let’s just take port 27017 and see if there’s any traffic coming in on this, and from where.

Bearing in mind that my testbox has the IP address of 10.243.52.51, we can see that incoming traffic is emanating from 10.243.24.69. I’d be logging in to that host to find out more about it and what it thinks it’s doing talking to this mongo thing.

Process internals

So we’ve got this far, and we know how the process is invoked and where it writes to. I’ve often found that the tricky part about lsof is that it only show files that are held open. If a configuration file is read once on startup, then it won’t show up in lsof. It can be handy to know where a program’s inputs are coming from. Attaching the strace program to a process when it starts up can reveal all sorts of information. In this example, yes it’s obvious that mongod has had the /etc/mongod.conf configuration file passed on the command line. But the point is that even if it hadn’t, strace would reveal that the file had been opened.

There are many options that can be passed to strace, but “-e open” narrows it to filehandles being opened only, which is a bit more manageable. By running it with the “-f” option, it will also drill down to any forked processes.

Reading embedded text in the application binary

Here’s one more trick that I like to use when I’m desperate. If you’ve got no manpage, a feeble “–help” and no “Usage”, then sometimes this may be of assistance. Run the strings command against the application binary, use a grep and a less for practicality, and see if you can extract anything useful – comments, expected arguments, anything:

In the mongod case, we get a few command line options (–replSet, etc). If I was really trying to ascertain how to use a program, some of these may be helpful. Again, not the best example, but it’s sometimes worth a try.

Of course, it goes without saying that you should try the man page, although for Mongo DB I only get this:

But you’ll find that the man pages for lsof, strace, tcpdump and find are extremely comprehensive and packed with great examples.

So feel free to share a few of your favorite debugging tips in the Comments. Bob knows, I could use them.
[flattr uid=’matthewparsons’ /]


Matt Parsons is a freelance Linux specialist who has designed, built and supported Unix and Linux systems in the finance, telecommunications and media industries.

He lives and works in London.

  2 Responses to “What the…? Tips for investigating mystery processes”

  1. Hi Matt,

    Good list! I also sometimes use the following to find out more about a package on Debian based distros (there’s likely an RPM equivalent):

    $ apt-cache depends FOO
    # show all packages that are required before installing FOO.
    …and
    $ apt-cache rdepends FOO
    # show all packages that depend on FOO being installed first.

  2. Hi Matt,Good list! I also sometimes use the fonllwiog to find out more about a package on Debian based distros (there’s likely an RPM equivalent):$ apt-cache depends FOO# show all packages that are required before installing FOO. and$ apt-cache rdepends FOO# show all packages that depend on FOO being installed first.

 Leave a Reply

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>