The Solaris Service Management Facility (SMF) marks a great improvement on the old style Unix RC script which is still so popular. The advantage of SMF, is that as well as allowing for greater control and granularity of application startup, it can also function as a replacement for the
inittab and keep processes running or react to crashes.
In this post, I’ll be running through an illustration of how I converted a standard RC script to an SMF manifest.
I use the ORCA utility to graph various system and application metrics, and have recently run into a few problems. The application periodically crashes for no apparaent reason, and I haven’t had time to debug the issue (once I get a core file I will figure this out). Since I rely on the graphs to trend server and application capacity, I want to ensure that the application gets restarted each time a failure occurs. Since ORCA is running on a Solaris 10 server, I decided to convert the existing start/stop scripts to Solaris 10 SMF manifests. To begin the conversion process, I first created a shell script that would be able to start up ORCA and clear any lockfiles that are present:
$ cat /usr/local/bin/orca.start #!/bin/sh if [ -d /var/orca/configs/orcallator.cfg.lock ] then logger -p daemon.notice "Removing orca lockfile" rm -rf /var/orca/configs/orcallator.cfg.lock fi logger -p daemon.notice "Starting orca in daemon mode" /usr/local/bin/orca -logfile /var/logs/orcallator.log \ -daemon /var/orca/configs/orcallator.cfg
Once I verified that the script worked correctly, I created an SMF manifest with a stop and start method and no dependencies:
$ cat orca.xml <!--?xml version="1.0"?--> <!--CTYPE service_bundle SYSTEM "/usr/share/lib/xml/dtd/service_bundle.dtd.--> <service name="application/orca" type="service" version="1"> <exec_method type='method' name='start' \ exec='/usr/local/bin/orca.start' timeout_seconds='0'> <exec_method type='method' name='stop' \ exec=':kill -15' timeout_seconds='3'> </service>
After the manifest was created, I used the svccfg %u2018validate%u2019 option to verify the structure of the XML document:
$ svccfg validate orca.xml $ echo $? 0
If svccfg encounters an error, it will display an error on the console, and return a non-zero return code. Debug with:
$ xmllint orac.xml
If the XML document validates, the svccfg import option can be used to import the manifest into the SMF repository:
$ svccfg import orca.xml
Once the manifest has been imported into the SMF repository, the svccfg listprop option can be used to display the service’s properties:
$ svccfg -s application/orca listprop start method start/exec astring /opt/data/orca/scripts/orca.start start/timeout_seconds count 0 start/type astring method stop method stop/exec astring %u201C:kill -15″ stop/timeout_seconds count 3 stop/type astring method
All of this took me about 15 minutes, and now when ORCA crashes SMF restarts the process, which generates the following messages in the system logfile:
Nov 9 11:56:03 winnie root: [ID 702911 daemon.notice] Removing orca lockfile
Nov 9 11:56:03 winnie root: [ID 702911 daemon.notice] Starting orca in daemon mode
And once again, everything is as it should be.