Monit (fix /var/log/messages)

Recently I discovered monit for FreeBSD, a monitoring system that is highly configurable and can be used to monitor various system happenings including service checks, disk space usage and process health. I installed it on all of the systems we have that require such monitoring. It has been very helpful in letting us know when things go offline.

The one issue I have noticed with it has to do with monitoring PostgreSQL. The PostgreSQL part of our config is setup like so:

# POSTGRESQL
check host PostgreSQL with address 127.0.0.1
    if failed ping then alert
    if failed port 5432 protocol pgsql then alert

So when it can’t connect to PostgreSQL we will receive an email message about it. Great! However if you look in the /var/log/messages you will notice something like this every 30 seconds (since we have monit setup to check everything in 30 second intervals):

Nov  1 00:12:41 blackbox postgres[70271]: [2-1] FATAL:  role "root" does not exist

To fix this and start cleaning up our log its pretty straight forward. Create that role:

psql -U pgsql template1
create role root login nocreaterole nocreatedb nosuperuser noinherit;

After this we now start seeing this error in the log:

Nov  9 12:47:50 blackbox postgres[94875]: [2-1] FATAL:  database "root" does not exist

To fix this we simple create that database:

create database root;

I also added this to the pg_hba.conf:

# TYPE  DATABASE        USER            ADDRESS                 METHOD
...
host    root            root            127.0.0.1/32            trust
...

That fixed our issues and now we no longer see those /var/log/messages. Monit wants to always connect using the “root” user and there’s no way to configure it to use a different user. So I came up with this as the workaround.

FreeBSD minor version upgrading

Upgrading FreeBSD to a new minor version is relatively simple. Here are the commands that will get you there: [upgrading from 10.1-p16]


[steve@blackbox:/var/empty]%uname -a
FreeBSD blackbox.cello.com 10.1-RELEASE-p16 FreeBSD 10.1-RELEASE-p16 #0: Tue Jul 17 05:25:45 UTC 2015 root@amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64

sudo freebsd-update upgrade -r 10.2-RELEASE
sudo freebsd-update install
sudo reboot now
sudo freebsd-update install
sudo reboot now
sudo pkg upgrade

I experienced a couple issues along the way, but they were easily correctable. During the upgrade (first command) the process hung while attempting to download the software it needs. I completely lost my connection so I logged back in and ran it again with no issues. Then during the second “freebsd-update install” the same thing happened. Again, log back in and run it again. No issues. Finally, while running the last command to upgrade the packages on the system I ran into one package it would not upgrade. Running “pkg upgrade” twice seemed to ignore the problem.

[steve@blackbox:/var/empty]%sudo pkg upgrade
Updating FreeBSD repository catalogue...
Fetching meta.txz: 100% 944 B 0.9kB/s 00:01
Fetching packagesite.txz: 100% 5 MiB 1.1MB/s 00:05
Processing entries: 100%
FreeBSD repository update completed. 24442 packages processed.
New version of pkg detected; it needs to be installed first.
The following 1 package(s) will be affected (of 0 checked):

Installed packages to be UPGRADED:
pkg: 1.5.5 -> 1.5.6

The process will require 319 B more space.
2 MiB to be downloaded.

Proceed with this action? [y/N]: y
Fetching pkg-1.5.6.txz: 100% 2 MiB 1.2MB/s 00:02
Checking integrity... done (0 conflicting)
[1/1] Upgrading pkg from 1.5.5 to 1.5.6...
[1/1] Extracting pkg-1.5.6: 100%
Message for pkg-1.5.6:
If you are upgrading from the old package format, first run:

# pkg2ng
Updating FreeBSD repository catalogue...
FreeBSD repository is up-to-date.
All repositories are up-to-date.
Checking for upgrades (32 candidates): 100%
Processing candidates (32 candidates): 100%
The following 32 package(s) will be affected (of 0 checked):

New packages to be INSTALLED:
postgresql93-client: 9.3.9

Installed packages to be UPGRADED:
...
curl: 7.43.0_2 -> 7.44.0
ca_root_nss: 3.19.2 -> 3.19.3
apache24: 2.4.16 -> 2.4.16_1

Installed packages to be REINSTALLED:
...
p5-DBD-Pg-3.5.1 (direct dependency changed: postgresql93-client)

The process will require 9 MiB more space.
43 MiB to be downloaded.

Proceed with this action? [y/N]: y
...
Fetching curl-7.44.0.txz: 100% 1 MiB 1.4MB/s 00:01
Fetching ca_root_nss-3.19.3.txz: 100% 334 KiB 341.7kB/s 00:01
Fetching apache24-2.4.16_1.txz: 100% 4 MiB 1.3MB/s 00:03
Fetching postgresql93-client-9.3.9.txz: 100% 2 MiB 2.0MB/s 00:01
Checking integrity... done (1 conflicting)
pkg: Cannot solve problem using SAT solver:
upgrade rule: upgrade local p5-DBD-Pg-3.5.1 to remote p5-DBD-Pg-3.5.1
cannot install package p5-DBD-Pg, remove it from request? [Y/n]: y
pkg: cannot find p5-DBD-Pg in the request
pkg: cannot solve job using SAT solver
Checking integrity... done (0 conflicting)
Conflicts with the existing packages have been found.
One more solver iteration is needed to resolve them.

[steve@nprod:/var/empty]%sudo pkg upgrade
Updating FreeBSD repository catalogue...
FreeBSD repository is up-to-date.
All repositories are up-to-date.
Checking for upgrades (32 candidates): 100%
Processing candidates (32 candidates): 100%
Checking integrity... done (1 conflicting)
pkg: Cannot solve problem using SAT solver:
upgrade rule: upgrade local p5-DBD-Pg-3.5.1 to remote p5-DBD-Pg-3.5.1
cannot install package p5-DBD-Pg, remove it from request? [Y/n]:
pkg: cannot find p5-DBD-Pg in the request
pkg: cannot solve job using SAT solver
Checking integrity... done (0 conflicting)
Your packages are up to date.

And there you have it.

[steve@blackbox:/var/empty]%uname -a
FreeBSD blackbox.cello.com 10.2-RELEASE FreeBSD 10.2-RELEASE #0 r286666: Wed Aug 09 09:55:07 UTC 2015 root@releng1.nyi.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64

Cricket monitoring – old library reference

We recently shutdown one of many systems we have monitored by the free and open source Cricket monitoring software. I noticed we got no notification from Cricket about this shutdown, which was puzzling. Looking through the logs I saw this error message:

Can't load '/usr/local/lib/perl5/site_perl/5.16/mach/auto/RRDs/RRDs.so' for module RRDs: Shared object "libpixman-1.so.30" not found, required by "libpangocairo-1.0.so.0" at /usr/local/lib/perl5/5.16/mach/DynaLoader.pm line 190.
 at /usr/local/cricket/cricket/./collector line 32.
Compilation failed in require at /usr/local/cricket/cricket/./collector line 32.
BEGIN failed--compilation aborted at /usr/local/cricket/cricket/./collector line 32.

So it was clear that the pango library file libpangocairo-1.0.so.0 was looking for a pixman library file it couldn’t find. The first thing I did was check out which version of pixman we currently had installed:

steve@someserver-335: pkg info pixman
pixman-0.32.4_2
Name           : pixman
Version        : 0.32.4_2
Installed on   : Mon May  5 15:30:07 EDT 2014
Origin         : x11/pixman
Architecture   : freebsd:10:x86:32
Prefix         : /usr/local
Categories     : x11
Maintainer     : x11@FreeBSD.org
WWW            : http://www.freedesktop.org/Software/xlibs
Comment        : Low-level pixel manipulation library
Shared Libs provided:
        libpixman-1.so.0.32.4
Flat size      : 1.43MiB
Description    :
This package contains the pixman library.

Ok, so now I was able to confirm that pango was trying to reference the old version (libpixman-1.so.30) of the pixman library, even though we have a newer version (libpixman-1.so.32.4) installed. Time to rebuild the pango port to update that reference.

$ sudo portmaster pango

And finally I ran the following to test Cricket out:

$ cd /usr/local/cricket/cricket
$ ./collect-subtrees normal

All was back to normal!

FreeBSD update broke Apache perl modules

After a recent freebsd update fetch && freebsd update install Apache would not restart properly. It was complaining about missing perl modules it relied on. So we rebuilt the perl port and all its dependencies.

sudo portmaster -m BATCH=yes --no-confirm -D -r perl

The -m BATCH=yes chooses defaults at setup screens and bypasses them, --no-confirm avoids prompting at the command line, -D keeps distfiles in tact and -r updates all its port dependencies. Once this was done Apache fired right up!

ZFS zpool crash

Recently we had an issue where one of our servers unexpectedly crashed and we had to hard reboot it. When it rebooted we began seeing kernel panics and it would not complete its boot cycle. In comes my mfsbsd stick to the rescue! The machine in question was running FreeBSD 10. We had just built it so we had one of these sticks handy. To create a mfsbsd stick yourself grab the image here:
http://mfsbsd.vx.sk

FreeBSD Build Image Instructions:
http://www.freebsd.org/doc/en/articles/remote-install/preparation.html

We ended up using Win32 Disk Imager:
https://sourceforge.net/projects/win32diskimager/

Once we got booted off the usb stick to a command prompt (beyond the scope of this post but simple, follow the prompts) we simply did the following to remount the zfs pools that were fubar’d:
zpool import -f -a

All zfs pools imported with no problems. Then we rebooted the machine as normal and voila, no errors!

Fun with a FreeBSD named port upgrade.

::Begin Prologue::
Recently we performed the following on our DNS server running FreeBSD 9.1:
freebsd update fetch && freebsd update install
Always smooth sailing. 🙂
::End Prologue::

So, today we got a new laptop which meant setting up the DNS records for it to run on our network. We added the necessary forward and reverse DNS records in named. After doing so, the normal practice is to run rndc reload so the system will reload and recognize the new DNS records. A very simple process. But today was a very different story. Today, running rndc reload produced this error message:

rndc: neither /usr/local/etc/rndc.conf nor /usr/local/etc/rndc.key was found

Huh? Out of the blue rndc does this? Well guess what, the named port got upgraded, so go figure.

Now we noticed the symbolic link /etc/named/rndc.key to /usr/local/etc/rndc.key was missing, so let’s create it:
ln -s /etc/named.rndc.key /usr/local/etc/rndc.key

Now running rndc reload produced this error:

rndc: 'reload' failed: not found

It would have been nice if the message had told me exactly what file was not found. But looking in the /var/log/daemon.log file pointed me in the right direction as to what was going wrong:

May 19 15:11:19 ns named[61129]: received control channel command 'reload'
May 19 15:11:19 ns named[61129]: loading configuration from '/usr/local/etc/named.conf'
May 19 15:11:19 ns named[61129]: open: /usr/local/etc/named.conf: file not found
May 19 15:11:19 ns named[61129]: reloading configuration failed: file not found

Normally it should look in /etc/named/named.conf for the configuration file but our new version now has a new path it is looking in.

So we then added the following to the /etc/rc.conf file:

named_conf="/etc/named/named.conf"

And ran:

service named restart
Stopping named.
Starting named.
/etc/rc.d/named: WARNING: failed to start named

Blamo! Named stopped and could not start back up, no thanks to our new config line. It again produced the same error in the log file. And now, there was no named service running. Double ugh.

This command fixed that and pointed it to the correct config file:

/usr/local/sbin/named -t /var/named -u bind -c /etc/namedb/named.conf

Now the named service is up and running and rndc reload runs as normal. Great!
Now we just need to do a reboot of the system (when there is little traffic) to see if named starts up normally after adding that line to /etc/rc.conf.