October 6, 2011

Always Connected

I’ve talked to some of my super-wired friends about what I call digital fatigue”. This is it.

October 3, 2011

Matt Cutts Copying Me

Seems one of Google’s best known bloggers is copying Tammy and I on his most recent 30 day challenge: going vegan!

Vegan Dear Diary
October 2, 2011

Soccer with Puddles

Mazie had a soccer game on Saturday. The field had a few puddles on it. The kids had no issues playing through.

October 2, 2011

Remodel Update Week 17

This was a big week for the project! The painting is mostly done now. The final colors are on all the walls, the tape is off, the trim is all painted. The railing is finished above the mudroom and I’m really excited about the space up there. I’m completely sold on putting a door in next year. The utility and electric rooms are complete enough that we were able to move a bunch of stuff back in. I was super excited to get the network and low voltage systems coming back together. Both built-ins were completed as well. The fence was installed around the egress window wells. One of the carpenters even came on Sunday and finished the railing on the front door and the new surround around the front door.

The coming week should be the last big week. The final electrical work is starting on Monday. The cork flooring is installed in the exercise room on Wednesday, and carpet is the next day. The theatre and network wiring should be all done as well. We’ve now taken down the plastic barrier in the kitchen so we can now get into the new space without going outside. Just a bit more now!

Morgan Remodel 2011 Dear Diary
October 2, 2011

Marathon Morning

It was a beautiful morning to walk over to Minnehaha Creek and watch the Twin Cities Marathon.

Dear Diary
October 2, 2011

Detect Slow Cacti Polling

I’m a big fan of graphs, and as such am a big fan of Cacti. I’ve used it at work and at home. It’s a wonderfully powerful, and ridiculously complicated front-end to RRDTool, which is also wonderfully powerful and ridiculously complicated. I’ve used Cacti to graph hundreds of servers, the temperature in my house, heat collected from solar panels and Twitter followers.

By default Cacti runs a poller every 5 minutes to collect data. Cacti gets very unhappy if the time required to run the poller exceeds the 5 minute interval. You will get blank data gaps and there aren’t any alarms that go off when this happens. Polling times can also vary without any changes in Cacti. If you are polling an external service and it gets slow, that could spike your polling times up in a terrible way.

I decided to write a quick script to catch this. Rather than integrate it into Cacti’s poller I made it completely separate. Save this script to a shell script and add it to /etc/cron.hourly and you can rest easily knowing your Cacti poller is healthy.

# by Jamie Thingelstad

# Where is the log?

# Sample up how many lines?

# Warning level for 5 minute poll interval, in seconds.
# Nonsensical to set this higher than 300.

# Where is mail?

# Where to send warning? What subject?
ERR_SUBJECT="[Cacti] Cacti Poller time is too high"

# Scan the cacti log and see how things are
# Grep for STATS \
# Get the last CACTI_SAMPLE samples \
# Find the duration field (cut) \
# Remove the label (cut) \
# Strip the decimal points, make it an integer (cut) \
# Add the total of all the samples and count the number of samples (awk) \
#   - this step is needed because the file may have < CACTI_SAMPLES in it
            tail -$CACTI_SAMPLE | \
            cut -d " " -f 7 | \
            cut -d ":" -f 2 | \
            cut -d "." -f 1 | \
            awk '{ sum += $1; count++ } END { print sum, count }')

# Get the two variables we want
TOTAL_TIME=$(echo $POLL_DATA | awk '{print $1}')
TOTAL_COUNT=$(echo $POLL_DATA | awk '{print $2}')

# Figure out the allowed time as a multiple of WARN_AT

# Check how long it took and warn if too long, in seconds
if [ "$AVERAGE_TIME" -gt "$WARN_AT" ]; then
    # Polling is exceeded limit, send warning
    ERR_MESSAGE="Polling time for cacti is dangerously high. Last $TOTAL_COUNT polling periods averaged $AVERAGE_TIME seconds, higher than $WARN_AT second threshold."
    echo $ERR_MESSAGE | $MAIL -s "$ERR_SUBJECT" "$ERR_TO" >/dev/null 2>/dev/null

Some notes on what this script is doing:

  • Sampling 24 lines means 24 five-minute polling periods, in other words, the last 24 * 5 minutes, or 2 hours.

  • The script will calculate the average of the last 24 (or fewer) samples and alarm if it is over threshold. One sample over the threshold won’t matter since it uses the average.

  • The awk complexity is used to deal with a new log file that has fewer than CACTI_SAMPLE events.