Context Navigation

← Previous Ticket
Next Ticket →

#142 closed enhancement (fixed)

Add monitoring for failure of the backend network

Reported by:	mitchb	Owned by:
Priority:	minor	Milestone:
Component:	internals	Keywords:	sipb-noc
Cc:

Description

We don't presently have a Nagios test that will alert us if there's a failure of the backend network switch, or the backend interface on an individual server. All the probes for sql.mit.edu will still pass because they run over the public network.

We should use some plugin to run a 'select 1;' or something similarly trivial on each scripts server.

Change History (4)

comment:1 Changed 13 years ago by adehnert

Keywords sipb-noc added

comment:2 Changed 12 years ago by adehnert

Resolution set to fixed
Status changed from new to closed

Fixed (see sipb-nagios commit 7d9206eae4e48824e0203d1ce19c4563f9bb664b and scripts r2190).

comment:3 Changed 12 years ago by quentin

Resolution fixed deleted
Status changed from closed to reopened

This isn't good enough; if the routes over the backend interface disappear, we will happily talk to sql over the frontend network and not notice the outage.

Unfortunately, it doesn't look like check_ping supports specifying an interface to check from. I guess we could pretend and ping the backend IP of sql instead.

comment:4 Changed 12 years ago by adehnert

Resolution set to fixed
Status changed from reopened to closed

Fixed in r2192.

Note: See TracTickets for help on using tickets.

Download in other formats: