Nagios Core — Classic Open‑Source Monitoring Engine
Overview
Nagios Core is the long‑lived heart of many on‑prem monitoring setups. It schedules checks, records states, and raises alerts when hosts or services drift from normal. No glamor, little ceremony — just a reliable polling engine with a huge plugin surface. In mixed estates (old switches here, new Linux clusters there), that stability still matters.
Why It Matters
Modern observability stacks excel at time‑series and traces, yet plenty of issues still come down to simple reachability or a service not answering. Nagios Core keeps that baseline honest. It watches uptime, latency, disk headroom, SSL expiry — the unglamorous things that break weekends. Teams use it when a deterministic “check → state → notify” loop is needed and change control prefers text files over magical autoconfig.
How It Works
The daemon runs a scheduler. At defined intervals it executes plugins — small programs that test something and return an exit code plus a line of text. Results land in status files, states flip between OK/WARN/CRIT/UNKNOWN, and notification rules decide who gets pinged. Active checks handle most polling; passive checks let external systems feed results back. Add‑ons such as NRPE or modern agents run commands remotely when the target can’t be reached directly.
Deployment / Installation Guide
– Linux: typical flow is packages or source build, Apache + PHP for the web UI, and a dedicated user.
– Plugins: install the official nagios‑plugins set first; add community or in‑house checks as needed.
– Configuration: hosts, services, contacts, and escalations are plain text (objects/*.cfg). Version control them; many teams template with Ansible.
– Scaling: distribute load with remote pollers or gearman‑style workers; keep the central node for state and notifications.
– Hardening: restrict CGI actions, enforce TLS on the UI, and separate service accounts from plugin credentials.
Integrations
Nagios Core ties into old and new worlds without much fuss. Network gear talks SNMP; servers expose checks via NRPE/NSClient++/NCPA; ticketing hooks create incidents; mail/SMS gateways deliver pages. When dashboards are required, results can be pushed toward Grafana, InfluxDB, or even a lightweight time‑series store via bridge scripts.
Real‑World Applications
– Baseline monitoring for legacy network hardware where SNMP polling is still the ground truth.
– Clean separation of duties: Nagios Core handles availability alerts; a separate stack (Prometheus, Elastic, or VictoriaMetrics) handles rich metrics.
– Regulated environments that favor auditable text configs and explicit change requests.
– Small colo sites and branch offices where a single VM must do the job for years.
Limitations
– Text config scales, but slowly; large estates need templating and discipline.
– No built‑in time‑series or fancy analytics — that’s an add‑on conversation.
– The UI is functional, not modern.
– Very large check volumes require careful scheduling and multiple workers.
Snapshot Comparison
Tool | Role | Strengths | Best Fit |
Nagios Core | Polling & alert engine | Mature, predictable, plugin‑rich | Mixed or legacy estates needing deterministic checks |
Icinga 2 | Modernized fork | Cleaner config, improved API/UI | Teams wanting Nagios roots with updated ergonomics |
Zabbix | NMS + metrics | Discovery, built‑in DB & dashboards | Enterprises seeking an all‑in‑one platform |
Prometheus | Metrics + pull model | Cloud‑native, powerful queries | Kubernetes and microservice environments |