Alerts
An alert rule is the core of monitoring: it watches one metric on one resource and triggers a response when that metric crosses a threshold you set. Alerts are where you decide what counts as a problem worth knowing about. This page lists your rules and is where you create, edit, and enable them.
Existing rules are shown in a table, one row per rule. The columns name the rule and the Product, Resource, and Metric it watches, and preview its Thresholds:
- The Thresholds preview lets you confirm where a rule triggers without opening it, so you can scan the table and spot a threshold that drifted out of date.
Creating an alert
A rule is organized into three parts: General sets its identity and timing, Data Source picks what it watches, and Thresholds define when it triggers.
General
Identity and timing for the rule. Past the Name and Enabled fields, two settings shape how the rule behaves when it fires:
- Default actions are the actions a threshold runs when it sets none of its own. Setting them here lets you reuse one response across every threshold instead of wiring each one up by hand, so a rule with five thresholds can share a single notification. Actions come from the Actions page.
- Cooldown (seconds) is how long to wait after firing before the rule can fire again, so a metric sitting over the line does not bury you in notifications. A cooldown of 300 means a rule that fires at 10:00 stays quiet until 10:05 even if the metric never drops. Set it too high and you miss a metric that recovers and spikes again inside the window; too low and a flapping metric notifies on every check.
Data Source
The data source ties the rule to one resource and one metric. You pick them in order: choosing a service narrows the resources, and choosing a resource narrows the metrics.
- Service: a service in the project, such as TagoIO Platform, an installed middleware, or an MQTT broker, plus Account for billing.
- Resource: the specific resource within that service, for example the Main Database under TagoIO Platform, or Billing under Account.
- Metric: the measurement to evaluate, such as CPU or memory utilization.
Budget alerts
When the service is Account and the metric is Budget, a Budget (USD) field appears. You set a budget amount in dollars, and the rule's thresholds are read as a percentage of it. A threshold at 50% of a 100 USD budget fires once month-to-date spend reaches 50 USD, which gives you an early warning before spend reaches the limit.
Thresholds
A threshold is a level on the metric that should trigger a response, set on a slider. Percentage metrics such as CPU and budget run from 0 to 100%; other metrics, like counts, seconds, or bytes, scale the slider to a range above their highest threshold instead. Beyond its name and color, two parts decide its behavior:
- The comparison can be greater than or equal (≥) or less than or equal (≤). Greater than fits metrics you want to cap, such as CPU; less than fits metrics where a drop is the problem, such as available disk or a connection count that should stay up.
- The actions run when the metric crosses the threshold. Left empty, the threshold falls back to the rule's default actions, so set them here only when this level needs a different response. Either way a threshold must resolve to at least one action.
A rule can hold more than one threshold, so the same metric can warn at one level and escalate at a higher one. A Main Database CPU rule might warn over 70% with an email and escalate over 90% with a webhook. When a value crosses more than one threshold at once, only the one nearest the current value runs its actions: a jump straight to 95% fires the 90% threshold, not the 70% one, so you get the most severe alert without the lower levels adding noise. The thresholds that were crossed but skipped are still recorded in the Logs.
How alerts run
The monitor reads the rule's metric regularly and compares it against each threshold. When a threshold is crossed, it runs that threshold's actions and records the event in the Logs. The cooldown then holds off repeat notifications while the condition lasts. A rule starts being evaluated as soon as it is enabled, and changes take effect as soon as you save them.
Use cases
- Catch a database or service running hot before it slows your application.
- Get an early warning as monthly spend approaches your budget.
- Warn at one threshold and run an automated response at a higher one.
- Trigger a webhook that runs an Analysis or calls an external system, so the response happens without anyone watching.