A system should explain itself
When production fails, the worst answer is silence. Observability gives engineers the ability to ask questions that were not predicted when the code was written.
Signals before dashboards
Start with signals that reveal user impact. Dashboards are useful only when they are built from meaningful signals and linked to ownership.
| Signal | Question | Owner |
|---|---|---|
| Latency | Are users waiting? | API team |
| Queue age | Is work stuck? | Worker team |
| Error budget | Are we spending reliability too fast? | Service owner |
Trace the boundary
Tracing is strongest around boundaries: API calls, message handlers, database queries, background jobs, and third-party integrations.
activity?.SetTag("tenant.id", tenantId);
activity?.SetTag("operation.name", "invoice.calculate");
activity?.SetTag("queue.age_ms", queueAge.TotalMilliseconds);