Observability
UCS ships a self-contained observability stack — metrics, logs, and
distributed traces — as the optional insoft-telemetry package. It is
not installed automatically with insoft-ucs; install it manually on
the hosts where you want local telemetry collection and visualization:
apt install insoft-telemetry # Ubuntu
dnf install insoft-telemetry # RHEL
insoft-telemetry pulls in Grafana, Loki, Tempo, Prometheus, and the
OpenTelemetry Collector as dependencies and configures them. A typical
deployment installs it on one or more central observability hosts; other
UCS hosts can forward telemetry there via the Collector.
What's included
| Component | Role | Package |
|---|---|---|
| Grafana | Web UI for dashboards, log/trace explorer | grafana |
| Loki | Log aggregation backend | loki |
| Tempo | Distributed tracing backend | tempo |
| Prometheus | Metrics storage and query engine | prometheus |
| OpenTelemetry Collector | Telemetry agent (receives OTLP, routes to backends) | insoft-otelcol-contrib |
| Insoft configuration | Drop-in configs that wire everything together | insoft-telemetry |
The collector receives telemetry from applications and forwards each signal to the appropriate backend (traces → Tempo, metrics → Prometheus, logs → Loki). Grafana then queries all three for visualization.
Accessing Grafana
Recommended URL (via Traefik on HTTPS):
https://<server-address>/grafana/
Direct URL (no proxy):
http://<server-address>:3030/grafana/
Grafana is configured to serve from the /grafana/ sub-path so it can sit
behind Traefik on the same hostname as the rest of UCS. The /grafana/
prefix is required in both URLs. The default port :3000 is also changed
to :3030 because :3000 collides with uauth-fa on UCS hosts.
Initial login: admin / admin — Grafana forces a password change on
first login.
Pre-setting the admin password
For automated provisioning (Ansible, etc.), set the password before the first install by writing to the Grafana environment file:
echo 'GF_SECURITY_ADMIN_PASSWORD=<generated>' >> /etc/default/grafana-server
Generate a unique password per host and store it in a secret manager — do not bake passwords into shared scripts.
LDAP / SSO
For Microsoft Active Directory or LDAP integration, configure Grafana's
LDAP authentication
in /etc/grafana/ldap.toml and enable [auth.ldap] in grafana.ini. The
local admin then becomes a break-glass account only.
Configuration locations
The Insoft configs live at the standard upstream paths — open
/etc/loki/config.yml, /etc/tempo/config.yml, or
/etc/prometheus/prometheus.yml in your editor as usual.
Behind the scenes each upstream path is a symlink to an Insoft-shipped
conffile. On install, the upstream original is renamed aside as .dist
and a symlink is created:
| Active path (edit this) | Symlinked to | Upstream original |
|---|---|---|
/etc/loki/config.yml | /etc/loki/config-insoft.yml | /etc/loki/config.yml.dist |
/etc/tempo/config.yml | /etc/tempo/config-insoft.yml | /etc/tempo/config.yml.dist |
/etc/prometheus/prometheus.yml | /etc/prometheus/prometheus-insoft.yml | /etc/prometheus/prometheus.yml.dist |
/etc/default/prometheus | /etc/default/prometheus-insoft | /etc/default/prometheus.dist |
The OpenTelemetry Collector's config lives at the upstream path with no
redirection: /etc/otelcol-contrib/config.yaml (shipped by
insoft-otelcol-contrib).
On upgrades: the symlink target (*-insoft.yml) is a dpkg/rpm
conffile of insoft-telemetry. If you've edited it through the symlink,
your changes survive apt upgrade / dnf upgrade — the package manager
saves any new shipped version alongside as .dpkg-dist / .rpmnew so
you can diff and merge if you want our new defaults.
On removal: the symlink is removed and the .dist file is renamed
back to the original path — upstream defaults are restored.
After editing a config, restart the relevant service:
systemctl restart loki # or tempo / prometheus / grafana-server / otelcol-contrib
Retention
| Backend | Default retention | Configured in |
|---|---|---|
| Loki | 240 hours (10 days) | /etc/loki/config.yml (limits_config.retention_period) |
| Tempo | 240 hours (10 days) | /etc/tempo/config.yml (compactor.compaction.block_retention) |
| Prometheus | Unlimited (no time-based eviction) | --storage.tsdb.retention.time=0 in /etc/default/prometheus |
Prometheus retains everything by default — monitor disk usage and adjust if needed.
Sending telemetry to the stack
Applications running on the same host should ship to the local collector:
| Endpoint | Purpose |
|---|---|
localhost:4317 (gRPC) | OTLP — accepts traces, metrics, logs |
localhost:4318 (HTTP) | OTLP — accepts traces, metrics, logs |
External clients (web pages, mobile apps) push over HTTPS through Traefik:
https://<server-address>/otel/v1/{traces,metrics,logs}
For details on instrumenting applications and the exact payload formats, see the developer documentation (Telemetry).
Troubleshooting
Grafana page returns "no session id" or similar weird response on :3000
— that's uauth-fa, not Grafana. Grafana on UCS hosts is at
https://<host>/grafana/ (or :3030/grafana/ directly).
Loki / Tempo says "Ingester not ready: waiting for 15s after being ready" — normal 15-second readiness grace period after each restart. Wait and retry.
Tempo fails to start with "Permission denied" — check that /var/tempo/
and its subdirectories exist and are owned by the tempo user. The
insoft-telemetry post-install script creates them; if directory perms drift,
recreate manually:
mkdir -p /var/tempo/{wal,blocks,generator/wal,generator/traces}
chown -R tempo: /var/tempo
systemctl restart tempo
Loki fails with "mkdir /var/lib/loki: permission denied" — same pattern:
mkdir -p /var/lib/loki/{chunks,rules} /var/lib/compactor
chown -R loki: /var/lib/loki /var/lib/compactor
systemctl restart loki
No data appearing in Grafana:
- Confirm services are running:
systemctl status grafana-server loki tempo prometheus otelcol-contrib - Confirm ports are listening:
ss -lntp | grep -E ":3030|:3100|:3200|:4317|:4318|:9091" - Check collector logs for export errors:
journalctl -u otelcol-contrib -n 50 --no-pager - Test directly with curl through the collector — see Telemetry developer docs.
Ports
See Communication for the full port list.