Skip to main content

Observability

UCS ships a self-contained observability stack — metrics, logs, and distributed traces — as the optional insoft-telemetry package. It is not installed automatically with insoft-ucs; install it manually on the hosts where you want local telemetry collection and visualization:

apt install insoft-telemetry    # Ubuntu
dnf install insoft-telemetry # RHEL

insoft-telemetry pulls in Grafana, Loki, Tempo, Prometheus, and the OpenTelemetry Collector as dependencies and configures them. A typical deployment installs it on one or more central observability hosts; other UCS hosts can forward telemetry there via the Collector.

What's included

ComponentRolePackage
GrafanaWeb UI for dashboards, log/trace explorergrafana
LokiLog aggregation backendloki
TempoDistributed tracing backendtempo
PrometheusMetrics storage and query engineprometheus
OpenTelemetry CollectorTelemetry agent (receives OTLP, routes to backends)insoft-otelcol-contrib
Insoft configurationDrop-in configs that wire everything togetherinsoft-telemetry

The collector receives telemetry from applications and forwards each signal to the appropriate backend (traces → Tempo, metrics → Prometheus, logs → Loki). Grafana then queries all three for visualization.

Accessing Grafana

Recommended URL (via Traefik on HTTPS):

https://<server-address>/grafana/

Direct URL (no proxy):

http://<server-address>:3030/grafana/

Grafana is configured to serve from the /grafana/ sub-path so it can sit behind Traefik on the same hostname as the rest of UCS. The /grafana/ prefix is required in both URLs. The default port :3000 is also changed to :3030 because :3000 collides with uauth-fa on UCS hosts.

Initial login: admin / admin — Grafana forces a password change on first login.

Pre-setting the admin password

For automated provisioning (Ansible, etc.), set the password before the first install by writing to the Grafana environment file:

echo 'GF_SECURITY_ADMIN_PASSWORD=<generated>' >> /etc/default/grafana-server

Generate a unique password per host and store it in a secret manager — do not bake passwords into shared scripts.

LDAP / SSO

For Microsoft Active Directory or LDAP integration, configure Grafana's LDAP authentication in /etc/grafana/ldap.toml and enable [auth.ldap] in grafana.ini. The local admin then becomes a break-glass account only.

Configuration locations

The Insoft configs live at the standard upstream paths — open /etc/loki/config.yml, /etc/tempo/config.yml, or /etc/prometheus/prometheus.yml in your editor as usual.

Behind the scenes each upstream path is a symlink to an Insoft-shipped conffile. On install, the upstream original is renamed aside as .dist and a symlink is created:

Active path (edit this)Symlinked toUpstream original
/etc/loki/config.yml/etc/loki/config-insoft.yml/etc/loki/config.yml.dist
/etc/tempo/config.yml/etc/tempo/config-insoft.yml/etc/tempo/config.yml.dist
/etc/prometheus/prometheus.yml/etc/prometheus/prometheus-insoft.yml/etc/prometheus/prometheus.yml.dist
/etc/default/prometheus/etc/default/prometheus-insoft/etc/default/prometheus.dist

The OpenTelemetry Collector's config lives at the upstream path with no redirection: /etc/otelcol-contrib/config.yaml (shipped by insoft-otelcol-contrib).

On upgrades: the symlink target (*-insoft.yml) is a dpkg/rpm conffile of insoft-telemetry. If you've edited it through the symlink, your changes survive apt upgrade / dnf upgrade — the package manager saves any new shipped version alongside as .dpkg-dist / .rpmnew so you can diff and merge if you want our new defaults.

On removal: the symlink is removed and the .dist file is renamed back to the original path — upstream defaults are restored.

After editing a config, restart the relevant service:

systemctl restart loki     # or tempo / prometheus / grafana-server / otelcol-contrib

Retention

BackendDefault retentionConfigured in
Loki240 hours (10 days)/etc/loki/config.yml (limits_config.retention_period)
Tempo240 hours (10 days)/etc/tempo/config.yml (compactor.compaction.block_retention)
PrometheusUnlimited (no time-based eviction)--storage.tsdb.retention.time=0 in /etc/default/prometheus

Prometheus retains everything by default — monitor disk usage and adjust if needed.

Sending telemetry to the stack

Applications running on the same host should ship to the local collector:

EndpointPurpose
localhost:4317 (gRPC)OTLP — accepts traces, metrics, logs
localhost:4318 (HTTP)OTLP — accepts traces, metrics, logs

External clients (web pages, mobile apps) push over HTTPS through Traefik:

https://<server-address>/otel/v1/{traces,metrics,logs}

For details on instrumenting applications and the exact payload formats, see the developer documentation (Telemetry).

Troubleshooting

Grafana page returns "no session id" or similar weird response on :3000 — that's uauth-fa, not Grafana. Grafana on UCS hosts is at https://<host>/grafana/ (or :3030/grafana/ directly).

Loki / Tempo says "Ingester not ready: waiting for 15s after being ready" — normal 15-second readiness grace period after each restart. Wait and retry.

Tempo fails to start with "Permission denied" — check that /var/tempo/ and its subdirectories exist and are owned by the tempo user. The insoft-telemetry post-install script creates them; if directory perms drift, recreate manually:

mkdir -p /var/tempo/{wal,blocks,generator/wal,generator/traces}
chown -R tempo: /var/tempo
systemctl restart tempo

Loki fails with "mkdir /var/lib/loki: permission denied" — same pattern:

mkdir -p /var/lib/loki/{chunks,rules} /var/lib/compactor
chown -R loki: /var/lib/loki /var/lib/compactor
systemctl restart loki

No data appearing in Grafana:

  1. Confirm services are running: systemctl status grafana-server loki tempo prometheus otelcol-contrib
  2. Confirm ports are listening: ss -lntp | grep -E ":3030|:3100|:3200|:4317|:4318|:9091"
  3. Check collector logs for export errors: journalctl -u otelcol-contrib -n 50 --no-pager
  4. Test directly with curl through the collector — see Telemetry developer docs.

Ports

See Communication for the full port list.