Telemetry
UCS hosts run an embedded observability stack (Grafana / Loki / Tempo / Prometheus / OpenTelemetry Collector). Applications should ship metrics, logs, and traces to the local OpenTelemetry Collector, which forwards each signal to the appropriate backend.
For the operator-facing overview, see Observability.
Endpoints
The OpenTelemetry Collector listens locally on:
| Endpoint | Protocol | Use it for |
|---|---|---|
localhost:4317 | OTLP / gRPC | Server-side apps with an OTel SDK |
localhost:4318 | OTLP / HTTP | Server-side apps that prefer HTTP, or curl/manual sends |
https://<host>/otel/v1/* | OTLP / HTTPS | External clients (browsers, mobile apps) — routed through Traefik |
All three accept all three signals (traces, metrics, logs). HTTP paths for the OTLP/HTTP endpoint are standard:
POST /v1/tracesPOST /v1/metricsPOST /v1/logs
The Traefik route strips the /otel prefix before forwarding, so
POST /otel/v1/traces is forwarded as POST /v1/traces to the local
collector.
Signal → backend routing
The collector's pipelines (defined in /etc/otelcol-contrib/config.yaml)
route each signal as follows:
| Signal | OTLP path | Backend | Where to view |
|---|---|---|---|
| Traces | /v1/traces | Tempo | Grafana → Explore → Tempo |
| Metrics | /v1/metrics | Prometheus | Grafana → Explore → Prometheus (via remote-write) |
| Logs | /v1/logs | Loki | Grafana → Explore → Loki |
Use the same single endpoint regardless of signal type — the collector demultiplexes by the HTTP path or gRPC method.
Instrumenting an application
Python
Install the OTel SDK and OTLP exporter:
pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp
Minimal example — emit a trace from a Python service:
from opentelemetry import trace
from opentelemetry.sdk.resources import Resource
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
resource = Resource.create({"service.name": "ucs-api"})
provider = TracerProvider(resource=resource)
provider.add_span_processor(
BatchSpanProcessor(OTLPSpanExporter(endpoint="http://localhost:4317", insecure=True))
)
trace.set_tracer_provider(provider)
tracer = trace.get_tracer(__name__)
with tracer.start_as_current_span("handle-request"):
# ... your code ...
pass
For Python apps, prefer auto-instrumentation for HTTP frameworks, databases, etc. — it adds no boilerplate per call site:
pip install opentelemetry-instrumentation
opentelemetry-bootstrap --action=install # installs instrumentations for detected libs
opentelemetry-instrument --traces_exporter otlp \
--metrics_exporter otlp \
--logs_exporter otlp \
--exporter_otlp_endpoint http://localhost:4317 \
python your_app.py
Node.js
npm install @opentelemetry/api @opentelemetry/sdk-node @opentelemetry/auto-instrumentations-node @opentelemetry/exporter-trace-otlp-grpc
const { NodeSDK } = require('@opentelemetry/sdk-node')
const { OTLPTraceExporter } = require('@opentelemetry/exporter-trace-otlp-grpc')
const { getNodeAutoInstrumentations } = require('@opentelemetry/auto-instrumentations-node')
const sdk = new NodeSDK({
serviceName: 'ucs-operator',
traceExporter: new OTLPTraceExporter({ url: 'http://localhost:4317' }),
instrumentations: [getNodeAutoInstrumentations()],
})
sdk.start()
Other languages
OpenTelemetry has SDKs for most major languages —
Java, Go, .NET, Rust, PHP, Ruby. The OTLP endpoint URL is the same in
all of them: http://localhost:4317 (gRPC) or http://localhost:4318 (HTTP).
Browser / mobile client telemetry
For client-side code (web pages, mobile apps), use the public HTTPS endpoint through Traefik:
import { WebTracerProvider } from '@opentelemetry/sdk-trace-web'
import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http'
const provider = new WebTracerProvider({
resource: new Resource({ 'service.name': 'ucs-admin-ui' }),
})
provider.addSpanProcessor(new BatchSpanProcessor(new OTLPTraceExporter({
url: 'https://your-ucs-host/otel/v1/traces',
})))
provider.register()
The Traefik route includes CORS middleware, rate limiting, and a request
body size cap — see /etc/traefik/conf.d/07_otel.yaml.
Manual sends (testing / one-offs)
Useful for verifying connectivity, debugging the pipeline, or sending ad-hoc events from shell scripts.
Send a log via curl:
NOW=$(date +%s%N)
curl -X POST -H "Content-Type: application/json" \
http://localhost:4318/v1/logs \
-d "{\"resourceLogs\":[{\"resource\":{\"attributes\":[{\"key\":\"service.name\",\"value\":{\"stringValue\":\"shell-script\"}}]},\"scopeLogs\":[{\"scope\":{},\"logRecords\":[{\"timeUnixNano\":\"$NOW\",\"severityText\":\"INFO\",\"body\":{\"stringValue\":\"hello from shell\"}}]}]}]}"
Send a trace:
TRACE=$(openssl rand -hex 16); SPAN=$(openssl rand -hex 8)
NOW=$(date +%s%N); END=$((NOW + 1000000))
curl -X POST -H "Content-Type: application/json" \
http://localhost:4318/v1/traces \
-d "{\"resourceSpans\":[{\"resource\":{\"attributes\":[{\"key\":\"service.name\",\"value\":{\"stringValue\":\"shell-script\"}}]},\"scopeSpans\":[{\"scope\":{},\"spans\":[{\"traceId\":\"$TRACE\",\"spanId\":\"$SPAN\",\"name\":\"test-span\",\"kind\":1,\"startTimeUnixNano\":\"$NOW\",\"endTimeUnixNano\":\"$END\"}]}]}]}"
echo "TRACE=$TRACE"
Send a metric:
NOW=$(date +%s%N)
curl -X POST -H "Content-Type: application/json" \
http://localhost:4318/v1/metrics \
-d "{\"resourceMetrics\":[{\"resource\":{\"attributes\":[{\"key\":\"service.name\",\"value\":{\"stringValue\":\"shell-script\"}}]},\"scopeMetrics\":[{\"scope\":{},\"metrics\":[{\"name\":\"shell_test_counter\",\"sum\":{\"dataPoints\":[{\"asInt\":\"1\",\"timeUnixNano\":\"$NOW\"}],\"aggregationTemporality\":2,\"isMonotonic\":true}}]}]}]}"
Each request should return HTTP 200 with body {"partialSuccess":{}}.
Conventions
Resource attributes
Always set at minimum service.name on every emitted signal — Grafana
groups data by this attribute. Strongly recommended additions:
service.namespace— e.g.ucs,uphone,operatorservice.version— release version of your servicedeployment.environment—production,staging,development
These are part of the OpenTelemetry Resource semantic conventions.
Span / metric naming
Follow OTel semantic conventions
for common attributes (HTTP, DB, RPC). For example HTTP servers should
emit spans named <METHOD> <route> with attributes like http.request.method,
http.response.status_code, etc. Auto-instrumentation packages do this
for you on supported libraries.
Sampling
For high-traffic services, configure head-based or tail-based sampling
in the SDK to limit volume. Default OTel SDK sampler is parentbased_always_on
which emits every span — fine for low-traffic backends, too expensive for
hot HTTP servers.
Log/trace correlation
When emitting logs from a context that has an active trace (typical inside
an instrumented HTTP handler), the SDK will automatically attach the
trace_id and span_id to log records. In Grafana's Loki view, you can
then click a log line and jump to the corresponding trace in Tempo.
Viewing the data
Open Grafana at https://<server-address>/grafana/ (or
http://<server-address>:3030/grafana/ direct, no proxy). Default
credentials admin/admin. The Loki, Prometheus, and Tempo datasources
are pre-provisioned.
Explore (free-form queries): ☰ → Explore, then pick a datasource.
- Loki query example:
{service_name="ucs-api"} |= "error" - Prometheus query example:
rate(http_server_request_duration_seconds_count{service_name="ucs-api"}[5m]) - Tempo query example: paste a
trace_iddirectly, or use TraceQL.
Troubleshooting
Application appears to send telemetry but nothing shows in Grafana:
- Tail the collector logs for export errors:
journalctl -u otelcol-contrib -n 50 --no-pager - Confirm the collector accepted the data — POST returning 200 means it was received. If the backend exporter (Loki / Tempo / Prometheus) is failing, errors appear in collector logs.
- Verify the time window in Grafana includes your test send — by default Explore shows the last hour.
OTLP HTTP returns 415 Unsupported Media Type — make sure the
Content-Type header is application/json (not text/plain).
Trace ID lookup returns 404 in Tempo — Tempo has a brief ingest delay (seconds). Wait and retry. Also verify the trace ID is exactly 32 lowercase hex characters.
Logs show up but the severity field is empty — set both
severityText (string) and severityNumber (integer) in the OTLP log
record. The OTel SDKs handle this automatically; manual payloads need both.