View markdown source on GitHub

Galaxy Monitoring with Telegraf and Grafana




last_modification Published: Jan 31, 2019
last_modification Last Updated: May 31, 2024

Telegraf, InfluxDB, and Grafana

General purpose tools for monitoring systems and services.

Tool Use
Telegraf plugin-driven server agent for collecting & reporting metrics
Influxdb purpose built time series database
Grafana dashboard for beautiful analytics and monitoring


Speaker Notes

Grafana showcase

If you see a dashboard you can export its configuration and put it on your Grafana with your data. Copy away!

Speaker Notes

galaxy dashboard showing route timings, user counts, job counts, etc.

Speaker Notes

node detail dashboard with filesystem usage, process states, cpu, memory, load, network, etc.

Speaker Notes

DB dashboard showing transactions, tuples fetched/modified, and index sizes for each database

Speaker Notes

user statistics page for Eu with 23k users, 30k workflows, 400k histories, 13M jobs, and 30M datasets. Additional breakdowns provided for years of compute time on various clusters included 1k years on de.NBI cloud.

Speaker Notes

cvmfs dashboard showing which repos each server supports in green, and missing ones in white. ~90% of repos are supported

Speaker Notes

Key Points

Thank you!

This material is the result of a collaborative work. Thanks to the Galaxy Training Network and all the contributors! page logo Tutorial Content is licensed under Creative Commons Attribution 4.0 International License.