# How to Monitor a Teamscale Instance

To check and monitor the state of a Teamscale instance, the following options are available:

# Monitoring via Web UI

Teamscale provides many helpful metrics via the System Information view in the System perspective. In addition, the logs (e.g., Worker Log) can be helpful for diagnosis.

# API Endpoint for Nagios

The URLs api/health-check and api/health-metrics provide check results and metrics in the Nagios format (opens new window). This can be used with Nagios or compatible solutions, such as Sensu, to monitor the current health status of Teamscale. To integrate on the command-line, see the monitoring directory in your Teamscale distribution.

Exposed Metrics

General information about the instance

  • State of the scheduler
  • If the instance has run out of memory
  • If the instance is running out of disk space
  • If the Java VM is running out of memory
  • If the license is valid or outdated
  • If the server certificate is valid
  • The version of Teamscale
  • Number of workers
  • The load of the worker
  • Number of licensed user
  • Number of users
  • Number of active users in specific time frames
  • Number of committers in specific time frames
  • Number of projects

# API Endpoint for Prometheus

The URL api/monitoring/prometheus exposes various metrics of Teamscale in the Prometheus format (opens new window). To enable this service, the environment variable TS_PROMETHEUS_ENABLED must be set to true . Additional protection of the metrics can be provided by setting a secret token in the environment variable TS_PROMETHEUS_TOKEN . If this is set, the URL becomes api/monitoring/prometheus?token=<secret-token>.

Exposed Metrics

General information about the instance

  • Name of the instance
  • Name of the process
  • The version of Teamscale
  • If the instance is in shadow mode
  • Number of workers
  • The load of the worker
  • Number of licensed user
  • Number of users
  • Number of active users in specific time frames
  • Number of committers in specific time frames
  • Number of projects
  • CPU load
  • Number of logical CPU cores
  • Size of the RAM
  • Amount of used RAM
  • Used RAM of the Java VM
  • Statistics of the Internal String Abbreviator Cache

Metrics for each project.

  • Primary public ID of the project
  • Number of connectors in specifc states
  • Number of files
  • Number of lines of code
  • Number of commits in specific states
  • Number of the different log entries

Storage performance metrics. These metrics are disabled by default because they are expansive to collect and are only useful for debugging. To enable these metrics set the flag -Dcom.teamscale.storage-metrics.enabled=true. More information how to set a flag can be seen here.

  • Number of opening operations for a store
  • Number of storage operations
  • Number of overall affected keys in storage operations
  • Number of overall bytes in keys
  • Number of overall bytes in values
  • Duration in milliseconds of storage operations

# Forwarding Teamscale Logs to Splunk

Teamscale uses the Log4J logging framework and provides support for forwarding the generated logs to a Splunk server. Log forwarding can be configured using the Splunk logging for Java (opens new window) integration and Teamscale fully supports HTTP Event Collector (recommended) and TCP data inputs. See the default log4j2.yaml Log4J configuration file in the Teamscale distribution Zip for an example configuration. For further configuration options refer to the official How to use Splunk logging for Java (opens new window) page. To reduce load on the Splunk server, consider adjusting the batch_interval, batch_size_bytes, or batch_size_count of the Log4J appender for Splunk to reduce the frequency of log forwarding.