Skip to content

Direct Database Monitoring

The KloudMate Agent can connect directly to your database using a dedicated monitoring user to collect deep query metrics, execution plans, and schema metadata. This approach provides the richest level of database insight — including per-query latency breakdowns, explain plans, and wait event analysis — across both self-hosted and managed database services.

  1. You create a dedicated kloudmate database user with minimal read-only permissions
  2. Helper procedures/functions (using SECURITY DEFINER) allow the agent to access query stats and run EXPLAIN without needing superuser privileges
  3. The KloudMate Agent connects using this user and periodically collects query metrics, live queries, execution plans, and schema metadata

Connect as a user with admin privileges (for RDS/Aurora, use the master user):

CREATE USER IF NOT EXISTS 'kloudmate'@'%'
  IDENTIFIED BY '';

-- Limit connections to prevent exhausting the pool
ALTER USER 'kloudmate'@'%' WITH MAX_USER_CONNECTIONS 5;
-- PROCESS: See all running threads and queries
-- REPLICATION CLIENT: Check replication lag and binary log position
-- SELECT on perf_schema: Read query digest stats, wait events, etc.

GRANT REPLICATION CLIENT ON *.* TO 'kloudmate'@'%';
GRANT PROCESS ON *.* TO 'kloudmate'@'%';
GRANT SELECT ON performance_schema.* TO 'kloudmate'@'%';

Create a dedicated schema and the helper procedures the agent uses:

CREATE SCHEMA IF NOT EXISTS kloudmate;
GRANT EXECUTE ON kloudmate.* TO 'kloudmate'@'%';

Enable Performance Schema, Consumers

On RDS/Aurora/Cloud SQL, you cannot permanently enable performance_schema consumers via config. This procedure lets the agent enable them at runtime:

DELIMITER $$
CREATE PROCEDURE IF NOT EXISTS kloudmate.enable_events_statements_consumers()
SQL SECURITY DEFINER
BEGIN
    UPDATE performance_schema.setup_consumers
       SET enabled = 'YES'
     WHERE name LIKE 'events_statements_%';

    UPDATE performance_schema.setup_consumers
       SET enabled = 'YES'
     WHERE name = 'events_waits_current';
END $$
DELIMITER ;

GRANT EXECUTE ON PROCEDURE kloudmate.enable_events_statements_consumers TO 'kloudmate'@'%';

Explain Statement

Collects JSON execution plans for sampled queries:

DELIMITER $$
CREATE PROCEDURE IF NOT EXISTS kloudmate.explain_statement(IN query TEXT)
SQL SECURITY DEFINER
BEGIN
    SET @explain := CONCAT('EXPLAIN FORMAT=json ', query);
    PREPARE stmt FROM @explain;
    EXECUTE stmt;
    DEALLOCATE PREPARE stmt;
END $$
DELIMITER ;

GRANT EXECUTE ON PROCEDURE kloudmate.explain_statement TO 'kloudmate'@'%';

Self-Hosted — add to your my.cnf / mysqld.cnf:

[mysqld]
performance_schema = ON
performance-schema-consumer-events-statements-current = ON
performance-schema-consumer-events-statements-history = ON
performance-schema-consumer-events-statements-history-long = ON
performance-schema-consumer-events-waits-current = ON
max_digest_length = 4096
performance_schema_max_digest_length = 4096
performance_schema_max_sql_text_length = 4096

To enable the KloudMate Schema Explorer, grant SELECT on the databases the agent should discover:

-- Option A: All databases
GRANT SELECT ON *.* TO 'kloudmate'@'%';

-- Option B: Specific databases only
GRANT SELECT ON your_app_db.* TO 'kloudmate'@'%';
FLUSH PRIVILEGES;

Run these as the kloudmate user:

-- Check performance_schema is enabled
SELECT @@performance_schema;

-- Check consumers are active
SELECT name, enabled FROM performance_schema.setup_consumers
 WHERE name LIKE 'events_statements_%' OR name = 'events_waits_current';

-- Check query digests are being collected
SELECT COUNT(*) FROM performance_schema.events_statements_summary_by_digest;

-- Verify procedures exist
SHOW PROCEDURE STATUS WHERE db = 'kloudmate';

-- Test explain
CALL kloudmate.explain_statement('SELECT 1');
receivers:
  kloudmate_mysql:
    endpoint: ":3306"
    username: "kloudmate"
    password: "${KLOUDMATE_DB_PASSWORD}"
    database: ""                           # empty = monitor all databases
    collect_query_metrics: true
    collect_query_samples: true
    collect_explain_plans: true
    collect_schema: true                   # requires SELECT grants (step 5)
    collection_interval: 10s
    query_samples_rate: 1                  # sample every Nth query
    tls:
      enabled: true                        # recommended for RDS
      skip_verify: false

For RDS with IAM authentication:

receivers:
  kloudmate_mysql:
    endpoint: "<RDS_ENDPOINT>:3306"
    auth_type: "iam"
    username: "kloudmate"
    aws:
      region: "ap-south-1"
      instance_endpoint: "<RDS_ENDPOINT>"

Connect as a superuser (or rds_superuser on RDS/Aurora) to the postgres database:

CREATE USER kloudmate WITH PASSWORD '';
-- pg_monitor: Built-in role (PG 10+) granting read access to
-- pg_stat_activity, pg_stat_replication, pg_stat_database, and more.
GRANT pg_monitor TO kloudmate;

pg_stat_statements is the primary data source for query metrics (latency, calls, rows):

CREATE EXTENSION IF NOT EXISTS pg_stat_statements SCHEMA public;

Self-Hosted — add to postgresql.conf:

shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
pg_stat_statements.max = 10000
track_activity_query_size = 4096

Create a dedicated schema and the SECURITY DEFINER wrapper functions:

CREATE SCHEMA IF NOT EXISTS kloudmate;
GRANT USAGE ON SCHEMA kloudmate TO kloudmate;
GRANT USAGE ON SCHEMA public TO kloudmate;

pg_stat_activity Wrapper

Ensures full visibility into all sessions’ query text (required on RDS where non-superusers have restricted access):

CREATE OR REPLACE FUNCTION kloudmate.pg_stat_activity()
RETURNS SETOF pg_stat_activity
AS $$
    SELECT * FROM pg_catalog.pg_stat_activity;
$$
LANGUAGE sql
SECURITY DEFINER;

pg_stat_statements Wrapper

Ensures full visibility into all query statistics:

CREATE OR REPLACE FUNCTION kloudmate.pg_stat_statements()
RETURNS SETOF pg_stat_statements
AS $$
    SELECT * FROM pg_stat_statements;
$$
LANGUAGE sql
SECURITY DEFINER;

Explain Statement

Collects JSON execution plans for sampled queries:

CREATE OR REPLACE FUNCTION kloudmate.explain_statement(
    l_query TEXT,
    OUT explain JSON
)
RETURNS SETOF JSON
AS $$
DECLARE
    curs REFCURSOR;
    plan JSON;
BEGIN
    OPEN curs FOR EXECUTE pg_catalog.concat('EXPLAIN (FORMAT JSON) ', l_query);
    LOOP
        FETCH curs INTO plan;
        EXIT WHEN NOT FOUND;
        RETURN QUERY SELECT plan;
    END LOOP;
    CLOSE curs;
    RETURN;
END;
$$
LANGUAGE plpgsql
SECURITY DEFINER;

Stats Reset

Safe way to reset pg_stat_statements counters without direct superuser access:

CREATE OR REPLACE FUNCTION kloudmate.pg_stat_statements_reset()
RETURNS VOID
AS $$
    SELECT pg_stat_statements_reset();
$$
LANGUAGE sql
SECURITY DEFINER;

Replication Monitoring Only needed on the primary / writer instance for monitoring replication lag:

CREATE OR REPLACE FUNCTION kloudmate.pg_stat_replication()
RETURNS SETOF pg_stat_replication
AS $$
    SELECT * FROM pg_catalog.pg_stat_replication;
$$
LANGUAGE sql
SECURITY DEFINER;

Self-Hosted — add to postgresql.conf:

shared_preload_libraries = 'pg_stat_statements'
pg_stat_statements.track = all
pg_stat_statements.max = 10000
track_activity_query_size = 4096
track_io_timing = on                     # required for I/O metrics
log_min_duration_statement = 1000        # log slow queries (ms)

For full EXPLAIN ANALYZE plans (actual execution times, not just estimates):

shared_preload_libraries = 'pg_stat_statements,auto_explain'
auto_explain.log_min_duration = 1000
auto_explain.log_analyze = true
auto_explain.log_buffers = true
auto_explain.log_timing = true
auto_explain.log_triggers = true
auto_explain.log_verbose = true
auto_explain.log_format = json
auto_explain.log_nested_statements = true
auto_explain.sample_rate = 1             # 1 = all, 0.1 = 10% sample

For each additional database (e.g. app_db, analytics_db), connect and re-run the extension + function creation:

for db in $(psql -t -c "SELECT datname FROM pg_database WHERE datistemplate = false"); do
  psql -d "$db" -f kloudmate_functions.sql
done

Run as the kloudmate user:

-- Check pg_stat_statements is loaded
SELECT count(*) FROM pg_stat_statements;

-- Check functions exist
SELECT routine_name FROM information_schema.routines
 WHERE routine_schema = 'kloudmate';

-- Test pg_stat_activity wrapper
SELECT count(*) FROM kloudmate.pg_stat_activity();

-- Test explain
SELECT * FROM kloudmate.explain_statement('SELECT 1');

-- Check track_activity_query_size
SHOW track_activity_query_size;

-- Check track_io_timing
SHOW track_io_timing;

PostgreSQL Agent Configuration

receivers:
  kloudmate_postgresql:
    endpoint: ":5432"
    username: "kloudmate"
    password: "${KLOUDMATE_DB_PASSWORD}"
    database: "postgres"                   # primary connection database
    databases_to_monitor:                  # additional databases
      - "app_db"
      - "analytics_db"
    collect_query_metrics: true
    collect_query_samples: true
    collect_explain_plans: true
    collect_schema: true
    collect_replication_stats: true
    collection_interval: 10s
    tls:
      mode: "require"                      # require | verify-ca | verify-full
      ca_cert: "/path/to/rds-ca-bundle.pem"

For RDS with IAM authentication:

receivers:
  kloudmate_postgresql:
    endpoint: "<RDS_ENDPOINT>:5432"
    auth_type: "iam"
    username: "kloudmate"
    database: "postgres"
    aws:
      region: "ap-south-1"
      instance_endpoint: "<RDS_ENDPOINT>"

DROP PROCEDURE IF EXISTS kloudmate.enable_events_statements_consumers;
DROP PROCEDURE IF EXISTS kloudmate.explain_statement;
DROP SCHEMA IF EXISTS kloudmate;
DROP USER IF EXISTS 'kloudmate'@'%';
DROP FUNCTION IF EXISTS kloudmate.pg_stat_activity();
DROP FUNCTION IF EXISTS kloudmate.pg_stat_statements();
DROP FUNCTION IF EXISTS kloudmate.explain_statement(TEXT);
DROP FUNCTION IF EXISTS kloudmate.pg_stat_statements_reset();
DROP FUNCTION IF EXISTS kloudmate.pg_stat_replication();
DROP SCHEMA IF EXISTS kloudmate;
DROP USER IF EXISTS kloudmate;