MongoDB integration with KloudMate using Opentelemetry
MongoDB is a widely-used NoSQL database designed for scalability and flexibility. It stores data in a JSON-like format (BSON), which allows for dynamic schema design and handling of unstructured data. Known for its high performance in read and write-heavy environments, MongoDB is ideal for applications like real-time analytics, content management, and IoT systems. Available in both open-source and enterprise editions, it offers features like horizontal scaling through sharding and built-in replication for high availability.
This document will guide you on collecting MongoDB server metrics using the OpenTelemetry (otel) MongoDB metrics receiver. This receiver can also help capture metrics on a per-database or per-collection basis from MongoDB instances running on AWS EC2, Azure VMs, or on-premise servers.
Pre-requisites:
- The MongoDB server must be running.
- This receiver supports MongoDB versions: 4.0+, 5.0, 6.0, 7.0
- Install the OpenTelemetry collector on the specific server that requires metric monitoring. See Installing and Configuring OpenTelemetry Collector.
- MongoDB recommends setting up a least privilege user (LPU) with a clusterMonitor role in order to collect metrics. Please refer to lpu.sh for an example of how to configure these permissions. To create this please follow the below steps: $mongosh >use admin >db.createUser({ user: "USER", pwd: "PASSWORD", roles: [{ role: "clusterMonitor", db: "admin" }] }) >exit
Step 1: Configure the Receivers to scrape metrics as well as Logs.
- To start monitoring MongoDB with Otel Collector, you need to configure the MongoDB receiver:
- Linux Users: Open the file located at /etc/otelcol-contrib/config.yaml using your preferred text editor.
- Windows Users: Create a new file called config.yaml in the C:\Program Files\OpenTelemetry Collector folder. You can use Notepad or any text editor to do this.
Add the suitable extensions :
In this step, two receivers are integrated to retrieve telemetry data from MongoDB.
- MongoDB Receiver: To retrieve metrics
- File Log Receiver: To retrieve logs
To get the metrics use only the MongoDB receiver
Step 2: Set up the processor component to identify resource information from the host and either append or replace the resource values in the telemetry data with this information.
Please choose one of the following configuration options based on your provider (AWS EC2, Azure VM, or on-premises server).
- Server(Can be on-premise, non-cloud, or cloud)
- AWS EC2:
(optional) To retrieve AWS EC2 instance tags along with logs and metrics, users need to associate an IAM role with the EC2 instance that includes the EC2:DescribeTags policy.
- Azure Virtual Machines:
Step 3: Configure the exporter, and extension and save the configuration
Set up the KloudMate Backend on the exporter part of the Open Telemetry configuration file and configure the pipeline.
Step 4 : To restart and verify the status of the OpenTelemetry (Otel) Collector, follow these steps:
For Linux:
- Execute the following commands:
These commands will restart the Otel Collector and display its current status.
For Windows:
- Open the Services window:
- Press Win + R, type services.msc, and press OK.
- Alternatively, search for "Services" in the Windows Start menu.
- In the Services window, locate the "OpenTelemetry Collector" service.
- Right-click the service and select "Restart."
Subsequently, monitor the metrics on the KloudMate dashboard and set up an alarm to receive notifications if the potential metrics for a specific application rise.
Name | Description | Type | Unit |
---|---|---|---|
mongodb_cache_operations | The number of cache operations of the instance. | Sum | number |
mongodb_collection_count | The number of collections. | Sum | number |
mongodb_data_size | The size of the collection. Data compression does not affect this value. | Sum | Bytes |
mongodb_connection_count | The number of connections. | Sum | number |
mongodb_extent_count | The number of extents. | Sum | number |
mongodb_global_lock_time | The time the global lock has been held. | Sum | milliseconds |
mongodb_index_count | The number of indexes. | Sum | number |
mongodb_index_size | The sum of the space allocated to all indexes in the database, including free index space. | Sum | Bytes |
mongodb_index_size | The amount of memory used. | Sum | Bytes |
mongodb_object_count | The number of objects. | Sum | number |
mongodb_operation_latency_time | The latency of operations. | Gauge | microseconds |
mongodb_operation_count | The number of operations executed. | Sum | number |
mongodb_operation_repl_count | The number of replicated operations executed. | Sum | number |
mongodb_storage_size | The total amount of storage allocated to this collection. | Sum | Bytes |
mongodb_database_count | The number of existing databases. | Sum | number |
mongodb_index_access_count | The number of times an index has been accessed. | Sum | number |
mongodb_document_operation_count | The number of document operations executed. | Sum | number |
mongodb_network_io_receive | The number of bytes received. | Sum | Bytes |
mongodb_network_io_transmit | The number of bytes transmitted. | Sum | Bytes |
mongodb_network_request_count | The number of requests received by the server. | Sum | number |
mongodb_operation_time | The total time spent performing operations. | Sum | microseconds |
mongodb_session_count | The total number of active sessions. | Sum | number |
mongodb_cursor_count | The number of open cursors maintained for clients. | Sum | number |
mongodb_cursor_timeout_count | The number of cursors that have timed out. | Sum | number |
mongodb_lock_acquire_count | Number of times the lock was acquired in the specified mode. | Sum | number |
mongodb_lock_acquire_wait_count | The number of times the lock acquisitions encountered waits because the locks were held in a conflicting mode. | Sum | number |
mongodb_lock_acquire_time | Cumulative wait time for the lock acquisitions. | Sum | microseconds |
mongodb_lock_deadlock_count | The number of times the lock acquisitions encountered deadlocks. | Sum | number |
mongodb_health | The health status of the server. | Gauge | number |
mongodb_uptime | The amount of time that the server has been running. | Sum | microseconds |