Setting Up KloudMate Alarms
KloudMate lets users create and configure alarms for events that are critical to their application. By setting up alarms on KloudMate, users can keep abreast of times when certain metrics are above or below a set of pre-defined thresholds and take necessary actions as soon as possible. These alarms are entirely customizable and can be configured for any AWS service available in KloudMate Inventory, using metrics associated with them. For example, the following are some of the Lambda Function metrics for which users can set up alarms:
- Memory Used
- Coldstart Count
- Coldstart Duration
Users can create, delete, and edit their alarms in the Alarms section.
- Click on the bell icon and navigate to the Alarms section from home
- Alarms section will display a list of existing alarms (if any)
- You can view, pause, delete, or edit any of these alarms using their corresponding more-options icon
To learn about the key concepts of KloudMate Alarms, see Understanding KloudMate Alarms
To create a new alarm, click on the Create Alarm button located at the top right corner of the alarms screen. It will open the Create Alarm page which lets you add Queries and Expressions.
- You can create and setup multiple queries and expressions using the Add Query and Add Expression buttons
- You can also replicate a query or expression by clicking on the copy icon available at the top right corner of each query or expression
- For each query and expression, there will be a unique alphabetical notation associated with it, such as A, B, C, and so on.
- Choose OpenTelemetry or KloudMate as the data source in the first dropdown menu.
1.1. Setting up Query conditions for AWS:
- Set a time duration for which you want the data to be fetched, either use the dropdown menu or use the custom field (in case of custom time, enter the time duration in seconds)
- Select the Region from its corresponding dropdown menu. It specifies the AWS region of the service you want to create an alarm for
- Select the Namespace from its corresponding dropdown menu. The namespace is the service you want to create an alarm for
- Select the Metric from its corresponding dropdown menu. It specifies the metric associated with the selected namespace that you want to monitor
- Select the Statistic from its corresponding dropdown menu. It is the statistical function you want to use to calculate data points
- You can use Dimensions to configure the alarm for a group resources of the selected namespace.
- For example, for EC2, you can choose an autoscaling group name, image id, instance type, and more in the Dimensions dropdown menu to configure the alarm for all the instances in the same dimension
- Click on the Run Query button
1.2. Setting up Query conditions for OpenTelemetry:
- Select the Data Set that you want to retrieve and work with, from your OpenTelemetry datasourse.
- Under the Metric to aggregate option, select the metric. It specifies the metric associated with the selected data set that you want to monitor
- For the Group By option, enter the attributes to group the data points
- You can add Filters to narrow down the retrieved data points
1.3. Use Time Range Expressions for Custom Time
Alarm query time ranges support the following:
- Operators: - (Subtract time)
- Examples: now, now-5m
1.4. Setting Up Evaluation expressions:
- In evaluating expressions, the value of a query condition or an expression that has already been configured can be passed as parameters using their alphabetical notation such as A, B, C, and so on. Note that an expression can be passed as a parameter only in case of multiple expressions
- For expression, choose from Math expression, Reduce, or Condition expression options
- In the case of Math expression, enter the expression that you want to apply to the value of a query/expression in the given text field. For example $A+1, $A<$B, $A && $C
- In the case of Reduce expression, use the first dropdown menu to choose the function you want to use to aggregate the values of a query/expression into a single value, and select the desired query/expression from the Input dropdown menu
Note: Following are some of the available functions to choose from:
- mean() : Get average value
- max() : Get maximum value
- min() : Get minimum value
- sum() : Get sum of all value
- last() : Get the last Value
- count() : Get the total number of Value
- In the case of Condition expression, select the function and a query/expression from their respective dropdown menus. Select the desired condition from its corresponding dropdown menu and provide a value against which the condition will be evaluated
- You can add multiple conditions in a condition expression and choose how they work together using logical operators such as OR & AND
- Click on the Run Queries button
For more information, see Writing Expressions for KloudMate Alarms.
- For alarm condition, select the query or the expression to trigger the alarm
- In the Evaluate every (seconds) field, define how frequently the alarm condition should be evaluated
- In the Evaluate for (seconds) field, define how long the alarm condition should be evaluated and found to be true before the alarm gets triggered
- In the Alert state dropdown menus, select the alarm behavior if the query results in no data or in an error
- Click on the Preview alarms button to run the query instantaneously and check the result
- Enter the Alarm name and a description of the alarm.
- Add tags to the alarm.
- You can configure a notification policy to send notifications to particular channel when the alarm is triggered with matching tags.
- Click on the Save button to save the alarm, or click on the Save & Close button to save the alarm and go back to the alarms section