Alerting

FusionReactor Cloud Alerting allows you to create automated checks on the values of metric data gathered by a FusionReactor agent, and generate alerts when user-made conditions are met. The alerting system is comprised of 3 parts:

  • Alerts are generated if the conditions of a check are met.
  • Checks are the conditions that are processed by the alerting engine.
  • Subscriptions can be added to a check to notify via other services when an alert is generated.

The alerting area can be found by clicking Alerting in the navigation bar, while alerting service configurations can be found by clicking Configurations. It is required to set-up alerting services before you can start using them to make Subscriptions.

!Screenshot

Alerts

!Screenshot

The area where all alerts can be viewed and dismissed.

Here you will see all the alerts for your account made by the alerting engine. There are various options to filter the alert log, found in the filter bar at the top left of the screen. The filters can be reset by clicking the broom icon on the opposite side of the screen.

When a check has been processed, it can fall into one of the following states:

!Screenshot

This panel can be viewed by hovering over the wand icon in the top right of the Alerts and Checks pages.

When a check has fallen into or out of OK, WARNING or ERROR state, by default any subscriptions that the check had will be sent the alert. However, you may change this for each subscription in the subscription's settings.

Viewing alerts

!Screenshot

You can click on View at the end of the row to see a more detailed overview of the alerts listen.

Depending on the type of event that happened, different information will be available from the alert details.

If the settings of a check are changed, the changes will be shown as well as the user who made the changes.

For both status and threshold checks, you are likely to be provided with a list of the entities involved in the check, and their states before and after the time of the alert.

If a threshold check has an error event, details will be provided showing the check period where the change in state happened as a graph.

NOTE: If the check was deleted, the graph will be unavaiable.

Dismissing alerts

Alerts can be dismissed by clicking the dismiss button on the same row as the alert, or in the bottom corner of the detailed alert view.

You can also dismiss them in bulk. This is done using the eye icon found in the top right of the screen. Clicking on the eye will reveal 3 options:

  • All alerts
    • Dismisses all alerts.
  • Filtered alerts
    • Dismisses all alerts currently visible by the filter settings. If there is no active filters it will behave the same as All alerts.
  • Alerts from deleted checks
    • Dismisses all alerts that came from checks which no longer exist.

Checks

All checks on your account can be viewed, edited and deleted under the Checks tab (shown below), as well toggle their enabled state. You may view a check in more detail by clicking its name on the left-side of the row.

!Screenshot

To start recieving alerts, you must first create and enable checks. To create a check, simply click + Check in the top-right corner of the screen.

!Screenshot

When creating a check, it is important to note that the alerting engine runs every 60 seconds. However if data is available for the time period a check watches over then the check will be processed immediately.

There are two types of check: * Threshold check * Status check

Section 1 of the check creation form lets you choose between them, as well as providing a name and description. The rest of the form will change depending on the type of check you wish to configure.

Threshold check

!Screenshot

Threshold checks are used to generate an alert when a metric value crosses a defined threshold.

Configure your threshold check

In this section you chose which instances, groups or applications you wish to have checked. Your choice will affect the available settings for triggering an alert under the check conditions.

Set threshold check conditions

The dropdown menu in this section lets you pick what metric you want to have checked by the alerting engine. You may then set an warning and error threshold. The warning threshold is disabled by default and can be enabled via the tickbox.

Some metrics will ask for a percentage threshold and provide sliders, while more open metrics will simply let you enter threshold values.

Instances

Allows you to check against instance metrics such as CPU and Memory usage.

  • a single one of
    • Alert if any one chosen instance meets the threshold for the given amount of time
  • all
    • Alert if all chosen instances meet the threshold for the given amount of time
  • a count of
    • Alert if a given number of chosen instances meet the threshold for the given amount of time
  • an average of
    • Alert if the average value between chosen instances meets the threshold for the given amount of time
Groups

Allows you to check against instance metrics such as CPU and Memory usage. You can also choose to have an aggregate function applied to the data. This function will, for each instance, add together all of its data-points and divide them by the size of the group.

  • a single one of
    • Alert if any one group member meets the threshold for the given amount of time
  • all
    • Alert if all group members meet the threshold for the given amount of time
  • a count of
    • Alert if a given number of group members meet the threshold for the given amount of time
  • an average of
    • Alert if the average value between group members meets the threshold for the given amount of time
Applications

Allows you to check against the application metrics such as request statuses, or throughput per minute.

  • a single one of
    • Alert if any one application meets the threshold for the given amount of time
  • all
    • Alert if all applications meet the threshold for the given amount of time
  • a count of
    • Alert if a given number of applications meet the threshold for the given amount of time
  • an average of
    • Alert if the average value between applications meets the threshold for the given amount of time

The conditions set will only check data within a given time period. By default this is 5 minutes, but it can be set up-to an hour. You may also click a checkbox on the row below that will enable a cooldown period for alerts, this again is between 5 minutes to an hour. During the cooldown period, the check's state will still be processed by the alerting engine, but will not generate any alerts.

Status check

!Screenshot

Status checks are used to generate an alert when the status (online / offline) of a server, or group of servers changes.

Configure your status check

In this section you chose which instances or groups you wish to check the status of. Your choice will affect the available settings for triggering an alert under the check conditions.

Set status check conditions

Instances
  • my selected instances
    • You may choose the length of time in which they can be offline before changing to the Error state.
    • A cooldown period may be set in which the check will still be processed by the alerting engine, but not trigger any alerts.
  • any instances
    • Disregards the instances selected above, and instead checks all instances on the account.
    • Allows you to set an exclusion period, in which instances offline for longer than the set time are no longer considered an error.
    • A cooldown period may be set in which the check will still be processed by the alerting engine, but not trigger any alerts.
Groups
  • All options allow you to set an exclusion period, in which instances offline for longer than the set time are no longer considered an error.
  • All options can have a cooldown period in which the check will still be processed by the alerting engine, but not trigger any alerts.
  • a single
    • Alert if any one instance in the group is offline for the given amount of time.
  • a count of
    • Alert if a given number of instances are offline for the given amount of time.
  • an percentage of
    • Alert if the given percentage of instances are offline for the given amount of time.
  • all
    • Alert if all instances in the group are offline for the given amount of time.

Add subscriptions

You can add configured subscriptions to the check, so that when an alert is triggered, the subscribed services will also recieve it. The configuration link on the right of this section will let you set up a new subscription without losing your current settings for a new check.

Preview and save

The final section shows a preview of the check that has been created on a graph with the metric being checked. This graph will only appear once you have a valid configuration, or will otherwise be hidden.

!Screenshot

Editing and duplicating checks

Editing a check will provide you with a form identical to the form for creating one; the difference being that you will see the configuration of the check already there, and when you save it the original check is overwritten.

Duplicate similarly shows the same form with the check's settings already filled-in. Clicking save when duplicating will create a new check instead.

Deleting and disabling checks

Clicking delete will cause a confirmation pop-up. Choosing Ok will remove it. If a check is deleted, an alert will be generated to show that it changed to the deleted state. Once deleted, any alerts that came from the check will no longer have full details and can also be dismissed in bulk.

If the check is enabled, its toggle will appear blue. Disabled checks will not be processed by the alerting engine, and instead enter a Disabled state. When a check enters or leaves the disabled state, an alert is generated to show the event.

Alerting services

Subscriptions require a service they can send their subscribed alerts to. These are set-up in the configuration menu under Alert services

!Screenshot

The following alerting services are currently supported:

Configuring alerting services

Service configurations can be found under alert services in the Configuration menu. From there you can configure the services you wish to use when creating subscriptions.

To configure a service simply click the Configuration button next to that service.

!Screenshot

A menu like the one shown above will appear. The required information differs for each service, and are explained in more detail in the sections below:

Email
  • Simply enable the service.
  • Email addresses are set on the subscriptions themselves.
Flowdock

More information: https://www.flowdock.com/

  • From your main Flowdock screen (after logging in), click your username on the navigation bar and select 'Account'
  • Select 'API tokens' on the left navigation
  • Scroll down until you see 'Flow API tokens' heading
  • Select the inbox you wish to use and take the API Token provided
Http (Webhook)
  • Enter Webhook URL in the individual sevices after creating an alert check.
OpsGenie

More information: https://www.opsgenie.com/

  • After you login to OpsGenie, click 'Integrations' on the left navigation bar
  • Select the 'Configure Integrations' tab
  • Select Default API (Rest API HTTPS over JSON)
  • Use the API Key listed on that page.
PagerDuty

More information: https://www.pagerduty.com/

  • After logging into the PagerDuty service, you can select 'Configuration' on the top navigation bar
  • Here you can crete a service to represent your FRCloud alerts
  • View a service and select the 'Settings' tab
  • Provide us with the integration key
Pushover

More information: https://pushover.net/

  • After you login to Pushover, 'Register an Application'
  • Enter the details required
  • After redirection, use 'API Token/Key'
Slack

More information: https://slack.com/

NOTE: This is due to be updated in the future to allow for the non-legacy tokens.

VictorOps

More information: https://portal.victorops.com

  • Sign into VictorOps and click 'Settings' on the top navigation bar
  • Select the 'Integrations' tab
  • Select 'REST Endpoint' on the list to the right
  • Select 'Enable Integration'
  • Provide the Post URL

Disabling alerting services

Clicking the toggle next to the service will switch it between being enabled or disabled.

When a service configuration is disabled, the following will happen:

  • Each subscription that is linked to that service will be disabled, you will need to manually re-enable them after re-configuring the service.
    • You cannot re-enable the subscriptions until the service is enabled.
  • The settings saved on that service will be kept, so no extra set-up is required when re-enabling them.

Resetting alerting services

!Screenshot

You can reset the configuration for a service by clicking the Reset button next to that service.

When a service configuration is reset, the following will happen:

  • Each subscription that is linked to that service will be disabled, you will need to manually re-enable them after re-configuring the service.
    • You cannot re-enable the subscriptions until the service is configured and enabled.
  • The settings saved on that service will be deleted, so any API keys and other information will need to be entered again when re-configuring them.

Subscriptions

The subscriptions tab lets you manage and test your subscriptions.

!Screenshot

You may view a subscription in more detail by clicking its name on the left-side of the row.

Creating subscriptions

Clicking the button shown below will open the form for creating subscriptions.

!Screenshot

Within this form you will be able to set the active times and days of the subscription, as well as which alert states it watches for.

!Screenshot

Messages will only be sent to the subscribed service of those conditions are met by an alert. Each service has different properties that may be configured, and will be explained in more detail below.

Email

  • Subject is for the email subject, which is "FusionReactor alert" by default.
  • Addresses is where you list the email addresses of recipients for the subscription.
  • key and value fields let you enter internet/email message headers as key:value pairs.

Gitlab

Gitlab supports creating issues via email. Follow this guide to find out how.

Flowdock

Flowdock doesn't require any additional configuration beyond the service configuration.

HTTP Webhooks

  • Target URL is required, as it is the URL that will the alerts will be sent to. Your endpoint should provide this URL that can be simply copied and pasted into the Target URL field.
  • Header is for the internet message headers attached to requests.
  • Body is for the body of the request.

The end-point of the webhook will vary on what it requires, so be sure to check the documentation of your endpoint to see what schema it follows. Then you can fill in the Body and Header to fit your needs.

Microsoft Teams

To set-up alerting for MS Teams, you will need to set up a Connector. Follow this guide to find out how. When done, it should provide the URL for your webhook.

Once you have the URL, you can enter it into your subscription as the Target URL, as mentioned above for webhooks.

Below is an example of the JSON schema used for MS Teams message cards. Copy and paste this into the Body of the webhook, or create your own. For more info on creating your own MessageCard follow this guide.

{
    "@type": "MessageCard",
    "@context": "https://schema.org/extensions",
    "summary": "FusionReactor Cloud Alert",
    "themeColor": "479eff",
    "sections": [
        {
            "activityTitle": "FusionReactor Cloud",
            "activitySubtitle" : "My alert title",
            "activityImage": "https://www.fusion-reactor.com/wp-content/uploads/2013/01/frico1.png",
            "text": "This text is here as a placeholder example of the webhook schema for MS Teams."
        }
    ],
    "potentialAction": [
        {
            "@type": "OpenUri",
            "name": "View alerts",
            "targets": [
                {
                    "os": "default",
                    "uri": "https://app.fusionreactor.io/alerting/alerts"
                }
            ]
        }
    ]
}

That looks like this:

!Screenshot

OpsGenie

  • Alias is the Client-defined identifier of the alert, that is also the key element of Alert De-Duplication. It is optional and has a limit of 512 characters.

For more information see this guide.

PagerDuty

  • Description is for the body of the request. This field is required.
  • Incident Key is used by the PagerDuty Incident Creation API as a unique identifier for incidents. One will be generated for you if you do not provide one.

For more information see this guide.

Pushover

  • Title is required, and will be the name of the alert created.
  • User/Group Key is where you put your Pushover user or group key.
  • Priority sets the Message Priority in Pushover.

For more information see this guide.

Slack

  • Channel is a required field, and takes the name of the channel you want alerts to appear in. Chats prefixed with a # in slack will require the # here.
  • Icon URL is for a URL link to an image. Any image formats supported by slack for profile pictures can be used here. The alert will appear with the linked image as its picture. Defaults to our logo.

For more information see this guide.

ViktorOps

  • Entity ID is used to identify incidents in ViktorOps. An ID is automatically generated if not provided.
  • Entity Display Name is the name used to display your incidents in a timeline, with the intention of being more human-readable than an ID.

For more information see this guide.

Editing and duplicating subscriptions

Clicking the edit button on the row of the subscription you wish to edit, will present you with an interface just like the one used to create subscriptions. The values shown will match what's currently set on the subscription. Once you've made your desired changes, simply click save and your changes will overwrite on the subscription.

If you don't wish to overwrite the existing subscription you can click the duplicate button. The values shown will match what's currently set on the subscription you chose. You may make changes before saving it. When you click save at the end of the form, a new subscription will be made.

Deleting and disabling subscriptions

Clicking the delete button will cause a message to appear confirming that you wish to delete the subscription. Choosing Ok will delete the subscription from your account.

NOTE: This cannot be undone.

Clicking the toggle under 'Enabled' will enable or disable the subscription. The check will appear blue when enabled. While disabled, checks will not send alerts to that subscription and the subscription will not appear when looking at a check in the detailed view. If the alert service used by the subscription is not configured, or disabled, then the subscription will be automatically disabled and cannot be enabled until the service is configured and enabled.