Appendix¶

A. List of all supported detection rules:¶

1. Percentage Rule (type: PERCENTAGE_RULE)¶

Compares current time series to a baseline, if the percentage change is above a certain threshold, detect it as an anomaly.

Example:

rules:
- detection:
    - name: detection_rule_1
      type: PERCENTAGE_RULE
      params:
        offset: wo1w
        percentageChange: 0.1
        pattern: UP_OR_DOWN

Parameters:

param

description

default value

supported values

offset

the baseline time series to compare with.

wo1w

* hoXh hour-over-hour data points with a lag of X hours * doXd day-over-day data points with a lag of X days * woXw week-over-week data points with a lag of X weeks * moXm month-over-month data points with a lag of X months * meanXU average of data points from the the past X units (hour, day, month, week), with a lag of 1 unit) * medianXU median of data points from the the past X units (hour, day, month, week), with a lag of 1 unit) * minXU minimum of data points from the the past X units (hour, day, month, week), with a lag of 1 unit) * maxXU maximum of data points from the the past X units (hour, day, month, week), with a lag of 1 unit)

percentageChange

The percentage threshold. If the percentage change is above this threshold, detect it as an anomaly.

NaN

double values

NaN means no threshold set.

pattern

Detect as an anomaly if the metric drop, rise or both directions.

UP_OR_DOWN

UP: detect as an anomaly only if the current time series is above the baseline.

DOWN: detect as an anomaly only if the current time series is below the baseline.

UP_OR_DOWN: detect as an anomaly in both directions

2. Threshold Rule (type: THRESHOLD)¶

If metric is above the max threshold or below the min threshold, detect is as an anomaly.

Example:

rules:
- detection:
    - name: detection_rule_1
      type: THRESHOLD
      params:
        max: 1000
        min: NaN

Parameters:

params

description

default value

supported values

max

If the metric goes above this value, detect is as an anomaly.

NaN

double values

NaN means no threshold set.

min

If the metric goes above below value, detect is as an anomaly.

NaN

double values

NaN means no threshold set.

3. Holt-Winters Algorithm (type: HOLT_WINTERS_RULE)¶

Holt-Winters Algorithm is a commonly used statistic forecasting algorithm for anomaly detection.

This algorithm performs very well for daily data and monthly data.

For hourly data and minutely data, please trial and error more patiently with duration filters and percentage filters.

Minimal configuration (for any granularity):

rules:
- detection:
    - name: detection_rule_1
      type: HOLT_WINTERS_RULE
      params:
        sensitivity: 6 # Detection sensitivity scale from 0 - 10, mapping z-score from 1 to 3.
        pattern: UP_OR_DOWN # Alert when value goes up or down by the configured threshold. (Values supported - UP, DOWN, UP_OR_DOWN)

Optional Parameters:

param	description	default value	supported values
sensitivity	Detection sensitivity scale from 0 - 10, mapping z-score from 1 to 3.	5	any double in [0, 10]
pattern	Detect as an anomaly if the metric drop, rise or both directions.	UP_OR_DOWN	UP, DOWN, UP_OR_DOWN
alpha	level smoothing factor	Optimized by BOBYQA optimizer to minimize error	any double in [0, 1]
beta	trend smoothing factor	Optimized by BOBYQA optimizer to minimize error	any double in [0, 1]
gamma	seasonal smoothing factor	Optimized by BOBYQA optimizer to minimize error	any double in [0, 1]
period	seasonality period, default 7 for daily, hourly and minutely data. For monthly data, set it to 12. For non-seasonal data, set it to 1.	7	Any positive interger
smoothing	For smoothing of hourly and minutely data to reduce noise	true	true or false

4. Absolute change Rule (Type: ABSOLUTE_CHANGE_RULE)¶

Compares current time series to a baseline, if the absolute change is above a certain threshold, detect it as an anomaly.

Example:

rules:
- detection:
    - name: detection_rule_1
      type: ABSOLUTE_CHANGE_RULE
      params:
        offset: wo1w
        absoluteChange: 100
        pattern: UP_OR_DOWN

Parameters:

param

description

default value

supported values

offset

the baseline time series to compare with.

wo1w

* hoXh hour-over-hour data points with a lag of X hours * doXd day-over-day data points with a lag of X days * woXw week-over-week data points with a lag of X weeks * moXm month-over-month data points with a lag of X months * meanXU average of data points from the the past X units (hour, day, month, week), with a lag of 1 unit) * medianXU median of data points from the the past X units (hour, day, month, week), with a lag of 1 unit) * minXU minimum of data points from the the past X units (hour, day, month, week), with a lag of 1 unit) * maxXU maximum of data points from the the past X units (hour, day, month, week), with a lag of 1 unit)

absoluteChange

The absolute change threshold. If the absolute change when compared to the baseline is above this threshold, detect it as an anomaly.

NaN

double values

NaN means no threshold set.

pattern

Detect as an anomaly if the metric drop, rise or both directions.

UP_OR_DOWN

UP: detect as an anomaly only if the current time series is above the baseline.

DOWN: detect as an anomaly only if the current time series is below the baseline.

UP_OR_DOWN: detect as an anomaly in both directions

B. List of all supported filter rules¶

.._filter-percentage:

1. Percentage change anomaly filter (type: PERCENTAGE_CHANGE_FILTER)¶

Filter the anomaly if compared to the baseline, percentage change is below a certain threshold.

Example:

filter:
  - name: filter_rule_1
    type: PERCENTAGE_CHANGE_FILTER
    params:
      threshold: 0.1 # filter out all changes less than 10%

Parameters:

params

description

default value

supported values

threshold

The percentage threshold. If the percentage change is below this threshold, filter the anomaly.

NaN

double values

NaN means no threshold set.

offset

The baseline timeseries used to calculate the baseline value.

The default baseline used in detection algorithm.

* hoXh hour-over-hour data points with a lag of X hours
* doXd day-over-day data points with a lag of X days
* woXw week-over-week data points with a lag of X weeks
* moXm month-over-month data points with a lag of X months
* meanXU average of data points from the the past X units (hour, day, month, week), with a lag of 1 unit)
* medianXU median of data points from the the past X units (hour, day, month, week), with a lag of 1 unit)
* minXU minimum of data points from the the past X units (hour, day, month, week), with a lag of 1 unit)
* maxXU maximum of data points from the the past X units (hour, day, month, week), with a lag of 1 unit)

If this value is not set, it will use the default baseline. E.g, if the detection uses PERCENTAGE_RULE and offset is wo1w then the baseline is last week’s value. If the detection type is ALGORITHM then the baseline is generated by algorithm.

pattern

Keep as an anomaly if the metric drop, rise or both directions.

UP_OR_DOWN

UP: Keep the anomaly only if the current value is above the baseline and passes the threshold.

DOWN: Keep the anomaly only if the current value is below the baseline and passes the threshold.

UP_OR_DOWN: Keep the anomaly if it passes the threshold regardless of metric moving to which directions

.._filter-sitewide:

2. Site wide impact anomaly filter (Type: SITEWIDE_IMPACT_FILTER)¶

Filter the anomaly if its site wide impact is below a certain threshold.

How site wide impact is calculated?

SWI = (currentValue of the anomaly - baselineValue of the anomaly) / (current value of the site wide metric in the anomaly range)

Example:

In the following example, we are setting up an anomaly detection pipeline for all the possible platforms (such as ios, android, windows, etc) in the US. We use the percentage rule to detect the anomaly, if the metric compared to median over 4 weeks value is up or down 1%, and the site-wide impact for the anomaly is larger than 1%, we say this is an anomaly.

For example, an anomaly is detected in iOS platform , the anomaly happens 2pm to 3pm. The site wide impact is calculated by: Taking the the total number of sign ups on iOS in U.S. between 2 to 3 pm, minus the week over week baseline value between 2 to 3 pm and then divided the current signup value of U.S. among all platforms.

detectionName: swi_monitor
metric: signups
dataset: registration_metrics_v2_additive
dimensionExploration:
 dimensions:
    platform
filters:
    country:
      us
rules:
- detection:
    - name: detection_rule1
      type: PERCENTAGE_RULE
      params:
        offset: median4w
        percentageChange: 0.01
  filter:
   - type: SITEWIDE_IMPACT_FILTER
     name: filter_rule_1
     params:
        threshold: 0.01
        pattern: up_or_down
        offset: wo1w
        sitewideMetricName: signups
        sitewideCollection: registration_metrics_v2_additive
        filters:
            country:
                us

Parameters:

params	description	default value	supported values & descriptions
threshold	The percentage threshold. If the percentage change is below this threshold, filter the anomaly.	NaN	double values NaN means no threshold set.
pattern	Keep as an anomaly if the metric drop, rise or both directions.	UP_OR_DOWN	UP: Keep the anomaly only if the current value is above the baseline and passes the threshold. DOWN: Keep the anomaly only if the current value is below the baseline and passes the threshold. UP_OR_DOWN: Keep the anomaly if it passes the threshold regardless of metric moving to which directions
sitewideMetricName	The metric to calculate the site wide baseline value	By default, use the same metric as the anomaly without the dimension filters.	All metric names in ThirdEye.
sitewideCollection	The metric to calculate the site wide baseline value	By default, use the same metric as the anomaly without the dimension filters.	The dataset name for the site wide metric. The sitewideCollection must be configured together with the sitewideMetricName.
filters	The dimension filter for the site wide metric	By default, use the same metric as the anomaly without the dimension filters.	See Dimension filter to configure the filters for site wide metric. This filters must be configured together with the sitewideMetricName.
offset	The baseline time series used to calculate the baseline value.	Use the baseline value generated in detection for the anomaly.	* hoXh hour-over-hour data points with a lag of X hours * doXd day-over-day data points with a lag of X days * woXw week-over-week data points with a lag of X weeks * moXm month-over-month data points with a lag of X months * meanXU average of data points from the the past X units (hour, day, month, week), with a lag of 1 unit) * medianXU median of data points from the the past X units (hour, day, month, week), with a lag of 1 unit) * minXU minimum of data points from the the past X units (hour, day, month, week), with a lag of 1 unit) * maxXU maximum of data points from the the past X units (hour, day, month, week), with a lag of 1 unit)

3. Threshold-based anomaly filter (Type: THRESHOLD_RULE_FILTER)¶

Filter the anomaly if the metric current value in the anomaly time duration is outside of the allowed range.

For example:

Filter the anomaly, if the anomaly current value per hour is less than 1000 or larger than 2000, filter the anomaly.

filter:
    - name: filter_rule_1
      type: THRESHOLD_RULE_FILTER
      params:
        minValueHourly: 1000
        maxValueHourly: 2000

Parameters:

params	description	default value	supported values
minValueHourly	The minimum value allowed for an anomaly on an hourly bases. If the current value per hour in the anomaly duration is less than this value, filter the anomaly.	NaN	double values NaN means no threshold set.
maxValueHourly	The maximum value allowed for an anomaly on an hourly bases. If the current value per hour in the anomaly duration is larger than this value, filter the anomaly.	NaN	double values NaN means no threshold set.
minValueDaily	The minimum value allowed for an anomaly on a daily bases. If the current value per day in the anomaly duration is less than this value, filter the anomaly.	NaN	double values NaN means no threshold set.
maxValueDaily	The maximum value allowed for an anomaly on a daily bases. If the current value per day in the anomaly duration is larger than this value, filter the anomaly.	NaN	double values NaN means no threshold set.

4. Anomaly duration filter (Type: DURATION_FILTER)¶

Filter the anomalies based on the anomaly duration.

Parameters:

params

description

default value

supported values

minDuration

The minimum duration allowed for an anomaly. If the anomaly’s duration is less than this value, filter the anomaly.

null

String representation of Java duration.

See examples here:

http://www.java2s.com/Tutorials/Java_Date_Time/java.time/Duration/Duration_parse_CharSequence_text_example.htm

maxDuration

The maximum duration allowed for an anomaly. If the anomaly’s duration is larger than this value, filter the anomaly.

null

String representation of Java duration

For example:

Filter the anomaly, if the anomaly duration is less than 15 minutes.

filter:
    - name: filter_rule_1
      type: DURATION_FILTER
      params:
        minDuration: PT15M

Please override the default merge configs in the YAML if the duration filter is set. Otherwise, it might have side effects.

merger:
  maxGap: 0  # prevent potential anomaly duration extension

.._filter-absolutechange 5. Absolute change anomaly filter (Type: ABSOLUTE_CHANGE_FILTER) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Check if the anomaly’s absolute change compared to baseline is above the threshold If not, filters the anomaly.

Example:

filter:
    - name: filter_rule_1
      type: ABSOLUTE_CHANGE_FILTER
      params:
        threshold: 0.1 # filter out all changes less than 10%

Parameters:

params

description

default value

supported values

threshold

The percentage threshold. If the percentage change is below this threshold, filter the anomaly.

NaN

double values

NaN means no threshold set.

offset

The baseline timeseries used to calculate the baseline value.

The default baseline used in detection algorithm.

* hoXh hour-over-hour data points with a lag of X hours
* doXd day-over-day data points with a lag of X days
* woXw week-over-week data points with a lag of X weeks
* moXm month-over-month data points with a lag of X months
* meanXU average of data points from the the past X units (hour, day, month, week), with a lag of 1 unit)
* medianXU median of data points from the the past X units (hour, day, month, week), with a lag of 1 unit)
* minXU minimum of data points from the the past X units (hour, day, month, week), with a lag of 1 unit)
* maxXU maximum of data points from the the past X units (hour, day, month, week), with a lag of 1 unit)

If this value is not set, it will use the default baseline. E.g, if the detection uses PERCENTAGE_RULE and offset is wo1w then the baseline is last week’s value. If the detection type is ALGORITHM then the baseline is generated by algorithm.

pattern

Keep as an anomaly if the metric drop, rise or both directions.

UP_OR_DOWN

UP: Keep the anomaly only if the current value is above the baseline and passes the threshold.

DOWN: Keep the anomaly only if the current value is below the baseline and passes the threshold.

UP_OR_DOWN: Keep the anomaly if it passes the threshold regardless of metric moving to which directions

C. List of all supported Subscription group Types¶

1. Default Alerter (type: DEFAULT_ALERTER_PIPELINE)¶

The default notification type which lets you to configure a set of recipients and sends anomaly notification to all of them.

type: DEFAULT_ALERTER_PIPELINE

2. Dimension Alerter (type: DIMENSION_ALERTER_PIPELINE)¶

This gives you the ability to alert different people/group/team based on the dimension values. This is a special notification type which sends the anomaly email to a set of unconditional and another set of conditional recipients, based on the value of a specified anomaly dimension.

type: DIMENSION_ALERTER_PIPELINE
dimension: app_name
dimensionRecipients:
 "android":
  - "android-oncall@linkedin.com"
 "ios":
  - "ios-oncall@linkedin.com"