规则类型及其配置项
在exapmple_rules文件夹当中有一些规则配置类型的实例可供参考.
提示
所有的时间规范是以单位:数量
的形式表达的,规范中的"单位"可以是weeks,days,hours,minutes或者seconds. 例如:minutes:15
或者hours:1
.
规则配置说明书
所有的规则
必须项 |
---|
es_host (string) |
es_port (number) |
index (string) |
type (string) |
alert (string or list) |
可选项 |
---|
name (string, defaults to the filename) |
use_strftime_index (boolean, default False) |
use_ssl (boolean, default False) |
verify_certs (boolean, default True) |
es_username (string, no default) |
es_password (string, no default) |
es_url_prefix (string, no default) |
es_send_get_body_as (string, default “GET”) |
aggregation (time, no default) |
description (string, default empty string) |
generate_kibana_link (boolean, default False) |
use_kibana_dashboard (string, no default) |
kibana_url (string, default from es_host) |
use_kibana4_dashboard (string, no default) |
kibana4_start_timedelta (time, default: 10 min) |
kibana4_end_timedelta (time, default: 10 min) |
use_local_time (boolean, default True) |
realert (time, default: 1 min) |
exponential_realert (time, no default) |
match_enhancements (list of strs, no default) |
top_count_number (int, default 5) |
top_count_keys (list of strs) |
raw_count_keys (boolean, default True) |
include (list of strs, default [“*”]) |
filter (ES filter DSL, no default) |
max_query_size (int, default global max_query_size\ |
query_delay (time, default 0 min) |
owner (string, default empty string) |
priority (int, default 2) |
import (string)IGNORED IFuse_count_query oruse_terms_query is true |
buffer_time (time, default from config.yaml) |
timestamp_type (string, default iso) |
timestamp_format (string, default “%Y-%m-%dT%H:%M:%SZ”) |
_source_enabled (boolean, default True) |
alert_text_args (array of strs) |
alert_text_kw (object) |
提示:下表当中Req(Required)代表的是必选项,代表是这一列的规则类型的必选项.Opt(Optional)代表的是可选项,代表这一列的规则类型可以使用该选项.
RULE TYPE | Any | Blacklist | Whitelist | Change | Frequency | Spike | Flatline | New_term | Cardinality |
---|---|---|---|---|---|---|---|---|---|
compare_key (string, no default) |
Req | Req | Req | ||||||
blacklist (list of strs, no default) |
Req | ||||||||
whitelist (list of strs, no default) |
Req | ||||||||
ignore_null (boolean, no default) |
Req | Req | |||||||
query_key (string, no default) |
Opt | Req | Opt | Opt | Opt | Req | Opt | ||
aggregation_key (string, no default) |
Opt | ||||||||
summary_table_fields (list, no default) |
Opt | ||||||||
timeframe (time, no default) |
Opt | Req | Req | Req | Req | ||||
num_events (int, no default) |
Req | ||||||||
attach_related (boolean, no default) |
Opt | ||||||||
use_count_query (boolean, no default)doc_type (string, no default) |
Opt | Opt | Opt | ||||||
use_terms_query (boolean, no default)doc_type (string, no default)query_key (string, no default)terms_size (int, default 50) |
Opt | Opt | Opt | ||||||
spike_height (int, no default) |
Req | ||||||||
spike_type ([up\down\both], no default) |
Req | ||||||||
alert_on_new_data (boolean, default False) |
Opt | ||||||||
threshold_ref (int, no default) |
Opt | ||||||||
threshold_cur (int, no default) |
Opt | ||||||||
threshold (int, no default) |
Req | ||||||||
fields (string or list, no default) |
Req | ||||||||
terms_window_size (time, default 30 days) |
Opt | ||||||||
window_step_size (time, default 1 day) |
Opt | ||||||||
alert_on_missing_fields (boolean, default False) |
Opt | ||||||||
cardinality_field (string, no default) |
Req | ||||||||
max_cardinality (boolean, no default) |
Opt | ||||||||
min_cardinality (boolean, no default) |
Opt |
Common Configuration Options
一般配置选项
在rules_folder
配置项指向的文件夹中所有以.yaml
结尾的文件默认都会被运行.下面将会介绍适用于所有规则类型的配置项.
必选配置
es_host
es_host
: 规则中查询所需要的 Elasticsearch 集群的 hostname.(比选项,字符串类型,无默认值),如果配置了环境变量ES_HOST
,此选项将会被覆盖.
es_port
es_port
: ElasticSearch 集群的端口号.(比选项,数字,无默认值),如果配置了环境变量ES_PORT
,此选项将会被覆盖.
index
index
: 限定搜索用的索引名称.此处可以使用通配符,例如:index:my-index-*
将会匹配my-index2014-10-05
这个索引.你还可以使用格式字符串,让%Y
,%m
,%d
分别代表年月日.如果需要使用这个选项,你必须设置use_strftime_index
的值为true.(比选项,字符串类型,无默认值)
name
name
: 规则名称.在所有规则中必须保证唯一.这个名称会在警报中被使用以及从 Elasticsearch 的读写查询信息的时候作为key使用.
type
type
: 提供给RuleType
使用.这可能是内置规则类型之一,更多细节请查看Rule Types或者从模块中加载.从模块中加载的具体方式是,类型必须为module.file.RuleName
.(比选项,字符串类型,无默认值)
alert
alert
: 提供给Alerter
使用.可以是一个或者多个内置警报.可以通过Alert Types部分获取更多信息,或者从模块中加载.若要从模块中加载的话,警报应该指定为module.file.AlertName
.(必选项,字符串或者列表,无默认值)
可选项
import
import
: 如果想实现在这个yaml文件中包含所有配置的话.此配置允许共享配置选项.请注意,导入的不完整文件不应该使用.yml
或.yaml
后缀,避免 ElastAlert 将其识别为规则.导入的文件中的过滤器将会与规则中的所有过滤器合并(ANDed-与关系).(可选项,字符串,无默认值)
use_ssl
use_ssl
: 是否通过TLS链接到es_host
.(可选项,布尔类型,默认为 False).如果配置了环境变量ES_USE_SSL
,该值将会被覆盖.
verify_certs
verify_certs
: 是否校验 TLS 证书.(可选项,布尔类型,默认为真)
es_username
es_username
: 提供连接到es_host
的用户名基本认证.(可选项,字符串类型,无默认值),如果配置了ES_USERNAME
,该值将会被覆盖.
es_password
es_password
: 提供连接到es_host
的密码基本认证.(可选项,字符串类型,无默认值),如果配置了环境变量ES_PASSWORD
,该值将会被覆盖.
es_url_prefix
es_url_prefix
: Elasticsearch节点的 URL 前缀.(可选项,字符串类型,无默认值)
es_send_get_body_as
es_send_get_body_as
: 查询 Elasticsearch 的 HTTP 方法.(可选项,字符串类型,默认值"GET")
use_strftime_index
use_strftime_index
: 如果此项为真,ElastAlert 将会使用 datetime.strftime 的格式去进行搜索.更多信息请查看https://docs.python.org/2/library/datetime.html#strftime-strptime-behavior.如果需要查询数天的数据,这种格式的索引将会以逗号的形式进行间隔.这有助于缩小搜索索引的数量,与正则匹配相比,速度方面有明显的提高.例如,如果index
是logstash-%Y.%m.%d
,查询的url地址将类似elasticsearch.example.com/logstash-2015.02.03/...
或者elasticsearch.example.com/logstash-2015.02.03,logstash-2015.02.04/...
.
aggregation
aggregation
: 这个选项允许你合计多个匹配进行统一报警.每次当发生匹配, ElastAlert 将等到aggregation
周期,并将特定规则中发生的所有匹配一起发送.
For example: 例如:
aggregation:
hours:2
上面的内容意味着如果12:00,1:00,2:30分别发生了一次匹配,2:00将会发送一封包含前两个匹配的警报,最后一个将和在4:30之前所有的匹配一起在4:30进行发送.这个功能在当你仅仅只是定期需要匹配数量的报告时候会非常有效.(可选项,时间格式,默认为none)
如果你希望合计你所有的警报在重复间隔时间(以天/周/月为单位)发送的话,你可以使用schedule
配置项.
例如,你希望每周一以及周五收到警报的话:
aggregation:
schedule:
'2 4 * * mon,fri'
这个配置想使用Cron语法,你可以阅读这里了解更多内容.要注意,这个配置项目当中只能包含schedule
或日期格式(例如hours
,minutes
,days
)中的一个.
默认来说,所有在聚合窗口发生的事件都会进行分组统计.然而,如果你的规则内存在aggregation_key
项,那么所有共享通用键值的事件将会被分组统计.如果有新的键值,相应的会创建新的聚合窗口.
例如,如果你希望接受到的警报是根据谁触发了事件而进行分组的,你可以这样设置:
aggregation_key:
'my_data.username'
然后,假设聚合窗口为10分钟,你将会接受到如下的数据:
{
'my_data':{
'username':'alice',
'event_type':'login'
},
'@timestamp':'2016-09-20T00:00:00'
}
{
'my_data':{
'username':'bob',
'event_type':'something'
},
'@timestamp':'2016-09-20T00:05:00'
}
{
'my_data':{
'username':'alice',
'event_type':'something else'
},
'@timestamp':'2016-09-20T00:06:00'
}
这些内容应该包含在两个警报当中: 一个包含alice的两个事件,发送与2016-09-20T00:10:00
,另外一个包含bob的一个事件,发送于2016-09-20T00:16:00
.
在聚合方面,有时候会导致各种媒体中存在大量的文档(email,jira ticket,等等..).可以增加summary_table_fields
配置项,ElastAlert 将提供结果中指定字段的摘要.
例如,如果您想总结出现在文档中的username和event_types,以便你可以快速查看最相关的字段:
summary_table_fields:
-my_data.username
-my_data.event_type
紧接着,对于上面列出的alice和bob的事件相同样本的数据,,Elastalert将在警报内容中提供以下摘要表:
+------------------+--------------------+
|
my_data
.
username
|
my_data
.
event_type
|
+------------------+--------------------+
|
alice
|
login
|
|
bob
|
something
|
|
alice
|
something
else
|
+------------------+--------------------+
注意
默认的,聚合时间是相对于系统的当前时间,而并非匹配时间.这意味着运行 elastalert 之前的事件与运行期间发生的事件将会以不同的警报呈现.这项属性可以被aggregate_by_match
配置项修改.
aggregate_by_match_time
此项为true将会聚合从第一个事件的时间戳而并非当前时间开始的数据.这个在查询历史数据或使用一个非常大的buffer_time里用简单的查询触发多个聚合的情况下将会很有用.(存疑)
realert
realert
: 这个选项允许你在一段时间内忽略重复的警报.如果规则中使用了query_key
,这个选项将会应用到相应的每个键值上.所有的规则的匹配项,以及相同query_key
的匹配,将会在指定的时间内被忽略.所有匹配结果query_key
为missing的将会被分组到_missing
当中.这个配置的时间基准依据的是警报发送的时间,而并非事件时间.默认值是一分钟,这意味着如果 ElastAlert 运行一个很大的时间范围进而产生了很多的匹配项,默认只会发送第一个警报.如果你希望得到每一个警报,设置次选项为0 minutes(可选项,时间格式,默认1分钟)
exponential_realert
exponential_realert
: 这个选项会在警报持续触发时导致realert
的值以指数形式递增.如果设置了此选项,exponential_realert
是realert
将能增加到的最大值.如果警报之间的时间少于两倍的realert
,realert
将会翻倍.例如,如果两个配置项分别为realert:minutes:10
,exponential_realert:hours:1
,发生了两个警报时间点分别为1:00和1:15,接下来的警报将不会在1:35之前发送.如果在1:35到2:15之间发生另外一个警报,realert
将会增加到最大的1个小时.如果间隔下一个警报的时间超过两个小时,realert
将会回退到原来的时间.请注意,忽略的警报(例如 1:05 触发过的警报)将不会改变realert
.(可选项,时间格式,无默认值)
buffer_time
buffer_time
: 这个选项允许规则覆盖在config.yaml中的全局变量buffer_time
.如果use_count_query
或者use_terms_query
为真,这个值将会被忽略.
query_delay
query_delay
: 这个选项使 ElastAlert 从每个查询里面减去时间增量,这会使规则延迟运行.这个选项在 Elasticsearch 不能及时创建索引时很有用.
owner
owner
: 该值将被用来作为识别警报的相关信息(stakeholder:利益相关者).该字段可以包含在任何警报类型当中.(可选项,字符串)
优先级
priority
: 该值将用于标识警报的相对优先级.可选的,该字段可以被包含在任何警报类型中.(例如,在email主题/正文文本当中).(可选,整形,默认为2)
max_query_size
max_query_size
: 单个查询能够从 Elasticsearch 中下载的最大的文档数量.如果你希望获得大量的结果,可以考虑在规则中使用use_count_query
.如果达到了此限制,将会记录警告而 ElasticAlert 将会继续运行但是停止继续下载文档.
filter
filter
: 用来查询 Elasticsearch 的 Elasticsearch查询DSL过滤器列表. ElastAlert将会使用{'filter':{'bool':{'must':[config.filter]}}}
格式以及额外的时间戳范围过滤查询Elasticsearch.所有查询的结果和过滤器将会传递给RuleType
以供分析.需要了解写过滤器的信息,请见创建过滤器. (Required, Elasticsearch query DSL, no default)
]
包含
include
: 应包含在查询结果中并传递给规则类型和警报的配置(terms)列表.当设置时,只有这些字段,以及如果存在的['@timestamp](mailto:'%40timestamp)‘,query_key
,compare_key
, andtop_count_keys
包含在内.
top_count_keys
top_count_keys
: 字段列表. ElastAlert将对每个字段的前X个最常见的值执行term查询,X 默认为5,或者是top_count_number
,如果存在的话.例如,如果num_events
为100,top_count_keys
为-"username"
,警报将会是这100个事件当中每个存在的username中的前五个username.运算时,时间范围使用的是最近的事件的time_frame
之前到最近事件过去后的 10 分钟.因为 ElastAlert 使用聚合查询来计算,所以它将尝试使用字段名称加“.raw”来计算未分析的术语.如果要关闭它,请将raw_count_keys
设置为false.
top_count_number
top_count_number
: top_count_keys
设置的情况下,terms数量列表.(可选项,整形,默认为5)
raw_count_keys
raw_count_keys
: 若真,所有在top_count_keys
的值将在结尾追加上.raw
.(可选项,布尔类型,默认为真.)
description
description
: 规则目的文本描述.(可选项,字符串类型,默认为空),可以被自定义的报警器作为规则触发的说明.
generate_kibana_link
generate_kibana_link
: 此选项只能在kibana3中使用.如果为真,ElastiAlert将在警报中产生一个临时的kibana仪表盘链接.该仪表盘由一个事件岁时间的变化图和一个在include
中包含的字段列表组成.如果规则当中使用了query_key
,仪表盘还将包含警报的query_key
的过滤器.仪表板模式将作为临时仪表板上传到kibana-int索引.(可选项,布尔,默认False)
kibana_url
kibana_url
: 访问Kibana的url.这个将会在generate_kibana_link
或use_kibana_dashboard
为真时使用.如果没有指定,URL将会由es_host
和es_port
组成.(可选项,字符串,默认为http://<es_host>:<es_port>/_plugin/kibana/
)
use_kibana_dashboard
use_kibana_dashboard
: Kibana 3仪表盘名称的链接地址.除了由模板生成仪表盘之外,ElastAlert 还可以引用已存在的仪表盘.它会在仪表盘设置一个包含匹配时间的时间范围,作为一个临时仪表盘上传,根据警报中的query_key
添加过滤器(如果适用),并且把指向仪表盘的url放在警报中.(可选项,字符串,无默认值)
use_kibana4_dashboard
use_kibana4_dashboard
: A link to a Kibana 4 dashboard. For example, “https://kibana.example.com/#/dashboard/My-Dashboard”. This will set the time setting on the dashboard from the match time minus the timeframe, to 10 minutes after the match time. Note that this does not support filtering byquery_key
like Kibana 3.
kibana4_start_timedelta
kibana4_start_timedelta
: Defaults to 10 minutes. This option allows you to specify the start time for the generated kibana4 dashboard. This value is added in front of the event. For example,kibana4_start_timedelta:minutes:2
kibana4_start_timedelta
: 默认为10分钟,该选项允许你为生成的 kibana4 仪表盘指定开始时间.该值在事件前面添加.例如,
kibana4_start_timedelta:minutes:2
.
kibana4_end_timedelta
kibana4_end_timedelta
: Defaults to 10 minutes. This option allows you to specify the end time for the generated kibana4 dashboard. This value is added in back of the event. For example,kibana4_end_timedelta:minutes:2
kibana4_end_timedelta
: 默认十分钟.改选项允许你为生成的 kibana4 仪表盘指定结束时间.该值在事件结尾添加,例如,
kibana4_end_timedelta:minutes:2
use_local_time
use_local_time
: Whether to convert timestamps to the local time zone in alerts. If false, timestamps will be converted to UTC, which is what ElastAlert uses internally. (Optional, boolean, default true)
use_local_time
: 是否将报警的时间戳覆盖为本地时区.如果为false,时间戳将被覆盖为ElastAlert内置的UTC时间.(可选项,boolen,默认为true)
match_enhancements
match_enhancements
: 与此规则一同使用的增强模块列表.增强模块是enhancements的子类.BaseEnhancement 将在匹配字典给给到警报之前给出,以供修改.enhancements 将运行在警报发送之前,silence以及realert已经被计算,警报被聚合的情况下.该行为可以通过配置run_enhancements_first
进行修改.enhancements应该通过module.file.EnhancementName
进行调用.可以通过Enhancements了解更多信息.(可选项,字符串列表,无默认值)
run_enhancements_first
run_enhancements_first
: If set to true, enhancements will be run as soon as a match is found. This means that they can be changed or dropped before affecting realert or being added to an aggregation. Silence stashes will still be created before the enhancement runs, meaning even if aDropMatchException
is raised, the rule will still be silenced. (Optional, boolean, default false)
run_enhancements_first
: 如果为真,enhancements将在匹配发生的第一是时间运行.这意味着匹配可以在影响realert或者被添加到聚合之前被修改或者放弃.但是enhancements运行前Silence stashes(沉默)仍然将会被创建,这意味着引发了DropMatchException
(异常),规则仍然会被沉默.(可选项,布尔,默认为false)
query_key
query_key
: Having a query key means that realert time will be counted separately for each unique value ofquery_key
.
query_key
: 存在query key(查询键值)意味着realert时间将根据不同独立的query_key
进行分开统计.
For rule types which count documents, such as spike, frequency and flatline, it also means that these counts will be independent for each unique value ofquery_key
.
规则文件当中统计数量的,如spike,frequency以及flatline,也意味着这些将根据各自独特的query_key
独立统计.
For example, ifquery_key
is set tousername
andrealert
is set, and an alert triggers on a document with{'username':'bob'}
, additional alerts for{'username':'bob'}
will be ignored while other usernames will trigger alerts.
例如,如果query_key
设置为username
并且设置了realert
,一个存在{'username':'bob'}
的文档的警报被出发.多余的关于{"username":"bob"}
的警报将被忽略知道其他usernames将触发警报.
Documents which are missing thequery_key
will be grouped together. A list of fields may also be used, which will create a compound query key.
丢失query_key
的文档将会被一起分组.可能需要使用的字段列表,将会创建复合的查询键.
This compound key is treated as if it were a single field whose value is the component values, or “None”, joined by commas. A new field with the key “field1,field2,etc” will be created in each document and may conflict with existing fields of the same name.
该复合键将被当作值为复合内容的单个字段,或者"None",以逗号分割加入.新的以"field1,field2,等等"为键的字段将在各文档中创建,有可能与存在的字段发生重名冲突.
aggregation_key
aggregation_key
: Having an aggregation key in conjunction with an aggregation will make it so that each new value encountered for the aggregation_key field will result in a new, separate aggregation window.
summary_table_fields
summary_table_fields
: Specifying the summmary_table_fields in conjunction with an aggregation will make it so that each aggregated alert will contain a table summarizing the values for the specified fields in all the matches that were aggregated together.
timestamp_type
timestamp_type
: One ofiso
,unix
,unix_ms
,custom
. This option will set the type of@timestamp
(ortimestamp_field
) used to query Elasticsearch.iso
will use ISO8601 timestamps, which will work with most Elasticsearch date type field.unix
will query using an integer unix (seconds since 1/1/1970) timestamp.unix_ms
will use milliseconds unix timestamp.custom
allows you to define your owntimestamp_format
. The default isiso
. (Optional, string enum, default iso).
timestamp_format
timestamp_format
: In case Elasticsearch used custom date format for date type field, this option provides a way to define custom timestamp format to match the type used for Elastisearch date type field. This option is only valid iftimestamp_type
set tocustom
. (Optional, string, default ‘%Y-%m-%dT%H:%M:%SZ’).
_source_enabled
_source_enabled
: If true, ElastAlert will use _source to retrieve fields from documents in Elasticsearch. If false, ElastAlert will usefields
to retrieve stored fields. Both of these are represented internally as if they came from_source
. Seehttps://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-fields.htmlfor more details. The fields used come frominclude
, see above for more details. (Optional, boolean, default True)
Some rules and alerts require additional options, which also go in the top level of the rule configuration file.
Testing Your Rule
Once you’ve written a rule configuration, you will want to validate it. To do so, you can either run ElastAlert in debug mode, or useelastalert-test-rule
, which is a script that makes various aspects of testing easier.
It can:
- Check that the configuration file loaded successfully.
- Check that the Elasticsearch filter parses.
- Run against the last X day(s) and the show the number of hits that match your filter.
- Show the available terms in one of the results.
- Save documents returned to a JSON file.
- Run ElastAlert using either a JSON file or actual results from Elasticsearch.
- Print out debug alerts or trigger real alerts.
- Check that, if they exist, the primary_key, compare_key and include terms are in the results.
- Show what metadata documents would be written to
elastalert_status
.
Without any optional arguments, it will run ElastAlert over the last 24 hours and print out any alerts that would have occurred. Here is an example test run which triggered an alert:
$
elastalert-test-rule my_rules/rule1.yaml
Successfully Loaded Example rule1
Got 105 hits from the last 1 day
Available terms in first hit:
@timestamp
field1
field2
...
Included term this_field_doesnt_exist may be missing or null
INFO:root:Queried rule Example rule1 from 6-16 15:21 PDT to 6-17 15:21 PDT: 105 hits
INFO:root:Alert for Example rule1 at 2015-06-16T23:53:12Z:
INFO:root:Example rule1
At least 50 events occurred between 6-16 18:30 PDT and 6-16 20:30 PDT
field1:
value1: 25
value2: 25
@timestamp: 2015-06-16T20:30:04-07:00
field1: value1
field2: something
Would have written the following documents to elastalert_status:
silence - {'rule_name': 'Example rule1', '@timestamp': datetime.datetime( ... ), 'exponent': 0, 'until':
datetime.datetime( ... )}
elastalert_status - {'hits': 105, 'matches': 1, '@timestamp': datetime.datetime( ... ), 'rule_name': 'Example rule1',
'starttime': datetime.datetime( ... ), 'endtime': datetime.datetime( ... ), 'time_taken': 3.1415926}
Note that everything between “Alert for Example rule1 at ...” and “Would have written the following ...” is the exact text body that an alert would have. See the section below on alert content for more details. Also note that datetime objects are converted to ISO8601 timestamps when uploaded to Elasticsearch. Seethe section on metadatafor more details.
Other options include:
--schema-only
: Only perform schema validation on the file. It will not load modules or query Elasticsearch. This may catch invalid YAML and missing or misconfigured fields.
--count-only
: Only find the number of matching documents and list available fields. ElastAlert will not be run and documents will not be downloaded.
--daysN
: Instead of the default 1 day, query N days. For selecting more specific time ranges, you must run ElastAlert itself and use--start
and--end
.
--save-jsonFILE
: Save all documents downloaded to a file as JSON. This is useful if you wish to modify data while testing or do offline testing in conjunction with--dataFILE
. A maximum of 10,000 documents will be downloaded.
--dataFILE
: Use a JSON file as a data source instead of Elasticsearch. The file should be a single list containing objects, rather than objects on separate lines. Note than this uses mock functions which mimic some Elasticsearch query methods and is not guaranteed to have the exact same results as with Elasticsearch. For example, analyzed string fields may behave differently.
--alert
: Trigger real alerts instead of the debug (logging text) alert.
Note
Results from running this script may not always be the same as if an actual ElastAlert instance was running. Some rule types, such as spike and flatline require a minimum elapsed time before they begin alerting, based on their timeframe. In addition, use_count_query and use_terms_query rely on run_every to determine their resolution. This script uses a fixed 5 minute window, which is the same as the default.
Rule Types
The variousRuleType
classes, defined inelastalert/ruletypes.py
, form the main logic behind ElastAlert. An instance is held in memory for each rule, passed all of the data returned by querying Elasticsearch with a given filter, and generates matches based on that data.
To select a rule type, set thetype
option to the name of the rule type in the rule configuration file:
type:<ruletype>
Any
any
: The any rule will match everything. Every hit that the query returns will generate an alert.
Blacklist
blacklist
: The blacklist rule will check a certain field against a blacklist, and match if it is in the blacklist.
This rule requires two additional options:
compare_key
: The name of the field to use to compare to the blacklist. If the field is null, those events will be ignored.
blacklist
: A list of blacklisted values, and/or a list of paths to flat files which contain the blacklisted values using-"!file/path/to/file"
; for example:
blacklist
:
-
value1
-
value2
-
"!file /tmp/blacklist1.txt"
-
"!file /tmp/blacklist2.txt"
It is possible to mix between blacklist value definitions, or use either one. Thecompare_key
term must be equal to one of these values for it to match.
Whitelist
whitelist
: Similar toblacklist
, this rule will compare a certain field to a whitelist, and match if the list does not contain the term.
This rule requires three additional options:
compare_key
: The name of the field to use to compare to the whitelist.
ignore_null
: If true, events without acompare_key
field will not match.
whitelist
: A list of whitelisted values, and/or a list of paths to flat files which contain the whitelisted values using-"!file/path/to/file"
; for example:
whitelist
:
-
value1
-
value2
-
"!file /tmp/whitelist1.txt"
-
"!file /tmp/whitelist2.txt"
It is possible to mix between whitelisted value definitions, or use either one. Thecompare_key
term must be in this list or else it will match.
Change
For an example configuration file using this rule type, look atexample_rules/example_change.yaml
.
change
: This rule will monitor a certain field and match if that field changes. The field must change with respect to the last event with the samequery_key
.
This rule requires three additional options:
compare_key
: The name of the field to monitor for changes.
ignore_null
: If true, events without acompare_key
field will not count as changed.
query_key
: This rule is applied on a per-query_key
basis. This field must be present in all of the events that are checked.
There is also an optional field:
timeframe
: The maximum time between changes. After this time period, ElastAlert will forget the old value of thecompare_key
field.
Frequency
For an example configuration file using this rule type, look atexample_rules/example_frequency.yaml
.
frequency
: This rule matches when there are at least a certain number of events in a given time frame. This may be counted on a per-query_key
basis.
This rule requires two additional options:
num_events
: The number of events which will trigger an alert.
timeframe
: The time thatnum_events
must occur within.
Optional:
use_count_query
: If true, ElastAlert will poll Elasticsearch using the count api, and not download all of the matching documents. This is useful is you care only about numbers and not the actual data. It should also be used if you expect a large number of query hits, in the order of tens of thousands or more.doc_type
must be set to use this.
doc_type
: Specify the_type
of document to search for. This must be present ifuse_count_query
oruse_terms_query
is set.
use_terms_query
: If true, ElastAlert will make an aggregation query against Elasticsearch to get counts of documents matching each unique value ofquery_key
. This must be used withquery_key
anddoc_type
. This will only return a maximum ofterms_size
, default 50, unique terms.
terms_size
: When used withuse_terms_query
, this is the maximum number of terms returned per query. Default is 50.
query_key
: Counts of documents will be stored independently for each value ofquery_key
. Onlynum_events
documents, all with the same value ofquery_key
, will trigger an alert.
attach_related
: Will attach all the related events to the event that triggered the frequency alert. For example in an alert triggered withnum_events
: 3, the 3rd event will trigger the alert on itself and add the other 2 events in a key namedrelated_events
that can be accessed in the alerter.
Spike
spike
: This rule matches when the volume of events during a given time period isspike_height
times larger or smaller than during the previous time period. It uses two sliding windows to compare the current and reference frequency of events. We will call this two windows “reference” and “current”.
This rule requires three additional options:
spike_height
: The ratio of number of events in the lasttimeframe
to the previoustimeframe
that when hit will trigger an alert.
spike_type
: Either ‘up’, ‘down’ or ‘both’. ‘Up’ meaning the rule will only match when the number of events isspike_height
times higher. ‘Down’ meaning the reference number isspike_height
higher than the current number. ‘Both’ will match either.
timeframe
: The rule will average out the rate of events over this time period. For example,hours:1
means that the ‘current’ window will span from present to one hour ago, and the ‘reference’ window will span from one hour ago to two hours ago. The rule will not be active until the time elapsed from the first event is at least two timeframes. This is to prevent an alert being triggered before a baseline rate has been established. This can be overridden usingalert_on_new_data
.
Optional:
threshold_ref
: The minimum number of events that must exist in the reference window for an alert to trigger. For example, ifspike_height:3
andthreshold_ref:10
, than the ‘reference’ window must contain at least 10 events and the ‘current’ window at least three times that for an alert to be triggered.
threshold_cur
: The minimum number of events that must exist in the current window for an alert to trigger. For example, ifspike_height:3
andthreshold_cur:60
, then an alert will occur if the current window has more than 60 events and the reference window has less than a third as many.
To illustrate the use ofthreshold_ref
,threshold_cur
,alert_on_new_data
,timeframe
andspike_height
together, consider the following examples:
" Alert if at least 15 events occur within two hours and less than a quarter of that number occurred within the previous two hours. "
timeframe: hours: 2
spike_height: 4
spike_type: up
threshold_cur: 15
hour1: 5 events (ref: 0, cur: 5) - No alert because (a) threshold_cur not met, (b) ref window not filled
hour2: 5 events (ref: 0, cur: 10) - No alert because (a) threshold_cur not met, (b) ref window not filled
hour3: 10 events (ref: 5, cur: 15) - No alert because (a) spike_height not met, (b) ref window not filled
hour4: 35 events (ref: 10, cur: 45) - Alert because (a) spike_height met, (b) threshold_cur met, (c) ref window filled
hour1: 20 events (ref: 0, cur: 20) - No alert because ref window not filled
hour2: 21 events (ref: 0, cur: 41) - No alert because ref window not filled
hour3: 19 events (ref: 20, cur: 40) - No alert because (a) spike_height not met, (b) ref window not filled
hour4: 23 events (ref: 41, cur: 42) - No alert because spike_height not met
hour1: 10 events (ref: 0, cur: 10) - No alert because (a) threshold_cur not met, (b) ref window not filled
hour2: 0 events (ref: 0, cur: 10) - No alert because (a) threshold_cur not met, (b) ref window not filled
hour3: 0 events (ref: 10, cur: 0) - No alert because (a) threshold_cur not met, (b) ref window not filled, (c) spike_height not met
hour4: 30 events (ref: 10, cur: 30) - No alert because spike_height not met
hour5: 5 events (ref: 0, cur: 35) - Alert because (a) spike_height met, (b) threshold_cur met, (c) ref window filled
" Alert if at least 5 events occur within two hours, and twice as many events occur within the next two hours. "
timeframe: hours: 2
spike_height: 2
spike_type: up
threshold_ref: 5
hour1: 20 events (ref: 0, cur: 20) - No alert because (a) threshold_ref not met, (b) ref window not filled
hour2: 100 events (ref: 0, cur: 120) - No alert because (a) threshold_ref not met, (b) ref window not filled
hour3: 100 events (ref: 20, cur: 200) - No alert because ref window not filled
hour4: 100 events (ref: 120, cur: 200) - No alert because spike_height not met
hour1: 0 events (ref: 0, cur: 0) - No alert because (a) threshold_ref not met, (b) ref window not filled
hour2: 20 events (ref: 0, cur: 20) - No alert because (a) threshold_ref not met, (b) ref window not filled
hour3: 100 events (ref: 0, cur: 120) - No alert because (a) threshold_ref not met, (b) ref window not filled
hour4: 100 events (ref: 20, cur: 200) - Alert because (a) spike_height met, (b) threshold_ref met, (c) ref window filled
hour1: 1 events (ref: 0, cur: 1) - No alert because (a) threshold_ref not met, (b) ref window not filled
hour2: 2 events (ref: 0, cur: 3) - No alert because (a) threshold_ref not met, (b) ref window not filled
hour3: 2 events (ref: 1, cur: 4) - No alert because (a) threshold_ref not met, (b) ref window not filled
hour4: 1000 events (ref: 3, cur: 1002) - No alert because threshold_ref not met
hour5: 2 events (ref: 4, cur: 1002) - No alert because threshold_ref not met
hour6: 4 events: (ref: 1002, cur: 6) - No alert because spike_height not met
hour1: 1000 events (ref: 0, cur: 1000) - No alert because (a) threshold_ref not met, (b) ref window not filled
hour2: 0 events (ref: 0, cur: 1000) - No alert because (a) threshold_ref not met, (b) ref window not filled
hour3: 0 events (ref: 1000, cur: 0) - No alert because (a) spike_height not met, (b) ref window not filled
hour4: 0 events (ref: 1000, cur: 0) - No alert because spike_height not met
hour5: 1000 events (ref: 0, cur: 1000) - No alert because threshold_ref not met
hour6: 1050 events (ref: 0, cur: 2050)- No alert because threshold_ref not met
hour7: 1075 events (ref: 1000, cur: 2125) Alert because (a) spike_height met, (b) threshold_ref met, (c) ref window filled
" Alert if at least 100 events occur within two hours and less than a fifth of that number occurred in the previous two hours. "
timeframe: hours: 2
spike_height: 5
spike_type: up
threshold_cur: 100
hour1: 1000 events (ref: 0, cur: 1000) - No alert because ref window not filled
hour1: 2 events (ref: 0, cur: 2) - No alert because (a) threshold_cur not met, (b) ref window not filled
hour2: 1 events (ref: 0, cur: 3) - No alert because (a) threshold_cur not met, (b) ref window not filled
hour3: 20 events (ref: 2, cur: 21) - No alert because (a) threshold_cur not met, (b) ref window not filled
hour4: 81 events (ref: 3, cur: 101) - Alert because (a) spike_height met, (b) threshold_cur met, (c) ref window filled
hour1: 10 events (ref: 0, cur: 10) - No alert because (a) threshold_cur not met, (b) ref window not filled
hour2: 20 events (ref: 0, cur: 30) - No alert because (a) threshold_cur not met, (b) ref window not filled
hour3: 40 events (ref: 10, cur: 60) - No alert because (a) threshold_cur not met, (b) ref window not filled
hour4: 80 events (ref: 30, cur: 120) - No alert because spike_height not met
hour5: 200 events (ref: 60, cur: 280) - No alert because spike_height not met
alert_on_new_data
: This option is only used ifquery_key
is set. When this is set to true, any newquery_key
encountered may trigger an immediate alert. When set to false, baseline must be established for each newquery_key
value, and then subsequent spikes may cause alerts. Baseline is established aftertimeframe
has elapsed twice since first occurrence.
use_count_query
: If true, ElastAlert will poll Elasticsearch using the count api, and not download all of the matching documents. This is useful is you care only about numbers and not the actual data. It should also be used if you expect a large number of query hits, in the order of tens of thousands or more.doc_type
must be set to use this.
doc_type
: Specify the_type
of document to search for. This must be present ifuse_count_query
oruse_terms_query
is set.
use_terms_query
: If true, ElastAlert will make an aggregation query against Elasticsearch to get counts of documents matching each unique value ofquery_key
. This must be used withquery_key
anddoc_type
. This will only return a maximum ofterms_size
, default 50, unique terms.
terms_size
: When used withuse_terms_query
, this is the maximum number of terms returned per query. Default is 50.
query_key
: Counts of documents will be stored independently for each value ofquery_key
.
Flatline
flatline
: This rule matches when the total number of events is under a giventhreshold
for a time period.
This rule requires two additional options:
threshold
: The minimum number of events for an alert not to be triggered.
timeframe
: The time period that must contain less thanthreshold
events.
Optional:
use_count_query
: If true, ElastAlert will poll Elasticsearch using the count api, and not download all of the matching documents. This is useful is you care only about numbers and not the actual data. It should also be used if you expect a large number of query hits, in the order of tens of thousands or more.doc_type
must be set to use this.
doc_type
: Specify the_type
of document to search for. This must be present ifuse_count_query
oruse_terms_query
is set.
use_terms_query
: If true, ElastAlert will make an aggregation query against Elasticsearch to get counts of documents matching each unique value ofquery_key
. This must be used withquery_key
anddoc_type
. This will only return a maximum ofterms_size
, default 50, unique terms.
terms_size
: When used withuse_terms_query
, this is the maximum number of terms returned per query. Default is 50.
query_key
: With flatline rule,query_key
means that an alert will be triggered if any value ofquery_key
has been seen at least once and then falls below the threshold.
New Term
new_term
: This rule matches when a new value appears in a field that has never been seen before. When ElastAlert starts, it will use an aggregation query to gather all known terms for a list of fields.
This rule requires one additional option:
fields
: A list of fields to monitor for new terms.query_key
will be used iffields
is not set. Each entry in the list of fields can itself be a list. If a field entry is provided as a list, it will be interpreted as a set of fields that compose a composite key used for the ElasticSearch query.
Note
The composite fields may only refer to primitive types, otherwise the initial ElasticSearch query will not properly return the aggregation results, thus causing alerts to fire every time the ElastAlert service initially launches with the rule. A warning will be logged to the console if this scenario is encountered. However, future alerts will actually work as expected after the initial flurry.
Optional:
terms_window_size
: The amount of time used for the initial query to find existing terms. No term that has occurred within this time frame will trigger an alert. The default is 30 days.
window_step_size
: When querying for existing terms, split up the time range into steps of this size. For example, using the default 30 day window size, and the default 1 day step size, 30 invidivdual queries will be made. This helps to avoid timeouts for very expensive aggregation queries. The default is 1 day.
alert_on_missing_field
: Whether or not to alert when a field is missing from a document. The default is false.
use_terms_query
: If true, ElastAlert will use aggregation queries to get terms instead of regular search queries. This is faster than regular searching if there is a large number of documents. If this is used, you may only specify a single field, and must also setquery_key
to that field. Also, note thatterms_size
(the number of buckets returned per query) defaults to 50. This means that if a new term appears but there are at least 50 terms which appear more frequently, it will not be found.
Cardinality
cardinality
: This rule matches when a the total number of unique values for a certain field within a time frame is higher or lower than a threshold.
This rule requires:
timeframe
: The time period in which the number of unique values will be counted.
cardinality_field
: Which field to count the cardinality for.
This rule requires one of the two following options:
max_cardinality
: If the cardinality of the data is greater than this number, an alert will be triggered. Each new event that raises the cardinality will trigger an alert.
min_cardinality
: If the cardinality of the data is lower than this number, an alert will be triggered. Thetimeframe
must have elapsed since the first event before any alerts will be sent. When a match occurs, thetimeframe
will be reset and must elapse again before additional alerts.
Optional:
query_key
: Group cardinality counts by this field. For each unique value of thequery_key
field, cardinality will be counted separately.
Metric Aggregation
metric_aggregation
: This rule matches when the value of a metric within the calculation window is higher or lower than a threshold. By default this isbuffer_time
.
This rule requires:
metric_agg_key
: This is the name of the field over which the metric value will be calculated. The underlying type of this field must be supported by the specified aggregation type.
metric_agg_type
: The type of metric aggregation to perform on themetric_agg_key
field. This must be one of ‘min’, ‘max’, ‘avg’, ‘sum’, ‘cardinality’, ‘value_count’.
doc_type
: Specify the_type
of document to search for.
This rule also requires at least one of the two following options:
max_threshold
: If the calculated metric value is greater than this number, an alert will be triggered. This threshold is exclusive.
min_threshold
: If the calculated metric value is less than this number, an alert will be triggered. This threshold is exclusive.
Optional:
query_key
: Group metric calculations by this field. For each unique value of thequery_key
field, the metric will be calculated and evaluated separately against the threshold(s).
use_run_every_query_size
: By default the metric value is calculated over abuffer_time
sized window. If this parameter is true the rule will userun_every
as the calculation window.
allow_buffer_time_overlap
: This setting will only have an effect ifuse_run_every_query_size
is false andbuffer_time
is greater thanrun_every
. If true will allow the start of the metric calculation window to overlap the end time of a previous run. By default the start and end times will not overlap, so if the time elapsed since the last run is less than the metric calculation window size, rule execution will be skipped (to avoid calculations on partial data).
bucket_interval
: If present this will divide the metric calculation window intobucket_interval
sized segments. The metric value will be calculated and evaluated against the threshold(s) for each segment. Ifbucket_interval
is specified thenbuffer_time
must be a multiple ofbucket_interval
. (Orrun_every
ifuse_run_every_query_size
is true).
sync_bucket_interval
: This only has an effect ifbucket_interval
is present. If true it will sync the start and end times of the metric calculation window to the keys (timestamps) of the underlying date_histogram buckets. Because of the way elasticsearch calculates date_histogram bucket keys these usually round evenly to nearest minute, hour, day etc (depending on the bucket size). By default the bucket keys are offset to allign with the time elastalert runs, (This both avoid calculations on partial data, and ensures the very latest documents are included). See:https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html#_offsetfor a more comprehensive explaination.
Percentage Match
percentage_match
: This rule matches when the percentage of document in the match bucket within a calculation window is higher or lower than a threshold. By default the calculation window isbuffer_time
.
This rule requires:
match_bucket_filter
: ES filter DSL. This defines a filter for the match bucket, which should match a subset of the documents returned by the main query filter.
doc_type
: Specify the_type
of document to search for.
This rule also requires at least one of the two following options:
min_percentage
: If the percentage of matching documents is greater than this number, an alert will be triggered.
max_percentage
: If the percentage of matching documents is less than this number, an alert will be triggered.
Optional:
query_key
: Group percentage by this field. For each unique value of thequery_key
field, the percentage will be calculated and evaluated separately against the threshold(s).
use_run_every_query_size
: Seeuse_run_every_query_size
in Metric Aggregation rule
allow_buffer_time_overlap
: Seeallow_buffer_time_overlap
in Metric Aggregation rule
bucket_interval
: Seebucket_interval
in Metric Aggregation rule
sync_bucket_interval
: Seesync_bucket_interval
in Metric Aggregation rule
Alerts
Each rule may have any number of alerts attached to it. Alerts are subclasses ofAlerter
and are passed a dictionary, or list of dictionaries, from ElastAlert which contain relevant information. They are configured in the rule configuration file similarly to rule types.
To set the alerts for a rule, set thealert
option to the name of the alert, or a list of the names of alerts:
alert:email
or
alert
:
-
email
-
jira
E-mail subject or JIRA issue summary can also be customized by adding analert_subject
that contains a custom summary. It can be further formatted using standard Python formatting syntax:
alert_subject
:
"Issue
{0}
occurred at
{1}
"
The arguments for the formatter will be fed from the matched objects related to the alert. The field names whose values will be used as the arguments can be passed withalert_subject_args
:
alert_subject_args
:
-
issue
.
name
-
"@timestamp"
It is mandatory to enclose the@timestamp
field in quotes since in YAML format a token cannot begin with the@
character. Not using the quotation marks will trigger a YAML parse error.
In case the rule matches multiple objects in the index, only the first match is used to populate the arguments for the formatter.
If the field(s) mentioned in the arguments list are missing, the email alert will have the text<MISSINGVALUE>
in place of its expected value. This will also occur ifuse_count_query
is set to true.
Alert Content
There are several ways to format the body text of the various types of events. In EBNF:
rule_name
=
name
alert_text
=
alert_text
ruletype_text
=
Depends
on
type
top_counts_header
=
top_count_key
,
":"
top_counts_value
=
Value
,
": "
,
Count
top_counts
=
top_counts_header
,
LF
,
top_counts_value
field_values
=
Field
,
": "
,
Value
Similarly toalert_subject
,alert_text
can be further formatted using standard Python formatting syntax. The field names whose values will be used as the arguments can be passed withalert_text_args
oralert_text_kw
. You may also refer to any top-level rule property in thealert_subject_args
,alert_text_args
, andalert_text_kwfields
. However, if the matched document has a key with the same name, that will take preference over the rule property.
By default:
body
=
rule_name
[
alert_text
]
ruletype_text
{
top_counts
}
{
field_values
}
Withalert_text_type:alert_text_only
:
body
=
rule_name
alert_text
Withalert_text_type:exclude_fields
:
body
=
rule_name
[
alert_text
]
ruletype_text
{
top_counts
}
ruletype_text is the string returned by RuleType.get_match_str.
field_values will contain every key value pair included in the results from Elasticsearch. These fields include “@timestamp” (or the value oftimestamp_field
), every key inincluded
, every key intop_count_keys
,query_key
, andcompare_key
. If the alert spans multiple events, these values may come from an individual event, usually the one which triggers the alert.
Command
The command alert allows you to execute an arbitrary command and pass arguments or stdin from the match. Arguments to the command can use Python format string syntax to access parts of the match. The alerter will open a subprocess and optionally pass the match, or matches in the case of an aggregated alert, as a JSON array, to the stdin of the process.
This alert requires one option:
command
: A list of arguments to execute or a string to execute. If in list format, the first argument is the name of the program to execute. If passed a string, the command is executed through the shell.
Strings can be formatted using the old-style format (%
) or the new-style format (.format()
). When the old-style format is used, fields are accessed using%(field_name)s
. When the new-style format is used, fields are accessed using{match[field_name]}
. New-style formatting allows accessing nested fields (e.g.,{match[field_1_name][field_2_name]}
).
In an aggregated alert, these fields come from the first match.
Optional:
new_style_string_format
: If True, arguments are formatted using.format()
rather than%
. The default is False.
pipe_match_json
: If true, the match will be converted to JSON and passed to stdin of the command. Note that this will cause ElastAlert to block until the command exits or sends an EOF to stdout.
Example usage using old-style format:
alert
:
-
command
command
:
[
"/bin/send_alert"
,
"--username"
,
"
%(username)s
"
]
Warning
Executing commmands with untrusted data can make it vulnerable to shell injection! If you use formatted data in your command, it is highly recommended that you use a args list format instead of a shell string.
Example usage using new-style format:
alert
:
-
command
command
:
[
"/bin/send_alert"
,
"--username"
,
"
{match[username]}
"
]
This alert will send an email. It connects to an smtp server located atsmtp_host
, or localhost by default. If available, it will use STARTTLS.
This alert requires one additional option:
email
: An address or list of addresses to sent the alert to.
Optional:
email_from_field
: Use a field from the document that triggered the alert as the recipient. If the field cannot be found, theemail
value will be used as a default. Note that this field will not be available in every rule type, for example, if you haveuse_count_query
or if it’stype:flatline
. You can optionally add a domain suffix to the field to generate the address usingemail_add_domain
. For example, with the following settings:
email_from_field
:
"data.user"
email_add_domain
:
"@example.com"
and a match{"@timestamp":"2017","data":{"foo":"bar","user":"qlo"}}
an email would be sent to[email protected]
smtp_host
: The SMTP host to use, defaults to localhost.
smtp_port
: The port to use. Default is 25.
smtp_ssl
: Connect the SMTP host using TLS, defaults tofalse
. Ifsmtp_ssl
is not used, ElastAlert will still attempt STARTTLS.
smtp_auth_file
: The path to a file which contains SMTP authentication credentials. It should be YAML formatted and contain two fields,user
andpassword
. If this is not present, no authentication will be attempted.
email_reply_to
: This sets the Reply-To header in the email. By default, the from address is ElastAlert@ and the domain will be set by the smtp server.
from_addr
: This sets the From header in the email. By default, the from address is ElastAlert@ and the domain will be set by the smtp server.
cc
: This adds the CC emails to the list of recipients. By default, this is left empty.
bcc
: This adds the BCC emails to the list of recipients but does not show up in the email message. By default, this is left empty.
Jira
The JIRA alerter will open a ticket on jira whenever an alert is triggered. You must have a service account for ElastAlert to connect with. The credentials of the service account are loaded from a separate file. The ticket number will be written to the alert pipeline, and if it is followed by an email alerter, a link will be included in the email.
This alert requires four additional options:
jira_server
: The hostname of the JIRA server.
jira_project
: The project to open the ticket under.
jira_issuetype
: The type of issue that the ticket will be filed as. Note that this is case sensitive.
jira_account_file
: The path to the file which contains JIRA account credentials.
For an example JIRA account file, seeexample_rules/jira_acct.yaml
. The account file is also yaml formatted and must contain two fields:
user
: The username.
password
: The password.
Optional:
jira_component
: The name of the component or components to set the ticket to. This can be a single string or a list of strings. This is provided for backwards compatibility and will eventually be deprecated. It is preferable to use the pluraljira_components
instead.
jira_components
: The name of the component or components to set the ticket to. This can be a single string or a list of strings.
jira_description
: Similar toalert_text
, this text is prepended to the JIRA description.
jira_label
: The label or labels to add to the JIRA ticket. This can be a single string or a list of strings. This is provided for backwards compatibility and will eventually be deprecated. It is preferable to use the pluraljira_labels
instead.
jira_labels
: The label or labels to add to the JIRA ticket. This can be a single string or a list of strings.
jira_priority
: The index of the priority to set the issue to. In the JIRA dropdown for priorities, 0 would represent the first priority, 1 the 2nd, etc.
jira_watchers
: A list of user names to add as watchers on a JIRA ticket. This can be a single string or a list of strings.
jira_bump_tickets
: If true, ElastAlert search for existing tickets newer thanjira_max_age
and comment on the ticket with information about the alert instead of opening another ticket. ElastAlert finds the existing ticket by searching by summary. If the summary has changed or contains special characters, it may fail to find the ticket. If you are using a customalert_subject
, the two summaries must be exact matches, except by settingjira_ignore_in_title
, you can ignore the value of a field when searching. For example, if the custom subject is “foo occured at bar”, and “foo” is the value field X in the match, you can setjira_ignore_in_title
to “X” and it will only bump tickets with “bar” in the subject. Defaults to false.
jira_ignore_in_title
: ElastAlert will attempt to remove the value for this field from the JIRA subject when searching for tickets to bump. Seejira_bump_tickets
description above for an example.
jira_max_age
: Ifjira_bump_tickets
is true, the maximum age of a ticket, in days, such that ElastAlert will comment on the ticket instead of opening a new one. Default is 30 days.
jira_bump_not_in_statuses
: Ifjira_bump_tickets
is true, a list of statuses the ticket mustnotbe in for ElastAlert to comment on the ticket instead of opening a new one. For example, to prevent comments being added to resolved or closed tickets, set this to ‘Resolved’ and ‘Closed’. This option should not be set if thejira_bump_in_statuses
option is set.
Example usage:
jira_bump_not_in_statuses
:
-
Resolved
-
Closed
jira_bump_in_statuses
: Ifjira_bump_tickets
is true, a list of statuses the ticket_must be in_for ElastAlert to comment on the ticket instead of opening a new one. For example, to only comment on ‘Open’ tickets – and thus not ‘In Progress’, ‘Analyzing’, ‘Resolved’, etc. tickets – set this to ‘Open’. This option should not be set if thejira_bump_not_in_statuses
option is set.
Example usage:
jira_bump_in_statuses
:
-
Open
Arbitrary Jira fields:
ElastAlert supports setting any arbitrary JIRA field that your jira issue supports. For example, if you had a custom field, called “Affected User”, you can set it by providing that field name insnake_case
prefixed withjira_
. These fields can contain primitive strings or arrays of strings. Note that when you create a custom field in your JIRA server, internally, the field is represented ascustomfield_1111
. In elastalert, you may refer to either the public facing name OR the internal representation.
Example usage:
jira_arbitrary_singular_field
:
My
Name
jira_arbitrary_multivalue_field
:
-
Name
1
-
Name
2
jira_customfield_12345
:
My
Custom
Value
jira_customfield_9999
:
-
My
Custom
Value
1
-
My
Custom
Value
2
OpsGenie
OpsGenie alerter will create an alert which can be used to notify Operations people of issues or log information. An OpsGenieAPI
integration must be created in order to acquire the necessaryopsgenie_key
rule variable. Currently the OpsGenieAlerter only creates an alert, however it could be extended to update or close existing alerts.
It is necessary for the user to create an OpsGenie Rest HTTPS APIintegration pagein order to create alerts.
The OpsGenie alert requires one option:
opsgenie_key
: The randomly generated API Integration key created by OpsGenie.
Optional:
opsgenie_account
: The OpsGenie account to integrate with.
opsgenie_recipients
: A list OpsGenie recipients who will be notified by the alert.
opsgenie_teams
: A list of OpsGenie teams to notify (useful for schedules with escalation).
opsgenie_tags
: A list of tags for this alert.
opsgenie_message
: Set the OpsGenie message to something other than the rule name. The message can be formatted with fields from the first match e.g. “Error occurred for {app_name} at {timestamp}.”.
opsgenie_alias
: Set the OpsGenie alias. The alias can be formatted with fields from the first match e.g “{app_name} error”.
SNS
The SNS alerter will send an SNS notification. The body of the notification is formatted the same as with other alerters. The SNS alerter uses boto3 and can use credentials in the rule yaml, in a standard AWS credential and config files, or via environment variables. Seehttp://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.htmlfor details.
SNS requires one option:
sns_topic_arn
: The SNS topic’s ARN. For example,arn:aws:sns:us-east-1:123456789:somesnstopic
Optional:
aws_access_key
: An access key to connect to SNS with.
aws_secret_key
: The secret key associated with the access key.
aws_region
: The AWS region in which the SNS resource is located. Default is us-east-1
profile
: The AWS profile to use. If none specified, the default will be used.
HipChat
HipChat alerter will send a notification to a predefined HipChat room. The body of the notification is formatted the same as with other alerters.
The alerter requires the following two options:
hipchat_auth_token
: The randomly generated notification token created by HipChat. Go tohttps://XXXXX.hipchat.com/account/apiand use ‘Create new token’ section, choosing ‘Send notification’ in Scopes list.
hipchat_room_id
: The id associated with the HipChat room you want to send the alert to. Go tohttps://XXXXX.hipchat.com/roomsand choose the room you want to post to. The room ID will be the numeric part of the URL.
hipchat_msg_color
: The color of the message background that is sent to HipChat. May be set to green, yellow or red. Default is red.
hipchat_domain
: The custom domain in case you have HipChat own server deployment. Default is api.hipchat.com.
hipchat_ignore_ssl_errors
: Ignore TLS errors (self-signed certificates, etc.). Default is false.
hipchat_proxy
: By default ElastAlert will not use a network proxy to send notifications to HipChat. Set this option usinghostname:port
if you need to use a proxy.
hipchat_notify
: When set to true, triggers a hipchat bell as if it were a user. Default is true.
hipchat_from
: When humans report to hipchat, a timestamp appears next to their name. For bots, the name is the name of the token. The from, instead of a timestamp, defaults to empty unless set, which you can do here. This is optional.
hipchat_message_format
: Determines how the message is treated by HipChat and rendered inside HipChat applications html - Message is rendered as HTML and receives no special treatment. Must be valid HTML and entities must be escaped (e.g.: ‘&’ instead of ‘&’). May contain basic tags: a, b, i, strong, em, br, img, pre, code, lists, tables. text - Message is treated just like a message sent by a user. Can include @mentions, emoticons, pastes, and auto-detected URLs (Twitter, YouTube, images, etc). Valid values: html, text. Defaults to ‘html’.
Slack
Slack alerter will send a notification to a predefined Slack channel. The body of the notification is formatted the same as with other alerters.
The alerter requires the following option:
slack_webhook_url
: The webhook URL that includes your auth data and the ID of the channel (room) you want to post to. Go to the Incoming Webhooks section in your Slack accounthttps://XXXXX.slack.com/services/new/incoming-webhook, choose the channel, click ‘Add Incoming Webhooks Integration’ and copy the resulting URL. You can use a list of URLs to send to multiple channels.
Optional:
slack_username_override
: By default Slack will use your username when posting to the channel. Use this option to change it (free text).
slack_channel_override
: Incoming webhooks have a default channel, but it can be overridden. A public channel can be specified “#other-channel”, and a Direct Message with “@username”.
slack_emoji_override
: By default ElastAlert will use the :ghost: emoji when posting to the channel. You can use a different emoji per ElastAlert rule. Any Apple emoji can be used, seehttp://emojipedia.org/apple/. If slack_icon_url_override parameter is provided, emoji is ignored.
slack_icon_url_override
: By default ElastAlert will use the :ghost: emoji when posting to the channel. You can provide icon_url to use custom image. Provide absolute address of the pciture, for example:http://some.address.com/image.jpg.
slack_msg_color
: By default the alert will be posted with the ‘danger’ color. You can also use ‘good’ or ‘warning’ colors.
slack_proxy
: By default ElastAlert will not use a network proxy to send notifications to Slack. Set this option usinghostname:port
if you need to use a proxy.
Telegram
Telegram alerter will send a notification to a predefined Telegram username or channel. The body of the notification is formatted the same as with other alerters.
The alerter requires the following two options:
telegram_bot_token
: The token is a string along the lines of110201543:AAHdqTcvCH1vGWJxfSeofSAs0K5PALDsaw
that will be required to authorize the bot and send requests to the Bot API. You can learn about obtaining tokens and generating new ones in this documenthttps://core.telegram.org/bots#botfather
telegram_room_id
: Unique identifier for the target chat or username of the target channel (in the format @channelusername)
Optional:
telegram_api_url
: Custom domain to call Telegram Bot API. Default to api.telegram.org
telegram_proxy
: By default ElastAlert will not use a network proxy to send notifications to Telegram. Set this option usinghostname:port
if you need to use a proxy.
PagerDuty
PagerDuty alerter will trigger an incident to a predefined PagerDuty service. The body of the notification is formatted the same as with other alerters.
The alerter requires the following option:
pagerduty_service_key
: Integration Key generated after creating a service with the ‘Use our API directly’ option at Integration Settings
pagerduty_client_name
: The name of the monitoring client that is triggering this event.
Optional:
pagerduty_incident_key
: If not set pagerduty will trigger a new incident for each alert sent. If set to a unique string per rule pagerduty will identify the incident that this event should be applied. If there’s no open (i.e. unresolved) incident with this key, a new one will be created. If there’s already an open incident with a matching key, this event will be appended to that incident’s log.
pagerduty_incident_key_args
: If set, andpagerduty_incident_key
is a formattable string, Elastalert will format the incident key based on the provided array of fields from the rule or match.
pagerduty_proxy
: By default ElastAlert will not use a network proxy to send notifications to Pagerduty. Set this option usinghostname:port
if you need to use a proxy.
Exotel
Developers in India can use Exotel alerter, it will trigger an incident to a mobile phone as sms from your exophone. Alert name along with the message body will be sent as an sms.
The alerter requires the following option:
exotel_accout_sid
: This is sid of your Exotel account.
exotel_auth_token
: Auth token assosiated with your Exotel account.
If you don’t know how to find your accound sid and auth token, refer -http://support.exotel.in/support/solutions/articles/3000023019-how-to-find-my-exotel-token-and-exotel-sid-
exotel_to_number
: The phone number where you would like send the notification.
exotel_from_number
: Your exophone number from which message will be sent.
The alerter has one optional argument:
exotel_message_body
: Message you want to send in the sms, is you don’t specify this argument only the rule name is sent
Twilio
Twilio alerter will trigger an incident to a mobile phone as sms from your twilio phone number. Alert name will arrive as sms once this option is chosen.
The alerter requires the following option:
twilio_accout_sid
: This is sid of your twilio account.
twilio_auth_token
: Auth token assosiated with your twilio account.
twilio_to_number
: The phone number where you would like send the notification.
twilio_from_number
: Your twilio phone number from which message will be sent.
VictorOps
VictorOps alerter will trigger an incident to a predefined VictorOps routing key. The body of the notification is formatted the same as with other alerters.
The alerter requires the following options:
victorops_api_key
: API key generated under the ‘REST Endpoint’ in the Integrations settings.
victorops_routing_key
: VictorOps routing key to route the alert to.
victorops_message_type
: VictorOps field to specify severity level. Must be one of the following: INFO, WARNING, ACKNOWLEDGEMENT, CRITICAL, RECOVERY
Optional:
victorops_entity_display_name
: Human-readable name of alerting entity. Used by VictorOps to correlate incidents by host througout the alert lifecycle.
victorops_proxy
: By default ElastAlert will not use a network proxy to send notifications to VictorOps. Set this option usinghostname:port
if you need to use a proxy.
Gitter
Gitter alerter will send a notification to a predefined Gitter channel. The body of the notification is formatted the same as with other alerters.
The alerter requires the following option:
gitter_webhook_url
: The webhook URL that includes your auth data and the ID of the channel (room) you want to post to. Go to the Integration Settings of the channelhttps://gitter.im/ORGA/CHANNEL#integrations, click ‘CUSTOM’ and copy the resulting URL.
Optional:
gitter_msg_level
: By default the alert will be posted with the ‘error’ level. You can use ‘info’ if you want the messages to be black instead of red.
gitter_proxy
: By default ElastAlert will not use a network proxy to send notifications to Gitter. Set this option usinghostname:port
if you need to use a proxy.
ServiceNow
The ServiceNow alerter will create a ne Incident in ServiceNow. The body of the notification is formatted the same as with other alerters.
The alerter requires the following options:
servicenow_rest_url
: The ServiceNow RestApi url, this will look likehttps://instancename.service-now.com/api/now/v1/table/incident
username
: The ServiceNow Username to access the api.
password
: The ServiceNow password to access the api.
short_description
: The ServiceNow password to access the api.
comments
: Comments to be attached to the incident, this is the equivilant of work notes.
assignment_group
: The group to assign the incident to.
category
: The category to attach the incident to, use an existing category.
subcategory
: The subcategory to attach the incident to, use an existing subcategory.
cmdb_ci
: The configuration item to attach the incident to.
caller_id
: The caller id (email address) of the user that created the incident ([email protected]).
Optional:
servicenow_proxy
: By default ElastAlert will not use a network proxy to send notifications to ServiceNow. Set this option usinghostname:port
if you need to use a proxy.
Debug
The debug alerter will log the alert information using the Python logger at the info level. It is logged into a Python Logger object with the nameelastalert
that can be easily accessed using thegetLogger
command.
Stomp
This alert type will use the STOMP protocol in order to push a message to a broker like ActiveMQ or RabbitMQ. The message body is a JSON string containing the alert details. The default values will work with a pristine ActiveMQ installation.
Optional:
stomp_hostname
: The STOMP host to use, defaults to localhost.stomp_hostport
: The STOMP port to use, defaults to 61613.stomp_login
: The STOMP login to use, defaults to admin.stomp_password
: The STOMP password to use, defaults to admin.stomp_destination
: The STOMP destination to use, defaults to /queue/ALERT
The stomp_destination field depends on the broker, the /queue/ALERT example is the nomenclature used by ActiveMQ. Each broker has its own logic.
Alerter
For all Alerter subclasses, you may reference values from a top-level rule property in your Alerter fields by referring to the property name surrounded by dollar signs. This can be useful when you have rule-level properties that you would like to reference many times in your alert. For example:
Example usage:
jira_priority: $priority$
jira_alert_owner: $owner$