Logstash grok过滤器 – 动态命名字段

我有以下格式的日志行,并且要提取字段:

[field1: content1] [field2: content2] [field3: content3] ...

我既不知道字段名称,也不知道字段数.

我尝试了反向引用和sprintf格式,但没有结果:

match => [ "message", "(?:\[(\w+): %{DATA:\k<-1>}\])+" ] # not working
match => [ "message", "(?:\[%{WORD:fieldname}: %{DATA:%{fieldname}}\])+" ] # not working

这似乎只适用于一个字段,但不是更多:

match => [ "message", "(?:\[%{WORD:field}: %{DATA:content}\] ?)+" ]
add_field => { "%{field}" => "%{content}" }

kv过滤器也不合适,因为字段的内容可能包含空格.

是否有任何插件/策略来解决这个问题?

最佳答案
Logstash Ruby插件可以帮助您. 🙂

这里是配置:

input {
    stdin {}
}

filter {
    ruby {
        code => "
            fieldArray = event['message'].split('] [')
            for field in fieldArray
                field = field.delete '['
                field = field.delete ']'
                result = field.split(': ')
                event[result[0]] = result[1]
            end
        "
    }
}

output {
    stdout {
        codec => rubydebug
    }
}

用你的日志:

[field1: content1] [field2: content2] [field3: content3]

这是输出:

{
   "message" => "[field1: content1] [field2: content2] [field3: content3]",
  "@version" => "1",
"@timestamp" => "2014-07-07T08:49:28.543Z",
      "host" => "abc",
    "field1" => "content1",
    "field2" => "content2",
    "field3" => "content3"
}

我试过4个字段,它也可以.

请注意,ruby代码中的事件是logstash事件.您可以使用它来获取所有事件字段,如消息,@timestamp等.

好好享受!!!

转载注明原文:Logstash grok过滤器 – 动态命名字段 - 代码日志