|NO.Z.00011|——————————|Deployment|——|Hadoop&ElasticSearch集中式日志分析系统.v11|——|Elasticsearch.v11|Logstash
一、filter插件
二、收集控制台输入数据,采集日期时间出来
Walter Savage Landor:strove with none,for none was worth my strife.Nature I loved and, next to Nature, Art:I warm'd both hands before the fire of life.It sinks, and I am ready to depart ——W.S.Landor
### --- Filter插件
~~~ Logstash之所以强悍的主要原因是filter插件;
~~~ 通过过滤器的各种组合可以得到我们想要的结构化数据。
~~~ 官网地址:https://www.elastic.co/guide/en/Logstash/current/plugins-filters-grok.html
### --- grok正则表达式
~~~ grok正则表达式是Logstash非常重要的一个环节;可以通过grok非常方便的将数据拆分和索引
~~~ # 语法格式:
~~~ (?pattern)
~~~ ?表示要取出里面的值,pattern就是正则表达式
二、收集控制台输入数据,采集日期时间出来
### --- 开发配置文件
~~~ # 开发配置文件
[root@hadoop02 ~]# vim /opt/yanqi/servers/es/Logstash/config/filter.conf
~~~ # 写入配置文件参数
input {stdin{}} filter {grok
{ match => {"message" => "(?\d+\.\d+)\s+"}
}
}
output {stdout{codec => rubydebug}}
### --- 检查配置文件的完整性
~~~ # 检查配置文件完整性
[root@hadoop02 ~]# /opt/yanqi/servers/es/Logstash/bin/logstash \
-f /opt/yanqi/servers/es/Logstash/config/filter.conf -t
~~~ # 输出如下配置参数
Configuration OK
Config Validation Result: OK. Exiting Logstash
### --- 启动Logstash服务
~~~ # 启动logstash服务
[root@hadoop02 ~]# /opt/yanqi/servers/es/Logstash/bin/logstash \
-f /opt/yanqi/servers/es/Logstash/config/filter.conf
~~~ # 控制台输入文字
11.11 神棍节!!
~~~ # 输出参数:
{
"date" => "11.11",
"message" => "11.11 神棍节!!",
"@version" => "1",
"@timestamp" => 2021-11-26T09:06:02.387Z,
"host" => "hadoop02"
}
三、使用grok收集nginx日志数据
### --- nginx一般打印出来的日志格式如下
~~~ 这种日志是非格式化的,通常,我们获取到日志后,
~~~ 还要使用mapreduce 或者spark 做一下清洗操作,就是将非格式化日志编程格式化日志;
~~~ 在清洗的时候,如果日志的数据量比较大,那么也是需要花费一定的时间的;
~~~ 所以可以使用Logstash 的grok 功能,将nginx 的非格式化数据采集成格式化数据:
~~~ # 插入参数解析后的数据:详见四.6章节
36.157.150.1 - - [05/Nov/2019:12:59:28 +0800] "GET/phpmyadmin_8c1019c9c0de7a0f/js/get_scripts.js.php?scripts%5B%5D=jquery/jquery-1.11.1.min.js&scripts%5B%5D=sprintf.js&scripts%5B%5D=ajax.js&scripts%5B%5D=keyhandler.js&scripts%5B%5D=jquery/jquery-ui-1.11.2.min.js&scripts%5B%5D=jquery/jquery.cookie.js&scripts%5B%5D=jquery/jquery.mousewheel.js&scripts%5B%5D=jquery/jquery.event.drag-2.2.js&scripts%5B%5D=jquery/jquery-ui-timepickeraddon.js&scripts%5B%5D=jquery/jquery.ba-hashchange-1.3.js HTTP/1.1" 200 139613 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36"
~~~ # 输出参数
{
"time_local" => "05/Nov/2019:12:59:28 +0800",
"@version" => "1",
"host" => "hadoop02",
"message" => "36.157.150.1 - - [05/Nov/2019:12:59:28 +0800] \"GET/phpmyadmin_8c1019c9c0de7a0f/js/get_scripts.js.php?scripts%5B%5D=jquery/jquery-1.11.1.min.js&scripts%5B%5D=sprintf.js&scripts%5B%5D=ajax.js&scripts%5B%5D=keyhandler.js&scripts%5B%5D=jquery/jquery-ui-1.11.2.min.js&scripts%5B%5D=jquery/jquery.cookie.js&scripts%5B%5D=jquery/jquery.mousewheel.js&scripts%5B%5D=jquery/jquery.event.drag-2.2.js&scripts%5B%5D=jquery/jquery-ui-timepickeraddon.js&scripts%5B%5D=jquery/jquery.ba-hashchange-1.3.js HTTP/1.1\" 200 139613 \"-\" \"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36\"",
"rawrequest" => "GET/phpmyadmin_8c1019c9c0de7a0f/js/get_scripts.js.php?scripts%5B%5D=jquery/jquery-1.11.1.min.js&scripts%5B%5D=sprintf.js&scripts%5B%5D=ajax.js&scripts%5B%5D=keyhandler.js&scripts%5B%5D=jquery/jquery-ui-1.11.2.min.js&scripts%5B%5D=jquery/jquery.cookie.js&scripts%5B%5D=jquery/jquery.mousewheel.js&scripts%5B%5D=jquery/jquery.event.drag-2.2.js&scripts%5B%5D=jquery/jquery-ui-timepickeraddon.js&scripts%5B%5D=jquery/jquery.ba-hashchange-1.3.js HTTP/1.1",
"@timestamp" => 2021-11-26T09:40:40.657Z,
"clientip" => "36.157.150.1",
"http_referer" => "\"-\"",
"status" => "200",
"body_bytes_sent" => "139613",
"agent" => "\"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36\""
}
四、在线安装grok插件
### --- 在线安装grok插件
~~~ # 更改镜像源地址
[root@hadoop02 ~]# vim /opt/yanqi/servers/es/Logstash/Gemfile
~~~ # 第4/5行配置如下参数
# source "https://rubygems.org" # 将这个镜像源注释掉
source "https://gems.ruby-china.com/" # 配置成中国的这个镜像源
### --- 准备在线安装
~~~ # 在线安装grok插件
[root@hadoop02 ~]# cd /opt/yanqi/servers/es/Logstash/
[root@hadoop02 Logstash]# bin/logstash-plugin install logstash-filter-grok
~~~ # 输出参数
Validating logstash-filter-grok
Installing logstash-filter-grok
Installation successful
### --- 开发Logstash的配置文件
~~~ # 定义Logstash的配置文件如下,我们从控制台输入nginx的日志数据,然后经过filter的过滤,将我们的日志文件转换成为标准的数据格式
[root@hadoop02 ~]# vim /opt/yanqi/servers/es/Logstash/config/monitor_nginx.conf
~~~ # 写入配置参数
input {stdin{}}
filter {
grok {
match => {"message" => "%{IPORHOST:clientip} \- \- \[%{HTTPDATE:time_local}\] \"(?:%{WORD:method} %{NOTSPACE:request}(?:HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:status} %{NUMBER:body_bytes_sent} %{QS:http_referer} %{QS:agent}"
}
}
}
output {stdout{codec => rubydebug}}
### --- 检查配置文件完整性
~~~ # 检查配置文件完整性
[root@hadoop02 ~]# /opt/yanqi/servers/es/Logstash/bin/logstash \
-f /opt/yanqi/servers/es/Logstash/config/monitor_nginx.conf -t
~~~ # 输出参数
Configuration OK
Config Validation Result: OK. Exiting Logstash
### --- 启动Logstash
~~~ # 执行以下命令启动Logstash
[root@hadoop02 ~]# /opt/yanqi/servers/es/Logstash/bin/logstash \
-f /opt/yanqi/servers/es/Logstash/config/monitor_nginx.conf
~~~ # 输出参数:详情查看6数据参数输出
### --- 从控制台输入nginx日志文件数据
~~~ # 输入第一条数据
113.31.119.183 - - [05/Nov/2019:12:59:27 +0800] "GET /phpmyadmin_8c1019c9c0de7a0f/js/messages.php? lang=zh_CN&db=&collation_connection=utf8_unicode_ci&token=6a44d72481633c90bffcfd42f11e25a1 HTTP/1.1" 200 8131 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36"
~~~ # 输出参数
{
"time_local" => "05/Nov/2019:12:59:27 +0800",
"@version" => "1",
"host" => "hadoop02",
"message" => "113.31.119.183 - - [05/Nov/2019:12:59:27 +0800] \"GET /phpmyadmin_8c1019c9c0de7a0f/js/messages.php? lang=zh_CN&db=&collation_connection=utf8_unicode_ci&token=6a44d72481633c90bffcfd42f11e25a1 HTTP/1.1\" 200 8131 \"-\" \"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36\"",
"rawrequest" => "GET /phpmyadmin_8c1019c9c0de7a0f/js/messages.php? lang=zh_CN&db=&collation_connection=utf8_unicode_ci&token=6a44d72481633c90bffcfd42f11e25a1 HTTP/1.1",
"@timestamp" => 2021-11-26T09:35:04.242Z,
"clientip" => "113.31.119.183",
"http_referer" => "\"-\"",
"status" => "200",
"body_bytes_sent" => "8131",
"agent" => "\"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36\""
}
~~~ # 输入第二条数据
36.157.150.1 - - [05/Nov/2019:12:59:28 +0800] "GET /phpmyadmin_8c1019c9c0de7a0f/js/get_scripts.js.php?scripts%5B%5D=jquery/jquery-1.11.1.min.js&scripts%5B%5D=sprintf.js&scripts%5B%5D=ajax.js&scripts%5B%5D=keyhandler.js&scripts%5B%5D=jquery/jquery-ui-1.11.2.min.js&scripts%5B%5D=jquery/jquery.cookie.js&scripts%5B%5D=jquery/jquery.mousewheel.js&scripts%5B%5D=jquery/jquery.event.drag-2.2.js&scripts%5B%5D=jquery/jquery-ui-timepickeraddon.js&scripts%5B%5D=jquery/jquery.ba-hashchange-1.3.js HTTP/1.1" 200 139613 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36"
~~~ # 输出参数
{
"time_local" => "05/Nov/2019:12:59:28 +0800",
"@version" => "1",
"host" => "hadoop02",
"message" => "36.157.150.1 - - [05/Nov/2019:12:59:28 +0800] \"GET /phpmyadmin_8c1019c9c0de7a0f/js/get_scripts.js.php?scripts%5B%5D=jquery/jquery-1.11.1.min.js&scripts%5B%5D=sprintf.js&scripts%5B%5D=ajax.js&scripts%5B%5D=keyhandler.js&scripts%5B%5D=jquery/jquery-ui-1.11.2.min.js&scripts%5B%5D=jquery/jquery.cookie.js&scripts%5B%5D=jquery/jquery.mousewheel.js&scripts%5B%5D=jquery/jquery.event.drag-2.2.js&scripts%5B%5D=jquery/jquery-ui-timepickeraddon.js&scripts%5B%5D=jquery/jquery.ba-hashchange-1.3.js HTTP/1.1\" 200 139613 \"-\" \"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36\"",
"rawrequest" => "GET /phpmyadmin_8c1019c9c0de7a0f/js/get_scripts.js.php?scripts%5B%5D=jquery/jquery-1.11.1.min.js&scripts%5B%5D=sprintf.js&scripts%5B%5D=ajax.js&scripts%5B%5D=keyhandler.js&scripts%5B%5D=jquery/jquery-ui-1.11.2.min.js&scripts%5B%5D=jquery/jquery.cookie.js&scripts%5B%5D=jquery/jquery.mousewheel.js&scripts%5B%5D=jquery/jquery.event.drag-2.2.js&scripts%5B%5D=jquery/jquery-ui-timepickeraddon.js&scripts%5B%5D=jquery/jquery.ba-hashchange-1.3.js HTTP/1.1",
"@timestamp" => 2021-11-26T09:35:28.894Z,
"clientip" => "36.157.150.1",
"http_referer" => "\"-\"",
"status" => "200",
"body_bytes_sent" => "139613",
"agent" => "\"Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36\""
}
Walter Savage Landor:strove with none,for none was worth my strife.Nature I loved and, next to Nature, Art:I warm'd both hands before the fire of life.It sinks, and I am ready to depart ——W.S.Landor