容器監(jiān)控實(shí)踐—node-exporter

VPointer 發(fā)布于2019-06-28 17:58 / 1357人閱讀

摘要：比如定義了基礎(chǔ)的數(shù)據(jù)類型以及對(duì)應(yīng)的方法收集事件次數(shù)等單調(diào)遞增的數(shù)據(jù)收集當(dāng)前的狀態(tài)，比如數(shù)據(jù)庫(kù)連接數(shù)收集隨機(jī)正態(tài)分布數(shù)據(jù)，比如響應(yīng)延遲收集隨機(jī)正態(tài)分布數(shù)據(jù)，和是類似的庫(kù)的詳細(xì)解析可以參考本文為容器監(jiān)控實(shí)踐系列文章，完整內(nèi)容見(jiàn)

概述

Prometheus從2016年加入CNCF，到2018年8月畢業(yè)，現(xiàn)在已經(jīng)成為Kubernetes的官方監(jiān)控方案，接下來(lái)的幾篇文章將詳細(xì)解讀Promethues(2.x)

Prometheus可以從Kubernetes集群的各個(gè)組件中采集數(shù)據(jù)，比如kubelet中自帶的cadvisor，api-server等，而node-export就是其中一種來(lái)源

Exporter是Prometheus的一類數(shù)據(jù)采集組件的總稱。它負(fù)責(zé)從目標(biāo)處搜集數(shù)據(jù)，并將其轉(zhuǎn)化為Prometheus支持的格式。與傳統(tǒng)的數(shù)據(jù)采集組件不同的是，它并不向中央服務(wù)器發(fā)送數(shù)據(jù)，而是等待中央服務(wù)器主動(dòng)前來(lái)抓取，默認(rèn)的抓取地址為http://CURRENT_IP:9100/metrics

node-exporter用于采集服務(wù)器層面的運(yùn)行指標(biāo)，包括機(jī)器的loadavg、filesystem、meminfo等基礎(chǔ)監(jiān)控，類似于傳統(tǒng)主機(jī)監(jiān)控維度的zabbix-agent

node-export由prometheus官方提供、維護(hù)，不會(huì)捆綁安裝，但基本上是必備的exporter

功能

node-exporter用于提供*NIX內(nèi)核的硬件以及系統(tǒng)指標(biāo)。

如果是windows系統(tǒng)，可以使用WMI exporter

如果是采集NVIDIA的GPU指標(biāo)，可以使用prometheus-dcgm

根據(jù)不同的*NIX操作系統(tǒng)，node-exporter采集指標(biāo)的支持也是不一樣的，如：

diskstats 支持 Darwin, Linux

cpu 支持Darwin, Dragonfly, FreeBSD, Linux, Solaris等，

詳細(xì)信息參考：node_exporter

我們可以使用 --collectors.enabled參數(shù)指定node_exporter收集的功能模塊,或者用--no-collector指定不需要的模塊，如果不指定，將使用默認(rèn)配置。

部署

二進(jìn)制部署：

下載地址：從https://github.com/prometheus...

解壓文件：tar -xvzf **.tar.gz

開(kāi)始運(yùn)行：./node_exporter

./node_exporter -h 查看幫助

usage: node_exporter []

Flags:
  -h, --help
  --collector.diskstats.ignored-devices
  --collector.filesystem.ignored-mount-points
  --collector.filesystem.ignored-fs-types      
  --collector.netdev.ignored-devices      
  --collector.netstat.fields      
  --collector.ntp.server="127.0.0.1"
  .....

./node_exporter運(yùn)行后，可以訪問(wèn)http://${IP}:9100/metrics，就會(huì)展示對(duì)應(yīng)的指標(biāo)列表

Docker安裝：

docker run -d 
  --net="host" 
  --pid="host" 
  -v "/:/host:ro,rslave" 
  quay.io/prometheus/node-exporter 
  --path.rootfs /host

k8s中安裝：

node-exporter.yaml文件：

apiVersion: v1
kind: Service
metadata:
  annotations:
    prometheus.io/scrape: "true"
  labels:
    app: node-exporter
    name: node-exporter
  name: node-exporter
spec:
  clusterIP: None
  ports:
  - name: scrape
    port: 9100
    protocol: TCP
  selector:
    app: node-exporter
  type: ClusterIP
----
apiVersion: extensions/v1beta1
kind: DaemonSet
metadata:
  name: node-exporter
spec:
  template:
    metadata:
      labels:
        app: node-exporter
      name: node-exporter
    spec:
      containers:
      - image: registry.cn-hangzhou.aliyuncs.com/tryk8s/node-exporter:latest
        name: node-exporter
        ports:
        - containerPort: 9100
          hostPort: 9100
          name: scrape
      hostNetwork: true
      hostPID: true

kubectl create -f node-exporter.yaml

得到一個(gè)daemonset和一個(gè)service對(duì)象，部署后，為了能夠讓Prometheus能夠從當(dāng)前node exporter獲取到監(jiān)控?cái)?shù)據(jù)，這里需要修改Prometheus配置文件。編輯prometheus.yml并在scrape_configs節(jié)點(diǎn)下添加以下內(nèi)容:

scrape_configs:
  # 采集node exporter監(jiān)控?cái)?shù)據(jù)
  - job_name: "node"
    static_configs:
      - targets: ["localhost:9100"]

也可以使用prometheus.io/scrape: "true"標(biāo)識(shí)來(lái)自動(dòng)獲取service的metric接口

- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]

配置完成后，重啟prometheus就能看到對(duì)應(yīng)的指標(biāo)

查看指標(biāo)：

直接查看：

如果是二進(jìn)制或者docker部署，部署成功后可以訪問(wèn)：http://${IP}:9100/metrics

會(huì)輸出下面格式的內(nèi)容，包含了node-exporter暴露的所有指標(biāo)：

# HELP go_gc_duration_seconds A summary of the GC invocation durations.
# TYPE go_gc_duration_seconds summary
go_gc_duration_seconds{quantile="0"} 6.1872e-05
go_gc_duration_seconds{quantile="0.25"} 0.000119463
go_gc_duration_seconds{quantile="0.5"} 0.000151156
go_gc_duration_seconds{quantile="0.75"} 0.000198764
go_gc_duration_seconds{quantile="1"} 0.009889647
go_gc_duration_seconds_sum 0.257232201
go_gc_duration_seconds_count 1187

# HELP node_cpu Seconds the cpus spent in each mode.
# TYPE node_cpu counter
node_cpu{cpu="cpu0",mode="guest"} 0
node_cpu{cpu="cpu0",mode="guest_nice"} 0
node_cpu{cpu="cpu0",mode="idle"} 68859.19
node_cpu{cpu="cpu0",mode="iowait"} 167.22
node_cpu{cpu="cpu0",mode="irq"} 0
node_cpu{cpu="cpu0",mode="nice"} 19.92
node_cpu{cpu="cpu0",mode="softirq"} 17.05
node_cpu{cpu="cpu0",mode="steal"} 28.1

Prometheus查看：

類似go_gc_duration_seconds和node_cpu就是metric的名稱，如果使用了Prometheus,則可以在http://${IP}:9090/頁(yè)面的指標(biāo)中搜索到以上的指標(biāo)：

常用指標(biāo)類型有：

node_cpu：系統(tǒng)CPU使用量
node_disk*：磁盤IO
node_filesystem*：文件系統(tǒng)用量
node_load1：系統(tǒng)負(fù)載
node_memeory*：內(nèi)存使用量
node_network*：網(wǎng)絡(luò)帶寬
node_time：當(dāng)前系統(tǒng)時(shí)間
go_*：node exporter中g(shù)o相關(guān)指標(biāo)
process_*：node exporter自身進(jìn)程相關(guān)運(yùn)行指標(biāo)

Grafana查看：

Prometheus雖然自帶了web頁(yè)面，但一般會(huì)和更專業(yè)的Grafana配套做指標(biāo)的可視化，Grafana有很多模板，用于更友好地展示出指標(biāo)的情況，如Node Exporter for Prometheus

在grafana中配置好變量、導(dǎo)入模板就會(huì)有上圖的效果。

深入解讀

node-exporter是Prometheus官方推薦的exporter，類似的還有

HAProxy exporter

Collectd exporter

SNMP exporter

MySQL server exporter

....

官方推薦的都會(huì)在https://github.com/prometheus下，在exporter推薦頁(yè)，也會(huì)有很多第三方的exporter，由個(gè)人或者組織開(kāi)發(fā)上傳，如果有自定義的采集需求，可以自己編寫exporter，具體的案例可以參考后續(xù)的[自定義Exporter]文章

版本問(wèn)題

因?yàn)閚ode_exporter是比較老的組件，有一些最佳實(shí)踐并沒(méi)有merge進(jìn)去，比如符合Prometheus命名規(guī)范(https://prometheus.io/docs/pr...，目前(2019.1)最新版本為0.17

一些指標(biāo)名字的變化（詳細(xì)比對(duì)）

* node_cpu ->  node_cpu_seconds_total
* node_memory_MemTotal -> node_memory_MemTotal_bytes
* node_memory_MemFree -> node_memory_MemFree_bytes
* node_filesystem_avail -> node_filesystem_avail_bytes
* node_filesystem_size -> node_filesystem_size_bytes
* node_disk_io_time_ms -> node_disk_io_time_seconds_total
* node_disk_reads_completed -> node_disk_reads_completed_total
* node_disk_sectors_written -> node_disk_written_bytes_total
* node_time -> node_time_seconds
* node_boot_time -> node_boot_time_seconds
* node_intr -> node_intr_total

解決版本問(wèn)題的方法有兩種：

一是在機(jī)器上啟動(dòng)兩個(gè)版本的node-exporter，都讓prometheus去采集。

二是使用指標(biāo)轉(zhuǎn)換器,他會(huì)將舊指標(biāo)名稱轉(zhuǎn)換為新指標(biāo)

對(duì)于grafana的展示，可以找同時(shí)支持兩套指標(biāo)的dashboard模板

Collector

node-exporter的主函數(shù)：

// Package collector includes all individual collectors to gather and export system metrics.
package collector

import (
    "fmt"
    "sync"
    "time"

    "github.com/prometheus/client_golang/prometheus"
    "github.com/prometheus/common/log"
    "gopkg.in/alecthomas/kingpin.v2"
)

// Namespace defines the common namespace to be used by all metrics.
const namespace = "node"

可以看到exporter的實(shí)現(xiàn)需要引入github.com/prometheus/client_golang/prometheus庫(kù)，client_golang是prometheus的官方go庫(kù)，既可以用于集成現(xiàn)有應(yīng)用，也可以作為連接Prometheus HTTP API的基礎(chǔ)庫(kù)。

比如定義了基礎(chǔ)的數(shù)據(jù)類型以及對(duì)應(yīng)的方法：

Counter：收集事件次數(shù)等單調(diào)遞增的數(shù)據(jù)
Gauge：收集當(dāng)前的狀態(tài)，比如數(shù)據(jù)庫(kù)連接數(shù)
Histogram：收集隨機(jī)正態(tài)分布數(shù)據(jù)，比如響應(yīng)延遲
Summary：收集隨機(jī)正態(tài)分布數(shù)據(jù)，和 Histogram 是類似的

switch metricType {
        case dto.MetricType_COUNTER:
            valType = prometheus.CounterValue
            val = metric.Counter.GetValue()

        case dto.MetricType_GAUGE:
            valType = prometheus.GaugeValue
            val = metric.Gauge.GetValue()

        case dto.MetricType_UNTYPED:
            valType = prometheus.UntypedValue
            val = metric.Untyped.GetValue()

client_golang庫(kù)的詳細(xì)解析可以參考：theory-source-code

本文為容器監(jiān)控實(shí)踐系列文章，完整內(nèi)容見(jiàn)：container-monitor-book

GPU云服務(wù)器云服務(wù)器容器監(jiān)控學(xué)習(xí)實(shí)踐架構(gòu)實(shí)踐開(kāi)發(fā)實(shí)踐

文章版權(quán)歸作者所有，未經(jīng)允許請(qǐng)勿轉(zhuǎn)載,若此文章存在違規(guī)行為，您可以聯(lián)系管理員刪除。

轉(zhuǎn)載請(qǐng)注明本文地址：http://m.specialneedsforspecialkids.com/yun/28053.html

發(fā)表評(píng)論

登陸后可評(píng)論

0條評(píng)論

VPointer

男|高級(jí)講師

我要關(guān)注我要私信

TA的文章

搬瓦工：Linux服務(wù)器選擇CentOS、Debian、Ubuntu 鏡像的建議

閱讀 3880·2021-09-23 11:51
虛擬主機(jī)業(yè)務(wù)是什么-服務(wù)器與虛擬主機(jī)有什么區(qū)別？

閱讀 3067·2021-09-22 15:59
BuyVM：新增解鎖流媒體VPS主機(jī) $5/月起，1Gbps不限流量，拉斯維加斯

閱讀 868·2021-09-09 11:37
蘋果首款 AR/VR 頭設(shè)的定制芯片可能已完成設(shè)計(jì)

閱讀 2070·2021-09-08 09:45
CSS入門之盒模型（六分之四）

閱讀 1267·2019-08-30 15:54
H5實(shí)例教學(xué)--3D全景(ThreeJs全景Demo)

閱讀 2065·2019-08-30 15:53
CSS || 元素垂直居中筆記

閱讀 492·2019-08-29 12:12
JSer全棧化技術(shù)棧推薦(一)——原生移動(dòng)端是天堂還是泥潭

閱讀 3290·2019-08-29 11:15

国产xxxx99真实实拍_久久不雅视频_高清韩国a级特黄毛片_嗯老师别我我受不了了小说

資訊專欄INFORMATION COLUMN

上云采購(gòu)季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺(tái)、長(zhǎng)期優(yōu)惠，快來(lái)選購(gòu)！