You are on page 1of 126

Zabbix 3.

0 Administrator

WWW.YOURCOMPANY.CO
M
ABOUT US
AGENDA
❖ Advanced data collection - Java monitoring ❖ Low level discovery - discovery of host

(JMX), database monitoring, SSH/Telnet resources, Windows services, discovery

checks. using SNMP and SQL, also writing your own

LLD rules.

❖ Monitoring of VMWare - vCenter, vSphere,

discovery of VMs and hypervisors. ❖ Distributed monitoring - Active/Passive

proxies, performance considerations.

❖ User macros functionality.

❖ Advanced topics - Zabbix server processes,

front-end customization, configuration

parameters of Zabbix daemons, protocols.

3
AGENDA
❖ Advanced problem detection - detecting ❖ Scaling and performance tuning - how to

anomalies, elimination of flapping. monitor millions of metrics and millions of

triggers.

❖ Advanced trigger function - trend

prediction, percentile. ❖ Problems, challenges and solutions -

integration with 3rd party systems,

performance, maintenance issues


❖ Zabbix API - functionality, examples

(Introduction Kibana/Grafana, module

Python). ❖ High availability and redundancy.

4
JAVA
MONITORING
MONITORING JAVA
APPLICATIONS
Zabbix server

ZBX

Apache
Tomcat

GlassFish

Zabbix Java
gateway

ZABBIX 3.0 -
ADMINISTRATOR 6
www.nubiral.com
JAVA GATEWAY

❖ Requires Java

❖ Can run on the same or separate system

❖ Is polled by the server

❖ Bundles requests for performance reasons

ZABBIX 3.0 -
ADMINISTRATOR 7
www.nubiral.com
MULTIPLE AND REMOTE
GATEWAYS

Zabbix proxy Zabbix Java


gateway

Apache
Tomcat

Zabbix proxy Zabbix Java


ZBX
gateway

Zabbix server GlassFish

Zabbix Java
gateway

ZABBIX 3.0 -
ADMINISTRATOR 8
www.nubiral.com
ODBC
MONITORING
GET DATA FROM
EXTERNAL DATABASE
ODBC connectivity is required
Depends on UnixODBC
db.odbc.select[<unique short description>,<dsn>]

ZABBIX 3.0 -
ADMINISTRATOR 10
www.nubiral.com
SSH/TELNET
MONITORING
GET DATA FROM
EXTERNAL DEVICE
Password or public key authentication (SSHKeyLocation)
Ability to run any command and return result back to Zabbix
ssh.run[<unique short description>,<ip>,<port>,<encoding>]

ZABBIX 3.0 -
ADMINISTRATOR 12
www.nubiral.com
MONITORING OF
VMWARE
MONITORING OF
VMWARE

Monitoring of vCenter and vSphere


Auto-discovery of hypervizors and guest VMs
Support of host prototypes; possible to extended for XEN, KVM,
Linux Containers, etc
ZABBIX 3.0 -
ADMINISTRATOR 14
www.nubiral.com
MONITORING OF VMWARE

Works out of the box

Does not require any 3rd party tools

It is based on native VMWare API


Optimized to generate as less API calls as possible

Configuration and performance data are two separate requests (2.4)

ZABBIX 3.0 -
ADMINISTRATOR 15
www.nubiral.com
READY TO USE TEMPLATES

Template Virt VMWare

Template Virt VMWare Guest

Template Virt VMWare Hypervizor

ZABBIX 3.0 -
ADMINISTRATOR 16
www.nubiral.com
USER MACROS
USER MACRO
FUNCTIONALITY
Easier maintenance – one template and:
different item key parameters
net.tcp.service[ssh,{$SSH_PORT}]
different trigger expression values
{server:system.cpu.load[,avg1].last(0)} > {$CPU_LOAD}

Overwrites upstream – priority: Global macro


Host macro {$CPU_LOAD}=5
Template macro
Global macro
Template macro
{$CPU_LOAD}=20
Syntax:
{$NAME}
Host macro
{$CPU_LOAD}=1

ZABBIX 3.0 -
ADMINISTRATOR 18
www.nubiral.com
USER MACRO
CONFIGURATION
Global: Administration → General → Macros

Host: Host and template


properties

ZABBIX 3.0 -
ADMINISTRATOR 19
www.nubiral.com
LOW LEVEL
DISCOVERY
BUILT-IN ENTITIES
FOR LLD
❖ Agent – filesystems
❖ Agent – network interfaces
❖ Agent – Windows services
❖ Agent - CPUs and CPU cores
❖ SQL entities
❖ SNMP entities
...anything using scripting

ZABBIX 3.0 -
ADMINISTRATOR 21
www.nubiral.com
LLD COMPONENTS
LLD rule
Item prototypes
Trigger prototypes
Graph prototypes
Host prototypes

ZABBIX 3.0 -
ADMINISTRATOR 22
www.nubiral.com
NETWORK INTERFACES
DISCOVERY
Item prototype:Traffic in Item prototype:Traffic out
Network
Trigger prototype: Trigger prototype:
interfaces high traffic in high traffic out
discovery rule Graph prototype:
traffic on interface

Item:Traffic in Item:Traffic out

Discovered eth0 Trigger: high traffic in Trigger: high traffic out


Graph prototype:
traffic on interface

Item:Traffic in Item:Traffic out

Discovered eth1 Trigger: high traffic in Trigger: high traffic out


Graph prototype:
traffic on interface
ZABBIX 3.0 -
ADMINISTRATOR 23
www.nubiral.com
CREATING PROTOTYPES

LLD rules return data in variables (macros):


Disks: {#FSNAME}, {#FSTYPE}
Interfaces: {#IFNAME}
CPU: {#CPU.NUMBER}, {#CPU.STATUS} SNMP:
{#SNMPINDEX}, {#SNMPVALUE}, ...
ODBC: column names become macro names
Windows services: {#SERVICE.NAME}, {#SERVICE.STATE}, ...

Example key:
vfs.fs.size[{#FSNAME},free]

LLD macros could be used in trigger expressions (2.2)


{Template_OS_Linux:vfs.fs.size[{#FSNAME},pused].last(0)}>
{#MY_CUSTOM_MACRO}

ZABBIX 3.0 -
ADMINISTRATOR 24
www.nubiral.com
CONTEXT SUPPORT
IN USER MACROS
Macro context is a text value
Use case for macro contexts would be using an LLD macro value

Example:
{ca_001:vfs.fs.size[{#FSNAME},pfree].last()} <
{$LOW_SPACE_LIMIT:"{#FSNAME}"}

where:
{$LOW_SPACE_LIMIT} 10
{$LOW_SPACE_LIMIT:"/opt"} 25

Events will be created when "/" and "/home" have less than 10% or
"/opt" filesystem has less than 25% of free disk space.

ZABBIX 3.0 -
ADMINISTRATOR 25
www.nubiral.com
LLD RULE
PROPERTIES
Can use any "item" type for data collection
Update interval concerns
Filtering by regexp

ZABBIX 3.0 -
ADMINISTRATOR 26
www.nubiral.com
DEPENDENCIES BETWEEN
TRIGGER PROTOTYPES

ZABBIX 3.0 -
ADMINISTRATOR 27
www.nubiral.com
CREATED ENTITIES

Denoted in configuration

ZABBIX 3.0 -
ADMINISTRATOR 28
www.nubiral.com
ENTITIES TO BE REMOVED

Denoted in configuration for an item or host

ZABBIX 3.0 -
ADMINISTRATOR 29
www.nubiral.com
Linking to applications based on discovery values

ZABBIX 3.0 -
ADMINISTRATOR 30
www.nubiral.com
LLD OF WINDOWS
SERVICES
LLD for Windows services:
service.discovery
LLD rule returns data in macros:
{#SERVICE.NAME}
{#SERVICE.DISPLAYNAME}
{#SERVICE.DESCRIPTION}
{#SERVICE.STATE}
{#SERVICE.STATENAME}
{#SERVICE.PATH}
{#SERVICE.USER}
{#SERVICE.STARTUP}
{#SERVICE.STARTUPNAME}

Item key:
service.info[service,<param>]
Example:
service.info[{#SERVICE.NAME},state]

ZABBIX 3.0 -
ADMINISTRATOR 32
www.nubiral.com
CUSTOM LLD DATA

{
"data":[
{ "{#FSNAME}":"/", "{#FSTYPE}":"rootfs"},
{ "{#FSNAME}":"/sys", "{#FSTYPE}":"sysfs"},
{ "{#FSNAME}":"/proc", "{#FSTYPE}":"proc"},
{ "{#FSNAME}":"/dev", "{#FSTYPE}":"devtmpfs"},
{ "{#FSNAME}":"/dev/pts", "{#FSTYPE}":"devpts"}
]
}

ZABBIX 3.0 -
ADMINISTRATOR 33
www.nubiral.com
LLD OF
SNMP OIDS
Syntax for SNMP discovery rules:
SNMP OID before 3.0: SNMP OID
SNMP OID from 3.0: discovery[{#SNMPVALUE}, SNMP OID]

Example:

discovery[{#IFDESCR}, IF-MIB::ifDescr, {#IFALIAS}, IF-MIB::ifAlias]

{ "data":[
{"{#SNMPINDEX}":1,"{#IFDESCR}":"Interface #1","{#IFALIAS}":"eth1"},
{"{#SNMPINDEX}":2,"{#IFDESCR}":"Interface #2", "{#IFALIAS}":"eth2"},
{"{#SNMPINDEX}":3,"{#IFALIAS}":"eth3"},
#4"},
{"{#SNMPINDEX}":4,"{#IFDESCR}":"Interface
{"{#SNMPINDEX}":5,"{#IFALIAS}":"eth5"}
} ]

ZABBIX 3.0 -
ADMINISTRATOR 42
www.nubiral.com
LLD USING
SQL
LLD VIA SQL QUERIES

LLD via SQL queries:


db.odbc.discovery[<description>,<dsn>]

Results automatically transformed into JSON


Column names become macro names and selected rows
become the values of these macros
Use column aliases to define macro names:
mysql> SELECT c.name, c.loc AS location FROM customers c;

Be aware: the discovery rule becomes not supported if


macro name is not valid

ZABBIX 3.0 -
ADMINISTRATOR 37
www.nubiral.com
DISTRIBUTED
MONITORING
THE PROBLEM

Zabbix Server

ZABBIX 3.0 -
ADMINISTRATOR 39
www.nubiral.com
THE SOLUTION
ZABBIX PROXY

Zabbix Server Zabbix Proxy

ZABBIX 3.0 -
ADMINISTRATOR 40
www.nubiral.com
PASSIVE PROXY

Zabbix Server Zabbix Proxy

ZABBIX 3.0 -
ADMINISTRATOR 41
www.nubiral.com
PROXY OVERVIEW

❖ Centralised monitoring
❖ Zabbix server controls configuration of all proxies
❖ Supports any platform server supports
❖ Supports any database server supports
❖ Can create SQLite DB automatically
❖ Can buffer data in case of communication problems
❖ Choose the direction of the connection
Don't use the same DB for proxy as server

❖ Can use different DBs on server & proxies

ZABBIX 3.0 -
ADMINISTRATOR 42
www.nubiral.com
INITIAL SETUP

Compile binary (--enable-proxy)

Create proxy database (optional for SQLite)


Update configuration file
Start the proxy
Add proxy in the frontend (Administration → Proxies)

Configure hosts to be monitored by the proxy

ZABBIX 3.0 -
ADMINISTRATOR 43
www.nubiral.com
PROXIES
Proxy list in frontend
Shows proxy mode
Encryption
Last seen
Host, item count
Required performance

ZABBIX 3.0 -
ADMINISTRATOR 44
www.nubiral.com
ACTIVE ZABBIX PROXY
CONFIGURATION

ProxyMode must be set to 0 (active)


Hostname must match proxy name as configured in the frontend
ProxyOfflineBuffer controls for how long data is kept locally if proxy can't
contact server (one hour by default)
ProxyLocalBuffer allows to preserve data in proxy database for later
processing
ConfigFrequency controls how often proxy requests configuration
information from Zabbix server
DataSenderFrequency controls how often data is sent to Zabbix server
HeartbeatFrequency makes proxy contact Zabbix server even if there
is no new data to transmit

ZABBIX 3.0 -
ADMINISTRATOR 45
www.nubiral.com
PASSIVE ZABBIX PROXY
CONFIGURATION

Zabbix proxy configuration file: ProxyMode


must be set to 1 (passive)

Zabbix server configuration file


StartProxyPollers controls how many pollers contact proxies
ProxyConfigFrequency – how often Zabbix server sends
configuration changes to passive proxies
ProxyDataFrequency – how often data is requested from
passive proxies

ZABBIX 3.0 -
ADMINISTRATOR 46
www.nubiral.com
PROXY COMMUNICATION
Server connects every
ProxyConfigFrequency seconds
and sends configuration (1 hour by default)
Passive
Server connects every proxy
ProxyDataFrequency seconds
and retrieves data (1 second by default)

Zabbix
server Proxy connects every
ConfigFrequency seconds
and retrieves configuration (1 hour by
default)
Proxy connects every
Active
DataSenderFrequency seconds and proxy
sends data, if any (1 second by default)

ZABBIX 3.0 -
ADMINISTRATOR 47
www.nubiral.com
RELOAD ACTIVE PROXY
CONFIGURATION

# zabbix_proxy --runtime-control config_cache_reload

Sends signal to current proxy to reload configuration cache


Makes active proxy also request configuration from server Is
ignored for passive proxy

ZABBIX 3.0 -
ADMINISTRATOR 48
www.nubiral.com
MONITORING PROXY
AVAILABILITY
HeartbeatFrequency ensures server will notice proxy
missing even if no data has to be sent. One minute by
default.

Internal item zabbix[proxy,"Proxy name",lastaccess]

Trigger based on function fuzzytime:


{server:zabbix[proxy,"Proxy name",lastaccess].fuzzytime(3m)}=0

ZABBIX 3.0 -
ADMINISTRATOR 49
www.nubiral.com
PER-PROXY QUEUE

Shows Zabbix server and per-proxy performance No


details on item categories

ZABBIX 3.0 -
ADMINISTRATOR 50
www.nubiral.com
PERFORMANCE
CONSIDERATIONS
Zabbix server
Fast CPU
Fast storage for processing of historical information
Number of trappers should be higher than number of Proxies
Performance after downtime

Proxy
Low hardware requirements
Embedded hardware can be used

ZABBIX 3.0 -
ADMINISTRATOR 51
www.nubiral.com
LIMITATIONS

Proxy does not support remote command relaying (yet)


Auto-creation of database is for SQLite only

No alerting or triggers, proxy is used for data collection only

ZABBIX 3.0 -
ADMINISTRATOR 52
www.nubiral.com
ADVANCED
TOPICS

Z
ZABBIX SERVER
COMPONENTS

ZABBIX 3.0 -
ADMINISTRATOR 54
www.nubiral.com
FRONTEND
CUSTOMIZATION
In include/defines.inc.php

Protection against password guessing Popup row limit


ZBX_LOGIN_ATTEMPTS ZBX_WIDGET_ROWS
ZBX_LOGIN_BLOCK
Rounding
Graph limits & defaults ZBX_UNITS_ROUNDOFF_THRESHOLD
ZBX_MIN_PERIOD ZBX_UNITS_ROUNDOFF_UPPER_LIMIT
ZBX_MAX_PERIOD ZBX_UNITS_ROUNDOFF_LOWER_LIMIT
ZBX_PERIOD_DEFAULT
GRAPH_YAXIS_SIDE_DEFAULT Latest data and item overview
ZBX_HISTORY_PERIOD

ZABBIX 3.0 -
ADMINISTRATOR 55
www.nubiral.com
CONFIGURATION
IN DETAIL
Zabbix server configuration parameters
zabbix_server.conf

Zabbix agent daemon configuration parameters


zabbix_agentd.conf

ZABBIX 3.0 -
ADMINISTRATOR 56
www.nubiral.com
ZABBIX PROTOCOLS

See https://zabbix.org/wiki/Docs/protocols

ZABBIX 3.0 -
ADMINISTRATOR 57
www.nubiral.com
ADVANCED
PROBLEM
DETECTION
DETECTING ANOMALIES

Time shift available for functions min, max, avg, last and count

Trigger if load average today exceeds average load of the same


hour yesterday more than twice:
{host:system.cpu.load.avg(1h)} /
{host:system.cpu.load.avg(1h,24h)} >2

ZABBIX 3.0 -
ADMINISTRATOR 59
www.nubiral.com
ANOMALY

Compare with one day ago

ZABBIX 3.0 -
ADMINISTRATOR 60
www.nubiral.com
HYSTERESIS

Different conditions for PROBLEM and OK states


Simple:
{server:system.cpu.load.last(0)}>5

Hysteresis:
({TRIGGER.VALUE}=0 and {server:system.cpu.load.last(0)}>5) or
({TRIGGER.VALUE}=1 and {server:system.cpu.load.last(0)}>1)

Another way to reduce trigger flapping

ZABBIX 3.0 -
ADMINISTRATOR 61
www.nubiral.com
HYSTERESIS

{server:system.cpu.load.last()} > 5 … {server:system.cpu.load.last()} > 1

ZABBIX 3.0 -
ADMINISTRATOR 62
www.nubiral.com
HYSTERESIS
TIMESTAMP VALUE
2016-01-30
10:55:50
0.2

2016-01-30
10:55:20
0.8 OK
2016-01-30
10:54:50
2.5

2016-01-30
OK 10:54:20
4.7 -
2016-01-30
PROBLEM 10:53:50
10 -
2016-01-30
OK 10:53:20
4.6 -
2016-01-30
PROBLEM 10:52:50
5.2 -
2016-01-30
OK 10:52:20
2.4 -
2016-01-30
PROBLEM 10:51:50
5.3
PROBLEM
2016-01-30
10:51:20
2.5

2016-01-30
10:50:50
1.2

No hysteresis 2016-01-30
10:50:20
0.5 Hysteresis
ZABBIX 3.0 -
ADMINISTRATOR 63
www.nubiral.com
ADVANCED TRIGGER
FUNCTIONS
TREND PREDICTION (VALUE)

Function: forecast(sec|#num,<time_shift>,time,<fit>,<mode>)

Parameters:
sec - time period
#num - number of values
<time_shift> - evaluation period time -
forecasting horizon in seconds
<fit> - function used (linear, polynomialN, exponential, logarithmic, power)
<mode> - demanded output (value, max, min, delta, avg)

Example:
{ora01_bi:vfs.fs.size[/,free].forecast(7d,,7d)}<100M

ZABBIX 3.0 -
ADMINISTRATOR 65
www.nubiral.com
TREND PREDICTION (TIME)

Function:
timeleft(sec|#num,<time_shift>,threshold,<fit>)

Parameters:
sec - time period
#num - number of values
<time_shift> - evaluation period
threshold - value to reach
<fit> - function used (linear, polynomialN, exponential, logarithmic, power)

Example:
{ora01_bi:vfs.fs.size[/,free].timeleft(1d,,104857600)}<1h

ZABBIX 3.0 -
ADMINISTRATOR 66
www.nubiral.com
TREND PREDICTION
Use calculated items to visualize values

Examples: forecast("vfs.fs.size[/,free]",1d,,1d)
timeleft("vfs.fs.size[/,free]",1d,,104857600)

ZABBIX 3.0 -
ADMINISTRATOR 67
www.nubiral.com
TREND PREDICTION NOTES

Data from the "trends*" tables is not used

Forecast shows now what is expected value of item after some


time

Some metrics unfortunately are unpredictable (for


example CPU)

Forecast() and timeleft() with default linear fit and


polynomial2–3 are cheap performance-wise

ZABBIX 3.0 -
ADMINISTRATOR 68
www.nubiral.com
TREND PREDICTION TIPS
If you have no insights on how your monitored system behaves start with
linear (default fit)

If your data is not straight but is curved you may want to try
polynomial

Power fit may be useful when your data has "ups" and "downs"

Exponential fit may be used for peak detection

Use longer intervals with more data points to obtain more accurate long-
term forecasts

Forecasts based on longer intervals can be very slow to respond to the


rapid change in trend

ZABBIX 3.0 -
ADMINISTRATOR 69
www.nubiral.com
TREND PREDICTION

Additional reading:

https://www.zabbix.com/documentation/3.0/manual/config
/triggers/prediction

http://zabbix.org/mw/images/1/18/Prediction_docs.pdf

ZABBIX 3.0 -
ADMINISTRATOR 70
www.nubiral.com
PERCENTILE
The function is used to determine the percent of
acceptability. The 95th percentile is the value which is
greater than 95% of the observed values:
measure bandwidth level without random peaks
do not take peak traffic into account
detect various anomalies

ZABBIX 3.0 -
ADMINISTRATOR 71
www.nubiral.com
PERCENTILE
Function:
percentile(period/#num, time_shift, percentage)

Parameters:
period - time period
#num - number of values
time_shift - time shift period
percentage - range of 0 to 100

Example:
{crtr05_rix:net.if.in[eth0,bytes].percentile(10m,,95)}>10M

ZABBIX 3.0 -
ADMINISTRATOR 72
www.nubiral.com
ZABBIX API
API OVERVIEW

Backend is based on web server


Secure: SSL, password authentication, audit
Based on JSON-RPC v2.0 specification Respects
permissions

ZABBIX 3.0 -
ADMINISTRATOR 74
www.nubiral.com
API STRUCTURE

232 different methods


Each of the methods performs one specific task

Examples:
host.create - creates new host
history.get - retrieves history data
item.update - updates existing items

ZABBIX 3.0 -
ADMINISTRATOR 75
www.nubiral.com
API FUNCTIONALITY

Integration of Zabbix with 3rd party applications


● Integrate with Puppet/CFEngine/Chef/bcfg2 or other system
management tools
● Integrate with ticketing systems
● Integrate with inventory systems to populate inventory data

Build services on top of Zabbix


Large scale configuration changes or implementation
More powerful operations
Shell scripting
… more

ZABBIX 3.0 -
ADMINISTRATOR 76
www.nubiral.com
API MESSAGE FLOW

Authenticate
Session ID

Method A

API client Result A


Zabbix API
Method B
Result B

ZABBIX 3.0 -
ADMINISTRATOR 77
www.nubiral.com
EXAMPLE - AUTHENTICATING

Authenticating via curl: Response:


$ curl -i -X POST -H 'Content- HTTP/1.1 200 OK
Type:application/json' -d' Date: Wed, 11 Nov 2015 09:32:41 GMT
{"jsonrpc": "2.0", Server: Apache/2.2.15 (CentOS) X-
"method":"user.login", Powered-By: PHP/5.3.3 Access-
"params":{ Control-Allow-Origin: *
"user":"Admin", Access-Control-Allow-Headers: Content-Type
"password":"zabbix"}, Access-Control-Allow-Methods: POST
"auth": null,"id":0} Access-Control-Max-Age: 1000
' http://195.13.231.163/zabbix/api_jsonrpc.php Content-Length: 68
Connection: close
Content-Type: application/json

{"jsonrpc":"2.0",
"result":"2f2ec4720863281c34cdd3c4c8a5de46","id":0}

ZABBIX 3.0 -
ADMINISTRATOR 78
www.nubiral.com
EXAMPLE - GETTING HOST

Getting host via curl: Response:


$ curl -i -X POST -H 'Content-Type: application/json' -d ' {
{"jsonrpc":"2.0", "jsonrpc":"2.0",
"method":"host.get", "result":[
"params":{ {
"filter":{ "hostid":"10126",
"host":"Zabbix server" "proxy_hostid":"0",
} "host":"Zabbix server",
}, "status":"0",
"auth":"2f2ec4720863281c34cdd3c4c8a5de46", "id":1} "disable_until":"0",
' http://195.13.231.163/zabbix/api_jsonrpc.php ...
"name":"Zabbix server",
"flags":"0",
"templateid":"0",
"description":""
}
],
"id":1
}

ZABBIX 3.0 -
ADMINISTRATOR 79
www.nubiral.com
EXAMPLE - GETTING HOST
USING DIFFERENT LANGUAGES
Install PyZabbix Python library using pip: Response:
# yum install python-pip $ ./host_get.py
# pip install pyzabbix available: 1
description:
disable_until: 0
Get auth & host via custom script: error:
#!/usr/bin/env python errors_from: 0
from pyzabbix import ZabbixAPI flags: 0
host: Zabbix server
zapi = ZabbixAPI("http://195.13.231.163/zabbix")
zapi.login("Admin", "zabbix") hostid: 10126
...
result = zapi.host.get(filter={"host" : "Zabbix server"}) for h in snmp_available: 0
result: snmp_disable_until: 0
snmp_error:
for key in sorted(h):
snmp_errors_from: 0
print "%s: %s " % (key, h[key])
status: 0
templateid: 0

ZABBIX 3.0 -
ADMINISTRATOR 80
www.nubiral.com
INTEGRATION WITH GRAFANA

ZABBIX 3.0 -
ADMINISTRATOR 81
www.nubiral.com
PERFORMANCE
BASIC DATA FLOW

Notifications

Visualization

DATABASE ZABBIX SERVER

History Analysis Data collection


ZABBIX 3.0 -
ADMINISTRATOR 83
www.nubiral.com
METRIC OF ZABBIX SIZING
Number of values processed per second (NVPS)
A rough estimate of NVPS is visible in Zabbix Dashboard

NVPS
ZABBIX 3.0 -
ADMINISTRATOR 84
www.nubiral.com
PERFORMANCE DELIVERED BY ZABBIX

Hardware: 10 Core CPU, 32GB, RAID10 BBWC


Budget: around 4K EUR

Zabbix is able to deliver 2 million of values per minute or


around 30.000 of values per second
In real life performance would be worse. Why?!

ZABBIX 3.0 -
ADMINISTRATOR 85
www.nubiral.com
FACTORS MAKING PERFORMANCE
LOWER

Type of items, value types, SNMPv3, number of triggers and


complexity of triggers
Housekeeper settings and thus size of the database
Number of users working with the WEB interface

ZABBIX 3.0 -
ADMINISTRATOR 86
www.nubiral.com
PERFORMANCE VS NUMBER OF HOSTS

60 items per host, update frequency once per minute


Performance
Number of hosts
(values per second)
100 100
1 000 1 000
10 000 10 000

300 items per host, update frequency once per minute


Performance
Number of hosts
(values per second)
100 500
1 000 5 000
10 000 50 000

ZABBIX 3.0 -
ADMINISTRATOR 87
www.nubiral.com
SLOW VS FAST

What Slow Fast


Database size Large Fits into
memory
Trigger min(), max(), avg() last(),
expressions nodata()
Data Polling (SNMP, Agent- Trapping
collection less, Passive agent) (active agents)
Data types Text, string Numeric
The history analysis does affect performance of Zabbix. But
not so much. Especially starting from Zabbix 2.2

ZABBIX 3.0 -
ADMINISTRATOR 88
www.nubiral.com
VISIBLE SYMPTOMS OF BAD
PERFORMANCE

Zabbix Queue has too many delayed items


Administration->Queue
Frequent gaps in graphs, no data for some of the ítems
False positives for triggers having nodata() function
Unresponsive WEB interface

No notifications

ZABBIX 3.0 -
ADMINISTRATOR 89
www.nubiral.com
NICE LOOKING QUEUE

ZABBIX 3.0 -
ADMINISTRATOR 90
www.nubiral.com
IDENTIFY AND
FIX COMMON
PROBLEMS
GENERIC TOOLS
top, ntop...
iostat, vmstat, sar
Zabbix itself
Database statistics, innotop
ps x|grep zabbix_server

ZABBIX 3.0 -
ADMINISTRATOR 92
www.nubiral.com
DIFFERENT VIEWS ON
PERFORMANCE

"I just added 5 new hosts, Zabbix is very sluggish" :-(


"Zabbix is so slooooow, I have only 48 hosts" :-(

however:
"Zabbix Milestone achieved - 1000 hosts and
growing" :-)
"Our status update: 8500 hosts, 950400 items, 670340
triggers, 9550 vps" :-)

What's the difference?


ZABBIX 3.0 -
ADMINISTRATOR 93
www.nubiral.com
COMMON PROBLEMS OF INITIAL
SETUP

Use of default templates


Make your own smarter templates

Default database settings


Tune database for the best performance

Not optimal configuration of Zabbix Server


Tune Zabbix Server configuration

Housekeeper settings do not match hardware spec

Use of older releases


Always use the latest stable one!
ZABBIX 3.0 -
ADMINISTRATOR 94
www.nubiral.com
HOW DO I KNOW DB PERFORMANCE
IS BAD?

Zabbix Server configuration file, zabbix_server.conf

LogSlowQueries=3000

ZABBIX 3.0 -
ADMINISTRATOR 95
www.nubiral.com
TUNE ZABBIX
CONFIGURATION
GET INITIAL STATS
Real number of VPS
zabbix[wcache, values, all]
zabbix[queue,1m] number of items delayed for more than 1 minute

Zabbix Server components


Alerter, Configuration syncer, DB watchdog, discoverer, escalator,
history syncer, http poller, housekeeper, icmp pinger, ipmi poller, poller,
trapper, etc.

Zabbix Server cache:


history write cache, value cache, trend write cache, vmware cache,
etc.

ZABBIX 3.0 -
ADMINISTRATOR 97
www.nubiral.com
GET INTERNAL STATS

Ready to use templates: Template


App Zabbix Server Template
App Zabbix Proxy Template
App Zabbix Agent

ZABBIX 3.0 -
ADMINISTRATOR 98
www.nubiral.com
INTERNAL STATS: OVERVIEW

Write cache: free history/text buffer (in %)


zabbix[wcache,history,pfree]
Write cache: free trend buffer (in %)
zabbix[wcache,trend,pfree]
Write cache: number of values expected/processed by Zabbix
zabbix[requiredperformance]
zabbix[wcache,values]
Configuration cache
zabbix[rcache,buffer,pfree]
Value cache
zabbix[vcache,cache,mode]
Number of values in the wait queue
zabbix[queue,<from>,<to>]

ZABBIX 3.0 -
ADMINISTRATOR 99
www.nubiral.com
HOW IT LOOKS LIKE

ZABBIX 3.0 -
ADMINISTRATOR 10
www.nubiral.com 0
TUNE NUMBER OF PROCESSES
(EXAMPLE)

Zabbix Server configuration file, zabbix_server.conf:

StartPollers=80
StartPingers=10
StartPollersUnreachable=80
StartIPMIPollers=10
StartTrappers=20
StartDBSyncers=8
ZABBIX 3.0 -
ADMINISTRATOR 101
www.nubiral.com
TUNE SIZE OF IN-MEMORY CACHE
(EXAMPLE)

Zabbix Server configuration file, zabbix_server.conf:

VMwareCacheSize=64M CacheSize=64M
HistoryCacheSize=128M
TrendCacheSize=64M
HistoryIndexCacheSize = 64M
ValueCacheSize=64M

ZABBIX 3.0 -
ADMINISTRATOR 102
www.nubiral.com
TABLE PARTITIONING

It is a way to split large tables into smaller partitions. Make


sense for historical tables:
history*, trends*, events

Benefits:
Easy to remove older data
Much better performance

ZABBIX 3.0 -
ADMINISTRATOR 103
www.nubiral.com
NO TABLE PARTITIONING

Zabbix
Server History
& GUI

ZABBIX 3.0 -
ADMINISTRATOR 10
www.nubiral.com 4
WITH TABLE PARTITIONING

Partition 2016_02

Zabbix Partition 2016_01


Server
& GUI
Partition 2015_12

Partition 2015_11
ZABBIX 3.0 -
10
ADMINISTRATOR
www.nubiral.com 5
MYSQL SPECIFIC
InnoDB is better than MyISAM In-
memory tmpfs for tmpdir Peek at
the data
mysqladmin status
mysqladmin variables
InnoDB
innodb_file_per_table = 1 innodb_buffer_pool_size=<large>
(~75% of total RAM) innodb_buffer_pool_instances = 4
(MySQL 5.6 - 8)
innodb_flush_log_at_trx_commit = 2
innodb_flush_method = O_DIRECT
innodb_log_file_size = 256M
Do not use
query log
binary logs if no replication is used (sync_binlog = 0)

ZABBIX 3.0 -
ADMINISTRATOR 10
www.nubiral.com 6
I STILL NEED BETTER
PERFORMANCE!

Run all Zabbix components on separate hardware!

Database
Zabbix Server Frontend
16 core CPU
8 core CPU Fast CPU
64GB of RAM
4GB of RAM 4GB of RAM
Fast storage

ZABBIX 3.0 -
ADMINISTRATOR 107
www.nubiral.com
HIGH-AVAILABILITY
AND REDUNDANCY
HIGH AVAILABILITY AND
REDUNDANCY
General
Shared storage
Virtual IP

Cluster solutions
Linux HA (OpenAIS/Corosync, Pacemaker)

Database replication if no shared storage is used

Master-master replication for redundancy

Galera Cluster for MySQL

ZABBIX 3.0 -
ADMINISTRATOR 10
www.nubiral.com 9
FAILOVER SETUP

ZABBIX 3.0 -
ADMINISTRATOR 110
www.nubiral.com
PRACTICAL
SETUP
PRACTICAL SETUP

❖ Add an additional item in "Template Basic":


"HTTP service availability"

❖ Make sure that the item is receiving data and is shown in


a human readable format

ZABBIX 3.0 -
ADMINISTRATOR 11
www.nubiral.com 2
PRACTICAL SETUP

❖ Add simple item in "Template Basic":


"ICMP lost packets"

❖ Add:
"Ping loss is too high on <host>" trigger
Use 5 as threshold

❖ Make sure that the item receives data

❖ Use the following command to simulate dropped packets


and test the trigger (run it once!):
# iptables -A INPUT -m statistic --mode random --probability 0.1 -j DROP

ZABBIX 3.0 -
ADMINISTRATOR 11
www.nubiral.com 3
PRACTICAL SETUP
❖ Create "Production cluster" dummy host which will
represent your "production HA cluster"

❖ Create "Template Aggregate Check" template

❖ Add aggregate item in "Template Aggregate Check":


"Average CPU load in cluster" which calculates average
CPU load on all systems from "Training servers" host
group

❖ Link "Template Aggregate Check" to "Production


cluster" host

❖ Make sure that the item receives data


ZABBIX 3.0 -
ADMINISTRATOR 11
www.nubiral.com 4
PRACTICAL SETUP

❖ Add calculated item in "Template Basic":


"Total throughput on eth0" (sum of "Incoming traffic on
eth0" and "Outgoing traffic on eth0")

❖ Make sure that the item receives data

ZABBIX 3.0 -
ADMINISTRATOR 11
www.nubiral.com 5
PRACTICAL SETUP

❖ Configure SNMP protocol on your VM

❖ Add SNMP interface on the host Link


“Template SNMP Device”

❖ Make sure that the item receives data

ZABBIX 3.0 -
ADMINISTRATOR 11
www.nubiral.com 6
PRACTICAL SETUP

❖ Add Zabbix sender item in "Template Basic":


"Number of persons in the room"

❖ Use item key "persons"

❖ Send a value via Zabbix sender

❖ Make sure that the item receives data

ZABBIX 3.0 -
ADMINISTRATOR 11
www.nubiral.com 7
PRACTICAL SETUP

❖ Create a new web scenario to monitor your Zabbix


frontend

❖ Add four steps :


log in
check login
logout
check logout

❖ Use macro in URLs to get the IP of the frontend

❖ Use macros in variables for the user name and password

ZABBIX 3.0 -
ADMINISTRATOR 11
www.nubiral.com 8
PRACTICAL SETUP

❖ Create a new auto-registration rule

❖ Configure HostMetadata Value

❖ Create a new action for the discovery rule

❖ Add the discovered hosts to "Training servers" host group

❖ Link "Template Basic" to the discovered hosts

ZABBIX 3.0 -
ADMINISTRATOR 11
www.nubiral.com 9
PRACTICAL SETUP

❖ Configure SNMP protocol on your VM

❖ Add SNMP interface on the host Link


“Template SNMP Device”

❖ Make sure that the item receives data

ZABBIX 3.0 -
ADMINISTRATOR 12
www.nubiral.com 0
PRACTICAL SETUP

❖ Install Tomcat

❖ Install Zabbix Java Gateway

❖ Configure Java Gateway to monitor Tomcat with


"Template JMX Generic"

❖ Check JMX agent icon status and "Latest data" for the
values

ZABBIX 3.0 -
ADMINISTRATOR 1
www.nubiral.com 2
1
PRACTICAL SETUP

❖ Add SSH item with password authentication in


"Template Basic": "Zabbix server daemon
version"

❖ Make sure that the item receives data

ZABBIX 3.0 -
ADMINISTRATOR 12
www.nubiral.com 2
PRACTICAL SETUP

❖ Configure unixODBC for MySQL


Get list of tables for Zabbix database using SQL query:
SELECT TABLE_NAME FROM information_schema.tables WHERE
TABLE_SCHEMA = 'zabbix'

❖ Use tables in the application prototype Monitor


data and index length per each table:
➢ SELECT data_length FROM information_schema.partitions
WHERE table_name LIKE '<TABLE_NAME>'

➢ SELECT index_length FROM information_schema.partitions


WHERE table_name LIKE '<TABLE_NAME>'

❖ Make sure that the items receive data


ZABBIX 3.0 -
ADMINISTRATOR 12
www.nubiral.com 3
PRACTICAL SETUP
Create the following calculated items in "Template Basic":
forecast(system.cpu.load,30m,,30m)
last(<forecast.item.key>,#1,30m)

Create a custom graph with three items (two the above


mentioned + original CPU load item)

Add "Future value for CPU load after 30 minutes will be more than 1"
trigger

Check latest data

ZABBIX 3.0 -
ADMINISTRATOR 12
www.nubiral.com 4
QUESTIONS?
Thank you

You might also like