Professional Documents
Culture Documents
for Databases
Sanjay Rao
Principal Performance Engineer, Red Hat
June 13, 2013
RHEL 6 scaling
Aspects of tuning
Tuning parameters
Bare metal
KVM Virtualization
Tools
Scalability
RHEL 6 is a lot more scalable but it also offers many opportunites for tuning
What To Tune
I/O
Memory
CPU
Network
Multiple HBAs
Device-mapper multipath
Deadline
Two queues per device, one for read and one for writes
CFQ
Each process queue gets fixed time slice (based on process priority)
Noop
FIFO
CFQ vs Deadline
1 thread per multipath device (4 devices)
450M
400M
350M
300M
250M
CFQ M P de v
De adline M P dev
200M
150M
100M
50M
0M
250
500M
200
400M
150
300M
100
200M
50
100M
64K-RR
64K-RW
64K-SR
64K-SW
32K-RR
32K-RW
32K-SR
32K-SW
16K-RR
16K-RW
16K-SR
16K-SW
8K-RR
8K-RW
8K-SR
-50
8K-SW
0M
CFQ
De adline
% diff (CFQ vs De adline)
Boot-time
Transactions / Minute
CFQ
Deadline
No-op
10U
20U
40U
60U
80U
100U
70
10:05
58.4
60
54.01
08:38
50
47.69
Time Elapsed
07:12
40
05:46
30
04:19
20
02:53
10
01:26
00:00
Parallel degree
16
32
CFQ
Deadline
% diff
Direct I/O
Predictable performance
Memory
(file cache)
Asynchronous I/O
DB Cache
Flat files on
file systems
Database
80K
70K
60K
Setall
DIO only
AIO only
No AIO DIO
Trans/min
50K
40K
30K
20K
10K
K
40U
00:05:46
00:05:02
Completion Time
00:04:19
00:03:36
00:02:53
00:02:10
00:01:26
00:00:43
00:00:00
256
1024
23.77
22.3
21.01
Transactions / Min
20
15
10
0
10U
40U
80U
Logs Fusion-io
Logs FC
% diff
Transactions / Min
4 database instances
FC
Fusion-io
01:40:48
01:26:24
Elapsed Time
01:12:00
00:57:36
00:43:12
00:28:48
00:14:24
00:00:00
Fusion-io
FC
Memory Tuning
NUMA
Huge Pages
Manage Virtual Memory pages
Swapping behavior
M1
M3
S1
c1 c2
c3 c4
S2
c1 c2
c3 c4
c1 c2
c3 c4
S3
c1 c2
c3 c4
S4
M1 M2 M3 M4
M1 M2 M3 M4
D1 D2 D3 D4
D1 D2 D3 D4
M2
M4
S1 S2 S3 S4
S1 S2 S3 S4
No NUMA optimization
NUMA optimization
numactl
Taskset
cpusets
Libvirt
Trans/Min
Non NUMA
NUMA
What is numad?
Added to Fedora 17
1250000
1200000
4 Guest NoNUMApin
4 Guest NumaD
4 Guest NUMApin
1150000
1100000
1050000
1000000
40U
vi /etc/sysctl.conf (vm.nr_hugepages=8192)
25
23.96
20
19.32
15
13.23
10
0
10U
40U
80U
4 Inst no pin
4 inst no pin Hugepages
4 inst numa pin Hugepages
% diff huge pgs pin Vs no huge pgs no pin
File cache
Free pagecache
Free slabcache
cpuspeed off
performance
ondemand
powersave
How To
Trans / Min
powersave
ondemand
performance
10U / 15%
20U / 23%
40U / 40%
60U / 54%
80U / 63%
100U / 76%
Transactions / Minute
ondemand
performance
Bios-no power savings
10U / 15%
40U / 40 %
80U / 63%
10U / 76%
08:38
07:12
05:46
04:19
02:53
01:26
00:00
performance
ondemand
powersave
0
0
0
0
0
4
4
4
4
5
1 89
2 87
1 90
1 90
2 86
6
6
5
5
7
0
0
0
0
0
00:07:12.00
00:06:28.80
00:05:45.60
00:05:02.40
00:04:19.20
00:03:36.00
00:06:47.02
00:06:27.17
00:02:52.80
00:02:09.60
00:01:26.40
00:00:43.20
00:00:00.00
16
00:06:24.53
How To
turbostat -i <interval>
Multi-queue
Tools to monitor dropped packets tc, dropwatch.
RCU adoption in stack
Multi-CPU receive to pull in from the wire faster.
10GbE driver improvements.
Data center bridging in ixbge driver.
FcoE performance improvements throughout the stack.
Network Performance
Private
Public
H2
Hardware
10GigE
Transactions / min
OLTP Workload
1500
9000
10U
40U
100U
DSS workloads
9000MTU
1500MTU
laptop-ac-powersave
spindown-disk
latency-performance
laptop-battery-powersave
server-powersave
throughput-performance
desktop-powersave
enterprise-storage
Default
default
kernel.sched_min_
granularity_ns
4ms
10ms
kernel.sched_wakeup
_granularity_ns
4ms
vm.dirty_ratio
20% RAM
vm.dirty_background
_ratio
10% RAM
5%
vm.swappiness
60
10
30
I/O Scheduler
(Elevator)
CFQ
deadline
deadline
deadline
Filesystem
Barriers
On
Off
Off
Off
CPU Governor
Disk Read-ahead
deadline
virtual-host
virtual-guest
10ms
10ms
10ms
15ms
15ms
15ms
15ms
40%
40%
10%
40%
deadline
4x
performance performance
4x
4x
Turning tuned on
tuned-adm profile enterprise-storage
Important tuned files
/etc/tune-profiles/enterprise-storage/ktune.sh
/etc/tune-profiles/enterprise-storage/sysctl.ktune
/etc/tune-profiles/enterprise-storage/ktune.sysconfig
Transations / Minute
TuneD-Default
TuneD-Lat_Perf
TuneD-Ent Stor
TuneD-Thru Perf
default
latency-performance
enterprise-storage
throughput-performance
Power with RF
Power
Throughput with RF
2500
Time (Seconds)
2000
TuneD-off
TuneD-ES
1500
1000
500
Insert
Update
Operation
Reads
Database Performance
Application tuning
Design
Resource Management
For performance
Application Isolation
I/O Cgroups
Instance
Instance
Instance
Instance
Unrestricted Resources
Controlled Resources
1
2
3
4
OLTP Workload
Instance 4
Instance 3
Instance 2
Instance 1
4 Instance
4 Instance Cgroup
OLTP Workload
Instance 1
Instance 2
89564.67
98176.57
101395.83
115343.13
98042.67
98054.07
Regular memory
Throttled memory
Instance 4
Instance 3
Instance 2
Instance 1
Swap In
Swap Out
NUMA
Huge Pages
VM2
VM1
Host
File Cache
VM2
Figure 2
How To
31.67
1Guest
4Guests
7.05
450
400
350
300
250
200
150
100
50
no NUMA
Manual Pin
NUMAD
RHEV Migration
Transactions / minute
TPM- regular
TPM Mig BW - 32
TPM Mig BW - 0
VirtIO
VirtIO drivers for network
Latency (usecs)
300
250
200
150
100
50
0
1
16
27
45
64
99
189
256
387
765
4
102
9
153
9
306
6
409
7
5
4
9
9
6
7
614 1228 1638 2457 4914 6553 9830
virtio RX
vhost RX
SR-IOV RX
Monitoring tools
Kernel tools
Networking
ethtool, ifconfig
Profiling
#
#
#
#
#
#
#
#
#
#
#
9000MTU
06:58:43PMIFACErxpck/stxpck/srxkB/stxkB/srxcmp/stxcmp/srxmcst/s
06:58:46PMeth00.910.000.070.000.000.000.00
06:58:46PMeth5104816.3648617.27900444.383431.150.000.000.91
06:58:49PMeth00.000.000.000.000.000.000.00
06:58:49PMeth5118269.8054965.841016151.643867.910.000.000.50
06:58:52PMeth00.000.000.000.000.000.000.00
06:58:52PMeth5118470.7354382.441017676.213818.350.000.000.98
06:58:55PMeth00.940.000.060.000.000.000.00
06:58:55PMeth5115853.0553515.49995087.673766.280.000.000.47
-----io---- --system------cpu----bi bo
in
cs
us sy id wa st
7952 163312 40235 97711 81 11 7 1 0
8150 162642 40035 97585 82 11 6 1 0
7498 166835 40494 97413 82 12 6 1 0
6841 159781 40150 95170 83 12 5 1 0
5950 138597 40117 94040 85 12 4 0 0
5895 139294 40487 94423 84 12 4 0 0
5906 136871 40401 94313 84 11 4 0 0
5890 135288 40744 95360 84 12 4 0 0
7866 174045 39702 97930 80 11 8 1 0
7217 174223 40469 95697 82 11 5 1 0
5682 218917 41571 94576 84 12 4 1 0
5488 357287 45871 96181 83 12 4 1 0
5456 257984 45617 97021 84 12 3 1 0
5518 161104 41572 96639 84 12 4 1 0
5440 159580 41191 95308 85 11 3 0 0
I/O
Virtualization Caching
Memory
NUMA
Huge Pages
Swapping
Managing Caches
CPU
Network
Separate networks
arp_filter
Packet size
New Tools
tuned
Wrap Up Virtualization
VirtIO drivers
aio (native)
NUMA
Cache options (none, writethrough)
Network (vhost-net)
access.redhat.com