Professional Documents
Culture Documents
ZFS is a modern 128-bit open-source operating system utilizing the copy-on-write model. This presentation is going to cover the following topics: How can we tune ZFS? When should ZFS be tuned?
Is ZFS slow?
Help, my ZFS is slow! compared to ... ? this depends on many factors (workload, data access, ...) tradeo speed vs data consistency + features maybe auto-tuning does not well in your case ...
ZFS checksum
We need the checksums to do: data and metadata integrity verication self-healing (scrub, mirror, raidz)
System memory Access time Dataset compression Deduplication ZFS send and receive
ZFS performance and stability depends on the amount of system RAM. recommended minimum: 1GB 4GB is ok 8GB and more is good
Access time
Due to copy-on-write enabled access time: reduces performance on read-intensive workloads increases space used by snapshots # zfs set atime=off dataset
Dataset compression
Dataset compression does: save space (LZJB less, gzip more) increase CPU usage (LZJB less, gzip much more) increase data throughput (LZJB, relative) Therefore I recommend using compression primarily for archiving purposes (e.g. system or webserver logles) # zfs set compression=[on|off|gzip]
Deduplication
Deduplication saves space by keeping just a single copy of identical data blocks. But remember, that deduplication: requiries even more memory (fast only if table ts to RAM) increases CPU usage Command to simulate the eect of enabling deduplication: # zdb -S pool Command for viewing detailed deduplication information: # zdb -D pool # zdb -DD pool
I higly recommend using a intermediate buering solution for sending and receiving large streams: misc/buer misc/mbuer (network capable)
We are going to look how to optimize the following applications for ZFS: Web servers Database servers File servers
Webservers
As of the current ZFS implementation it is recommended to disable sendle and mmap to avoid redundant data caching. Apache EnableMMAP Off EnableSendfile Off Nginx Sendfile off Lighttpd server.network-backend="writev"
Database servers
For PostgreSQL and MysSQL users recomment using a dierent recordsize than default 128k. PostgreSQL: 8k MySQL MyISAM storage: 8k MySQL InnoDB storage: 16k # zfs create -o recordsize=8k tank/mysql
File Servers
Here are some tips for le servers: disable access time keep number of snapshots low dedup only of you have lots of RAM for heavy write workloads move ZIL to separate SSD drives optionally disable ZIL for datasets (beware consequences)
We are going to look at the following: Adaptive Replacement Cache (ARC) Level 2 Adaptive Replacement Cache (L2ARC) ZFS Intent Log (ZIL) File-level Prefetch (zfetch) Device-level Prefetch (vdev prefetch) ZFS Statistics Tools
ARC resides in system RAM, provides major speedup to ZFS and its size is auto-tuned. Default values are: maximum: physical RAM less 1GB (or 1/2 of all memory) metadata limit: arc meta limit = 1/4 of arc max minimum: 1/2 of arc meta limit (but at least 16MB)
How to tune the ARC: you can disable ARC on per-dataset level maximum can be limited to reserve memory for other tasks increasing arc meta limit may help if working with many les # sysctl kstat.zfs.misc.arcstats.size # sysctl vfs.zfs.arc meta used # sysctl vfs.zfs.arc meta limit
Some facts about L2ARC: is designed to run on fast block devices (SSD) helps primarily read-intensive workloads each device can be attached to only one ZFS pool # zpool add pool cache device # zpool remove pool device
How to tune the L2ARC: enable prefetch for streaming or serving of large les congurable on per-dataset basis turbo warmup phase may require tuning (e.g. set to 16MB) vfs.zfs.l2arc noprefetch vfs.zfs.l2arc write max vfs.zfs.l2arc write boost
File-level prefetching analyses read patterns of les tries to predict next reads goal: reduce application response times Loader tunable to enable/disable zfetch: vfs.zfs.prefetch disable
Device-level prefetching reads data after small reads from pool devices may be useful for drives with higher latency consumes constant RAM per vdev is disabled by default Loader tunable to enable/disable vdev prefetch: vfs.zfs.vdev.cache.size=[bytes]
zfs-stats: overview
The zfs-stats utility is based on Ben Rockwoods arc-summary.pl and includes modications by Jason J. Hellenthal and myself. It provides information about: ARC structure and eciency L2ARC structure and eciency ZFETCH eciency values of ZFS tunables system memory (overview)
62.87% 37.13%
88.54%
zfs-mon: overview
The zfs-mon utility polls ZFS counters in real-time analyzes ARC, L2ARC, ZFETCH and vdev prefetch displays absolute and relative values displays output in varnishstat(1) style
ZFS real-time cache activity monitor Seconds elapsed: 120 Cache hits and misses: 1s ARC hits: 259 ARC misses: 51 ARC demand data hits: 223 ARC demand data misses: 36 ARC demand metadata hits: 36 ARC demand metadata misses: 15 ARC prefetch data hits: 0 ARC prefetch data misses: 0 ARC prefetch metadata hits: 0 ARC prefetch metadata misses: 0 L2ARC hits: 47 L2ARC misses: 4 ZFETCH hits: 47903 ZFETCH misses: 272 Cache efficiency percentage: 10s ARC: 91.51 ARC demand data: 95.42 ARC demand metadata: 36.67 ARC prefetch data: 80.00 ARC prefetch metadata: 0.00 L2ARC: 87.18 ZFETCH: 99.06 10s 431 40 417 20 11 19 4 1 0 0 34 5 47294 449 60s 418 49 390 17 25 21 3 10 0 1 40 9 48155 1147 tot 466 52 437 16 25 25 4 8 0 3 37 15 47138 3593