Professional Documents
Culture Documents
by John R Avery [Last Updated: 2009/May/31] Last Updated (mostly formatting): 2013/Jan/15 ==
Page
of
59
Table of Contents Preface............................................................................4 Disclaimer.......................................................................4 About This Document..............................................................4 Introduction to ZFS................................................................5 ZFS Pooled Storage...............................................................5 ZFS is a Transactional File-System...............................................6 Checksums........................................................................6 Self-Healing Data................................................................6 ZFS Scalability..................................................................7 ZFS Snapshots....................................................................7 ZFS Simplified Administration....................................................7 IMPORTANT System-Config Considerations for ZFS...................................8 ZFS Hardware & Software: Requirements & Recommendations........................8 From Sun's "ZFS Best-Practices Guide...........................................9 ZFS Terminology: An Intro.......................................................10 Glossary of General ZFS Terms.................................................10 ZFS Component-Naming Syntax...................................................12 Introductory Cheatsheet: Create a simple ZFS Storage-Pool and Basic File-System...14 Create and Modify a Simple Striped Storage-Pool.................................14 Create the New Pool...........................................................15 Add Components to the Pool....................................................16 Attempt Various Operations on the Pool........................................17 Export and Import the Pool....................................................19 Add (more) File-Systems to the Pool...........................................21 Create Snapshots of the Pool and Its File-Systems.............................23 Create & Delete a Clone.......................................................25 Create & Promote a Clone......................................................27 Destroy some [or all?] Snapshots..............................................29 Destroy the Pool..............................................................30 ZFS Storage-Pool: Types; Considerations; How to Create...........................32 Identify Technical and Logistics Requirements for your Storage-Pool.............33 Determine the Available Underlying Devices to be Used.........................33 What are the Device-Type Choices?............................................33 Aspects of Using Entire Physical Hard-Disks..................................33 Aspects of Using Individual Slices of Preformatted Hard-Disks................34 Aspects of Using LUNs from Hardware-RAID Arrays..............................35 Aspects of Using Volumes from Software-Based Volume-Managers.................36 Aspects of Using Files.......................................................36 Choose the Type of Data-Replication for the Storage-Pool......................38 What are the Replication-Type Choices?.......................................38 Mirrors......................................................................38 RAID-Z: Single and Double Parity.............................................39 Striped (Simple; no Parity)..................................................40 Choosing between Mirrored; RAID-Z; and RAID-Z2...............................41 Logistical & Administrative Considerations for the Storage-Pool...............42 Basic Commands to Create a Storage-Pool.........................................43 Create Striped Storage-Pools (Simple; no Parity) (Command Examples)...........43 Create Mirrored Storage-Pools (Command Examples)..............................43 Create RAID-Z Storage-Pools (Command Examples)................................44 Create RAID-Z2 Storage-Pools (Command Examples)...............................44 Managing Storage-Pool-Creation Errors...........................................44 Detecting In-Use Devices......................................................44 Mismatched Replication-Types..................................................45 Example: Combined Simple-Striping and Mirroring in Same Pool................45 Example: Combined 2-Way Mirror and 3-Way Mirror in Same Pool................45 Example: Combined RAID-Z(2) vdevs w/ Differently-Sized-Devices in Same Pool.46 Controlling ZFS Mount-Point at Pool-Creation: Alternate "Default Mount-Point"; Alternate Root..................................................................46 Page 2 of 59
Alternate "Default Mount-Point"...............................................47 Alternate Root-Directory......................................................47 ZFS Intent-Log (ZIL)..............................................................49 Considerations for Using Separate ZIL Devices...................................49 Command-Syntax to Create or Add Separate ZIL Devices............................50 Storage-Pool Management...........................................................52 Create Storage-Pools............................................................52 View Storage-Pool Data / Information............................................52 "zpool list": Simple View of Storage-Pools...................................52 "zpool status": Storage-Pool / VDev Details/Status...........................52 About "zpool status" Output-Sections.........................................52 About ZFS-Storage-Pool Device-States (from "zpool status")...................53 Examples of "zpool status" Syntax and Output.................................54 "zpool iostat": Storage-Pool I/O Statistics..................................56 Pool-Summary Statistics......................................................56 Pool Statistics per VDev.....................................................56 ["zpool history": ]..........................................................56 ["zpool get": ]..............................................................56 When & Why to Export / Import a Storage-Pool....................................57 [......]..........................................................................58 [.....].........................................................................58 [....]........................................................................58 [...]........................................................................58 ==
Page
of
59
Preface
Disclaimer
Naturally and predictably, I make no guarantees that the information in this document is either accurate or appropriate to your needs. This document is intended (A) partly as a means by which I can learn and retain information about ZFS; (B) partly as an easy-to-use reference for myself; (C) partly in the hope that other people will find it useful. --John Reed Avery, 2013/Jan/15
Page
of
59
Introduction to ZFS
"ZFS" stands for (or originally stood for) "ZettaByte File-System", where "zetta" is an SI prefix (look it up on Wikipedia) that refers to 1,000^7, in other words (iow) 10^21. Strictly speaking in computer terminology, a "ZettaByte" should be the equivalent of 1,024^7 Bytes (iow 2^70) but, according to Wikipedia, "ZettaByte" is rarely used to mean specifically that, and supposedly somebody has proposed the term "ZebiByte" --aka ZiB-- to refer specifically to the latter. ZFS is a type of file-system that works in such a way as to eliminate the need for a separate volume-management-layer to allow the spreading of file-system data across multiple disks and controllers. Originally, a file-system was relegated to a single disk-device, or a portion ("slice" or "partition") of that device. When people began wanting the dataredundancy and performance-improvements of spreading a file-system's data across multiple disks and/or multiple controllers, volume-management software was developed, to provide a control-layer between the file-system layer and the physical-device layer. The advantage was the avoidance of the hassle of redesigning the file-system infrastructure to account for the multiple devices and multiple controllers: install and implement the volume-manager software and one could then apply the same familiar file-system infrastructures, to those volumes, as had been used for years. The disadvantage was the addition of another layer of complexity and the prevention of various file-system advances that otherwise might have been possible without that extra layer: the file-system could not directly control the placement of data within the volumes. A major point of ZFS is to overcome these disadvantages.
Multiple file-systems can be assigned to a single storage-pool. Any file-system can use space from any device within the
storage-pool to which it is assigned.
There is no need to predetermine a specific size for a filesystem when you create it, because each file-system, from all those assigned to a storage-pool, is able to dynamically use whatever extra space that it needs from all the available space in the entire storage-pool, with no extra steps to assign the extra space.
those used to attach the new devices to the pool in the first place.
Mirroring: RAID-Z:
A variation on RAID-5, which provides a single parity-stripe for each such "volume" (aka "vdev", as ZFS calls it) within a storage-pool. A variation on RAID-6, which provides two parity-stripes (aka "double parity") for each such "volume" (aka "vdev", as ZFS calls it) within a storage-pool. !NOTE!: It is technically possible but distinctly not recommended to mix & match any of these different "replication types" --as Sun calls them-- in a single ZFS Storage-Pool.
RAID-Z2:
Checksums
[Again, not satisfied with the way the Sun manual explains this stuff; so, waiting to check other sources.]
Self-Healing Data
ZFS storage-pools can include any of 3 different types of data-redundancy: mirroring RAID-Z (a variation of RAID-5) RAID-Z2 (a variation of RAID-6, which is simply double-parity RAID-5) Page 6 of 59
When a bad data-block is detected, its contents are automatically recoverable from an alternate copy of the data from one of the redundancy elements [like a hotspare] in the pool.
ZFS Scalability
ZFS is the most-scalable file-system, with the following characteristics:
ZFS Snapshots
[quoting from the Sun PDF]: "A snapshot is a read-only [point-in-time] copy of a file system or volume. Snapshots can be created quickly and easily. Initially, snapshots consume no additional space within the pool. As data within the active dataset changes, the snapshot consumes space by continuing to reference the old data. As a result, the snapshot prevents the data from being freed back to the pool." Purpose of a Snapshot: Apparently, the primary purpose of a snapshot is to preserve the state of a filesystem at a particular point in time, in case one needs to abandon/reverse all changes made to the file-system, after the snapshot was taken, and revert the filesystem back to its complete state at the time the snapshot was taken. [My interpretation, based on what I've read. I've not yet found any documentation that clearly states this but it is implied in Sun's documentation regarding ZFS snapshots. --JRAvery]
manage file-systems with fewer [sets of?] commands and without editing configfiles:
hierarchical file-system layout property inheritance automanagement of mount-points and NFS-share semantics
Other simplified-admin advantages of ZFS:
easy to set quotas or reservations easy to turn compression on & off can manage mount-points for multiple file-systems with a single
command
file-systems are very simple/easy to create and a new filesystem incurs little overhead; so, admins are encouraged to create separate file-systems for many things for which, outside ZFS, simple directories and subdirectories would be used instead
that anybody, in the real world, would encounter a situation in which these small numbers of MB actually matter.]
Run ZFS only on systems running the 64-bit kernel. Have at least 1-GB of RAM. Provision at least 1 extra GB of RAM for each 10,000 mounted ZFS
file-systems, including snapshots.
Roughly 64-KB of RAM is used up for each mounted ZFS fs. (Also be prepared for longer boot-times on systems with 1000's of ZFS fs's.
So: if you have limited RAM, it's probably a good idea to configure more disk-based swap-space to help make up for the relative lack of RAM.
For all the disks you use as basic devices in your ZFS StoragePools, it's best to have as many disk-controllers as feasible: 2 instead of 1; 3 or 4 instead of 2.
(This is not so much related specifically to ZFS but, rather, to storage-dataredundancy and high-availability (HA) issues in general; also, related to Page 9 of 59
read/write performance. It's a general best-practices recommendation for any virtual-volume configuration.)
checksum
In ZFS, a checksum is a 256-bit hash of the data in a file system block. The checksum capability can range from fletcher2 (the default; simple and fast) to cryptographically-strong hashes, such as SHA256.
clone
In ZFS, a clone is a file-system whose initial contents are identical to the contents of a snapshot. ("Initial" meaning that, after you've created the clone, you can make changes to the file-system, so that its present contents are no longer its "initial contents".)
dataset
In ZFS, a dataset is the generic term that covers the following ZFS entities:
clones
--a file-system whose initial contents are identical to the contents of some snapshot
volumes
Each dataset is identified by a unique name within the namespace of any ZFSinstallation, using the following format: pool/path@[snapshot] Where ...
pool = name of the storage-pool containing the dataset path = slash-delimited pathname for the dataset snapshot = optional component to identify a snapshot
default file-systems
Page
10
of
59
In ZFS, the default file-systems are the file systems that are created by default when using Live Upgrade (LU) to migrate from UFS to a ZFS root. The current set of default file-systems is:
Starting with the 10/08 release of Solaris-10, a new feature of the ZFS Intent-Log (ZIL) is added to ZFS, to comply with POSIX standards for synchronous transactions. The creation of the ZIL is automatic when a Storage-Pool is created. By default, the ZIL is allocated from blocks within the main storage pool. However, better performance might be possible by using separate intent log devices in your ZFS storage pool, such as with VRAM or a dedicated disk. Later in this document, there is a section explaining considerations for setting up a separate device for the ZIL and commands for how to do it. Other examples of this specific type of POSIX-compliance are these (from Sun's ZFS Admin Guide PDF): It is common for a database to require that its transactions be on stable storage-devices when they are returning from a system call. The fsync() function is used by various applications to ensure that all data, associated with a particular write-to file-descriptor, is reliably transferred to the device associated with that file-descriptor or generate an error.
mirror
The term "mirror" means the same thing in ZFS as elsewhere in file-system & virtual-volume configurations --i.e., RAID-1.
primary boot-environment
"primary boot-environment" (PBE) is actually a Sun Live Upgrade (LU) term but LU is so prevalent in the typical implementation of ZFS that some of LUs terms must be effectively considered as practically applying to ZFS. In LU, the PBE is a boot-environment that (A) has been used by the lucreate command to build an "alternate boot-environment" and (B) which, by default, is the present boot-environment, but which setting can be overridden, for example, by using the lucreate -s command.
RAID-Z
In ZFS, RAID-Z refers to a virtual device that stores data and single-parity on multiple disks, similar to RAID-5.
Page
11
of
59
RAID-Z2
In ZFS, RAID-Z refers to a virtual device that stores data and double-parity on multiple disks, similar to RAID-6.
resilvering
In ZFS, resilvering is the process of transferring data from one device to another. For example: When you have a mirrored dataset --with some actual data in it-- and one component of that mirror is being brought online, the data, on one of the mirror-components that never went offline, gets copied to the newlyonlined component. In traditional virtual-volume terms, this is called mirrorresynchronization.
shared file-systems
In a Sun Live Upgrade (LU) context, shared file-systems are those that the ABEs (Alternate Boot-Environments) and PBE (Primary Boot-Environment) share in common and that do not need to be changed when switching the PBE from one to another.
snapshot
In ZFS, a snapshot is a read-only image of a file-system or volume at any given point in time.
Storage-Pool
See "pool".
Partly quoted and partly paraphrased from Sun's Solaris-10 ZFS Admin Guide: In ZFS, a virtual device is a logical device, as defined in a pool, which can refer to a physical device, a file, or a collection of devices. Each storage pool is comprised of one or more "virtual devices", aka "vdevs". A virtual device (vdev) is an internal representation, of the storage-pool, that describes the layout of physical storage and its fault characteristics. As such, a vdev represents the disk devices or files that are used to create the storage pool. A pool can have any number of vdevs at the top of the storagepool's configuration, known as "top-level vdevs"; aka "root vdevs".
volume
In ZFS, a volume is a dataset used to emulate a physical device.
ZIL
I've never seen examples of these types of names, either in manuals or from the various times that I have created ZFS components myself. --JRA !!] Each ZFS component must have an assigned name for which the syntax must adhere to the following rules:
Unnamed components are not allowed. The only characters allowed, in the component-names, are alpha and numeric
characters, plus the following 4 special characters:
"_"
"."
Any pool name must begin with an alpha character and also has the following
restrictions:
If you begin a pool-name with the character "c" then you cannot
immediately follow it with any numeric character (0-9). It is reserved.
Page
13
of
59
*-- This is being done because, even though the use of individual slices is officially not recommended, my available disks, at the time I created these cheatsheet-examples, were too few to give me the variety of examples I wanted. --JRAvery
Page
14
of
59
create
fishlegs
c0t2d0s3 c0t3d0s3
# zpool 2)
create
-f
fishlegs
c0t2d0s3 c0t3d0s3
list
USED 91K
# zpool
NAME fishlegs
list
SIZE 1.58G
# zpool
pool: fishlegs state: ONLINE scrub: none requested config: NAME fishlegs c0t2d0s3 c0t3d0s3 STATE ONLINE ONLINE ONLINE READ WRITE CKSUM 0 0 0 0 0 0 0 0 0
status
# zfs
get
"all"
fishlegs
VALUE filesystem Sat May 30 16:23 2009 108K 1.55G 18K 1.00x yes none none 128K /fishlegs off on off on on on on off off Page 15 of 59 SOURCE default default default default default default default default default default default default default
NAME fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs
PROPERTY type creation used available referenced compressratio mounted quota reservation recordsize mountpoint sharenfs checksum compression atime devices exec setuid readonly zoned
fishlegs snapdir hidden default fishlegs aclmode groupmask default fishlegs aclinherit restricted default fishlegs canmount on default fishlegs shareiscsi off default fishlegs xattr on default fishlegs copies 1 default fishlegs version 3 fishlegs utf8only off fishlegs normalization none fishlegs casesensitivity sensitive fishlegs vscan off default fishlegs nbmand off default fishlegs sharesmb off default fishlegs refquota none default fishlegs refreservation none default [Will not repeat this command unless important, because of the verbosity of the output. --JRAvery]
# zpool
NAME fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs
get "all"
PROPERTY size used available capacity altroot health guid version bootfs delegation autoreplace cachefile failmode
fishlegs
VALUE SOURCE 1.58G 95.5K 1.58G 0% default ONLINE 17063033720074425605 10 default default on default off default default wait default size 1.5G size 1.5G used 18K used 18K
# df -h
fishlegs
avail capacity 1.5G 1% avail capacity 1.5G 1% Mounted on /fishlegs Mounted on /fishlegs
Filesystem fishlegs
# df -h
/fishlegs
Filesystem fishlegs
# ls -al
total 5 drwxr-xr-x drwxr-xr-x
/fishlegs
2 root 37 root root root 2 May 30 14:49 . 1024 May 30 14:49 ..
add
fishlegs
c0t4d0s3
c0t5d0s3
Page
16
of
59
# zpool 4)
add
-f
fishlegs
c0t4d0s3
c0t5d0s3
list
SIZE 3.16G USED 126K AVAIL 3.16G CAP 0% HEALTH ONLINE ALTROOT -
# zpool
status
pool: fishlegs state: ONLINE scrub: none requested config: NAME fishlegs c0t2d0s3 c0t3d0s3 c0t4d0s3 c0t5d0s3 STATE ONLINE ONLINE ONLINE ONLINE ONLINE READ WRITE CKSUM 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# zpool
NAME fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs
get "all"
PROPERTY size used available capacity altroot health guid version bootfs delegation autoreplace cachefile failmode
fishlegs
VALUE SOURCE 3.16G 178K 3.16G 0% default ONLINE 17063033720074425605 10 default default on default off default default wait default size 3.1G used 18K
# df -h
Filesystem fishlegs
fishlegs
Mounted on /fishlegs
6)
# zpool
detach
fishlegs
c0t3d0s3
7)
Attempt to "remove" one of the component-devices in the pool: # zpool remove fishlegs c0t3d0s3
cannot remove c0t3d0s3: only inactive hot spares or cache devices can be removed
8)
9)
status
10)
Attempt to "replace" one of the component-devices in the pool: # zpool replace fishlegs c0t3d0s3 c0t2d0s4
11)
12
<--[You see this if you "spool status" quickly enough after you get the prompt back from the "replace" command.]
# zpool
NAME
list
SIZE
USED
AVAIL Page
CAP 18
HEALTH of 59
ALTROOT
fishlegs
3.16G
228K
3.16G
0%
ONLINE
# zpool
pool: fishlegs state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Sat May 30 16:14:02 2009 config: NAME fishlegs c0t2d0s3 c0t2d0s4 c0t4d0s3 c0t5d0s3 STATE ONLINE ONLINE ONLINE ONLINE ONLINE READ WRITE CKSUM 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
status
<--[replaced c0t3d0s3]
# zpool
NAME fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs [no
get "all"
fishlegs
PROPERTY VALUE SOURCE size 3.16G used 178K available 3.16G capacity 0% altroot default health ONLINE guid 17063033720074425605 version 10 default bootfs default delegation on default autoreplace off default cachefile default failmode wait default change from Step #4]
# df -h
Filesystem size used fishlegs 3.1G 18K [no change from Step #4]
/fishlegs
Mounted on /fishlegs
# zpool
NAME fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs
get "all"
PROPERTY size used available capacity altroot health guid version bootfs delegation autoreplace cachefile failmode
fishlegs
VALUE SOURCE 3.16G 178K 3.16G 0% default ONLINE 13624054972794092142 10 default default on default off default default wait default
NOTE:
12)
13)
list
no pools available cannot open 'fishlegs': no such pool df: (/fishlegs ) not a block device, directory or mounted resource df: (fishlegs ) not a block device, directory or mounted resource
14)
15)
list
SIZE 3.16G USED 126K AVAIL 3.16G CAP 0% HEALTH ONLINE ALTROOT -
# zpool
status
pool: fishlegs state: ONLINE scrub: none requested config: NAME fishlegs c0t2d0s3 c0t2d0s4 c0t4d0s3 c0t5d0s3 STATE ONLINE ONLINE ONLINE ONLINE ONLINE READ WRITE CKSUM 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
# zpool
get "all"
PROPERTY size used available capacity
fishlegs
VALUE 3.16G 178K 3.16G 0%
SOURCE 20 of 59
Page
default ONLINE 13624054972794092142 10 default default on default off default default wait default size 3.1G used 18K
# df -h
Filesystem fishlegs
fishlegs
Mounted on /fishlegs
16)
Check the contents of /fishlegs and FS-mounts related to the fishlegs pool: # ls -al
total 5 drwxr-xr-x drwxr-xr-x
/fishlegs
2 root 37 root root root 2 May 30 16:47 . 1024 May 30 16:23 ..
# df -h
18K
3.1G
1%
/fishlegs
# mount
[... output abbreviated ...] /fishlegs on fishlegs read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=4010009 on Sat May 30 16:23:35 2009
17)
# ls -al
total 5 drwxr-xr-x drwxr-xr-x drwxr-xr-x
/fishlegs
2 root 37 root 2 root root root root 2 May 30 16:47 . 1024 May 30 16:23 .. 2 May 30 16:47 toes
# df -h
[... output abbreviated ...] fishlegs 3.1G fishlegs/toes 3.1G 18K 18K 3.1G 3.1G 1% 1% /fishlegs /fishlegs/toes
# mount
Page 21 of 59
[... output abbreviated ...] /fishlegs on fishlegs read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=4010009 on Sat May 30 16:23:35 2009 /fishlegs/toes on fishlegs/toes read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=401000c on Sat May 30 16:48:05 2009
18)
Add three more new file-system to the fishlegs pool: # zfs # zfs # zfs create create create fishlegs/heels fishlegs/heels/calluses fishlegs/claws
# ls -al
total 5 drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x
/fishlegs
2 37 2 2 2 root root root root root root root root root root 2 1024 2 2 2 May May May May May 30 30 30 30 30 16:47 16:23 16:58 16:54 16:47 . .. claws heels toes
# ls -al
total 9 drwxr-xr-x drwxr-xr-x drwxr-xr-x
/fishlegs/heels
3 root 5 root 2 root root root root 3 May 30 17:35 . 5 May 30 16:58 .. 2 May 30 17:34 calluses
# df -h
[... output abbreviated ...] fishlegs 3.1G fishlegs/toes 3.1G fishlegs/heels 3.1G fishlegs/nails 3.1G fishlegs/heels/calluses 3.1G 18K 18K 18K 18K 18K 3.1G 3.1G 3.1G 3.1G 3.1G 1% 1% 1% 1% 1% /fishlegs /fishlegs/toes /fishlegs/heels /fishlegs/claws /fishlegs/heels/calluses
# mount
[... output abbreviated ...] /fishlegs on fishlegs read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=4010009 16:23:35 2009 /fishlegs/toes on fishlegs/toes read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=401000c 16:48:05 2009 /fishlegs/heels on fishlegs/heels read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=401000d 16:54:47 2009 /fishlegs/claws on fishlegs/claws read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=401000f 16:58:18 2009 /fishlegs/heels/calluses on fishlegs/heels/calluses read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=4010010 17:35:00 2009
on Sat May 30 on Sat May 30 on Sat May 30 on Sat May 30 on Sat May 30
Page
22
of
59
20)
21)
# zfs
NAME fishlegs@seafloor
list
-t
snapshot
USED 0
MOUNTPOINT -
22)
Create snapshots of individual file-systems within the fishlegs pool: # zfs # zfs # zfs snapshot snapshot list fishlegs/claws@nails fishlegs/toes@knuckles
USED 233K 0 18K 0 37K 18K 18K 0 USED 0 0 0 AVAIL 3.11G 3.11G 3.11G 3.11G 3.11G AVAIL REFER 22K 22K 18K 18K 19K 18K 18K 18K REFER 22K 18K 18K MOUNTPOINT /fishlegs /fishlegs/claws /fishlegs/heels /fishlegs/heels/calluses /fishlegs/toes MOUNTPOINT -
# zfs
list
-t
snapshot
Page
23
of
59
23)
Create individual snapshots of all the "descendent" file-systems with a single command, using the "-r" (recursive) switch: # zfs # zfs snapshot -r list
USED 234K 0 0 18K 0 0 37K 0 18K 0 18K 0 0 USED 0 0 0 0 0 0 0 0 AVAIL 3.11G 3.11G 3.11G 3.11G 3.11G AVAIL REFER 22K 22K 22K 18K 18K 18K 19K 19K 18K 18K 18K 18K 18K REFER 22K 22K 18K 18K 19K 18K 18K 18K MOUNTPOINT /fishlegs /fishlegs/claws /fishlegs/heels /fishlegs/heels/calluses /fishlegs/toes MOUNTPOINT -
fishlegs@all
NAME fishlegs fishlegs@seafloor fishlegs@all fishlegs/claws fishlegs/claws@nails fishlegs/claws@all fishlegs/heels fishlegs/heels@all fishlegs/heels/calluses fishlegs/heels/calluses@all fishlegs/toes fishlegs/toes@knuckles fishlegs/toes@all
# zfs
list
-t
snapshot
24)
Create individual snapshots, recursively, for only the file-systems from "heels" on down: # zfs # zfs snapshot -r list fishlegs/heels@rear
AVAIL REFER 19K 18K MOUNTPOINT -
NAME USED [ ... output abbreviated ... ] fishlegs/heels@rear 0 [...] fishlegs/heels/calluses@rear 0 [...]
# zfs
NAME USED AVAIL REFER MOUNTPOINT [ ... output abbreviated ... ] fishlegs/heels@rear 0 19K [...] fishlegs/heels/calluses@rear 0 18K [...]
list
-t
snapshot
# zfs
list
-r
-t snapshot
CREATION Sat May 30 17:44 2009 Sat May 30 17:52 2009 Sat May 30 17:47 2009 24 of 59
-o name,creation
fishlegs
Page
30 30 30 30 30 30 30
# zfs
NAME fishlegs/heels@all fishlegs/heels@rear fishlegs/heels/calluses@all fishlegs/heels/calluses@rear NAME fishlegs/heels@all fishlegs/heels@rear fishlegs/heels/calluses@all fishlegs/heels/calluses@rear
list
-r
-t snapshot
CREATION Sat May 30 Sat May 30 Sat May 30 Sat May 30 CREATION Sat May 30 Sat May 30 Sat May 30 Sat May 30
-o name,creation
17:52 17:56 17:52 17:56 2009 2009 2009 2009
fishlegs/heels
USED 0 0 0 0
# ls -al
total 14 drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x
/fishlegs/
5 37 2 3 2 root root root root root root root root root root 5 1024 2 4 2 May May May May May 31 30 30 31 30 18:54 16:23 16:58 18:52 16:48 . .. claws heels toes
# zfs
clone
fishlegs/toes@knuckles
fishlegs/bunions
# ls -al
total 17 drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x
/fishlegs/
6 37 2 2 3 2 root root root root root root root root root root root root size used 6 1024 2 2 4 2 May May May May May May 31 30 30 30 31 30 22:16 16:23 16:48 16:58 18:52 16:48 . .. bunions claws heels toes
# df -h
avail capacity
Mounted on
Page
25
of
59
fishlegs/bunions
3.1G
1%
/fishlegs/bunions
# zfs
NAME fishlegs/bunions
list
fishlegs/bunions
MOUNTPOINT /fishlegs/bunions
26)
Try to "destroy" (remove) the snapshot from which the "bunions" clone was created: # zfs
cannot destroy 'fishlegs/toes@knuckles': snapshot has dependent clones use '-R' to destroy the following datasets: fishlegs/bunions
destroy
fishlegs/toes@knuckles
27)
Confirm:
# ls -al
/fishlegs/
6 37 2 3 2 root root root root root root root root root root 6 1024 2 4 2 May May May May May 31 30 30 31 30 22:16 16:23 16:58 18:52 16:48 . .. claws heels toes
# df -h # df -h
/fishlegs/bunions fishlegs/bunions
size 3.1G
Filesystem fishlegs
used 25K
Mounted on /fishlegs
# zfs
list
fishlegs/bunions
28)
Now again try to "destroy" (remove) the snapshot snapshot from which the "bunions" clone was created: # zfs destroy fishlegs/toes@knuckles
Confirm:
# zfs
list
-t snapshot
USED 0 0 0 0 0 0 0 0 Page 26
AVAIL of 59
MOUNTPOINT -
fishlegs/toes@all
15K
18K
# ls -al
total 9 drwxr-xr-x drwxr-xr-x drwxr-xr-x
/fishlegs/heels/
3 root 5 root 2 root root root root 3 May 31 17:49 . 5 May 30 16:58 .. 2 May 30 17:34 calluses
Now, create the clone [It's not necessary to name the clone the same as the snapshot; I'm simply choosing to do that here. --JRAvery]:
# zfs
clone
fishlegs/heels@rear
fishlegs/heels/rear
# ls -al
/fishlegs/heels/
4 5 2 3 root root root root root root root root 4 5 2 3 May May May May 31 30 30 30 17:52 16:58 17:34 17:35 . .. calluses rear
# ls -al
total 9 drwxr-xr-x drwxr-xr-x drwxr-xr-x
/fishlegs/heels/rear/
3 root 4 root 2 root root root root size 3.1G USED 0 used 19K AVAIL 3.11G 3 May 30 17:35 . 4 May 31 17:52 .. 2 May 30 17:35 calluses avail capacity 3.1G REFER 19K 1% Mounted on /fishlegs/heels/rear
# df -h
Filesystem [output truncated] fishlegs/heels/rear
# zfs
NAME fishlegs/heels/rear
list
fishlegs/heels/rear
MOUNTPOINT /fishlegs/heels/rear
30)
Page
27
of
59
31)
Promote the "fishlegs/heels/rear" clone to replace the "fishlegs/heels" file-system: # zfs promote fishlegs/heels/rear
32)
33)
Complete the promotion by renaming the file-systems: # zfs # zfs rename list -r fishlegs/heels fishlegs
USED 402K 0 0 33K 0 0 88.5K 33K 0 0 36.5K 0 0 33K 0 0 AVAIL 3.11G 3.11G 3.11G 3.11G 3.11G 3.11G REFER 22K 22K 22K 18K 18K 18K 21K 18K 18K 18K 20.5K 19K 19K 18K 18K 18K MOUNTPOINT /fishlegs /fishlegs/claws /fishlegs/heels_Orig /fishlegs/heels_Orig/calluses /fishlegs/heels_Orig/rear /fishlegs/toes -
fishlegs/heels_Orig
NAME fishlegs fishlegs@seafloor fishlegs@all fishlegs/claws fishlegs/claws@nails fishlegs/claws@all fishlegs/heels_Orig fishlegs/heels_Orig/calluses fishlegs/heels_Orig/calluses@all fishlegs/heels_Orig/calluses@rear fishlegs/heels_Orig/rear fishlegs/heels_Orig/rear@all fishlegs/heels_Orig/rear@rear fishlegs/toes fishlegs/toes@knuckles fishlegs/toes@all
# zfs # zfs
rename list -r
fishlegs/heels_Orig/rear fishlegs
USED 434K 0 0 33K 0 0 36.5K 0 0 52K 33K 0 0 28 of 59
fishlegs/heels
REFER 23K 22K 22K 18K 18K 18K 20.5K 19K 19K 21K 18K 18K 18K MOUNTPOINT /fishlegs /fishlegs/claws /fishlegs/heels /fishlegs/heels_Orig -
NAME fishlegs fishlegs@seafloor fishlegs@all fishlegs/claws fishlegs/claws@nails fishlegs/claws@all fishlegs/heels fishlegs/heels@all fishlegs/heels@rear fishlegs/heels_Orig fishlegs/heels_Orig/calluses /fishlegs/heels_Orig/calluses fishlegs/heels_Orig/calluses@all fishlegs/heels_Orig/calluses@rear Page
33K 0 0
3.11G -
/fishlegs/toes -
# zfs # zfs
rename list -r
fishlegs/heels_Orig/calluses fishlegs
USED 440K 0 0 33K 0 0 69.5K 0 0 33K 0 0 19K 33K 0 0 AVAIL 3.11G 3.11G 3.11G 3.11G 3.11G 3.11G REFER 26K 22K 22K 18K 18K 18K 20.5K 19K 19K 18K 18K 18K 21K 18K 18K 18K
fishlegs/heels/calluses
MOUNTPOINT /fishlegs /fishlegs/claws /fishlegs/heels /fishlegs/heels/calluses /fishlegs/heels_Orig /fishlegs/toes -
NAME fishlegs fishlegs@seafloor fishlegs@all fishlegs/claws fishlegs/claws@nails fishlegs/claws@all fishlegs/heels fishlegs/heels@all fishlegs/heels@rear fishlegs/heels/calluses fishlegs/heels/calluses@all fishlegs/heels/calluses@rear fishlegs/heels_Orig fishlegs/toes fishlegs/toes@knuckles fishlegs/toes@all
34)
Filesystem fishlegs
fishlegs/heels_Orig
# ls -al
/fishlegs/
5 37 2 3 2 root root root root root root root root root root 5 1024 2 4 2 May May May May May 31 30 30 31 30 18:54 16:23 16:58 18:52 16:48 . .. claws heels toes
.
.
.
. .
Page
29
of
59
no pools available
list
fishlegs
fishlegs
40)
.
.
################################################################################ ################################################################################ [The following few bullets and command-examples have not yet been flushed out, regarding their final & proper placement and presentation in this cheatsheet. --JRAvery]
Assume that you will be naming the storage-pool and the filesystem with the same name --in this case, "fishlegs".
Assume that you have two physical disks available, one on each
of two different controllers (ideal for mirrored virtual devices): c2t1d0 c3t2d0
The command, for accomplishing the above, is simply this:
# zpool
create
fishlegs
mirror
c2t1d0
c3t2d0
Page
30
of
59
# df
Filesystem fishlegs
-h
/fishlegs
size 40G
used 1M
avail 40G
capacity 0%
Mounted on /fishlegs
Now assume you want to create one more file-system, within the
"fishlegs" pool, to be mounted directly beneath the "/fishlegs" mount-point.
The command, for accomplishing the above, is simply this:
# zfs
create
fishlegs/nosuch
# df
Filesystem fishlegs/nosuch
-h
/fishlegs/nosuch
size 40G
used 1M
avail 40G
capacity 0%
Mounted on /fishlegs/nosuch
################################################################################ ################################################################################
==
Page
31
of
59
Identify Technical and Logistics Requirements for your StoragePool. Determine the Available Underlying Devices to be Used.
What are the Device-Type Choices?. Aspects of Using Entire Physical Hard-Disks. Aspects of Using Individual Slices of Preformatted Hard-Disks. Aspects of Using LUNs from Hardware-RAID Arrays. Aspects of Using Volumes from Software-Based Volume-Managers. Aspects of Using Files. What are the Replication-Type Choices?. Mirrors. RAID-Z: Single and Double Parity. Striped (Simple; no Parity). Choosing between Mirrored; RAID-Z; and RAID-Z2.
==
Page
32
of
59
Distinctly *NOT*
Page
33
of
59
c0t0d0 /dev/dsk/c0t1d0 c1t0d1s2 /dev/foo/disk <--[this one needs more explanation but not yet available]
From Sun's PDF on ZFS: "Disks are identified both by their path and by their device ID, if available. This method allows devices to be reconfigured on a system without having to update any ZFS state. If a disk is switched between controller 1 and controller 2, ZFS uses the device ID to detect that the disk has moved and should now be accessed using controller 2. The device ID is unique to the drive's firmware. While unlikely, some firmware updates have been known to change device IDs. If this situation happens, ZFS can still access the device by path and update the stored device ID automatically. If you inadvertently change both the path and the ID of the device, then export and re-import the pool in order to use it." [Unfortunately, the above explanation begs as many questions as it answers; hope to get better info later.]
Where you choose to use entire physical hard-disks as your basic devices in the pool (best-practices recommendation), ZFS formats the disk with an EFI (Extensible Firmware Interface) label. According to an introductory paragraph on Wikipedia on 2009/Mar/26: "The Extensible Firmware Interface (EFI) is a specification that defines a software interface between an operating system and platform firmware. EFI is intended as a significantly improved replacement of the old legacy BIOS firmware interface historically used by all IBM PC-compatible personal computers. The EFI specification was originally developed by Intel, and is now managed by the Unified EFI Forum and is officially known as Unified EFI (UEFI)." According to documentation at docs.sun.com, the EFI label is distinct from the traditional VTOC (Volume Table of Contents) disk-label, partly in that the VTOC apparently cannot support disk-sizes of 1 TeraByte or larger, while EFI can. Sun also says that one can apply EFI labels to < 1-TB disks by using the "format -e" command on the disk(s) in question. When an EFI label is used on a disk, the format-command's partition-table output appears similar to the following: Current partition table (original): Total disk sectors available: 71670953 + 16384 (reserved sectors) Part 0 1 2 3 4 5 6 7 8 Tag usr unassigned unassigned unassigned unassigned unassigned unassigned unassigned reserved Flag wm wm wm wm wm wm wm wm wm First Sector 34 0 0 0 0 0 0 0 71670954 Size 34.18GB 0 0 0 0 0 0 0 8.00MB Last Sector 71670953 0 0 0 0 0 0 0 71687337
For a bootable ZFS root pool, the disks in the pool must contain
slices.
The simplest configuration would be to put the entire disk capacity in slice0 and use that slice for the root pool.
You have a serious need for a single disk to be shared between UFS (on one or A disk is already being used, in part, as a swap- or dump-device.
Aspects of Using LUNs from Hardware-RAID Arrays Questions about redundancy and performance:
From Sun's ZFS Admin Guide PDF: "If you construct ZFS configurations on top of LUNs from hardware RAID arrays, you need to understand the relationship between ZFS redundancy features and the redundancy features offered by the array. Certain configurations might provide adequate redundancy and performance, but other configurations might not." [Sun's manual provides no specifics for this type of scenario. --JRAvery]
From Sun's "ZFS Best-Practices Guide": "ZFS works well with storage based protected LUNs (RAID-5 or mirrored LUNs from intelligent storage arrays). However, [with this configuration] ZFS cannot heal corrupted blocks that are detected by ZFS checksums."
Storage-Pool performance-consideration:
From Sun's "ZFS Best-Practices Guide": If you must use LUNs then try at least to use "LUNs made up of [only] a few disks. By providing ZFS with more visibility into the LUNs setup, ZFS is able to make better I/O scheduling decisions."
Page
35
of
59
Aspects of Using Files Intended only for experimental and testing purposes. Files must be at least 64-MB. If a file is moved or otherwise renamed, the storage-pool must
be exported and re-imported to be able to use the changed file.
[Sun's PDF does not say what type(s) of files can be used; what
special command to use, if any, to create these files. A natural assumption is the same command for creating files that can be used as swap files but not yet sure.]
I found evidence that the "dd" command can be used, like this (See the Blog: http://i18n-freedom.blogspot.com/search?q=%22How+to+turn+a+mirror+in+to+a+RAID %22):
# lofiadm
-a /xenophanes/disk.img
/dev/lofi/1
Create the ZFS RAID-Z Storage-Pool "heraclitus" with a RAID-Z vdev that
includes /dev/lofi/1 as one of the vdev-components:
# zpool
create heraclitus
Page
36
of
59
(I've sent the blogger a message, asking whether or not he knows whether or not the "dd" command is the only way to create files for this purpose: never got back any response. --JRAvery) ==
Page
37
of
59
Striped (aka "Dynamic Striping") (non-redundant) Mirrored (multiple identical copies of the same data,
dynamically maintained)
Page
38
of
59
An RFE is filed for this feature. <--[The Sun guide does *not* clarify the difference between this, which is not supported, and the above-mentioned "replace" and "detach" operations, which are supported!] An RFE is filed for
RAID-Z: Single and Double Parity Single-parity RAID-Z is basically like original RAID-5.
It requires at least 2 disks (though more are recommended and typically used): 1 for an original-data stripe and 1 more for 1 parity stripe, for each striped volume or vdev. Use the "raidz" or "raidz1" option on the commandline.
Page
39
of
59
It requires at least 3 disks (though more are recommended and typically used): 1 for an original-data stripe and 2 more for each of two parity stripes, for each striped volume or vdev. Use the "raidz2" option on the command-line.
(Per Sun's ZFS Admin Guide): "A RAID-Z configuration with N disks of size X with P parity disks can hold approximately (N-P)*X bytes and can withstand P device(s) failing before data integrity is compromised."
(Per Sun's ZFS Admin Guide): If you are creating a RAID-Z Storage-Pool with >10 disks, it is better to create 2 or more RAID-Z top-level vdevs within the storage-pool, to get better performance. For example: 12 disks: create 2 RAID-Z top-level vdevs with 6 disks each, rather than a single top-level vdev with 12 disks. In other words: the recommended number of disks, per RAID-Z top-level vdev, is from 3 to 9.
Striped (Simple; no Parity) Provides *NO* data-redundancy: *NO* protection from faileddevices!
Page
40
of
59
Disadvantages:
To store X amount of data, uses more disk-space than RAID-Z or RAID-Z2. Must use at least 3-way mirrors --which requires more disk-space than 2way mirrors-- to approach the MTTDL (Mean-Time To Data-Loss) of RAID-Z and RAID-Z2. (This is not necessarily a notable disadvantage, given that some shops assume the need for 3-way mirrors, as a matter of course.)
RAID-Z:
Advantages:
Uses the least amount of disk-space for ZFS replication models that
provide redundancy.
Performs well *WHEN* data reads & writes occur in large chunks of 128 K or
more.
Disadvantages:
Not quite as good MTTDL (Mean-Time To Data-Loss) as RAID-Z2. Performs less well for random reads, compared to mirrors.
RAID-Z2:
Advantages:
Distinctly the best MTTDL (Mean-Time To Data-Loss) when compared to RAID-Z
or 2-way mirrors. Performance similar to RAID-Z.
Disadvantages:
Performs less well for random reads, compared to mirrors.
Page
41
of
59
Determine the number of disks you have and, specifically, what is/are their
IOPS (I/O Operations-Per-Second) capability: X. (You need to have a bunch of disks that share the same IOPS stats, for this to work.)
Divide Y by X (Y/X) --in other words, divide your FS-blocks/sec target by the
IOPS of which your disks are capable.
The result is the number of disks you should include in each RAID-Z grouping
[not RAID-Z2 I think, which uses a slightly different formula, I think --JRA].
EXAMPLE:
50 disks, each capable of 250 IOPS. Target FS-blocks/sec = 1000. Y/X = 1000/250 = 4 disks for each RAID-Z top-level vdev in the ZFS StoragePool.
RAID-Z is a great technology not only when disk blocks are your most precious
resources but also when your available IOPS far exceed your expected needs. But beware that if you get your hands on fewer very large disks, the IOPS capacity can easily become your most precious resource. Under those conditions, mirroring should be strongly favored or alternatively a dynamic stripe of RAID-Z groups each made up of a small number of devices.
Page
42
of
59
# zpool
create fishy
The data will be dynamically striped across the two-or-more designated disks (or whatever types of devices you specified, on the command-line, as the components of the "fishy" vdev. [*NOTE*: Sun's PDF gives only one example, which is a variation of the above. When creating a mirrored or RAID-5 ZFS-storage-pool, one can create multiple top-level (aka "root") vdevs during the original creation of the pool. Given the above example (and the variation in Sun's PDF), it's not at all clear how one would do this with a simple Striped Storage-Pool. Granted, this seems not particularly important, given that it would probably be very rare for anybody to actually want to have a simple Striped Storage-Pool, unless maybe they were going only for performance and no HA (High Availability). But, if it's possible, I don't yet know how. --JRA]
# zpool
create puppy
# zpool
create puppy
# zpool
create puppy
Create a mirrored pool consisting of two 3-way mirrors: # zpool create puppy
NOTE that the above commands also create a single corresponding ZFS file-system, by the name of "puppy", and automatically mount that file-system as "/puppy".
Page
43
of
59
# zpool
create puppy
Create a RAID-Z pool consisting of 2 vdevs with 3 disks each: # zpool create puppy
# zpool
create puppy
# zpool create puppy raidz2 c0t1d0 c1t0d0 c2t0d0 c3t1d0 > raidz2 c0t2d0 c1t1d0 c2t1d0 c3t2d0
# zpool
create headcase
c1t0d0 c1t1d0
invalid vdev specification use -f to override the following errors: /dev/dsk/c1t0d0s0 is currently mounted on /. Please see umount(1M). /dev/dsk/c1t0d0s1 is currently mounted on swap. Please see swap(1M). /dev/dsk/c1t1d0s0 is part of active ZFS pool zeepool. Please see zpool(1M). Some errors can be overridden by using the -f switch but most cannot.
You cannot use a dedicated dump-device as a ZFS element. Correct the situation with dumpadm or choose a different device. To overcome this error:
Choose a different device; or ... use "zpool destroy" to destroy the other
pool if it's no longer needed; or ... use "zpool detach" to detach the device from the other pool.
The following storage-pool-create errors can be overridden with the "-f" switch:
Contains a File-System Part of an SVM volume Live Upgrade ABE Part of an exported ZFS pool The disk, or one or more of its slices, contains a known file-system that is not mounted and not in /etc/vfstab. The disk, or one or more of its slices, is part of an SVM volume. The disk is designated as an Alternate Boot Environment (ABE) for Sun's Live Upgrade. The disk is part of a storage pool that has been exported or manually removed from a system. In the latter case, the pool is reported as "potentially active", because the disk might be a network-attached drive in use by another system. Be cautious when overriding a "potentially active" pool.
Mismatched Replication-Types
ZFS will allow you to create a storage-pool with vdevs of mismatched replication-types but it does not like this: It will error and try to prevent you from doing this.
create hepburn
c1t1d0
# zpool
create -f hepburn
c1t1d0
Page
45
of
59
# zpool # df -h
create fishlegs
fishlegs
size 1.5G size 1.5G
c0t2d0s3 c0t3d0s3
used 18K used 18K avail capacity 1.5G 1% avail capacity 1.5G 1% Mounted on /fishlegs Mounted on /fishlegs
Filesystem fishlegs
# df -h
Filesystem fishlegs
/fishlegs
# ls -al
/fishlegs
2 root 37 root root root 2 May 30 14:49 . 1024 May 30 14:49 ..
If you want to customize the top-level mount-point for the top-level file-system (aka, the "Root-Directory") of the Storage-Pool, there are two ways to do it:
Generally, there are 2 scenarios in which you might want to use either of these techniques:
Page
46
of
59
When you designate an "Alternate Root-Directory", the name of the StoragePool will not necessarily appear anywhere in the path to that mount-point, unless you decide to specifically include it on the command-line after the "-R" switch.
# zpool
default mountpoint /topcat exists and is not empty use -m option to specify a different default
create topcat
c0t3d0
# zpool
create
-m /export/zfs
topcat
c0t3d0
This uses --and creates, if necessary-- the /export/zfs/topcat/ subdirectory for the mount-point. EXAMPLE:
# zpool
create
-m /stuff
lowdog
c0t3d0
This uses --and creates, if necessary-- the /stuff/lowdog/ subdirectory for the mount-point. As far as I tell, there is no way to prevent ZFS from mounting a related file-system somewhere when the "zpool create" command is executed. --JRA
Alternate Root-Directory
EXAMPLE:
# zpool # df -h
create fishlegs
-R /panda
size 1.5G size 1.5G
fishlegs
used 18K used 18K
c0t2d0s3 c0t3d0s3
Mounted on /panda Mounted on /panda
Filesystem fishlegs
# df -h
/panda
Filesystem fishlegs
# ls -al
total 5 drwxr-xr-x drwxr-xr-x
/panda
2 root 37 root root root Page 47 2 May 30 14:49 . 1024 May 30 14:49 .. of 59
==
Page
48
of
59
play data that might be stored on a log-device is relatively small. Log-blocks are freed when the log transaction (system call) is committed. [NOTE that this can result in the fragmenting to which Neil Perrin refers in his blog, as noted above. -JRA]
# zpool
2)
create fishy
log c0t2d0
Example 2: Starting with the storage-pool and single-log-device created in Example 1 above, add a second and separate log-device, not mirrored:
# zpool
add fishy
log c0t3d0
NOTE that, if the "fishy" pool had not been created with any log-device in the first place, the above command would simply have added a first-ever logdevice to the "fishy" pool. 3) Example 3: Starting again with the storage-pool and single-log-device created in Example 1 above (pretend that we have not done Example 2), add a second log-device to the pool but, this time, in such a way as to create a mirror, using the original log-device and the newly-added log-device (from the following command) as the two submirrors of the log-device-mirror:
# zpool
attach fishy
c0t2d0 c0t3d0
Again: now the "fishy" pool's log-device is a mirrored vdev, with c0t2d0 and c0t3d0 acting as the two submirrors of that mirrored vdev. 4) Example 4: Create a new storage-pool that, from the beginning, contains a separate and mirrored log-device.
# zpool
create birdy
c0t1d0 c1t0d0
log
Page
50
of
59
5)
Example 5: Create a new storage-pool that, from the beginning, contains two separate and mirrored log-devices.
# zpool create doggy c0t1d0 c1t0d0 \ > log mirror c0t2d0 c1t2d0 log mirror c1t3d0 c2t3d0
6) Example 6: For an existing storage-pool, detach a log-device from a mirrored log-device.
# zpool
7)
detach fishy
c0t3d0
Example 7: For an existing storage-pool, replace one log-device with a different log-device.
# zpool
replace fishy
c0t2d0 c0t3d0
Again: the result of the above command is that the previous log-device, c0t2d0, has now been replaced with the newly-designated log-device, c0t3d0. [What is not completely clear is this: Before the replacement, was c0t3d0 not part of the pool at all, or did it need to be part of the pool first? Also, does this replace action remove c0t2d0 completely from the pool or does it simply deactivate its status as a log-device? I think the answer to these two is that c0t3d0 was not part of the pool before the replace and c0t2d0 is not part of the pool after the replace but I'm not yet certain.]
==
Page
51
of
59
Storage-Pool Management
[!*NOTE*!: At the moment this note is being added, it is my intention probably to have ZFS File-System Management covered under a different major heading in this document. I might change my mind but that's it for now. --JRAvery]
Create Storage-Pools
[This topic is covered under the major heading "ZFS Storage-Pool: Types; Considerations; How to Create" and the 2nd-level heading "Basic Commands to Create a Storage-Pool".]
list
SIZE 80G 42.9G 90G USED 274K 70.0K 45G AVAIL 80G 42.8G 45G CAP 0% 0% 50% HEALTH ONLINE ONLINE ONLINE ALTROOT /super -
# zpool
NAME hershey
list
SIZE 90G
hershey
USED 45G
AVAIL 45G
CAP 50%
HEALTH ONLINE
ALTROOT -
# zpool
NAME bump zeeter hershey
list
-o name,size,health
SIZE 80.0G 2.3T 90.0G HEALTH ONLINE ONLINE ONLINE
status action
errors
still have failing devices or data corruption. See the table under the SubHeader "About ZFS-Storage-Pool Device-States (from "zpool status")", below. If and only if there is a problem with the pool, this section is displayed to provide a description of the problem. If and only if there is a problem with the pool, this section is displayed to provide a recommended action for repairing the errors. This field might be in an abbreviated form, directing the user to one of the following sections. If and only if there is a problem with the pool, this section is displayed to provide a reference to a knowledge article that contains detailed repair-info. Identifies the current status of a scrub operation, which might include the date and time that the last scrub was completed, a scrub in progress, or if no scrubbing was requested. Describes the config-layout of the devices in the pool, plus their state and any device-errors. Device-states are any of the following: ONLINE OFFLINE FAULTED DEGRADED AVAIL UNAVAILABLE See the table under the SubHeader "About ZFS-Storage-Pool Device-States (from "zpool status")", below. Also, a 2nd subsection of the config section displays errorstatistics: READ --I/O errors occurred while issuing a read request. WRITE --I/O errors occurred while issuing a write request. CKSUM --Checksum errors. The device returned corrupted data as the result of a read request. NOTE: These errors can help indicate whether or not the damage is permanent. Only a few I/O errors might indicate a temporary outage; a large number might indicate a permanent device problem. NOTE ALSO: Any such errors might not correspond to errors as seen either (A) by the application(s) using the date on the affected vdevs or (B) even at the top-level of the vdevs --that is, at the mirror or raidz level. This is because, with these redundant configurations, it's possible that the remaining underlying devices, in the mirror or raidz vdev, were sufficient to prevent any loss of data or functionality when the error occurred. After all, that's what the redundancy is for. Also, ZFS's self-healing-data feature might have been able to kick in, also. Identifies known data errors or the fact that there are none.
FAULTED
OFFLINE UNAVAILABLE
AVAIL
REMOVED
there is only 1 top-level vdev w/in the pool but what if there are two or more root vdevs?!?!: it should still work, I believe!] The virtual device (vdev) has been deliberately taken offline by the administrator. The device or virtual device (vdev) cannot be opened. In some cases, pools with UNAVAILABLE devices appear in DEGRADED mode. If a toplevel vdev is unavailable, then nothing in the pool can be accessed. [AGAIN!: Is that last sentence really true?!?! That makes sense to me if there is only 1 top-level vdev w/in the pool but what if there are two or more root vdevs?!?!: it should still work, I believe!] [This is not included in this section in Sun's PDF on ZFS but, elsewhere in the same PDF, I saw examples of "zpool status" output that clearly displayed this state for spare devices, obviously indicating that these spare devices were available for use within the pool as needed.] The device was physically removed while the system was running. Device removal detection is hardware-dependent and might not be supported on all platforms.
# zpool
status
pool: rootpool state: ONLINE scrub: none requested config: NAME rootpool mirror c0t4d0s0 c1t3d0s0
pool: hepcat state: ONLINE scrub: none requested config: NAME STATE hepcat ONLINE mirror ONLINE c1t2d0 ONLINE c2t3d0 ONLINE mirror ONLINE c1t3d0 ONLINE c2t3d0 ONLINE mirror ONLINE c1t4d0 ONLINE c2t4d0 ONLINE logs ONLINE mirror ONLINE c0t3d0 ONLINE c1t5d0 ONLINE spares c0t5d0 AVAIL c1t6d0 AVAIL
[Sorry but I don't remember why I have the above in red-font. --JRAvery] Page 54 of 59
Example 2:
# zpool
status
rootpool
pool: rootpool state: ONLINE scrub: none requested config: NAME rootpool mirror c0t4d0s0 c1t3d0s0 STATE ONLINE ONLINE ONLINE ONLINE READ WRITE CKSUM 0 0 0 0 0 0 0 0 0 0 0 0
Example 3: Request "status" with "-v" for "verbose" output. (Note: Might not see anything different from not specifying "-v"; depends on the state(s) of the pool(s).)
# zpool
[no sample output yet] errors: No known data errors <--[supposedly this will appear after other output, *if* actually no errors]
status
-v
Example 4: Request "status" with "-x" to get only pools that are erroring or unavailable. Here we get an "UNAVAIL" and "cannot open":
# zpool
pool: zeepool state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using zpool online. see: http://www.sun.com/msg/ZFS-8000-D3 scrub: resilver completed after 0h12m with 0 errors on Thu Aug 28 09:29:43 2008 config: NAME STATE READ WRITE CKSUM zeepool DEGRADED 0 0 0 mirror DEGRADED 0 0 0 c1t2d0 ONLINE 0 0 0 spare DEGRADED 0 0 0 c2t1d0 UNAVAIL 0 0 0 cannot open c2t3d0 ONLINE 0 0 0 spares c1t3d0 AVAIL c2t3d0 INUSE currently in use errors: No known data errors <--[not sure this really appears in this type of situation]
status
-x
*NOTE*: Similar detail can be seen when using "zpoll status -v". The difference is that, with "-v", you'll get info about all the pools; with "x", you get info only about the pools with problems.
Page
55
of
59
Example 5: Request "status" with "-x" to get only pools that are erroring or unavailable. In this example, "all pools are healthy":
# zpool
status
-x
# zpool
iostat
bump
capacity used avail ----- ----100G 20.0G 100G 20.0G 100G 20.0G
# zpool
iostat
-v
operations read write ----- ----0 22 1 295 1 299 ----- ----0 22 bandwidth read write ----- ----0 6.00K 11.2K 148K 11.2K 148K ----- ----0 6.00K
*NOTE* 1: Space-usage is available only for top-level vdevs (aka "root" vdevs), such as "mirror" in the above output. *NOTE* 2: The numbers might not add up exactly correctly, particularly with Mirrors and RAID-Z/Z2 (as opposed to simple striping). The apparent discrepancy is particularly noticeable immediately and shortly after a pool is first created, because of all the I/O necessary for the pool-creation activities at the individual disk (or partition or file) level.
["zpool history": ]
[nothing yet. --jra]
["zpool get": ]
[nothing yet. --jra]
Page
56
of
59
Moving a Storage-Pool from one computer-system to another. Export the pool from the original system. Perform whatever physical moves might be required. Import the pool to the different system. Perform a "zpool replace" command and the replacement device has a larger capacity than the device that was replaced. If you do not export and import the pool, the increased size, of the replacement device, will not be seen/acknowledged by the system. In a rare situation in which both the path and Device-ID of a disk is changed without the disk's being technically removed from the Storage-Pool, the pool must be exported and then imported or else the disk cannot be used in ZFS. From the Solaris ZFS Administrator Guide of 2008/September: "Disks are identified both by their path and by their device ID, if available. This method allows devices to be reconfigured on a system without having to update any ZFS state. If [for example] a disk is switched between controller 1 and controller 2, ZFS uses the device ID to detect that the disk has moved and should now be accessed using controller 2. The device ID is unique to the drive's firmware. While unlikely, some firmware updates have been known to change device IDs. If this situation happens, ZFS can still access the device by path and update the stored device ID automatically. If you inadvertently change both the path and the ID of the device, then export and re-import the pool in order to use it." You are using UFS-files as virtual-devices (vdevs) in a Storage-Pool, for testing purposes. At some point, you move or rename one of those vdev files. At this point, the Storage-Pool must be exported and then imported, in order to continue using the pool. When growing the size of a LUN. From the Solaris ZFS Administrator Guide of 2008/September: "Currently, when growing the size of an existing LUN that is part of a storage pool, you must also perform the export and import steps to see the expanded disk capacity." You previously "offline[d]" a disk from a Storage-Pool. This does not technically remove that disk from that Storage-Pool. Later, you destroy that particular Storage-Pool. Later again, you want to re-use that disk in a different Storage-Pool but, when you run the "zpool add" command, you get the error "{device} is part of exported or potentially active ZFS pool. Please see zpool(1M)." At this point, you must "import" that previously-destroyed pool again, which will make that disk active again and also make the previously-destroyed pool active again. Now destroy that pool again and the disk will be freed up for use in the different pool where you now want it.
==
Page
57
of
59
[......]
[.....]
[....]
[...]
==
Page
58
of
59
########################################## 1) A) 1. a. i. 1) 2) 3) . . . . . . Multi-Layered List 1, Level 1 . Multi-Layered List 1, Level 2 . Multi-Layered List 1, Level 3 . Multi-Layered List 1, Level 4 . Multi-Layered List 1, Level 5 .
.
.
.
.
.
.
.
.
.
.
. .
. .
########################################## ==
Page
59
of
59