You are on page 1of 59

SOLARIS-10 ZFS TUTORIALS & CHEATSHEETS

by John R Avery [Last Updated: 2009/May/31] Last Updated (mostly formatting): 2013/Jan/15 ==

Page

of

59

Table of Contents Preface............................................................................4 Disclaimer.......................................................................4 About This Document..............................................................4 Introduction to ZFS................................................................5 ZFS Pooled Storage...............................................................5 ZFS is a Transactional File-System...............................................6 Checksums........................................................................6 Self-Healing Data................................................................6 ZFS Scalability..................................................................7 ZFS Snapshots....................................................................7 ZFS Simplified Administration....................................................7 IMPORTANT System-Config Considerations for ZFS...................................8 ZFS Hardware & Software: Requirements & Recommendations........................8 From Sun's "ZFS Best-Practices Guide...........................................9 ZFS Terminology: An Intro.......................................................10 Glossary of General ZFS Terms.................................................10 ZFS Component-Naming Syntax...................................................12 Introductory Cheatsheet: Create a simple ZFS Storage-Pool and Basic File-System...14 Create and Modify a Simple Striped Storage-Pool.................................14 Create the New Pool...........................................................15 Add Components to the Pool....................................................16 Attempt Various Operations on the Pool........................................17 Export and Import the Pool....................................................19 Add (more) File-Systems to the Pool...........................................21 Create Snapshots of the Pool and Its File-Systems.............................23 Create & Delete a Clone.......................................................25 Create & Promote a Clone......................................................27 Destroy some [or all?] Snapshots..............................................29 Destroy the Pool..............................................................30 ZFS Storage-Pool: Types; Considerations; How to Create...........................32 Identify Technical and Logistics Requirements for your Storage-Pool.............33 Determine the Available Underlying Devices to be Used.........................33 What are the Device-Type Choices?............................................33 Aspects of Using Entire Physical Hard-Disks..................................33 Aspects of Using Individual Slices of Preformatted Hard-Disks................34 Aspects of Using LUNs from Hardware-RAID Arrays..............................35 Aspects of Using Volumes from Software-Based Volume-Managers.................36 Aspects of Using Files.......................................................36 Choose the Type of Data-Replication for the Storage-Pool......................38 What are the Replication-Type Choices?.......................................38 Mirrors......................................................................38 RAID-Z: Single and Double Parity.............................................39 Striped (Simple; no Parity)..................................................40 Choosing between Mirrored; RAID-Z; and RAID-Z2...............................41 Logistical & Administrative Considerations for the Storage-Pool...............42 Basic Commands to Create a Storage-Pool.........................................43 Create Striped Storage-Pools (Simple; no Parity) (Command Examples)...........43 Create Mirrored Storage-Pools (Command Examples)..............................43 Create RAID-Z Storage-Pools (Command Examples)................................44 Create RAID-Z2 Storage-Pools (Command Examples)...............................44 Managing Storage-Pool-Creation Errors...........................................44 Detecting In-Use Devices......................................................44 Mismatched Replication-Types..................................................45 Example: Combined Simple-Striping and Mirroring in Same Pool................45 Example: Combined 2-Way Mirror and 3-Way Mirror in Same Pool................45 Example: Combined RAID-Z(2) vdevs w/ Differently-Sized-Devices in Same Pool.46 Controlling ZFS Mount-Point at Pool-Creation: Alternate "Default Mount-Point"; Alternate Root..................................................................46 Page 2 of 59

Alternate "Default Mount-Point"...............................................47 Alternate Root-Directory......................................................47 ZFS Intent-Log (ZIL)..............................................................49 Considerations for Using Separate ZIL Devices...................................49 Command-Syntax to Create or Add Separate ZIL Devices............................50 Storage-Pool Management...........................................................52 Create Storage-Pools............................................................52 View Storage-Pool Data / Information............................................52 "zpool list": Simple View of Storage-Pools...................................52 "zpool status": Storage-Pool / VDev Details/Status...........................52 About "zpool status" Output-Sections.........................................52 About ZFS-Storage-Pool Device-States (from "zpool status")...................53 Examples of "zpool status" Syntax and Output.................................54 "zpool iostat": Storage-Pool I/O Statistics..................................56 Pool-Summary Statistics......................................................56 Pool Statistics per VDev.....................................................56 ["zpool history": ]..........................................................56 ["zpool get": ]..............................................................56 When & Why to Export / Import a Storage-Pool....................................57 [......]..........................................................................58 [.....].........................................................................58 [....]........................................................................58 [...]........................................................................58 ==

Page

of

59

Preface
Disclaimer
Naturally and predictably, I make no guarantees that the information in this document is either accurate or appropriate to your needs. This document is intended (A) partly as a means by which I can learn and retain information about ZFS; (B) partly as an easy-to-use reference for myself; (C) partly in the hope that other people will find it useful. --John Reed Avery, 2013/Jan/15

About This Document


In general, I have a tendency to be somewhat meticulous in the ways I express things, whether in written or oral form, at least if the topic is technical. I also tend to think in a foundational and hierarchical way, about the topics I'm learning and/or teaching/presenting. And I tend to remember, long after the fact, the emotion and process of first learning something. This means that I tend to notice things that a beginner is likely to wonder near the start of learning something new. If this type of approach seems reasonable to you then you might like to start with this document and then move on to Sun's official documentation for details that I don't cover in these 50+ pages. I also tend to think of things in a three-dimensional-puzzle manner. I know how people can tend to learn a few things --just enough to be dangerous, as the saying goes-- and then run off to try to make things happen, without having spent enough time preparing. I can't prevent you, dear reader, from doing this but, to me, the big picture is part of the details and the details are part of the big picture. So, I sometimes try to include notes, in certain places, that will preemptively warn readers of something of which they might not otherwise have thought when reading about a particular topic. Again, if this type of approach seems reasonable to you, this might be another reason why you would want to start with this document, to learn ZFS, and then move on to other resources for details that I don't cover here. Now that I'm finally publishing this doc on my meager website, I'm perceiving it as deserving a grade of about C+ or B- for what I regard as an ideal document of this type. The color-coding of the headers has nothing to do with that grade. I regard that feature simply as an experiment in variations on how to help the reader to keep track of where he is in the doc. This is not how I do section-headers in all my documents. There are a few places where I've left place-holders for info that I've not yet filled in, sometimes because I've not yet been able to find the info; sometimes because I've not yet had time to find it. So, take it as it is. I hope you find it helpful. == --John Reed Avery, 2013/Jan/15

Page

of

59

Introduction to ZFS
"ZFS" stands for (or originally stood for) "ZettaByte File-System", where "zetta" is an SI prefix (look it up on Wikipedia) that refers to 1,000^7, in other words (iow) 10^21. Strictly speaking in computer terminology, a "ZettaByte" should be the equivalent of 1,024^7 Bytes (iow 2^70) but, according to Wikipedia, "ZettaByte" is rarely used to mean specifically that, and supposedly somebody has proposed the term "ZebiByte" --aka ZiB-- to refer specifically to the latter. ZFS is a type of file-system that works in such a way as to eliminate the need for a separate volume-management-layer to allow the spreading of file-system data across multiple disks and controllers. Originally, a file-system was relegated to a single disk-device, or a portion ("slice" or "partition") of that device. When people began wanting the dataredundancy and performance-improvements of spreading a file-system's data across multiple disks and/or multiple controllers, volume-management software was developed, to provide a control-layer between the file-system layer and the physical-device layer. The advantage was the avoidance of the hassle of redesigning the file-system infrastructure to account for the multiple devices and multiple controllers: install and implement the volume-manager software and one could then apply the same familiar file-system infrastructures, to those volumes, as had been used for years. The disadvantage was the addition of another layer of complexity and the prevention of various file-system advances that otherwise might have been possible without that extra layer: the file-system could not directly control the placement of data within the volumes. A major point of ZFS is to overcome these disadvantages.

ZFS Pooled Storage


ZFS introduces the concept of storage-pools for managing physical storage. The physical storage-devices are grouped into storage-pools, which describe the various physical attributes of the storage, such as device-layout and data-redundancy. Here is an overview of the relationship between the file-systems and the storage-pools:

Multiple file-systems can be assigned to a single storage-pool. Any file-system can use space from any device within the
storage-pool to which it is assigned.

There is no need to predetermine a specific size for a filesystem when you create it, because each file-system, from all those assigned to a storage-pool, is able to dynamically use whatever extra space that it needs from all the available space in the entire storage-pool, with no extra steps to assign the extra space.

When new storage-devices are added to the storage-pool, each of


the attached file-systems can automatically use the added storagespace without any further special steps or commands, other than
Page 5 of 59

those used to attach the new devices to the pool in the first place.

Storage-Pools automatically and inextricably provide at least


one of 4 different RAID-like features: Dynamic Striping:
A variation of RAID-0, which "stripes" data across multiple disks. This can provide improved read/write performance if each of those multiple disks is on a different controller. [NOTE: Sun's 2008/September ZFS Admin Guide seems not to talk about advantages of having the dynamic striping. Maybe the authors assume that people know enough about RAID to understand this without explanation. --JRAvery, 2009/May] A variation on RAID-1. RAID-1 mirroring to work. This works generally as one expects

Mirroring: RAID-Z:

A variation on RAID-5, which provides a single parity-stripe for each such "volume" (aka "vdev", as ZFS calls it) within a storage-pool. A variation on RAID-6, which provides two parity-stripes (aka "double parity") for each such "volume" (aka "vdev", as ZFS calls it) within a storage-pool. !NOTE!: It is technically possible but distinctly not recommended to mix & match any of these different "replication types" --as Sun calls them-- in a single ZFS Storage-Pool.

RAID-Z2:

ZFS is a Transactional File-System


[Sun's manual contrasts "traditional" file-systems and "journaling" file-systems with "transactional" file-systems but I'm not satisfied with the descriptions; so, I'm not gonna write anything here yet, until I check out other descriptions. According to Wikipedia's page on File Systems, a "journaling file-system" is simply one type of implementation of a "transactional file-system". Given that Sun's manual contrasts "journaling" and "transactional", Sun's authors obvsiously disagree with that categorization.]

Checksums
[Again, not satisfied with the way the Sun manual explains this stuff; so, waiting to check other sources.]

Self-Healing Data
ZFS storage-pools can include any of 3 different types of data-redundancy: mirroring RAID-Z (a variation of RAID-5) RAID-Z2 (a variation of RAID-6, which is simply double-parity RAID-5) Page 6 of 59

When a bad data-block is detected, its contents are automatically recoverable from an alternate copy of the data from one of the redundancy elements [like a hotspare] in the pool.

ZFS Scalability
ZFS is the most-scalable file-system, with the following characteristics:

ZFS is a 128-bit FS, which means that it is capable of


supporting 256 quadrillion zettabytes of storage.

All metadata is allocated dynamically; no need to pre-allocate


inodes or preset other potentially-limiting aspects of the filesystem's size when it is created.

Each directory/subdirectory can have up to 256 trillion (2^48)


entries in it.

No limit on the number of files that can exist in a single filesystem.

No limit on the number of file-systems [within a storage-pool?


within a particular system? --Sun's documentation is not clear on this detail].

ZFS Snapshots
[quoting from the Sun PDF]: "A snapshot is a read-only [point-in-time] copy of a file system or volume. Snapshots can be created quickly and easily. Initially, snapshots consume no additional space within the pool. As data within the active dataset changes, the snapshot consumes space by continuing to reference the old data. As a result, the snapshot prevents the data from being freed back to the pool." Purpose of a Snapshot: Apparently, the primary purpose of a snapshot is to preserve the state of a filesystem at a particular point in time, in case one needs to abandon/reverse all changes made to the file-system, after the snapshot was taken, and revert the filesystem back to its complete state at the time the snapshot was taken. [My interpretation, based on what I've read. I've not yet found any documentation that clearly states this but it is implied in Sun's documentation regarding ZFS snapshots. --JRAvery]

ZFS Simplified Administration


Through use of the following features, ZFS [to paraphrase the Sun manual] provides a simplified model of administration; making it easier to create and Page 7 of 59

manage file-systems with fewer [sets of?] commands and without editing configfiles:

hierarchical file-system layout property inheritance automanagement of mount-points and NFS-share semantics
Other simplified-admin advantages of ZFS:

easy to set quotas or reservations easy to turn compression on & off can manage mount-points for multiple file-systems with a single
command

examine and/or repair devices without using a separate set of


commands for volume-management

file-systems are very simple/easy to create and a new filesystem incurs little overhead; so, admins are encouraged to create separate file-systems for many things for which, outside ZFS, simple directories and subdirectories would be used instead

IMPORTANT System-Config Considerations for ZFS


ZFS Hardware & Software: Requirements & Recommendations
A SPARCTM or x86 system running Solaris-10 6/06 release or later. The minimum disk-size [or slice-size?, as is implied elsewhere
in Sun's documentation!] is 128 Mbytes. The minimum amount of disk-space required for ["to define"? --Sun PDF not clear here] a storage-pool is approximately 64 Mbytes. [NOTE that these 2 sentences are quotes from Sun's "Solaris ZFS Administration Guide" from 2008/September. There seems to be a glaring inconsistency in the instructions that you need fewer MB of disk-space (64) to define a ZFS pool than is required of the minimum size of any disk (128 MB) that can be used in a ZFS pool. I don't yet have clarification on this. However, it does seem extremely unlikely
Page 8 of 59

that anybody, in the real world, would encounter a situation in which these small numbers of MB actually matter.]

For good ZFS performance, at least 1 GB of memory is


recommended, though 768 MB is the minimum recommended to install/run Solaris.

Multiple controllers are recommended for the different


submirrors of a mirror virtual device.

[Not mentioned, at least in this same portion of Sun's


2008/September doc, is that fact that it is also generally recommended, with RAID in general, for each different disk, used to create any striped volume, be on a different controller, for best read/write performance. However, I'm not yet sure how much ZFS is designed to either (a) automatically spread the multiple disks across multiple controllers, when they are available, or (b) allow the administrator to custom-assign the disks, to ensure that configuration for a striped volume. --JRAvery]

From Sun's "ZFS Best-Practices Guide


(http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide)

Run ZFS only on systems running the 64-bit kernel. Have at least 1-GB of RAM. Provision at least 1 extra GB of RAM for each 10,000 mounted ZFS
file-systems, including snapshots.
Roughly 64-KB of RAM is used up for each mounted ZFS fs. (Also be prepared for longer boot-times on systems with 1000's of ZFS fs's.

ZFS caches data in kernel-addressable RAM.

So: if you have limited RAM, it's probably a good idea to configure more disk-based swap-space to help make up for the relative lack of RAM.

See the "Memory and Dynamic Reconfiguration Recommendations"


section of Sun's "ZFS Best-Practices Guide" for more-detailed information about ZFS's Adaptive-Replacement Cache (ARC) and how this affects memory.

For all the disks you use as basic devices in your ZFS StoragePools, it's best to have as many disk-controllers as feasible: 2 instead of 1; 3 or 4 instead of 2.
(This is not so much related specifically to ZFS but, rather, to storage-dataredundancy and high-availability (HA) issues in general; also, related to Page 9 of 59

read/write performance. It's a general best-practices recommendation for any virtual-volume configuration.)

ZFS Terminology: An Intro


Glossary of General ZFS Terms
alternate boot-environment
"alternate boot-environment" (ABE) is actually a Sun Live Upgrade (LU) term but LU is so prevalent in the typical implementation of ZFS that some of LUs terms must be effectively considered as practically applying to ZFS. In LU, the ABE is a potential boot-environment that (A) has been created by the lucreate command and (B) might have been updated by the luupgrade command but is not presently set to be the "primary boot-environment" (PBE), which is the bootenvironment from which the Solaris system is presently configured to boot.

checksum
In ZFS, a checksum is a 256-bit hash of the data in a file system block. The checksum capability can range from fletcher2 (the default; simple and fast) to cryptographically-strong hashes, such as SHA256.

clone

In ZFS, a clone is a file-system whose initial contents are identical to the contents of a snapshot. ("Initial" meaning that, after you've created the clone, you can make changes to the file-system, so that its present contents are no longer its "initial contents".)

dataset

In ZFS, a dataset is the generic term that covers the following ZFS entities:

clones

--a file-system whose initial contents are identical to the contents of some snapshot

file-systems snapshots --a read-only image of a file-system or volume, at a given


point in time

volumes
Each dataset is identified by a unique name within the namespace of any ZFSinstallation, using the following format: pool/path@[snapshot] Where ...

pool = name of the storage-pool containing the dataset path = slash-delimited pathname for the dataset snapshot = optional component to identify a snapshot

default file-systems

Page

10

of

59

In ZFS, the default file-systems are the file systems that are created by default when using Live Upgrade (LU) to migrate from UFS to a ZFS root. The current set of default file-systems is:

/ /usr /opt /var

Intent Log; aka ZFS Intent-Log (ZIL)

Starting with the 10/08 release of Solaris-10, a new feature of the ZFS Intent-Log (ZIL) is added to ZFS, to comply with POSIX standards for synchronous transactions. The creation of the ZIL is automatic when a Storage-Pool is created. By default, the ZIL is allocated from blocks within the main storage pool. However, better performance might be possible by using separate intent log devices in your ZFS storage pool, such as with VRAM or a dedicated disk. Later in this document, there is a section explaining considerations for setting up a separate device for the ZIL and commands for how to do it. Other examples of this specific type of POSIX-compliance are these (from Sun's ZFS Admin Guide PDF): It is common for a database to require that its transactions be on stable storage-devices when they are returning from a system call. The fsync() function is used by various applications to ensure that all data, associated with a particular write-to file-descriptor, is reliably transferred to the device associated with that file-descriptor or generate an error.

mirror
The term "mirror" means the same thing in ZFS as elsewhere in file-system & virtual-volume configurations --i.e., RAID-1.

pool, aka "storage-pool"


In ZFS, a pool is a logical group of storage-devices and related hardware that describes the layout and physical characteristics of the available storage within that pool. Space for datasets is allocated from a pool.

primary boot-environment

"primary boot-environment" (PBE) is actually a Sun Live Upgrade (LU) term but LU is so prevalent in the typical implementation of ZFS that some of LUs terms must be effectively considered as practically applying to ZFS. In LU, the PBE is a boot-environment that (A) has been used by the lucreate command to build an "alternate boot-environment" and (B) which, by default, is the present boot-environment, but which setting can be overridden, for example, by using the lucreate -s command.

RAID-Z

In ZFS, RAID-Z refers to a virtual device that stores data and single-parity on multiple disks, similar to RAID-5.

Page

11

of

59

RAID-Z2
In ZFS, RAID-Z refers to a virtual device that stores data and double-parity on multiple disks, similar to RAID-6.

resilvering
In ZFS, resilvering is the process of transferring data from one device to another. For example: When you have a mirrored dataset --with some actual data in it-- and one component of that mirror is being brought online, the data, on one of the mirror-components that never went offline, gets copied to the newlyonlined component. In traditional virtual-volume terms, this is called mirrorresynchronization.

shared file-systems
In a Sun Live Upgrade (LU) context, shared file-systems are those that the ABEs (Alternate Boot-Environments) and PBE (Primary Boot-Environment) share in common and that do not need to be changed when switching the PBE from one to another.

snapshot
In ZFS, a snapshot is a read-only image of a file-system or volume at any given point in time.

Storage-Pool
See "pool".

virtual device --aka "vdev"

Partly quoted and partly paraphrased from Sun's Solaris-10 ZFS Admin Guide: In ZFS, a virtual device is a logical device, as defined in a pool, which can refer to a physical device, a file, or a collection of devices. Each storage pool is comprised of one or more "virtual devices", aka "vdevs". A virtual device (vdev) is an internal representation, of the storage-pool, that describes the layout of physical storage and its fault characteristics. As such, a vdev represents the disk devices or files that are used to create the storage pool. A pool can have any number of vdevs at the top of the storagepool's configuration, known as "top-level vdevs"; aka "root vdevs".

volume
In ZFS, a volume is a dataset used to emulate a physical device.

ZIL

See Intent Log.

ZFS Component-Naming Syntax


[!! Double-check the consistency of the following description(s) w/ the examples and your own experience. To the best of my recollection: except for pool names, Page 12 of 59

I've never seen examples of these types of names, either in manuals or from the various times that I have created ZFS components myself. --JRA !!] Each ZFS component must have an assigned name for which the syntax must adhere to the following rules:

Unnamed components are not allowed. The only characters allowed, in the component-names, are alpha and numeric
characters, plus the following 4 special characters:

Underscore: Hyphen: Colon:


"-" ":"

"_"

Period (aka "dot"):

"."

Any pool name must begin with an alpha character and also has the following
restrictions:

If you begin a pool-name with the character "c" then you cannot
immediately follow it with any numeric character (0-9). It is reserved.

You cannot use the name "log".


because these are reserved.

You cannot start any pool-name with "mirror" or "raidz" or "spare",

Dataset names must begin with an alpha or numeric character.


[The Sun PDF mentions that the "%" sign is disallowed at the beginning of any poolname or anywhere in any dataset-name. But this is stupid to mention, because they already disallowed the use of the "%" sign anywhere in any ZFS-component-name, at the beginning of the description of ZFS-component-naming.] ==

Page

13

of

59

Introductory Cheatsheet: Create a simple ZFS Storage-Pool and Basic File-System


NOTE: The following 35-40 steps are a sort of cross between a cheatsheet and a tutorial. There's a lot more detail than a prototypical cheatsheet contains but not quite as much detail as you might expect from a full tutorial --though it's probably close. Regarding detail, these 35-40 steps probably err on the side of being more like a cheatsheet: assuming either that you know what the commands mean or that you can easily figure it out on your own, outside of these steps. (On the other hand, I do show a lot of screen-output to the commands.) --JRAvery Each of the following examples uses individual slices-* of preformatted VTOC 4.2-GB hard-disks, each slice sized 776 cylinders = 848.44 MB. The disks and slices are:

c0t2d0, s3 - s7 c0t3d0, s3 - s7 c0t4d0, s3 - s7 c0t5d0, s3 - s7

*-- This is being done because, even though the use of individual slices is officially not recommended, my available disks, at the time I created these cheatsheet-examples, were too few to give me the variety of examples I wanted. --JRAvery

Create and Modify a Simple Striped StoragePool


[It is not typical to create or use a simple Striped Pool but I'm including these examples here for the sake of thoroughness and just in case somebody decides they want/need it. --JRAvery] NOTE: To use the "zpool create" command to create different types of StoragePools, such as Mirrored or RAID-Z or RAID-Z2, see the following subsections: Create Mirrored Storage-Pools (Command Examples) Create RAID-Z Storage-Pools (Command Examples) Create RAID-Z2 Storage-Pools (Command Examples) Example: Combined Simple-Striping and Mirroring in Same Pool Example: Combined 2-Way Mirror and 3-Way Mirror in Same Pool Besides the "zpool create" command, most or all of the other commands, in the following 35-40 steps, should be basically the same, regardless of which replication-type you choose. --JRAvery

Page

14

of

59

Create the New Pool


1) Perform the initial creation with only two components (slices here, which could be entire disks in different circumstances): # zpool
invalid vdev specification use '-f' to override the following errors: /dev/dsk/c0t3d0s3 contains a ufs filesystem.

create

fishlegs

c0t2d0s3 c0t3d0s3

# zpool 2)

create

-f

fishlegs

c0t2d0s3 c0t3d0s3

Verify the new pool's config & stats: # zfs


NAME fishlegs

list

USED 91K

AVAIL 1.55G USED 112K

REFER 1K AVAIL 1.58G

MOUNTPOINT /fishlegs CAP 0% HEALTH ONLINE ALTROOT -

# zpool

NAME fishlegs

list

SIZE 1.58G

# zpool

pool: fishlegs state: ONLINE scrub: none requested config: NAME fishlegs c0t2d0s3 c0t3d0s3 STATE ONLINE ONLINE ONLINE READ WRITE CKSUM 0 0 0 0 0 0 0 0 0

status

errors: No known data errors

# zfs

get

"all"

fishlegs
VALUE filesystem Sat May 30 16:23 2009 108K 1.55G 18K 1.00x yes none none 128K /fishlegs off on off on on on on off off Page 15 of 59 SOURCE default default default default default default default default default default default default default

NAME fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs

PROPERTY type creation used available referenced compressratio mounted quota reservation recordsize mountpoint sharenfs checksum compression atime devices exec setuid readonly zoned

fishlegs snapdir hidden default fishlegs aclmode groupmask default fishlegs aclinherit restricted default fishlegs canmount on default fishlegs shareiscsi off default fishlegs xattr on default fishlegs copies 1 default fishlegs version 3 fishlegs utf8only off fishlegs normalization none fishlegs casesensitivity sensitive fishlegs vscan off default fishlegs nbmand off default fishlegs sharesmb off default fishlegs refquota none default fishlegs refreservation none default [Will not repeat this command unless important, because of the verbosity of the output. --JRAvery]

# zpool
NAME fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs

get "all"
PROPERTY size used available capacity altroot health guid version bootfs delegation autoreplace cachefile failmode

fishlegs
VALUE SOURCE 1.58G 95.5K 1.58G 0% default ONLINE 17063033720074425605 10 default default on default off default default wait default size 1.5G size 1.5G used 18K used 18K

# df -h

fishlegs
avail capacity 1.5G 1% avail capacity 1.5G 1% Mounted on /fishlegs Mounted on /fishlegs

Filesystem fishlegs

# df -h

/fishlegs

Filesystem fishlegs

# ls -al
total 5 drwxr-xr-x drwxr-xr-x

/fishlegs
2 root 37 root root root 2 May 30 14:49 . 1024 May 30 14:49 ..

Add Components to the Pool


3) Add two new components (slices, in this case) to the "fishlegs" pool: # zpool
invalid vdev specification use '-f' to override the following errors: /dev/dsk/c0t4d0s3 contains a ufs filesystem.

add

fishlegs

c0t4d0s3

c0t5d0s3

Page

16

of

59

# zpool 4)

add

-f

fishlegs

c0t4d0s3

c0t5d0s3

Verify the updated pool's config & stats: # zpool


NAME fishlegs

list
SIZE 3.16G USED 126K AVAIL 3.16G CAP 0% HEALTH ONLINE ALTROOT -

# zpool

status

pool: fishlegs state: ONLINE scrub: none requested config: NAME fishlegs c0t2d0s3 c0t3d0s3 c0t4d0s3 c0t5d0s3 STATE ONLINE ONLINE ONLINE ONLINE ONLINE READ WRITE CKSUM 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

errors: No known data errors

# zpool

NAME fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs

get "all"

PROPERTY size used available capacity altroot health guid version bootfs delegation autoreplace cachefile failmode

fishlegs

VALUE SOURCE 3.16G 178K 3.16G 0% default ONLINE 17063033720074425605 10 default default on default off default default wait default size 3.1G used 18K

# df -h

Filesystem fishlegs

fishlegs

avail capacity 3.1G 1%

Mounted on /fishlegs

Attempt Various Operations on the Pool


5) Attempt to "offline" one of the component-devices in the pool: # zpool offline fishlegs c0t3d0s3
cannot offline c0t3d0s3: no valid replicas-* *--Apparently, the "offline" operation is valid only in a mirrored pool.

6)

Attempt to "detach" one of the component-devices in the pool:


Page 17 of 59

# zpool

cannot detach c0t3d0s3: only applicable to mirror and replacing vdevs

detach

fishlegs

c0t3d0s3

7)

Attempt to "remove" one of the component-devices in the pool: # zpool remove fishlegs c0t3d0s3
cannot remove c0t3d0s3: only inactive hot spares or cache devices can be removed

8)

Attempt to "scrub" the "fishlegs" pool: # zpool scrub fishlegs


<--[finished almost immediately because no data]

9)

Check results of scrub operation: # zpool


pool: fishlegs state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Sat May 30 15:51:06 2009 config: NAME fishlegs c0t2d0s3 c0t3d0s3 c0t4d0s3 c0t5d0s3 STATE ONLINE ONLINE ONLINE ONLINE ONLINE READ WRITE CKSUM 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

status

errors: No known data errors

10)

Attempt to "replace" one of the component-devices in the pool: # zpool replace fishlegs c0t3d0s3 c0t2d0s4

11)

Check the results of the "replace" operation: # zpool status


pool: fishlegs state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Sat May 30 16:14:02 2009 config: NAME fishlegs c0t2d0s3 replacing c0t3d0s3 c0t2d0s4 c0t4d0s3 c0t5d0s3 STATE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE ONLINE READ WRITE CKSUM 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

12

<--[You see this if you "spool status" quickly enough after you get the prompt back from the "replace" command.]

errors: No known data errors

# zpool
NAME

list

SIZE

USED

AVAIL Page

CAP 18

HEALTH of 59

ALTROOT

fishlegs

3.16G

228K

3.16G

0%

ONLINE

<--[no change from Step #4]

# zpool

pool: fishlegs state: ONLINE scrub: resilver completed after 0h0m with 0 errors on Sat May 30 16:14:02 2009 config: NAME fishlegs c0t2d0s3 c0t2d0s4 c0t4d0s3 c0t5d0s3 STATE ONLINE ONLINE ONLINE ONLINE ONLINE READ WRITE CKSUM 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

status

<--[replaced c0t3d0s3]

errors: No known data errors

# zpool
NAME fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs [no

get "all"

fishlegs

PROPERTY VALUE SOURCE size 3.16G used 178K available 3.16G capacity 0% altroot default health ONLINE guid 17063033720074425605 version 10 default bootfs default delegation on default autoreplace off default cachefile default failmode wait default change from Step #4]

# df -h

Filesystem size used fishlegs 3.1G 18K [no change from Step #4]

/fishlegs

avail capacity 3.1G 1%

Mounted on /fishlegs

# zpool
NAME fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs

get "all"
PROPERTY size used available capacity altroot health guid version bootfs delegation autoreplace cachefile failmode

fishlegs
VALUE SOURCE 3.16G 178K 3.16G 0% default ONLINE 13624054972794092142 10 default default on default off default default wait default

Export and Import the Pool


Page 19 of 59

NOTE:

See the subsection "When & Why to Export / Import a Storage-Pool".

12)

Attempt to "export" the "fishlegs" pool: # zpool export fishlegs

13)

Check the results of the "export" operation: # zpool # zpool # zpool # df -h # df -h


no pools available

list

status get "all" /fishlegs fishlegs fishlegs

no pools available cannot open 'fishlegs': no such pool df: (/fishlegs ) not a block device, directory or mounted resource df: (fishlegs ) not a block device, directory or mounted resource

14)

Attempt to "import" the "fishlegs" pool: # zpool import fishlegs

15)

Check the results of the "import" operation: # zpool


NAME fishlegs

list
SIZE 3.16G USED 126K AVAIL 3.16G CAP 0% HEALTH ONLINE ALTROOT -

# zpool

status

pool: fishlegs state: ONLINE scrub: none requested config: NAME fishlegs c0t2d0s3 c0t2d0s4 c0t4d0s3 c0t5d0s3 STATE ONLINE ONLINE ONLINE ONLINE ONLINE READ WRITE CKSUM 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

errors: No known data errors

# zpool

NAME fishlegs fishlegs fishlegs fishlegs

get "all"
PROPERTY size used available capacity

fishlegs
VALUE 3.16G 178K 3.16G 0%

SOURCE 20 of 59

Page

fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs fishlegs

altroot health guid version bootfs delegation autoreplace cachefile failmode

default ONLINE 13624054972794092142 10 default default on default off default default wait default size 3.1G used 18K

# df -h

Filesystem fishlegs

fishlegs

avail capacity 3.1G 1%

Mounted on /fishlegs

Add (more) File-Systems to the Pool


The ZFS convention seems to be that, by default, all new file-systems, added to the pool, are mounted somewhere underneath the top-level mount-point for the rootlevel file-system of that pool. If /fishlegs is the root-level file-system of the "fishlegs" pool, new file-systems are conventionally mounted somewhere underneath /fishlegs, as reflected in the command-syntax for creating them.

16)

Check the contents of /fishlegs and FS-mounts related to the fishlegs pool: # ls -al
total 5 drwxr-xr-x drwxr-xr-x

/fishlegs
2 root 37 root root root 2 May 30 16:47 . 1024 May 30 16:23 ..

# df -h

[... output abbreviated ...] fishlegs 3.1G

18K

3.1G

1%

/fishlegs

# mount

[... output abbreviated ...] /fishlegs on fishlegs read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=4010009 on Sat May 30 16:23:35 2009

17)

Add a new file-system to the fishlegs pool: # zfs create fishlegs/toes

# ls -al
total 5 drwxr-xr-x drwxr-xr-x drwxr-xr-x

/fishlegs
2 root 37 root 2 root root root root 2 May 30 16:47 . 1024 May 30 16:23 .. 2 May 30 16:47 toes

# df -h
[... output abbreviated ...] fishlegs 3.1G fishlegs/toes 3.1G 18K 18K 3.1G 3.1G 1% 1% /fishlegs /fishlegs/toes

# mount
Page 21 of 59

[... output abbreviated ...] /fishlegs on fishlegs read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=4010009 on Sat May 30 16:23:35 2009 /fishlegs/toes on fishlegs/toes read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=401000c on Sat May 30 16:48:05 2009

18)

Add three more new file-system to the fishlegs pool: # zfs # zfs # zfs create create create fishlegs/heels fishlegs/heels/calluses fishlegs/claws

# ls -al
total 5 drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x

/fishlegs
2 37 2 2 2 root root root root root root root root root root 2 1024 2 2 2 May May May May May 30 30 30 30 30 16:47 16:23 16:58 16:54 16:47 . .. claws heels toes

# ls -al
total 9 drwxr-xr-x drwxr-xr-x drwxr-xr-x

/fishlegs/heels
3 root 5 root 2 root root root root 3 May 30 17:35 . 5 May 30 16:58 .. 2 May 30 17:34 calluses

# df -h
[... output abbreviated ...] fishlegs 3.1G fishlegs/toes 3.1G fishlegs/heels 3.1G fishlegs/nails 3.1G fishlegs/heels/calluses 3.1G 18K 18K 18K 18K 18K 3.1G 3.1G 3.1G 3.1G 3.1G 1% 1% 1% 1% 1% /fishlegs /fishlegs/toes /fishlegs/heels /fishlegs/claws /fishlegs/heels/calluses

# mount

[... output abbreviated ...] /fishlegs on fishlegs read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=4010009 16:23:35 2009 /fishlegs/toes on fishlegs/toes read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=401000c 16:48:05 2009 /fishlegs/heels on fishlegs/heels read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=401000d 16:54:47 2009 /fishlegs/claws on fishlegs/claws read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=401000f 16:58:18 2009 /fishlegs/heels/calluses on fishlegs/heels/calluses read/write/setuid/devices/nonbmand/exec/xattr/atime/dev=4010010 17:35:00 2009

on Sat May 30 on Sat May 30 on Sat May 30 on Sat May 30 on Sat May 30

Page

22

of

59

Create Snapshots of the Pool and Its File-Systems


19) List the present pool-elements: # zfs list
USED 230K 18K 37K 18K 18K AVAIL 3.11G 3.11G 3.11G 3.11G 3.11G REFER 22K 18K 19K 18K 18K MOUNTPOINT /fishlegs /fishlegs/claws /fishlegs/heels /fishlegs/heels/calluses /fishlegs/toes NAME fishlegs fishlegs/claws fishlegs/heels fishlegs/heels/calluses fishlegs/toes

20)

Create a snapshot of the entire "fishlegs" pool: # zfs snapshot fishlegs@seafloor

21)

Check the results of the snapshot-creation operation: # zfs list


USED 230K 0 18K 37K 18K 18K AVAIL AVAIL 3.11G 3.11G 3.11G 3.11G 3.11G REFER 18K REFER 22K 22K 18K 19K 18K 18K MOUNTPOINT /fishlegs <--[snapshot] /fishlegs/claws /fishlegs/heels /fishlegs/heels/calluses /fishlegs/toes NAME fishlegs fishlegs@seafloor fishlegs/claws fishlegs/heels fishlegs/heels/calluses fishlegs/toes

# zfs

NAME fishlegs@seafloor

list

-t

snapshot
USED 0

MOUNTPOINT -

22)

Create snapshots of individual file-systems within the fishlegs pool: # zfs # zfs # zfs snapshot snapshot list fishlegs/claws@nails fishlegs/toes@knuckles
USED 233K 0 18K 0 37K 18K 18K 0 USED 0 0 0 AVAIL 3.11G 3.11G 3.11G 3.11G 3.11G AVAIL REFER 22K 22K 18K 18K 19K 18K 18K 18K REFER 22K 18K 18K MOUNTPOINT /fishlegs /fishlegs/claws /fishlegs/heels /fishlegs/heels/calluses /fishlegs/toes MOUNTPOINT -

NAME fishlegs fishlegs@seafloor fishlegs/claws fishlegs/claws@nails fishlegs/heels fishlegs/heels/calluses fishlegs/toes fishlegs/toes@knuckles

# zfs

list

-t

snapshot

NAME fishlegs@seafloor fishlegs/claws@nails fishlegs/toes@knuckles

Page

23

of

59

23)

Create individual snapshots of all the "descendent" file-systems with a single command, using the "-r" (recursive) switch: # zfs # zfs snapshot -r list
USED 234K 0 0 18K 0 0 37K 0 18K 0 18K 0 0 USED 0 0 0 0 0 0 0 0 AVAIL 3.11G 3.11G 3.11G 3.11G 3.11G AVAIL REFER 22K 22K 22K 18K 18K 18K 19K 19K 18K 18K 18K 18K 18K REFER 22K 22K 18K 18K 19K 18K 18K 18K MOUNTPOINT /fishlegs /fishlegs/claws /fishlegs/heels /fishlegs/heels/calluses /fishlegs/toes MOUNTPOINT -

fishlegs@all

NAME fishlegs fishlegs@seafloor fishlegs@all fishlegs/claws fishlegs/claws@nails fishlegs/claws@all fishlegs/heels fishlegs/heels@all fishlegs/heels/calluses fishlegs/heels/calluses@all fishlegs/toes fishlegs/toes@knuckles fishlegs/toes@all

# zfs

list

-t

snapshot

NAME fishlegs@seafloor fishlegs@all fishlegs/claws@nails fishlegs/claws@all fishlegs/heels@all fishlegs/heels/calluses@all fishlegs/toes@knuckles fishlegs/toes@all

24)

Create individual snapshots, recursively, for only the file-systems from "heels" on down: # zfs # zfs snapshot -r list fishlegs/heels@rear
AVAIL REFER 19K 18K MOUNTPOINT -

NAME USED [ ... output abbreviated ... ] fishlegs/heels@rear 0 [...] fishlegs/heels/calluses@rear 0 [...]

# zfs

NAME USED AVAIL REFER MOUNTPOINT [ ... output abbreviated ... ] fishlegs/heels@rear 0 19K [...] fishlegs/heels/calluses@rear 0 18K [...]

list

-t

snapshot

# zfs

NAME fishlegs@seafloor fishlegs@all fishlegs/claws@nails

list

-r

-t snapshot

CREATION Sat May 30 17:44 2009 Sat May 30 17:52 2009 Sat May 30 17:47 2009 24 of 59

-o name,creation

fishlegs

Page

fishlegs/claws@all fishlegs/heels@all fishlegs/heels@rear fishlegs/heels/calluses@all fishlegs/heels/calluses@rear fishlegs/toes@knuckles fishlegs/toes@all

Sat Sat Sat Sat Sat Sat Sat

May May May May May May May

30 30 30 30 30 30 30

17:52 17:52 17:56 17:52 17:56 17:48 17:52

2009 2009 2009 2009 2009 2009 2009

# zfs

NAME fishlegs/heels@all fishlegs/heels@rear fishlegs/heels/calluses@all fishlegs/heels/calluses@rear NAME fishlegs/heels@all fishlegs/heels@rear fishlegs/heels/calluses@all fishlegs/heels/calluses@rear

list

-r

-t snapshot

CREATION Sat May 30 Sat May 30 Sat May 30 Sat May 30 CREATION Sat May 30 Sat May 30 Sat May 30 Sat May 30

-o name,creation
17:52 17:56 17:52 17:56 2009 2009 2009 2009

fishlegs/heels

# zfs list -r -t snapshot -o name,creation,used,referenced fishlegs/heels

17:52 17:56 17:52 17:56

2009 2009 2009 2009

USED 0 0 0 0

REFER 19K 19K 18K 18K

Create & Delete a Clone


25) Create a clone.
In this example, create the clone "fishlegs/bunions" from the snapshot "fishlegs/toes@knuckles": First, verify that there is no "bunions" subdirectory underneath the "/fishlegs" directory:

# ls -al
total 14 drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x

/fishlegs/
5 37 2 3 2 root root root root root root root root root root 5 1024 2 4 2 May May May May May 31 30 30 31 30 18:54 16:23 16:58 18:52 16:48 . .. claws heels toes

Now, create the clone:

# zfs

clone

fishlegs/toes@knuckles

fishlegs/bunions

Now, verify that the clone exists:

# ls -al
total 17 drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x

/fishlegs/
6 37 2 2 3 2 root root root root root root root root root root root root size used 6 1024 2 2 4 2 May May May May May May 31 30 30 30 31 30 22:16 16:23 16:48 16:58 18:52 16:48 . .. bunions claws heels toes

# df -h

Filesystem [output truncated]

avail capacity

Mounted on

Page

25

of

59

fishlegs/bunions

3.1G USED 0 AVAIL 3.11G

18K REFER 18K

3.1G

1%

/fishlegs/bunions

# zfs

NAME fishlegs/bunions

list

fishlegs/bunions

MOUNTPOINT /fishlegs/bunions

26)

Try to "destroy" (remove) the snapshot from which the "bunions" clone was created: # zfs
cannot destroy 'fishlegs/toes@knuckles': snapshot has dependent clones use '-R' to destroy the following datasets: fishlegs/bunions

destroy

fishlegs/toes@knuckles

27)

"Destroy" (remove) the "bunions" clone: # zfs destroy fishlegs/bunions

Confirm:

# ls -al

total 17 drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x

/fishlegs/
6 37 2 3 2 root root root root root root root root root root 6 1024 2 4 2 May May May May May 31 30 30 31 30 22:16 16:23 16:58 18:52 16:48 . .. claws heels toes

# df -h # df -h

df: (/fishlegs/bunions) not a block device, directory or mounted resource

/fishlegs/bunions fishlegs/bunions
size 3.1G

Filesystem fishlegs

used 25K

avail capacity 3.1G 1%

Mounted on /fishlegs

# zfs

list

fishlegs/bunions

cannot open 'fishlegs/bunions': dataset does not exist

28)

Now again try to "destroy" (remove) the snapshot snapshot from which the "bunions" clone was created: # zfs destroy fishlegs/toes@knuckles

Confirm:

# zfs

NAME fishlegs@seafloor fishlegs@all fishlegs/claws@nails fishlegs/claws@all fishlegs/heels@all fishlegs/heels@rear fishlegs/heels/calluses@all fishlegs/heels/calluses@rear

list

-t snapshot

USED 0 0 0 0 0 0 0 0 Page 26

AVAIL of 59

REFER 22K 22K 18K 18K 19K 19K 18K 18K

MOUNTPOINT -

fishlegs/toes@all

15K

18K

Create & Promote a Clone


29) Create a clone.
In this example, create the clone "fishlegs/heels/rear", based on the "rear" snapshot of the "fishlegs/heels" file-system: First, verify that there is presently no /fishlegs/heels/rear subdirectory:

# ls -al
total 9 drwxr-xr-x drwxr-xr-x drwxr-xr-x

/fishlegs/heels/
3 root 5 root 2 root root root root 3 May 31 17:49 . 5 May 30 16:58 .. 2 May 30 17:34 calluses

Now, create the clone [It's not necessary to name the clone the same as the snapshot; I'm simply choosing to do that here. --JRAvery]:

# zfs

clone

fishlegs/heels@rear

fishlegs/heels/rear

Now, verify that the clone exists:

# ls -al

total 12 drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x

/fishlegs/heels/
4 5 2 3 root root root root root root root root 4 5 2 3 May May May May 31 30 30 30 17:52 16:58 17:34 17:35 . .. calluses rear

# ls -al
total 9 drwxr-xr-x drwxr-xr-x drwxr-xr-x

/fishlegs/heels/rear/
3 root 4 root 2 root root root root size 3.1G USED 0 used 19K AVAIL 3.11G 3 May 30 17:35 . 4 May 31 17:52 .. 2 May 30 17:35 calluses avail capacity 3.1G REFER 19K 1% Mounted on /fishlegs/heels/rear

# df -h
Filesystem [output truncated] fishlegs/heels/rear

# zfs

NAME fishlegs/heels/rear

list

fishlegs/heels/rear

MOUNTPOINT /fishlegs/heels/rear

30)

Confirm the status of the two snapshots for "fishlegs/heels":


# zfs list -r -t snapshot -o name,creation,used,referenced fishlegs/heels NAME CREATION USED REFER fishlegs/heels@all Sat May 30 17:52 2009 0 19K fishlegs/heels@rear Sat May 30 17:56 2009 0 19K fishlegs/heels/calluses@all Sat May 30 17:52 2009 0 18K fishlegs/heels/calluses@rear Sat May 30 17:56 2009 0 18K

Page

27

of

59

31)

Promote the "fishlegs/heels/rear" clone to replace the "fishlegs/heels" file-system: # zfs promote fishlegs/heels/rear

32)

Check that the "promote" operation has succeeded:


# zfs list -r -t snapshot -o name,creation,used,referenced fishlegs/heels NAME CREATION USED REFER fishlegs/heels/calluses@all Sat May 30 17:52 2009 0 18K fishlegs/heels/calluses@rear Sat May 30 17:56 2009 0 18K fishlegs/heels/rear@all Sat May 30 17:52 2009 0 19K fishlegs/heels/rear@rear Sat May 30 17:56 2009 0 19K Previously, there were snapshots "fishlegs/heels@all" and "fishlegs/heels@rear"; now they have been replaced by "fishlegs/heels/rear@all" and "fishlegs/heels/rear@rear".

33)

Complete the promotion by renaming the file-systems: # zfs # zfs rename list -r fishlegs/heels fishlegs
USED 402K 0 0 33K 0 0 88.5K 33K 0 0 36.5K 0 0 33K 0 0 AVAIL 3.11G 3.11G 3.11G 3.11G 3.11G 3.11G REFER 22K 22K 22K 18K 18K 18K 21K 18K 18K 18K 20.5K 19K 19K 18K 18K 18K MOUNTPOINT /fishlegs /fishlegs/claws /fishlegs/heels_Orig /fishlegs/heels_Orig/calluses /fishlegs/heels_Orig/rear /fishlegs/toes -

fishlegs/heels_Orig

NAME fishlegs fishlegs@seafloor fishlegs@all fishlegs/claws fishlegs/claws@nails fishlegs/claws@all fishlegs/heels_Orig fishlegs/heels_Orig/calluses fishlegs/heels_Orig/calluses@all fishlegs/heels_Orig/calluses@rear fishlegs/heels_Orig/rear fishlegs/heels_Orig/rear@all fishlegs/heels_Orig/rear@rear fishlegs/toes fishlegs/toes@knuckles fishlegs/toes@all

# zfs # zfs

rename list -r

fishlegs/heels_Orig/rear fishlegs
USED 434K 0 0 33K 0 0 36.5K 0 0 52K 33K 0 0 28 of 59

fishlegs/heels
REFER 23K 22K 22K 18K 18K 18K 20.5K 19K 19K 21K 18K 18K 18K MOUNTPOINT /fishlegs /fishlegs/claws /fishlegs/heels /fishlegs/heels_Orig -

NAME fishlegs fishlegs@seafloor fishlegs@all fishlegs/claws fishlegs/claws@nails fishlegs/claws@all fishlegs/heels fishlegs/heels@all fishlegs/heels@rear fishlegs/heels_Orig fishlegs/heels_Orig/calluses /fishlegs/heels_Orig/calluses fishlegs/heels_Orig/calluses@all fishlegs/heels_Orig/calluses@rear Page

AVAIL 3.11G 3.11G 3.11G 3.11G 3.11G -

fishlegs/toes fishlegs/toes@knuckles fishlegs/toes@all

33K 0 0

3.11G -

18K 18K 18K

/fishlegs/toes -

# zfs # zfs

rename list -r

fishlegs/heels_Orig/calluses fishlegs
USED 440K 0 0 33K 0 0 69.5K 0 0 33K 0 0 19K 33K 0 0 AVAIL 3.11G 3.11G 3.11G 3.11G 3.11G 3.11G REFER 26K 22K 22K 18K 18K 18K 20.5K 19K 19K 18K 18K 18K 21K 18K 18K 18K

fishlegs/heels/calluses
MOUNTPOINT /fishlegs /fishlegs/claws /fishlegs/heels /fishlegs/heels/calluses /fishlegs/heels_Orig /fishlegs/toes -

NAME fishlegs fishlegs@seafloor fishlegs@all fishlegs/claws fishlegs/claws@nails fishlegs/claws@all fishlegs/heels fishlegs/heels@all fishlegs/heels@rear fishlegs/heels/calluses fishlegs/heels/calluses@all fishlegs/heels/calluses@rear fishlegs/heels_Orig fishlegs/toes fishlegs/toes@knuckles fishlegs/toes@all

34)

Remove the original file-system (optional): # zfs # df -h destroy fishlegs/heels_Orig


size 3.1G used 26K avail capacity 3.1G 1% Mounted on /fishlegs

Filesystem fishlegs

fishlegs/heels_Orig

# ls -al

total 14 drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x drwxr-xr-x

/fishlegs/
5 37 2 3 2 root root root root root root root root root root 5 1024 2 4 2 May May May May May 31 30 30 31 30 18:54 16:23 16:58 18:52 16:48 . .. claws heels toes

Destroy some [or all?] Snapshots


[Nothing here yet. --JRAvery]

35) 36) 37)

.
.

.
. .

Page

29

of

59

Destroy the Pool


38) Attempt to "destroy" the "fishlegs" pool: # zpool 39) destroy fishlegs

Check the results of the "destroy" operation:

# zpool # zpool # zpool # df -h # df -h

no pools available

list

status get "all" /fishlegs


) not a block device, directory or mounted resource

no pools available cannot open 'fishlegs': no such pool df: (fishlegs

fishlegs

df: (/fishlegs ) not a block device, directory or mounted resource

fishlegs

40)

.
.

################################################################################ ################################################################################ [The following few bullets and command-examples have not yet been flushed out, regarding their final & proper placement and presentation in this cheatsheet. --JRAvery]

Assume that you want to create a simple mirrored storage-pool


and a single corresponding mirrored file-system attached to that pool. This can all be done with a single command.

Assume that you will be naming the storage-pool and the filesystem with the same name --in this case, "fishlegs".

Assume that you have two physical disks available, one on each
of two different controllers (ideal for mirrored virtual devices): c2t1d0 c3t2d0
The command, for accomplishing the above, is simply this:

# zpool

create

fishlegs

mirror

c2t1d0

c3t2d0

Page

30

of

59

Check the results:

# df

Filesystem fishlegs

-h

/fishlegs

size 40G

used 1M

avail 40G

capacity 0%

Mounted on /fishlegs

Now assume you want to create one more file-system, within the
"fishlegs" pool, to be mounted directly beneath the "/fishlegs" mount-point.
The command, for accomplishing the above, is simply this:

# zfs

create

fishlegs/nosuch

Check the results:

# df

Filesystem fishlegs/nosuch

-h

/fishlegs/nosuch
size 40G

used 1M

avail 40G

capacity 0%

Mounted on /fishlegs/nosuch

################################################################################ ################################################################################

==

Page

31

of

59

ZFS Storage-Pool: Types; Considerations; How to Create


This chapter covers the following topics/considerations:

Identify Technical and Logistics Requirements for your StoragePool. Determine the Available Underlying Devices to be Used.

What are the Device-Type Choices?. Aspects of Using Entire Physical Hard-Disks. Aspects of Using Individual Slices of Preformatted Hard-Disks. Aspects of Using LUNs from Hardware-RAID Arrays. Aspects of Using Volumes from Software-Based Volume-Managers. Aspects of Using Files. What are the Replication-Type Choices?. Mirrors. RAID-Z: Single and Double Parity. Striped (Simple; no Parity). Choosing between Mirrored; RAID-Z; and RAID-Z2.

Choose the Type of Data-Replication for the Storage-Pool.

Logistical & Administrative Considerations for the Storage-Pool.


.

Basic Commands to Create a Storage-Pool.



Create Striped Storage-Pools (Simple; no Parity) (Command Examples). Create Mirrored Storage-Pools (Command Examples). Create RAID-Z Storage-Pools (Command Examples). Create RAID-Z2 Storage-Pools (Command Examples).

Managing Storage-Pool-Creation Errors.


Detecting In-Use Devices. Mismatched Replication-Types.
Example: Example: Example:
Pool. Combined Simple-Striping and Mirroring in Same Pool. Combined 2-Way Mirror and 3-Way Mirror in Same Pool. Combined RAID-Z(2) vdevs w/ Differently-Sized-Devices in Same

==

Page

32

of

59

Identify Technical and Logistics Requirements for your Storage-Pool


Determine the Available Underlying Devices to be Used
What are the Device-Type Choices? Available devices must each be at least 128-MB in size. Technically and in general, devices can be any one of the
following types: entire physical hard-disks --This is the only recommended device for bestpractices. !*NOTE*!: The one exception to this is that a bootable ZFS root-pool *MUST* be made of disks with the traditional predefined slices/partitions (created by the format utility, for example).

individual slices of preformatted hard-disks --Not recommended, due to


increased complexity of management, particularly in the case of diskfailures.

LUNs from hardware-RAID arrays --Not recommended, partly because of a need to


understand the interaction between ZFS's redundancy features and the hwRAID's redundancy features, due to potential conflicts and a degrading of the desired results; partly because, although ZFS is considered to "work well" with "RAID-5 or mirrored LUNs from intelligent storage arrays", it cannot heal corrupted blocks detected by ZFS checksums, when such underlying devices are used.

Volumes/metadevices constructed by software-based volume-managers, such as


SVM or VxVM --Not recommended, because of potential performance problems.

Files on a UFS file-system --Only for testing purposes.


recommended for production.

Distinctly *NOT*

Aspects of Using Entire Physical Hard-Disks Disk-designation arguments on the Command-Line:


When executing ZFS CLI-commands, entire disks can be designated with any device-name as it appears within the /dev/dsk/ subdirectory, either with or without the full pathname. For examples:

Page

33

of

59

c0t0d0 /dev/dsk/c0t1d0 c1t0d1s2 /dev/foo/disk <--[this one needs more explanation but not yet available]

Note about device-path and device-ID for identifying disks:

From Sun's PDF on ZFS: "Disks are identified both by their path and by their device ID, if available. This method allows devices to be reconfigured on a system without having to update any ZFS state. If a disk is switched between controller 1 and controller 2, ZFS uses the device ID to detect that the disk has moved and should now be accessed using controller 2. The device ID is unique to the drive's firmware. While unlikely, some firmware updates have been known to change device IDs. If this situation happens, ZFS can still access the device by path and update the stored device ID automatically. If you inadvertently change both the path and the ID of the device, then export and re-import the pool in order to use it." [Unfortunately, the above explanation begs as many questions as it answers; hope to get better info later.]

Note about the EFI (Extensible Firmware Interface) label:

Where you choose to use entire physical hard-disks as your basic devices in the pool (best-practices recommendation), ZFS formats the disk with an EFI (Extensible Firmware Interface) label. According to an introductory paragraph on Wikipedia on 2009/Mar/26: "The Extensible Firmware Interface (EFI) is a specification that defines a software interface between an operating system and platform firmware. EFI is intended as a significantly improved replacement of the old legacy BIOS firmware interface historically used by all IBM PC-compatible personal computers. The EFI specification was originally developed by Intel, and is now managed by the Unified EFI Forum and is officially known as Unified EFI (UEFI)." According to documentation at docs.sun.com, the EFI label is distinct from the traditional VTOC (Volume Table of Contents) disk-label, partly in that the VTOC apparently cannot support disk-sizes of 1 TeraByte or larger, while EFI can. Sun also says that one can apply EFI labels to < 1-TB disks by using the "format -e" command on the disk(s) in question. When an EFI label is used on a disk, the format-command's partition-table output appears similar to the following: Current partition table (original): Total disk sectors available: 71670953 + 16384 (reserved sectors) Part 0 1 2 3 4 5 6 7 8 Tag usr unassigned unassigned unassigned unassigned unassigned unassigned unassigned reserved Flag wm wm wm wm wm wm wm wm wm First Sector 34 0 0 0 0 0 0 0 71670954 Size 34.18GB 0 0 0 0 0 0 0 8.00MB Last Sector 71670953 0 0 0 0 0 0 0 71687337

Aspects of Using Individual Slices of Preformatted Hard-Disks


Page 34 of 59

Disks can be labeled with a traditional Solaris VTOC label when


you create a storage pool with a disk slice.
(This is true, at least, of disks < 1-TB in size. Some documentation clearly indicates that VTOCs cannot be used on disks of => 1-TB in size.

For a bootable ZFS root pool, the disks in the pool must contain
slices.
The simplest configuration would be to put the entire disk capacity in slice0 and use that slice for the root pool.

The use of individual slices, for ZFS, might be necessary


when ... The device-name is nonstandard.
more slices) and ZFS. <--[Sun PDF does not clarify!!!]

You have a serious need for a single disk to be shared between UFS (on one or A disk is already being used, in part, as a swap- or dump-device.

If you experience a failure with a disk that contains different


types of slices --some UFS; some ZFS; some swap; etc-- then replacing and recovering from such a failure is extra complicated.
[Sun's manual provides no specifics for this type of scenario. --JRAvery]

Aspects of Using LUNs from Hardware-RAID Arrays Questions about redundancy and performance:
From Sun's ZFS Admin Guide PDF: "If you construct ZFS configurations on top of LUNs from hardware RAID arrays, you need to understand the relationship between ZFS redundancy features and the redundancy features offered by the array. Certain configurations might provide adequate redundancy and performance, but other configurations might not." [Sun's manual provides no specifics for this type of scenario. --JRAvery]

Problems with Self-Healing Data:

From Sun's "ZFS Best-Practices Guide": "ZFS works well with storage based protected LUNs (RAID-5 or mirrored LUNs from intelligent storage arrays). However, [with this configuration] ZFS cannot heal corrupted blocks that are detected by ZFS checksums."

Storage-Pool performance-consideration:
From Sun's "ZFS Best-Practices Guide": If you must use LUNs then try at least to use "LUNs made up of [only] a few disks. By providing ZFS with more visibility into the LUNs setup, ZFS is able to make better I/O scheduling decisions."

Page

35

of

59

Aspects of Using Volumes from Software-Based Volume-Managers Potential performance issues:


From Sun's ZFS Admin Guide PDF: "You can construct logical devices for ZFS using volumes presented by software-based volume managers, such as SolarisTM Volume Manager (SVM) or Veritas Volume Manager (VxVM). However, these configurations are not recommended [because, although] ZFS functions properly on such devices, less-than-optimal performance might be [one] result."

Potential performance and efficiency issues:


From Sun's "ZFS Best-Practices Guide": "ZFS works best without any additional volume management software. If you must use ZFS with SVM because you need an extra level of volume management, ZFS expects that 1 to 4 Mbytes of consecutive logical block[s] [be mapped] to consecutive physical blocks. Keeping to this rule allows ZFS to drive the volume with efficiency."

Aspects of Using Files Intended only for experimental and testing purposes. Files must be at least 64-MB. If a file is moved or otherwise renamed, the storage-pool must
be exported and re-imported to be able to use the changed file.

[Sun's PDF does not say what type(s) of files can be used; what
special command to use, if any, to create these files. A natural assumption is the same command for creating files that can be used as swap files but not yet sure.]
I found evidence that the "dd" command can be used, like this (See the Blog: http://i18n-freedom.blogspot.com/search?q=%22How+to+turn+a+mirror+in+to+a+RAID %22):

Create a Sparse File:


$ dd if=/dev/zero of=/xenophanes/disk.img bs=1024k seek=149k count=1

Use the "lofiadm" utility to add the /xenophanes/disk.img file as a blockdevice:

# lofiadm

-a /xenophanes/disk.img

/dev/lofi/1

Create the ZFS RAID-Z Storage-Pool "heraclitus" with a RAID-Z vdev that
includes /dev/lofi/1 as one of the vdev-components:

# zpool

create heraclitus

raidz c2d0 c4d1 /dev/lofi/1

Page

36

of

59

(I've sent the blogger a message, asking whether or not he knows whether or not the "dd" command is the only way to create files for this purpose: never got back any response. --JRAvery) ==

Page

37

of

59

Choose the Type of Data-Replication for the Storage-Pool


What are the Replication-Type Choices?
A single ZFS Storage-Pool can support any one of the following Replication-Types (aka "RAID-level variations"):

Striped (aka "Dynamic Striping") (non-redundant) Mirrored (multiple identical copies of the same data,
dynamically maintained)

RAID-Z and -Z2 (variation of RAID-5 and RAID-6, respectively)


!*NOTE*!: ZFS officially does support any single Storage-Pool that has multiple replication-types of virtual-devices (vdevs) in it. !*HOWEVER*!, it is distinctly recommended that you *NOT* configure multiple replication-types within a single Storage-Pool! In other words, one can have any number of individual Storage-Pools, each of which has some simple-stripe vdevs plus one or more mirrored vdevs plus one or more RAID-Z and/or RAID-Z2 vdevs; however, it is highly recommended that you don't do that.

Mirrors As of 2008/September, the following operations are supported for


ZFS mirrored storage-pools (per Sun's 2008/September ZFS Admin Guide): Add another set of disks for an additional top-level vdev to an existing
mirrored configuration.

Attach additional disks to an existing mirrored storage-pool or attach


additional disks to a non-replicated configuration to create a mirrored storage-pool.

Replace a disk or disks in an existing mirrored storage-pool as long as the


replacement disks' sizes are greater-than or equal-to the size of device to be replaced.

Detach a disk or disks in a mirrored storage-pool as long as the remaining


devices provide adequate redundancy for the configuration.

As of 2008/September, the following operations are NOT supported


for ZFS mirrored storage-pools (per Sun's ZFS Admin Guide):

Page

38

of

59

You cannot outright remove a device from a mirrored storage pool.

An RFE is filed for this feature. <--[The Sun guide does *not* clarify the difference between this, which is not supported, and the above-mentioned "replace" and "detach" operations, which are supported!] An RFE is filed for

You cannot split or break a mirror for backup purposes.


this feature.

RAID-Z: Single and Double Parity Single-parity RAID-Z is basically like original RAID-5.
It requires at least 2 disks (though more are recommended and typically used): 1 for an original-data stripe and 1 more for 1 parity stripe, for each striped volume or vdev. Use the "raidz" or "raidz1" option on the commandline.

[graphic borrowed from Wikipedia --JRA]

Double-parity RAID-Z --aka RAID-Z2-- is basically like original


RAID-6.

Page

39

of

59

It requires at least 3 disks (though more are recommended and typically used): 1 for an original-data stripe and 2 more for each of two parity stripes, for each striped volume or vdev. Use the "raidz2" option on the command-line.

[graphic borrowed from Wikipedia --JRA]

RAID-5 write-hole: (Per Sun's ZFS Admin Guide): RAID-Z overcomes


the "RAID-5 write-hole" by using "variable-width RAID stripes so that all writes are full-stripe writes." "This design is only possible because ZFS integrates file system and device management in such a way that the file system's metadata has enough information about the underlying data redundancy model to handle variable-width RAID stripes. RAID-Z is the world's first softwareonly solution to the RAID-5 write hole."

RAID-Z Data-Capacity and Failure-Resistance:

(Per Sun's ZFS Admin Guide): "A RAID-Z configuration with N disks of size X with P parity disks can hold approximately (N-P)*X bytes and can withstand P device(s) failing before data integrity is compromised."

Recommended Number-of-Disks Range for RAID-Z:

(Per Sun's ZFS Admin Guide): If you are creating a RAID-Z Storage-Pool with >10 disks, it is better to create 2 or more RAID-Z top-level vdevs within the storage-pool, to get better performance. For example: 12 disks: create 2 RAID-Z top-level vdevs with 6 disks each, rather than a single top-level vdev with 12 disks. In other words: the recommended number of disks, per RAID-Z top-level vdev, is from 3 to 9.

Striped (Simple; no Parity) Provides *NO* data-redundancy: *NO* protection from faileddevices!

Page

40

of

59

Choosing between Mirrored; RAID-Z; and RAID-Z2 Mirrors:


Advantages:
Better performance "particularly" *IF* your I/O predominantly consists of
"large, uncacheable, random read loads" *OR* "small random reads".

Disadvantages:
To store X amount of data, uses more disk-space than RAID-Z or RAID-Z2. Must use at least 3-way mirrors --which requires more disk-space than 2way mirrors-- to approach the MTTDL (Mean-Time To Data-Loss) of RAID-Z and RAID-Z2. (This is not necessarily a notable disadvantage, given that some shops assume the need for 3-way mirrors, as a matter of course.)

RAID-Z:
Advantages:
Uses the least amount of disk-space for ZFS replication models that
provide redundancy.

Performs well *WHEN* data reads & writes occur in large chunks of 128 K or
more.

Disadvantages:
Not quite as good MTTDL (Mean-Time To Data-Loss) as RAID-Z2. Performs less well for random reads, compared to mirrors.

RAID-Z2:
Advantages:
Distinctly the best MTTDL (Mean-Time To Data-Loss) when compared to RAID-Z
or 2-way mirrors. Performance similar to RAID-Z.

Disadvantages:
Performs less well for random reads, compared to mirrors.

When It Doesn't Matter (relatively speaking):


"If your I/Os are large, sequential, or write-mostly, then ZFS's I/O
scheduler aggregates them in such a way that you'll get very efficient use of the disks regardless of the data-replication model."

Page

41

of

59

When a high MTTDL (Mean-Time to Data-Loss) is of prime


importance and/or you believe you can't afford the storage-capacity loss involved with mirroring but you need to maximize your I/O within these constraints (from a blog-posting by Roch (rhymes with Spock) Bourbonnais, on blogs.sun.com, 2006/May): Choose RAID-Z
--aka "Z1" (same as RAID-5).

Determine the number of disks you have and, specifically, what is/are their
IOPS (I/O Operations-Per-Second) capability: X. (You need to have a bunch of disks that share the same IOPS stats, for this to work.)

Decide on a target of Y number of FS-blocks per second in your I/O


operations.

Divide Y by X (Y/X) --in other words, divide your FS-blocks/sec target by the
IOPS of which your disks are capable.

The result is the number of disks you should include in each RAID-Z grouping
[not RAID-Z2 I think, which uses a slightly different formula, I think --JRA].

EXAMPLE:

50 disks, each capable of 250 IOPS. Target FS-blocks/sec = 1000. Y/X = 1000/250 = 4 disks for each RAID-Z top-level vdev in the ZFS StoragePool.

RAID-Z is a great technology not only when disk blocks are your most precious
resources but also when your available IOPS far exceed your expected needs. But beware that if you get your hands on fewer very large disks, the IOPS capacity can easily become your most precious resource. Under those conditions, mirroring should be strongly favored or alternatively a dynamic stripe of RAID-Z groups each made up of a small number of devices.

Logistical & Administrative Considerations for the Storage-Pool


1) Choose a name for your Storage-Pool. This is not a technical consideration but it might be important to you from a logistical or administrative standpoint, to help you and your colleagues keep track of your pools and file-systems by having names that are easy to remember and/or have some bearing on the types of data that you expect to store on them. (This is not always the better approach to such naming conventions. Sometimes, the specific use of a resource changes over time and, when this occurs, it is sometimes more trouble than desired to change the name of that resource to reflect the new usage, in which case an arbitrary naming convention might be better.) 2) . .

Page

42

of

59

Basic Commands to Create a Storage-Pool


Create Striped Storage-Pools (Simple; no Parity) (Command Examples)
Use the following command syntax to create a mirrored storage-pool: (Each of the following examples uses entire physical disks as the elements.) Create a simple-striped pool with a single vdev of 3 disks:

# zpool

create fishy

c0t1d0 c1t0d0 c2t1d0

The data will be dynamically striped across the two-or-more designated disks (or whatever types of devices you specified, on the command-line, as the components of the "fishy" vdev. [*NOTE*: Sun's PDF gives only one example, which is a variation of the above. When creating a mirrored or RAID-5 ZFS-storage-pool, one can create multiple top-level (aka "root") vdevs during the original creation of the pool. Given the above example (and the variation in Sun's PDF), it's not at all clear how one would do this with a simple Striped Storage-Pool. Granted, this seems not particularly important, given that it would probably be very rare for anybody to actually want to have a simple Striped Storage-Pool, unless maybe they were going only for performance and no HA (High Availability). But, if it's possible, I don't yet know how. --JRA]

Create Mirrored Storage-Pools (Command Examples)


Use the following command syntax to create a mirrored storage-pool: (Each of the following examples uses entire physical disks as the elements.) Create a mirrored pool consisting of a single 2-way mirror:

# zpool

create puppy

mirror c0t1d0 c1t0d0

Create a mirrored pool consisting of a single 3-way mirror:

# zpool

create puppy

mirror c0t1d0 c1t0d0 c2t1d0

Create a mirrored pool consisting of two 2-way mirrors:

# zpool

create puppy

mirror c0t1d0 c1t0d0

mirror c2t1d0 c0t1d0

Create a mirrored pool consisting of two 3-way mirrors: # zpool create puppy

mirror c0t1d0 c1t0d0 c2t1d0

mirror c0t1d0 c1t1d0 c2t2d0

NOTE that the above commands also create a single corresponding ZFS file-system, by the name of "puppy", and automatically mount that file-system as "/puppy".

Page

43

of

59

Create RAID-Z Storage-Pools (Command Examples)


Use the following command syntax to create a RAID-Z storage-pool: Create a RAID-Z pool consisting of a single vdev with 4 disks:

# zpool

create puppy

raidz c0t1d0 c1t0d0 c2t0d0 c3t1d0

Create a RAID-Z pool consisting of 2 vdevs with 3 disks each: # zpool create puppy

raidz c0t1d0 c1t0d0 c2t0d0

raidz c0t2d0 c2t0d0 c3t0d0

Create RAID-Z2 Storage-Pools (Command Examples)


Use the following command syntax to create a RAID-Z storage-pool: Create a RAID-Z2 pool consisting of a single vdev with 5 disks:

# zpool

create puppy

raidz c0t1d0 c1t0d0 c2t0d0 c3t1d0 c0t2d0

Create a RAID-Z pool consisting of 2 vdevs with 4 disks each:

# zpool create puppy raidz2 c0t1d0 c1t0d0 c2t0d0 c3t1d0 > raidz2 c0t2d0 c1t1d0 c2t1d0 c3t2d0

Managing Storage-Pool-Creation Errors


Detecting In-Use Devices
When you are trying to "zpool create" a new Storage-Pool, ZFS first attempts to determine whether a device (included on the command-line) is already in use by ZFS or already in use by some other part of the OS or not presently in use at all. If it determines that the device is already in use, the error-message might appear similar to the following:

# zpool

create headcase

c1t0d0 c1t1d0

invalid vdev specification use -f to override the following errors: /dev/dsk/c1t0d0s0 is currently mounted on /. Please see umount(1M). /dev/dsk/c1t0d0s1 is currently mounted on swap. Please see swap(1M). /dev/dsk/c1t1d0s0 is part of active ZFS pool zeepool. Please see zpool(1M). Some errors can be overridden by using the -f switch but most cannot.

The following storage-pool-create errors must be fixed manually:


Mounted File-System File-System in /etc/vfstab The disk, or one or more of its partitions, has a filesystem that is presently mounted. You must "umount" those file-systems. Even if the file-system is not presently mounted, it must not be listed in /etc/vfstab. Manually remove it or comment it out. Page 44 of 59

Dedicated dump device Already part of a ZFS pool

You cannot use a dedicated dump-device as a ZFS element. Correct the situation with dumpadm or choose a different device. To overcome this error:

Choose a different device; or ... use "zpool destroy" to destroy the other
pool if it's no longer needed; or ... use "zpool detach" to detach the device from the other pool.
The following storage-pool-create errors can be overridden with the "-f" switch:
Contains a File-System Part of an SVM volume Live Upgrade ABE Part of an exported ZFS pool The disk, or one or more of its slices, contains a known file-system that is not mounted and not in /etc/vfstab. The disk, or one or more of its slices, is part of an SVM volume. The disk is designated as an Alternate Boot Environment (ABE) for Sun's Live Upgrade. The disk is part of a storage pool that has been exported or manually removed from a system. In the latter case, the pool is reported as "potentially active", because the disk might be a network-attached drive in use by another system. Be cautious when overriding a "potentially active" pool.

Mismatched Replication-Types
ZFS will allow you to create a storage-pool with vdevs of mismatched replication-types but it does not like this: It will error and try to prevent you from doing this.

Example: Combined Simple-Striping and Mirroring in Same Pool


# zpool
invalid vdev specification use -f to override the following errors: mismatched replication level: both disk and mirror vdevs are present To override:

create hepburn

c1t1d0

mirror c2t2d0 c3t3d0

# zpool

create -f hepburn

c1t1d0

mirror c2t2d0 c3t3d0

Example: Combined 2-Way Mirror and 3-Way Mirror in Same Pool


# zpool create fishhead mirror c0t1d0 c0t2d0 mirror c0t3d0 c0t4d0 c0t5d0 invalid vdev specification use -f to override the following errors: mismatched replication level: 2-way mirror and 3-way mirror vdevs are present To override: # zpool create -f fishhead

mirror c0t1d0 c0t2d0

mirror c0t3d0 c0t4d0 c0t5d0

Page

45

of

59

Example: Combined RAID-Z(2) vdevs w/ Differently-Sized-Devices in Same Pool


[don't presently have an example of this] To override: # zpool create -f [...]

Controlling ZFS Mount-Point at Pool-Creation: Alternate "Default Mount-Point"; Alternate Root


The default behavior, for the "zpool create" command, is that, whatever customname you place after the "create" switch, on the command-line, becomes not only the name of the Storage-Pool but also the name of the top-level directory and the toplevel mount-point of the first file-system in the Storage-Pool. For examples:

# zpool # df -h

create fishlegs

fishlegs
size 1.5G size 1.5G

c0t2d0s3 c0t3d0s3
used 18K used 18K avail capacity 1.5G 1% avail capacity 1.5G 1% Mounted on /fishlegs Mounted on /fishlegs

Filesystem fishlegs

# df -h

Filesystem fishlegs

/fishlegs

# ls -al

total 5 drwxr-xr-x drwxr-xr-x

/fishlegs
2 root 37 root root root 2 May 30 14:49 . 1024 May 30 14:49 ..

If you want to customize the top-level mount-point for the top-level file-system (aka, the "Root-Directory") of the Storage-Pool, there are two ways to do it:

Customize the Default Mount-Point Designate an "Alternate Root-Directory"


(This latter means an "alternate root-directory" specifically for the Storage-Pool; *not* for the entire computer-system.)

Generally, there are 2 scenarios in which you might want to use either of these techniques:

A top-level directory already exists, and has files and/or


subdirectories in it, where that directory's name matches the name that you are giving to the Storage-Pool.
For example: You are creating a new Storage-Pool called "hardrock" but there is already a "/hardrock" directory and it already has some files and/or subdirectories in it.

Page

46

of

59

For whatever reason, you simply want your Storage-Pool to have a


"Default Mount-Point" that has a different name from the name of the Storage-Pool.
*NOTE*: The difference, between using an alternate "Default Mount-Point" and using an "Alternate Root-Directory" is this:

When you designate an alternate "Default Mount-Point", that alternate will


be the parent-directory of a mount-point subdirectory that still bears the same name as the Storage-Pool.

When you designate an "Alternate Root-Directory", the name of the StoragePool will not necessarily appear anywhere in the path to that mount-point, unless you decide to specifically include it on the command-line after the "-R" switch.

Alternate "Default Mount-Point"


EXAMPLE:

# zpool

default mountpoint /topcat exists and is not empty use -m option to specify a different default

create topcat

c0t3d0

# zpool

create

-m /export/zfs

topcat

c0t3d0

This uses --and creates, if necessary-- the /export/zfs/topcat/ subdirectory for the mount-point. EXAMPLE:

# zpool

create

-m /stuff

lowdog

c0t3d0

This uses --and creates, if necessary-- the /stuff/lowdog/ subdirectory for the mount-point. As far as I tell, there is no way to prevent ZFS from mounting a related file-system somewhere when the "zpool create" command is executed. --JRA

Alternate Root-Directory
EXAMPLE:

# zpool # df -h

create fishlegs

-R /panda
size 1.5G size 1.5G

fishlegs
used 18K used 18K

c0t2d0s3 c0t3d0s3
Mounted on /panda Mounted on /panda

Filesystem fishlegs

avail capacity 1.5G 1% avail capacity 1.5G 1%

# df -h

/panda

Filesystem fishlegs

# ls -al
total 5 drwxr-xr-x drwxr-xr-x

/panda
2 root 37 root root root Page 47 2 May 30 14:49 . 1024 May 30 14:49 .. of 59

==

Page

48

of

59

ZFS Intent-Log (ZIL)


Starting with the 10/08 release of Solaris-10, a new feature of the ZFS IntentLog (ZIL) is added to ZFS, to comply with POSIX standards for synchronous transactions. The creation of the ZIL is automatic when a Storage-Pool is created. By default, the ZIL is allocated from blocks within the main storage pool. However, better performance might be possible by using separate intent log devices in your ZFS storage pool, such as with VRAM or a dedicated disk.

Considerations for Using Separate ZIL Devices


(Some of the following information is gleened from Neil Perrin's Weblog, at "http://blogs.sun.com/perrin/entry/slog_blog_or_blogging_on".)

Any performance improvement seen by implementing a separate log


device depends on the device type, the hardware configuration of the pool, and the application workload.
Neil Perrin, of Sun Microsystems, refers to ZILs on separate devices as "slogs", which is short for "separate intent-logs". By default, the intent-log consists of "a chain of varying block-sizes[,] anchored in fixed objects". The problem with this chain is that log-blocks get allocated for a transaction but then "freed as soon as the pool transaction-group has completed", which can easily result in fragmentation within the pool. Regarding performance, Neil mentions this: "The performance of databases and NFS is dictated by the latency of making data stable. They need to be assured that their transactions are not lost on power or system failure. So they are heavily dependent on the speed of the intent log devices." Neil ran some tests that give preliminary indications that, if you can specify an NVRAM-device as your slog-device (separate device for a ZIL), it's worth having a separate slog but, if your slog-device is simply a regular disk, there are no clear performance benefits. In his test, his NVRAM-device was a "battery backed pci Micro Memory pci1332,5425 card".

Separate log-devices can be unreplicated or mirrored but not


RAID-Z.

If a separate log-device is not mirrored and that log-device


fails, ZFS reverts to the default behavior of storing the intentlog on blocks in the storage-pool.

Log devices can be added, replaced, attached, detached, and


imported and exported as part of the larger storage pool. As of 2008/September, log devices cannot be removed. [Again, Sun's PDF does not clarify the difference between what it means to "detach" and what it means to "remove".]

The minimum size of a log-device is the same as the minimum size


of each device in a pool, which is 64 MBytes.
Page 49 of 59

The amount of in-

play data that might be stored on a log-device is relatively small. Log-blocks are freed when the log transaction (system call) is committed. [NOTE that this can result in the fragmenting to which Neil Perrin refers in his blog, as noted above. -JRA]

The maximum size of a log-device should be approximately 1/2 the


size of physical memory because that is the maximum amount of potential in-play data that can be stored. For example, if a system has 16 Gbytes of physical memory, consider a maximum logdevice-size of 8 Gbytes.

Command-Syntax to Create or Add Separate ZIL Devices


In each of the following command-examples, the command-elements that are specific to creating or manipulating the log-device(s) are displayed in bold-blue font. 1) Example 1: Create a simple-striped pool with a single vdev of 3 disks, plus a single-disk log-device:

# zpool
2)

create fishy

c0t1d0 c1t0d0 c2t1d0

log c0t2d0

Example 2: Starting with the storage-pool and single-log-device created in Example 1 above, add a second and separate log-device, not mirrored:

# zpool

add fishy

log c0t3d0

NOTE that, if the "fishy" pool had not been created with any log-device in the first place, the above command would simply have added a first-ever logdevice to the "fishy" pool. 3) Example 3: Starting again with the storage-pool and single-log-device created in Example 1 above (pretend that we have not done Example 2), add a second log-device to the pool but, this time, in such a way as to create a mirror, using the original log-device and the newly-added log-device (from the following command) as the two submirrors of the log-device-mirror:

# zpool

attach fishy

c0t2d0 c0t3d0

Again: now the "fishy" pool's log-device is a mirrored vdev, with c0t2d0 and c0t3d0 acting as the two submirrors of that mirrored vdev. 4) Example 4: Create a new storage-pool that, from the beginning, contains a separate and mirrored log-device.

# zpool

create birdy

c0t1d0 c1t0d0

log

mirror c0t2d0 c1t2d0

Page

50

of

59

5)

Example 5: Create a new storage-pool that, from the beginning, contains two separate and mirrored log-devices.

# zpool create doggy c0t1d0 c1t0d0 \ > log mirror c0t2d0 c1t2d0 log mirror c1t3d0 c2t3d0
6) Example 6: For an existing storage-pool, detach a log-device from a mirrored log-device.

# zpool
7)

detach fishy

c0t3d0

Example 7: For an existing storage-pool, replace one log-device with a different log-device.

# zpool

replace fishy

c0t2d0 c0t3d0

Again: the result of the above command is that the previous log-device, c0t2d0, has now been replaced with the newly-designated log-device, c0t3d0. [What is not completely clear is this: Before the replacement, was c0t3d0 not part of the pool at all, or did it need to be part of the pool first? Also, does this replace action remove c0t2d0 completely from the pool or does it simply deactivate its status as a log-device? I think the answer to these two is that c0t3d0 was not part of the pool before the replace and c0t2d0 is not part of the pool after the replace but I'm not yet certain.]

==

Page

51

of

59

Storage-Pool Management
[!*NOTE*!: At the moment this note is being added, it is my intention probably to have ZFS File-System Management covered under a different major heading in this document. I might change my mind but that's it for now. --JRAvery]

Create Storage-Pools
[This topic is covered under the major heading "ZFS Storage-Pool: Types; Considerations; How to Create" and the 2nd-level heading "Basic Commands to Create a Storage-Pool".]

View Storage-Pool Data / Information


"zpool list": Simple View of Storage-Pools
# zpool
NAME bump radical hershey

list
SIZE 80G 42.9G 90G USED 274K 70.0K 45G AVAIL 80G 42.8G 45G CAP 0% 0% 50% HEALTH ONLINE ONLINE ONLINE ALTROOT /super -

# zpool
NAME hershey

list

SIZE 90G

hershey

USED 45G

AVAIL 45G

CAP 50%

HEALTH ONLINE

ALTROOT -

# zpool
NAME bump zeeter hershey

list

-o name,size,health
SIZE 80.0G 2.3T 90.0G HEALTH ONLINE ONLINE ONLINE

"zpool status": Storage-Pool / VDev Details/Status


About "zpool status" Output-Sections
The output of the "zpool status" command is divided into several sections. Not all of these sections will always appear, each time you execute the command. Here is a brief description of each section: Section Label pool state Description The name of the pool. A one-word label as to the present health of the pool. This information refers only to the ability of the pool to provide the necessary replication level. Pools that are ONLINE might Page 52 of 59

status action

see scrub config

errors

still have failing devices or data corruption. See the table under the SubHeader "About ZFS-Storage-Pool Device-States (from "zpool status")", below. If and only if there is a problem with the pool, this section is displayed to provide a description of the problem. If and only if there is a problem with the pool, this section is displayed to provide a recommended action for repairing the errors. This field might be in an abbreviated form, directing the user to one of the following sections. If and only if there is a problem with the pool, this section is displayed to provide a reference to a knowledge article that contains detailed repair-info. Identifies the current status of a scrub operation, which might include the date and time that the last scrub was completed, a scrub in progress, or if no scrubbing was requested. Describes the config-layout of the devices in the pool, plus their state and any device-errors. Device-states are any of the following: ONLINE OFFLINE FAULTED DEGRADED AVAIL UNAVAILABLE See the table under the SubHeader "About ZFS-Storage-Pool Device-States (from "zpool status")", below. Also, a 2nd subsection of the config section displays errorstatistics: READ --I/O errors occurred while issuing a read request. WRITE --I/O errors occurred while issuing a write request. CKSUM --Checksum errors. The device returned corrupted data as the result of a read request. NOTE: These errors can help indicate whether or not the damage is permanent. Only a few I/O errors might indicate a temporary outage; a large number might indicate a permanent device problem. NOTE ALSO: Any such errors might not correspond to errors as seen either (A) by the application(s) using the date on the affected vdevs or (B) even at the top-level of the vdevs --that is, at the mirror or raidz level. This is because, with these redundant configurations, it's possible that the remaining underlying devices, in the mirror or raidz vdev, were sufficient to prevent any loss of data or functionality when the error occurred. After all, that's what the redundancy is for. Also, ZFS's self-healing-data feature might have been able to kick in, also. Identifies known data errors or the fact that there are none.

About ZFS-Storage-Pool Device-States (from "zpool status")


States ONLINE DEGRADED Descriptions The device is in normal working order. While some transient errors might still occur, the device is otherwise in working order. The virtual device (vdev) has experienced failure but is still able to function. This state is most common when a mirror or RAID-Z device has lost one or more constituent devices. The fault tolerance of the pool might be compromised, as a subsequent fault in another device might be unrecoverable. The virtual device (vdev) is completely inaccessible. This status typically indicates total failure of the device, such that ZFS is incapable of sending/receiving data to/from it. If a top-level (aka "root") vdev is in this state, the pool is completely inaccessible. [Is that last sentence really true?!?! That makes sense to me if Page 53 of 59

FAULTED

OFFLINE UNAVAILABLE

AVAIL

REMOVED

there is only 1 top-level vdev w/in the pool but what if there are two or more root vdevs?!?!: it should still work, I believe!] The virtual device (vdev) has been deliberately taken offline by the administrator. The device or virtual device (vdev) cannot be opened. In some cases, pools with UNAVAILABLE devices appear in DEGRADED mode. If a toplevel vdev is unavailable, then nothing in the pool can be accessed. [AGAIN!: Is that last sentence really true?!?! That makes sense to me if there is only 1 top-level vdev w/in the pool but what if there are two or more root vdevs?!?!: it should still work, I believe!] [This is not included in this section in Sun's PDF on ZFS but, elsewhere in the same PDF, I saw examples of "zpool status" output that clearly displayed this state for spare devices, obviously indicating that these spare devices were available for use within the pool as needed.] The device was physically removed while the system was running. Device removal detection is hardware-dependent and might not be supported on all platforms.

Examples of "zpool status" Syntax and Output


Example 1: Request "status" without specifying a particular pool.

# zpool

status

pool: rootpool state: ONLINE scrub: none requested config: NAME rootpool mirror c0t4d0s0 c1t3d0s0

STATE ONLINE ONLINE ONLINE ONLINE

READ WRITE CKSUM 0 0 0 0 0 0 0 0 0 0 0 0

pool: hepcat state: ONLINE scrub: none requested config: NAME STATE hepcat ONLINE mirror ONLINE c1t2d0 ONLINE c2t3d0 ONLINE mirror ONLINE c1t3d0 ONLINE c2t3d0 ONLINE mirror ONLINE c1t4d0 ONLINE c2t4d0 ONLINE logs ONLINE mirror ONLINE c0t3d0 ONLINE c1t5d0 ONLINE spares c0t5d0 AVAIL c1t6d0 AVAIL

READ WRITE CKSUM 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

[Sorry but I don't remember why I have the above in red-font. --JRAvery] Page 54 of 59

Example 2:

Request "status", specifying a pool.

# zpool

status

rootpool

pool: rootpool state: ONLINE scrub: none requested config: NAME rootpool mirror c0t4d0s0 c1t3d0s0 STATE ONLINE ONLINE ONLINE ONLINE READ WRITE CKSUM 0 0 0 0 0 0 0 0 0 0 0 0

Example 3: Request "status" with "-v" for "verbose" output. (Note: Might not see anything different from not specifying "-v"; depends on the state(s) of the pool(s).)

# zpool

[no sample output yet] errors: No known data errors <--[supposedly this will appear after other output, *if* actually no errors]

status

-v

Example 4: Request "status" with "-x" to get only pools that are erroring or unavailable. Here we get an "UNAVAIL" and "cannot open":

# zpool

pool: zeepool state: DEGRADED status: One or more devices could not be opened. Sufficient replicas exist for the pool to continue functioning in a degraded state. action: Attach the missing device and online it using zpool online. see: http://www.sun.com/msg/ZFS-8000-D3 scrub: resilver completed after 0h12m with 0 errors on Thu Aug 28 09:29:43 2008 config: NAME STATE READ WRITE CKSUM zeepool DEGRADED 0 0 0 mirror DEGRADED 0 0 0 c1t2d0 ONLINE 0 0 0 spare DEGRADED 0 0 0 c2t1d0 UNAVAIL 0 0 0 cannot open c2t3d0 ONLINE 0 0 0 spares c1t3d0 AVAIL c2t3d0 INUSE currently in use errors: No known data errors <--[not sure this really appears in this type of situation]

status

-x

*NOTE*: Similar detail can be seen when using "zpoll status -v". The difference is that, with "-v", you'll get info about all the pools; with "x", you get info only about the pools with problems.

Page

55

of

59

Example 5: Request "status" with "-x" to get only pools that are erroring or unavailable. In this example, "all pools are healthy":

# zpool

status

-x

all pools are healthy

"zpool iostat": Storage-Pool I/O Statistics


Pool-Summary Statistics
Example 1:

# zpool

iostat

bump

<--[the "2" is to repeat the stats every 2 seconds]

pool ---------bump bump bump

capacity used avail ----- ----100G 20.0G 100G 20.0G 100G 20.0G

operations read write ----- ----1.2M 102K 134 0 94 342

bandwidth read write ----- ----1.2M 3.45K 1.34K 0 1.06K 4.1M

Pool Statistics per VDev


Example 1:

# zpool

iostat

-v
operations read write ----- ----0 22 1 295 1 299 ----- ----0 22 bandwidth read write ----- ----0 6.00K 11.2K 148K 11.2K 148K ----- ----0 6.00K

tank ---------mirror c1t0d0 c1t1d0 ---------total

capacity used avail ----- ----20.4G 59.6G ----- ----24.5K 149M

*NOTE* 1: Space-usage is available only for top-level vdevs (aka "root" vdevs), such as "mirror" in the above output. *NOTE* 2: The numbers might not add up exactly correctly, particularly with Mirrors and RAID-Z/Z2 (as opposed to simple striping). The apparent discrepancy is particularly noticeable immediately and shortly after a pool is first created, because of all the I/O necessary for the pool-creation activities at the individual disk (or partition or file) level.

["zpool history": ]
[nothing yet. --jra]

["zpool get": ]
[nothing yet. --jra]

Page

56

of

59

When & Why to Export / Import a Storage-Pool

Moving a Storage-Pool from one computer-system to another. Export the pool from the original system. Perform whatever physical moves might be required. Import the pool to the different system. Perform a "zpool replace" command and the replacement device has a larger capacity than the device that was replaced. If you do not export and import the pool, the increased size, of the replacement device, will not be seen/acknowledged by the system. In a rare situation in which both the path and Device-ID of a disk is changed without the disk's being technically removed from the Storage-Pool, the pool must be exported and then imported or else the disk cannot be used in ZFS. From the Solaris ZFS Administrator Guide of 2008/September: "Disks are identified both by their path and by their device ID, if available. This method allows devices to be reconfigured on a system without having to update any ZFS state. If [for example] a disk is switched between controller 1 and controller 2, ZFS uses the device ID to detect that the disk has moved and should now be accessed using controller 2. The device ID is unique to the drive's firmware. While unlikely, some firmware updates have been known to change device IDs. If this situation happens, ZFS can still access the device by path and update the stored device ID automatically. If you inadvertently change both the path and the ID of the device, then export and re-import the pool in order to use it." You are using UFS-files as virtual-devices (vdevs) in a Storage-Pool, for testing purposes. At some point, you move or rename one of those vdev files. At this point, the Storage-Pool must be exported and then imported, in order to continue using the pool. When growing the size of a LUN. From the Solaris ZFS Administrator Guide of 2008/September: "Currently, when growing the size of an existing LUN that is part of a storage pool, you must also perform the export and import steps to see the expanded disk capacity." You previously "offline[d]" a disk from a Storage-Pool. This does not technically remove that disk from that Storage-Pool. Later, you destroy that particular Storage-Pool. Later again, you want to re-use that disk in a different Storage-Pool but, when you run the "zpool add" command, you get the error "{device} is part of exported or potentially active ZFS pool. Please see zpool(1M)." At this point, you must "import" that previously-destroyed pool again, which will make that disk active again and also make the previously-destroyed pool active again. Now destroy that pool again and the disk will be freed up for use in the different pool where you now want it.

==

Page

57

of

59

[......]
[.....]
[....]
[...]
==

Page

58

of

59

########################################## 1) A) 1. a. i. 1) 2) 3) . . . . . . Multi-Layered List 1, Level 1 . Multi-Layered List 1, Level 2 . Multi-Layered List 1, Level 3 . Multi-Layered List 1, Level 4 . Multi-Layered List 1, Level 5 .

.
.

.
.

.
.

.
.

.
.

. .

. .

########################################## ==

Page

59

of

59

You might also like