You are on page 1of 22

FIOS: A Fair, Ecient Flash I/O Scheduler

Stan Park Kai Shen


University of Rochester
1 / 21
Background
Flash is widely available as mass storage, e.g. SSD
$/GB still dropping, aordable high-performance I/O
Deployed in data centers as well low-power platforms
Adoption continues to grow but very little work in robust I/O
scheduling for Flash I/O
Synchronous writes are still a major factor in I/O bottlenecks
2 / 21
Flash: Characteristics & Challenges
No seek latency, low latency variance
I/O granularity: Flash page, 28KB
Large erase granularity: Flash block, 64256 pages
Architecture parallelism
Erase-before-write limitation
I/O asymmetry
Wide variation in performance across vendors!
3 / 21
Motivation
Disk is slow scheduling has largely been
performance-oriented
Flash scheduling for high performance ALONE is easy (just
use noop)
Now fairness can be a rst-class concern
Fairness must account for unique Flash characteristics
4 / 21
Motivation: Prior Schedulers
Fairness-oriented Schedulers: Linux CFQ, SFQ(D), Argon
Lack Flash-awareness and appropriate anticipation support
Linux CFQ, SFQ(D): fail to recognize the need for anticipation
Argon: overly aggressive anticipation support
Flash I/O Scheduling: write bundling, write block preferential, and
page-aligned request merging/splitting
Limited applicability to modern SSDs, performance-oriented
5 / 21
Motivation: Read-Write Interference
0 1 2
P
r
o
b
a
b
i
l
i
t
y

d
e
n
s
i
t
y
I/O response time (in msecs)
Intel SSD read (alone)
0 0.2 0.4 0.6
P
r
o
b
a
b
i
l
i
t
y

d
e
n
s
i
t
y
I/O response time (in msecs)
Vertex SSD read (alone)
0 100 200 300
P
r
o
b
a
b
i
l
i
t
y

d
e
n
s
i
t
y
all respond quickly
I/O response time (in msecs)
CompactFlash read (alone)
6 / 21
Motivation: Read-Write Interference
0 1 2
P
r
o
b
a
b
i
l
i
t
y

d
e
n
s
i
t
y
I/O response time (in msecs)
Intel SSD read (alone)
0 0.2 0.4 0.6
P
r
o
b
a
b
i
l
i
t
y

d
e
n
s
i
t
y
I/O response time (in msecs)
Vertex SSD read (alone)
0 100 200 300
P
r
o
b
a
b
i
l
i
t
y

d
e
n
s
i
t
y
all respond quickly
I/O response time (in msecs)
CompactFlash read (alone)
0 1 2
P
r
o
b
a
b
i
l
i
t
y

d
e
n
s
i
t
y
I/O response time (in msecs)
Intel SSD read (with write)
0 0.2 0.4 0.6
P
r
o
b
a
b
i
l
i
t
y

d
e
n
s
i
t
y
I/O response time (in msecs)
Vertex SSD read (with write)
0 100 200 300
P
r
o
b
a
b
i
l
i
t
y

d
e
n
s
i
t
y
I/O response time (in msecs)
CompactFlash read (with write)
Fast read response is disrupted by interfering writes.
6 / 21
Motivation: Parallelism
1 2 4 8 16 32 64
0
2
4
6
8
Number of concurrent I/O operations
S
p
e
e
d
u
p

o
v
e
r

s
e
r
i
a
l

I
/
O
Intel SSD


Read I/O parallelism Write I/O parallelism
1 2 4 8 16 32 64
0
1
2
3
4
Number of concurrent I/O operations
S
p
e
e
d
u
p

o
v
e
r

s
e
r
i
a
l

I
/
O
Vertex SSD
SSDs can support varying levels of read and write parallelism.
7 / 21
Motivation: I/O Anticipation Support
Reduces potential seek cost for mechanical disks
...but largely negative performance eect on Flash
Flash has no seek latency: no need for anticipation?
No anticipation can result in unfairness: premature service
change, read-write interference
8 / 21
Motivation: I/O Anticipation Support
0
1
2
3
4
I
/
O

s
l
o
w
d
o
w
n

r
a
t
i
o


L
i
n
u
x

C
F
Q
,

n
o

a
n
t
i
c
.
S
F
Q
(
D
)
,

n
o

a
n
t
i
c
.
F
u
l
l

q
u
a
n
t
u
m

a
n
t
i
c
.
Read slowdown
Write slowdown
Lack of anticipation can lead to unfairness; aggressive anticipation
makes fairness costly.
9 / 21
FIOS: Policy
Fair timeslice management: Basis of fairness
Read-write interference management: Account for Flash I/O
asymmetry
I/O parallelism: Recognize and exploit SSD internal
parallelism while maintaining fairness
I/O anticipation: Prevent disruption to fairness mechanisms
10 / 21
FIOS: Timeslice Management
Equal timeslices: amount of time to access device
Non-contiguous usage
Multiple tasks can be serviced simultaneously
Collection of timeslices = epoch; Epoch ends when:
No task with a remaining timeslice issues a request, or
No task has a remaining timeslice
11 / 21
FIOS: Interference Management
0 1 2
P
r
o
b
a
b
i
l
i
t
y

d
e
n
s
i
t
y
I/O response time (in msecs)
Intel SSD read (with write)
0 0.2 0.4 0.6
P
r
o
b
a
b
i
l
i
t
y

d
e
n
s
i
t
y
I/O response time (in msecs)
Vertex SSD read (with write)
0 100 200 300
P
r
o
b
a
b
i
l
i
t
y

d
e
n
s
i
t
y
I/O response time (in msecs)
CompactFlash read (with write)
Reads are faster than writes interference penalizes reads
more
Preference for servicing reads
Delay writes until reads complete
12 / 21
FIOS: I/O Parallelism
SSDs utilize multiple independent channels
Exploit internal parallelism when possible, minding timeslice
and interference management
Parallel cost accounting: New problem in Flash scheduling
Linear cost model, using time to service a given request size
Probabilistic fair sharing: Share perceived device time usage
among concurrent users/tasks
Cost =
T
elapsed
P
issuance
T
elapsed
is the requests elapsed time from its issuance to its
completion, and P
issuance
is the number of outstanding requests
(including the new request) at the issuance time
13 / 21
FIOS: I/O Anticipation - When to anticipate?
Anticipatory I/O originally used for improving performance on disk
to handle deceptive idleness: wait for a desirable request.
Anticipatory I/O on Flash used to preserve fairness.
Deceptive idleness may break:
timeslice management
interference management
14 / 21
FIOS: I/O Anticipation - How long to anticipate?
Must be much shorter than the typical idle period for disks
Relative anticipation cost is bounded by , where idle period is
T
service


1
where T
service
is per-task exponentially-weighted
moving average of per-request service time
(Default = 0.5)
ex. I/O anticipation I/O anticipation I/O
15 / 21
Implementation Issues
Linux coarse tick timer High resolution timer for I/O
anticipation
Inconsistent synchronous write handling across le system and
I/O layers
ext4 nanosecond timestamps lead to excessive metadata
updates for write-intensive applications
16 / 21
Results: Read-Write Fairness
0
8
16
24
32
I
/
O

s
l
o
w
d
o
w
n

r
a
t
i
o
4reader 4writer on Intel SSD


proportional
slowdown

R
a
w

d
e
v
i
c
e

I
/
O
L
i
n
u
x

C
F
Q
S
F
Q
(
D
)
Q
u
a
n
t
a
F
I
O
S
0
8
16
24
32
proportional
slowdown

R
a
w

d
e
v
i
c
e

I
/
O
L
i
n
u
x

C
F
Q
S
F
Q
(
D
)
Q
u
a
n
t
a
F
I
O
S
I
/
O

s
l
o
w
d
o
w
n

r
a
t
i
o
4reader 4writer (with thinktime) on Intel SSD
Average read latency Average write latency
Only FIOS provides fairness with good eciency under diering
I/O load conditions.
17 / 21
Results: Beyond Read-Write Fairness
0
8
16
proportional
slowdown

I
/
O

s
l
o
w
d
o
w
n

r
a
t
i
o
4reader 4writer on Vertex SSD


R
a
w

d
e
v
i
c
e

I
/
O
L
i
n
u
x

C
F
Q
S
F
Q
(
D
)
Q
u
a
n
t
a
F
I
O
S
Mean read latency
Mean write latency
0
2
4
6
8
proportional
slowdown

I
/
O

s
l
o
w
d
o
w
n

r
a
t
i
o
4KBreader and 128KBreader on Vertex SSD


R
a
w

d
e
v
i
c
e

I
/
O
L
i
n
u
x

C
F
Q
S
F
Q
(
D
)
Q
u
a
n
t
a
F
I
O
S
Mean latency 4KB read
Mean latency 128KB read
FIOS achieves fairness not only with read-write asymmetry but
also requests of varying cost.
18 / 21
Results: SPECweb co-run TPC-C
0
2
4
6
8
10
12
R
e
s
p
o
n
s
e

t
i
m
e

s
l
o
w
d
o
w
n

r
a
t
i
o
SPECweb and TPCC on Intel SSD


28
R
a
w

d
e
v
i
c
e

I
/
O
L
i
n
u
x

C
F
Q
S
F
Q
(
D
)
Q
u
a
n
t
a
F
I
O
S
SPECweb
TPCC
FIOS exhibits the best fairness compared to the alternatives.
19 / 21
Results: FAWNDS (CMU, SOSP09) on CompactFlash
0
1
2
3
T
a
s
k

s
l
o
w
d
o
w
n

r
a
t
i
o


proportional slowdown
R
a
w

d
e
v
i
c
e

I
/
O
L
i
n
u
x

C
F
Q
S
F
Q
(
D
)
Q
u
a
n
t
a
F
I
O
S
FAWNDS hash gets FAWNDS hash puts
FIOS also applies to low-power Flash and provides ecient fairness.
20 / 21
Conclusion
Fairness and eciency in Flash I/O scheduling
Fairness is a primary concern
New challenge for fairness AND high eciency (parallelism)
I/O anticipation is ALSO important for fairness
I/O scheduler support must be robust in the face of varied
performance and evolving hardware
Read/write fairness and BEYOND
May support other resource principals (VMs in cloud).
21 / 21

You might also like