You are on page 1of 29

1

Performing Lisp
LUV ‘94 Tutorial
Kenneth R. Anderson
BBN STD
6/4a
10 Moulton St.
Cambridge, MA 02138
KAnderson@BBN.COM
2
Some Questions
Q: Is Lisp Slow?
A: No, Lisp can be as fast as C, even faster.
Q: If Lisp is so fast, why is my application so
slow?
A: It is easy to write Lisp. It is easy to write
slow Lisp.
Q: If I write fast Lisp, will it look like C?
A: Sometimes it can, but you can also use
features unique to Lisp.
3
You will learn
• about performance issues in Lisp and other
high level languages, like C and C++.
• how to identify performance problems in your
software.
• how to avoid performance pitfalls.
4
Performance
• Performance is not the only -- or even the
most important -- measure of a Lisp
implementation. [RPG85, p.1]
• Performance problems are easily noticed.
• Other measures are less obvious, but become
important in the long run.
– Clarity
– Explicit knowledge
– Extensibility
– Habitibility
• How do we deliver such things, and
performance?
5
Rough Course Outline
• Performance myths persist.
• How do we learn slow Lisp?
• Understand the architecture of a Lisp system.
• Take a ruler to a spitting contest.
• Read the fine print.
• Your application is the ultimate benchmark.
• Performance expert systems.
• Delivering performance.
• Getting faster by getting higher.
6
Spot Quiz
1. Which is faster, A or B?
Lisp C
A: (mod i 4) i%4
B: (logand i 3) i&3
2. Which is faster, Lisp’s SORT or C’s QSORT?
(sort #(1 2 3 1 2 3 ...))
3. What is relative times of:
(defun chaos (n x)
(dotimes (i n)
(chaos 5 1) 1 (setq x (+ 1 x (- (* x x))))
x))
(chaos 5 0.33) ___
(chaos 5 1/3) ___
7
Robust Line Fit to 50 Points
• Given a List of (x y) points, estimate a line fit
as follows:
• Slope is the median of the slope between
each pair of points.
• Intercept is the median of y - x*Slope of each
point.
• Error is the median of the absolute value of
the error of each point,
error = y - (Intercept + x*Slope)
8
Robust Line Fit to 50 points
(defun median (sequence &optional (order #'>))
;; Destructively sort SEQUENCE and return the median element.
(setq sequence (sort sequence order))
(values
(multiple-value-bind (n/2 remainder)
(truncate (length sequence) 2)
(if (zerop remainder)
(* 0.5 (+ (elt sequence (1- n/2)) (elt sequence n/2)))
(elt sequence n/2)))
sequence))
(setq points
(let ((points ()))
(dotimes (i 50)
(let ((p (random 100)))
(push (list p (+ (* 2 p) (- (random 5) 2)))
points)))
points))
9
What single change leads to the
most improvement?
(defun line-fit-1 (points)
;; Fits a robust line to a list of (x y) pairs, POINTS.
;; Returns intercept, slope, and median absolute error.
(let*
((slope (median
(mapcon #'(lambda (L) ; Slope of each pair of points.
(let ((p1 (first L)))
(mapcan #'(lambda (p2)
(if (/= (first p2) (first p1))
(list
(/ (- (second p2) (second p1))
(- (first p2) (first p1))))))
(rest L))))
points)))
(intercept (median
(mapcar #'(lambda (p) (- (second p) (* slope (first p))))
points)))
(error (median
(mapcar #'(lambda (p)
(abs (- (second p)
(+ (* slope (first p)) intercept))))
points))))
(values intercept slope error)))
10
Myth: Lisp is slow because it is
interpreted.
• Recent quotes:
“Lisp, Smalltalk, and other interpreted languages ...” [Udell]
“Interpreted languages, such as Smalltalk adapt this
approach ...” [Meyers]
“Often the complete system and the genetic operators
themselves are written in an interpreting language like
LISP [Koza 1992, Page 71]. This reduces performance in
most hardware environments.” [Nordin]
• Many people only experience Lisp through an
interpreter, such as EMACS.
• Lisp is compiled.
• Some modern Lisp’s don’t have interpeters
(MCL, Franz for Windows,Scheme)
• Lisp semantics described in terms of EVAL.
11
Myth: Automatic garbage
collection is slow.
• Noticable pauses, often at the wrong time.
• But ephemeral GC make pauses quite short.
• Explict GC cost is hidden.
• With enough memory, GC becomes almost free
[Appel].
• Reference counting, can be quite costly (commonly
advocated in C++).
• GC quite competitive with hand optimized methods
[Zorn]
• However, consing is not free!
12
Myth: Lisp is slow on stock
hardware.
• High performance Lisp implementations have
existed for some time.
• Stock hardware does provide some support
for non C languages.
• Stock hardware may provide less support for
object oriented languages, like Smalltalk, C++
or Lisp [Zorn94]
– Branches less common.
– Indirect function calling (dispatch) more common.
• RISC architectures generally match Lisp
better than CISC because complex
instructions may not match Lisp well
[RAM93].
13
Myth: Performance of low level
languages is easier to estimate.
• Estimates from C experts can vary by a factor
of 5 [BKV].
• Todays RISC computers make performance
issues harder to expose.
– Register declarations in C are better made by the
compiler.
• C++ performance issues are much less clear
than C’s.
– “->“ Smart pointers are neither.
– C++ overhead can be 30% - 100% of C’s.
14
Myth: Recode in C or C++ to
make your algorithm faster.
• Recoding in general can make an algorithm
faster because you think about it.
– Recoding for a parallel processor can make
parallelization unnecessary.
• Recoding in C requires declarations that
could be added to Lisp just as easily.
• You are on your own for the hard stuff that
the Lisp implementor provides.
• Recode in C, then recode in Lisp.
15
Spot Quiz Results
1. Which is faster, A or B?
Lisp C
A: (mod i 4) i%4
B: (logand i 3) i&3
2. Which is faster, Lisp’s SORT or C’s QSORT?
(sort #(1 2 3 1 2 3 ...)) [Some Lisp’s]
3. What is relative times of:
(defun chaos (n x)
(dotimes (i n)
(chaos 5 1) 1 (setq x (+ 1 x (- (* x x))))
x))
(chaos 5 0.33) 6
(chaos 5 1/3) 400
16
How do we learn slow Lisp?
• Lisp programming is easy, but only to a point
[RGB91].
• In Lisp, writing a slow program is easy.
• C programming is difficult.
– Programmer provides all the details (declaration).
– Few datatypes.
• In C writing a slow program is almost
impossible.
17
How do we learn slow Lisp?
• Simple object model.
– Everything is an object.
– Functions operate on objects.
• Uniform cost assumption.
– ELT works on any sequence O(1), or O(n).
• Mapping style garden path.
– (mapcar #’feature objects) ; Maybe
OK.
– (reduce #’+ (mapcar #’square (mapcar #’x-pos data)) ; YOW!
– (remove-duplicates
(apply #’append (mapcar #’attributes objects)) ; YOW!
18
How do we learn slow Lisp?
• Learn general (~ equivalent) subset of Lisp.
– SETF vs SET
– EQUAL vs EQ
– ELT vs AREF
– APPLY vs FUNCALL
– READ-FROM-STRING vs INTERN
• Overly general interface bug.
– Lots of keyword and optional arguments.
– Lots of hooks.
– Lots of methods to override.
19
How do we learn slow Lisp?
• Suffiently smart compiler bug.
– (dotimes (i N) (foo i)) ; Is i a fixnum?
– (length (the array a)) ; Is length inlined?
– (+ (the fixnum i) (the fixnum j)) ; Is + inlined?
• Bad declaration bug.
(declare (type (array fixnum *) a1 a2 a3)
(declare (type (simple-array t *) a1 a2 a3)
• Inappropriate data structures
– Matrix multiply with matrix as nested lists.
– Long division with roman numerals [Allen].
• Lisp (almost) lets you forget about computer science.
– Easy to do things that are hard in other languages.
– Lets you worry about the details later.
– The ultimate time space tradeoff.
20
Levels of Lisp System
Architecture
• Hardware Level
• Lisp ‘Instruction’ Level
– Variable/constant reference
– Free/Special Variable lookup and binding
– Function Call/Return
– Data structure manipulation
– Type computation
– Arithmetic
• Lisp operation level
– MAPCAR, ASSOC, APPEND, REVERSE
• Major facilities
– Interpreter
– I/O
– Garbage Collection
– Compiler
21
Objects
• Lisp
– Variables can have variable type at run time.
– All objects have a type at runtime.
– Objects are passed to functions.
• C++
– Variables have a fixed type at compile time.
– Primitive objects have no runtime type information.
– User defined classes have type identity only for method
dispatch and exception handling.
– Runtime type information (RTTI) must be provided by the user.
– Objects are passed to functions by
• Copying
• Pointer
• Reference
22
Object Representations
Pointer
Pointer Header Data
+ Obvious representation for C or C++.
- Level of indirection.
- Objects are fatter.
- Type check requires page reference.
23
Object Representations (cont.)
Object description
Header? Data
Pointer Tag CDR CAR
Immediate Object Description
Data Tag
+ Some type info without page reference.
+ Small, immediate objects.
+ Some indirect objects without header (cons).
+/- Tag manipulation required, but can be
combined with access.
24
Object Representation (cont.)
512 Byte Page
BIBOP - Big Bag of Pages Header
Pointer Data
+ No per object header.
- Type check requires page reference.
25
Object Representation (cont.)
Object Table
Header Pointer data
Header Pointer data
Header Pointer
. + Two levels (types) of storage
. management can be used.
.
- Extra level of indirection.
26
MCL Type Tags
000 Fixnum
001 Uvector - vectors, structures, instances
010 Symbol
011 Double-float
100 Cons, Nil
101 Short-float
110 Function
111 Immediate - Characters, and others.
27
Allegro type tags
000 Even fixnum
001 List
010 Other
100 Odd fixnum
101 Nil
110 Char
111 Symbol
28
Allegro primitive operations
GetTag(v) (long) v & 7
FixnumToInt(v) v >> 2
IntToFixnum(v) v << 2
ScharToChar(v) (char) (v >> 3)
CharToSchar(v) (LispVal) ((((int) v) << 3) |
CharType)
SymbolValue(s) (*(LispVal *) (s-SymbolOff+8)
Cdr(x) (*(LispVal*)((LispVal) x - ListOff
+ CdrOff))
29
Allegro Arrays
• Two structures used:
– Fixed size Type Size Data ...
• simple-array
• vector
– Array
• Everything else
Type NDim Fill Ptr Data Ptr Displace Flags
Type Size Data ...

You might also like