You are on page 1of 22

CSCI 315: Artificial Intelligence

through Deep Learning

W&L Winter Term 2017


Prof. Levy

Introduction to Deep Learning with


TensorFlow
Why TensorFlow (vs. just NumPy)?
• Recall two main essentials: dot product and
activation function derivative.
• Dot product: n
net j =∑ a i wij , w 0≡1
i=0
hnet = np.dot(append(Ij,1), self.wih)
• “Embarrassingly Parallel”: since each unit i has its
own incoming weights, neti can be computed
independently from / simultaneously with all other
units in its layer.
• On an ordinary computer, we (NumPy dot) must
compute one neti after another, sequentially.
Ordinary dot product computation
for a layer

First me! Then me!


Exploiting Parallelism

All together now!


GPU to the Rescue!

• Graphics Processing Unit: Designed for videogames,


to exploit the parallelism in pixel-level updates.
• NVIDIA offers CUDA API for programmers, but it's
wicked hard – need to track locations of values in
memory.
• TensorFlow exploits GPU / CUDA if they're available.
GPU: A Multi-threaded
architecture
A traditional architecture has: one
processor, one memory, one process at a
time:

CPU

Von Neumann
Bottleneck

Memory
http://web.eecs.utk.edu/~plank/plank/classes/cs360/360/notes/Memory/lecture.html
• A distributed architecture (e.g., Beowulf cluster) has
several processors, each with its own memory
• Communication among processors uses message-
passing (e.g., MPI)

Connecting Network

CPU CPU … CPU

Memory Memory … Memory


• A shared memory architecture allows several processes
to access the same memory, either from a single CPU or
several CPUs
• Typically, a single process launches several “lightweight
processes” called threads, which all share the same heap
and global memory with each having its own stack.
• Ideally, each thread runs on its own processor (“core”)

Core 1 Core 1 … Core n

NVIDIA
Jetson TK1: NVIDIA GeForce
Memory (Heap / Globals) 192 cores GTX 1080Ti:
3584 cores
Python vs. NumPy vs. TensorFlow
• Dot product in “naive” Python:

• This will be slow, because the interpreter is


executing the loop code c += a[k] * b[k] over and
over
• Some speedup is likely once the interpreter has
compiled your code into a .pyc (bytecode) file.
Python vs. NumPy vs. TensorFlow
• Dot in NumPy: c = np.dot(a, b)
• “Under the hood”: Your arrays a and b are passed
to a pre-compiled C program that computes the dot
product, typically much faster than you would get
with your own code:

• Hence, TensorFlow will require us to specify info


about types and memory in order to exploit GPU
Why TensorFlow (vs. just NumPy)?
• Recall two main essentials: dot product and
activation function derivative.
• Activation function derivative:
1 df (x) ex
f (x )= = f ' (x)= x 2 = f ( x)(1− f ( x))
1+e−x dx (1+e )

f (x )=tanh(x ) f ' (x )=sech2 ( x)


e ix ∂ yi
y i=f ( x i)= = yi(1− y i )if i= j ,− y i y j if i≠*j
∑ ej x
∂yj
j

• This is called symbolic differentiation and requires


us to use our calculus or a special computation tool,
case-by-case. TensorFlow will automate this for us!
Tensor + Flow = TensorFlow
• Scalar: a single number (rank = zero)
• Vector: a sequence of numbers (rank = one)
• Matrix: a rectangular array of numbers
(rank= two)
• Tensor: any rank

https://www.mathworks.com/help/matlab/math/ch_data_struct5.gif
Rank as Bracket Count
TensorFlow: First Program
Line-by-line analysis

Our usual import-and-abbreviate


(c.f. import numpy as np)
• These are the parameters (weights and biases) of our
familiar neural-net layer.
• 28x28=784 pixels for input image; 10 possible digits
at output
• Like NumPy, Tensor flow provides some useful
generator functions (random_uniform, zeros)
that we can call directly.
• What’s really new is the Variable object: this is the
component from which we will build our networks.
• For input data, TensorFlow requires a special kind
of object called a placeholder.
• Note the mandatory data type (32-bit float):
essential for GPU and related high-performance
tricks!
• So matmul is pretty clearly the TensorFlow
equivalent of NumPy dot.
• Unlike dot, however, matmul does not return an
immediate result; instead, it gives us the ability to
compute a result, in a Session (up next).
• We do however have enough to visualize our
model, using a traditional dataflow graph (hence the
“Flow” part of TensorFlow) ....
y

Adapted from Buduma (2017) Fig. 3.2


TensorFlow Sessions:
Getting the Job Done

https://stackoverflow.com/questions/44433438/understanding-tf-global-variables-initializer
Finishing Up
An ordinary Python list,
containing 784 ones.

Note name agreement

So what output do we expect?


Finishing Up

You might also like