You are on page 1of 120

An asm Introduction And The Embedded

"Hello World"!!!!
Assembler is a low-level language. It consists of a list of instructions that are in no way comparable to
anything you might know from C, Basic or Pascal. The AVR has about 120 of them, depending on the type
and it's peripherals and size. You can download the instruction set from Atmel's website and print out the
summary (a list of all instructions and what type of operands they have). If you download it, I suggest
printing out pages 1 and 10 to 15. That's not too much and yet is everything you need for a start. Here's a
link to it. The document is about 150 pages in total.
Let's have look at probably the easiest program possible:
main:
rjmp main

; this is a the label "main"


; Relative JuMP to main

"Main" (the first line) is not translated by the assembler, but used as a label. This label replaces an address
in the AVR's code space (FLASH memory). At exactly this address the next instruction (rjmp main) is
placed. When the instruction is exectued, the cpu will jump to "main" again. The jump will be repeated over
and over, resulting in an infinite loop.
After power-up, or when a reset has occured, the micro will always start program execution from address
0x0000. The first bytes in code space are the "Interrupt Vector Table". The AVR has internal peripherals,
like timers, a UART or an analog-to-digital converter. These can generate interrupts which will stop normal
code execution in order to react on certain events as fast as possible. This is good to know.
See Architecture -> Interrupts for more!
The interrupt vector table can be used by you to tell the micro what it has to do when a specific interrupt
occurs. The normal AVRs have space for one instruction per interrupt vector (an rjmp for example). This
instruction will be executed when the interrupt occurs (There's more to tell you about this, but not now...)
The first interrupt vector is the "reset vector". It contains the instruction the cpu should execute when a
reset occurs. We will use it to jump to our program we already had above:
.org 0x0000
rjmp main
main:
rjmp main

; the next instruction has to be written to address 0x0000


; the reset vector: jump to "main"
;
; this is the label "main"
; Relative JuMP to main

Assuming that our AVR is running at 4 MHz (4 Million clock cycles per second), how long does all this
take? AVRs are pretty fast - most instructions can be executed in one or two clock cycles. Some
instructions need a bit more time, but these are not important now. As the external clock is not divided
internally (some other microcontrollers do that, like the HC11 from motorola, but that's an old one), two
clock cycles at 4 MHz means that the instruction takes 0.0000005 seconds. Pretty fast!
"main" itself only needs 0.0000005 seconds per round, as it only consists of a single rjmp. Right now, our
main program doesn't actually DO anything.
The first thing I did when I started on AVRs was making an LED flash. LEDs can be connected to the
AVR's I/O ports (Architecture -> I/O Ports). These can be set to be input or output individually for each pin
and you can also enable an internal pull-up resistor if the port pin is set to input. Each I/O port has three
registers you can work with: The port's data register (for example PortB), the Data Direction Register (for
example DDRB) and the Pin register (for example PinB). For confiugring a pin as an output pin, set its
corresponding bit in the data direction register. The output value (0 or 1) can then be set in the Port Data
register.
If you have an STK500, get yourself an ATmega8 in a DIP package and a 4 MHz crystal. The micro can
plugged into the green target socket named "SCKT3200A2". Connect the 6-pin cable from "ISP6PIN" to
the ISP header for the green socket. That's the one named "SPROG2". The default jumper setting for the

oscillator system is the software oscillator. We want to use the crystal oscillator instead. Set the "OSCSEL"
jumper to close pins 2 and 3 (pin 1 is marked with a "1"). "Vtarget", "AREF", "RESET" and "XTAL1" should
be closed. Of course, the mega8 should be the only micro plugged into the STK. Do not insert more than
one AVR at a time! Now take one of the two wire-cables and use it to connect "PB3" (that's one of the
"PORTB" pins) to "LED0" (that's one of the "LEDS" pins). The crystal belongs into the crystal socket on the
STK500.
The STK500 LEDs are connected to the AVR via some extra components. If you want, you can take a look
at the LED circuit in the STK500 documentation (it's included in the AVR Studio help file and also came
with the STK). The most important fact is that the LED is connected to be "active low", which means that if
PortB.3 is low now (we connected it to one of the LEDs), the LED will be ON.
If you don't have an STK, connect the LED to PortB.3 via a current limiting resistor (about 470 Ohms is
OK, the value is not critical). Connect the resistor to the port pin and the LED's cathode, the anode goes to
Vcc. This will result in the same "active low" behavior.
Let's now change our program a bit so that it configures all PortB pins as output pins. After a reset, all Data
bits are set to zero, so the LED should be ON when the program is executed:
.org 0x0000
rjmp main
main:
ldi r16, 0xFF
out DDRB, r16
loop:
rjmp loop

; the next instruction has to be written to address 0x0000


; the reset vector: jump to "main"
;
; this is the label "main"
; load register 16 with 0xFF (all bits are 1)
; write the value in r16 (0xFF) to Data Direction Register B
; this is a new label we use for a "do nothing loop"
; jump to loop

The new loop was inserted so that we can set the Data Direction bits to our needs and then loop without
doing that again. We could, however, include the load and store instructions (ldi and out) in the loop. It
wouldn't hurt, but the micro would configure the Ports to the same value over and over again.
If you haven't done it already, download AVR Studio (I prefer version 3.5, not 4!) from Atmel's website and
install it. Now read "Creating a Project" in the AVR Studio Section and create a new project, choose your
favourite name and take into account that the LED will flash when it's finished :-)
It's now time to start a new page (it's already long enough). Our code will grow a bit...it might be useful to
have a calculator for all the timing stuff...
On the first page I told you how a simple program works, how to create a project and some details about
the AVR. Now it's time to have a look at what we want to do and how it can be done.
We want to make an LED flash. Basically we want to switch it on and off in a loop. The LED is connected
to PortB.3, one of the AVRs I/O pins we configured as an output.
Single bits in the I/O space can be cleared and set with the cbi (clear bit in I/O) and sbi (set bit in I/O)
instructions. Unfortunately, these don't work for all I/O registers. Open the AVR instruction set (the
complete one, not just the 7 pages you printed out) and look for cbi: It can only clear bits in registers 0-31
(Hex: 0x00...0x1F). PortB is in this range (0x18), so we can use cbi (same for sbi).
As the LED is ON when PortB.3 is low, the LED can be switched on with cbi and off with sbi. Let's add that
to the loop:
.org 0x0000
rjmp main
main:
ldi r16, 0xFF
out DDRB, r16
loop:
sbi PortB, 3

; the next instruction has to be written to address 0x0000


; the reset vector: jump to "main"
;
; this is the label "main"
; load register 16 with 0xFF (all bits are 1)
; write the value in r16 (0xFF) to Data Direction Register B
;
; switch off the LED

cbi PortB, 3
rjmp loop

; switch it on
; jump to loop

The reason why we first switch it off is that it was already on when entering the loop for the first time: After
reset, all port data bits are zero. By setting bit 3 in DDRB we configured portB.3 as an output and switched
the LED on.
You could now program your mega8 with this code, but you would only see the LED being on all the time.
The loop switches the LED off (sbi PortB, 3), which takes 1 clock cycle. Then, after 0.00000025 (again 1
clock cycle) seconds, the LED is on again (cbi PortB, 3). The rjmp takes 2 clock cycles (0.0000005
seconds). That's a bit too fast for the eye. We need to waste some time between switching the LED on and
off (let's say 0.5 seconds)
0.5 seconds at 4 MHz equals 2,000,000 clock cycles. Generating such a long delay requires either a timer
(which would use interrupts) or delay code that just takes lots of time for execution while occupying only
small space. This example will use a delay loop.
Keeping track of how many times the loop has been executed is done with a counter. As the AVR is an 8bit microcontroller, the registers con only hold the values 0 to 255. That's less than the 2 Million clock
cycles we need to wait between toggling the LED output, but we'll see how far we can get with that and
some tricks... The delay loop will be in a seperate subroutine we can call in order to wait for half a second.
The AVR Assembler -> jump and subroutine call pages might be worth looking at now.
Registers can be used in pairs as well, allowing to work with values from 0 to 65535. The following piece of
code clears registers 24 and 25 and increments them in a loop until they overflow to zero again. When that
condition occurs, the loop doesn't go around again:
clr r24
clr r25
delay_loop:
adiw r24, 1
brne delay_loop

; clear register 24
; clear register 25
; the loop label
; "add immediate to word": r24:r25 are incremented
; if no overflow ("branch if not equal"), go back to "delay_loop"

This little loop takes a lot of time: clr needs 1 cycle, adiw needs two cycles and brne needs 2 cycles if the
branch is done and 1 otherwise. Every time the registers don't overflow the loop takes adiw(2) + brne(2) =
4 cycles. This is done 0xFFFF times before the overflow occurs. The next time the loop only needs 3
cycles, because no branch is done. This adds up to 4*0xFFFF(looping) + 3(overflow) + 2(clr) = 262145
cycles. This is still not enough: 2,000,000/262,145 ~ 7.63
We need to tweak the loop a bit and also crete a loop "around" it which will contain our 262,145 cycle loop.
For fine-tuning the inner loop we need to change the clr instructions to ldi so that we can use a different
start value than 0. The "outer" loop will be down-counting from 8 to zero using r16. This is how the delay
code looks now:
ldi r16, 8
outer_loop:
ldi r24, 0
ldi r25, 0
delay_loop:
adiw r24, 1
brne delay_loop
dec r16
brne outer_loop

; load r16 with 8


; outer loop label
;
; clear register 24
; clear register 25
; the loop label
; "add immediate to word": r24:r25 are incremented
; if no overflow ("branch if not equal"), go back to "delay_loop"
;
; decrement r16
; and loop if outer loop not finished

Again, some calculations: The inner loop is now treated like one BIG instruction needing 262,145 clock
cycles. ldi needs 1 clock cycle, dec also needs 1 clock cycle and brne needs 1 or 2 cycles (see above).
The overall loop needs: 262,145 (inner loop) + 1 (dec) + 2 (brne) = 262148 * 8 = 2097184 cycles plus the
initial ldi = 2097185. Wait. Subtract one because the last brne didn't result in a branch, so it needs

2097184 cycles. This is more like what we want, but 97184 cycles too long. This is where the fine-tuning
comes in - we need to change the initial value of r24:r25.
The outer loop is executed 8 times and includes the "big-inner-loop-instruction". We have to subtract some
cycles from the inner loop: 97184 / 8 =12148 cycles per inner loop. This is what the inner loop has to be
shorter. Every iteration of the inner loop takes 4 cycles (the last one takes 3 but that's not so important), so
let's divide those 12148 by 4. That's 3036.5 or 3037 less iterations. This is our new initialisation value for
r24:r25!
Now, if you want, do all those calculations again: The result is 2,000,000 clock cycles! Now just put this
into a seperate routine and call it from the main LED flashing loop:
.org 0x0000
rjmp main
main:
ldi r16, low(RAMEND)
out SPL, r16
ldi r16, high(RAMEND)
out SPH, r16
ldi r16, 0xFF
out DDRB, r16
loop:
sbi PortB, 3
rcall delay_05
cbi PortB, 3
rcall delay_05
rjmp loop
delay_05:
ldi r16, 8
outer_loop:
ldi r24, low(3037)
ldi r25, high(3037)
delay_loop:
adiw r24, 1
brne delay_loop
dec r16
brne outer_loop
ret

; the next instruction has to be written to address 0x0000


; the reset vector: jump to "main"
;
;
; set up the stack
;
;
;
;
; load register 16 with 0xFF (all bits are 1)
; write the value in r16 (0xFF) to Data Direction Register B
;
; switch off the LED
; wait for half a second
; switch it on
; wait for half a second
; jump to loop
;
; the subroutine:
; load r16 with 8
; outer loop label
;
; load registers r24:r25 with 3037, our new init value
;
; the loop label
; "add immediate to word": r24:r25 are incremented
; if no overflow ("branch if not equal"), go back to "delay_loop"
;
; decrement r16
; and loop if outer loop not finished
; return from subroutine

For trying this in the simulator or assembling it, also add this line at the beginning of the text:
.include "c:\avr_studio_working_directory\appnotes\m8def.inc"
This will include the ATmega8 def file from Atmel which includes things like the PortB address definition.
Without this file the assembler will spit out error warnings!

Assembler Basics
Assembler is a low-level language which does not know any C-like commands like for{;;} or while{}.
Assembler instructions are small, for example out PortD, r15 writes the contents of register 15 (which in an
AVR can hold one byte) to PortD (which is 8 I/O lines handled as one I/O register).
Other assembler instructions only work on the register rather than on registers AND I/O registers or SRAM.
"inc r15" is one of them. It increments the value register 15 holds by one. This is useful for loops (like
for{;;}).

Almost every instruction leaves certain bits in the Status Register set or cleared based on the instruction's
result. These bits can be used by branch instructions or arithmetic instructions in order to perform correctly
(branch/don't branch, increment result etc).
Branch instructions jump to a specific code address (or code line) if the microcontroller is in a specific state
or just go on with the next code line if this state is not present. If the counting variable in a loop has not
reached the desired value, they can let the mcu repeat the loop.
Here is a small example code snippet showing how arithmetic, I/O and branch instructions work together:
ldi r16, 0
for_loop:
inc r16
out PortD, r16
cpi r16, 10
brlo for_loop

; load register 16 with zero


; this is a label we can jump or branch to
; increment register 16
; write contents of r16 to PortD
; compare value in r16 with 10 (this leaves a status for brlo)
; if value 10 not reached, repeat loop

In the loop the counter (r16) is increased in every iteration and written to PortD. When it reaches 10, brlo
will not jump to the beginning of the loop, but to the next instruction. This small example gives a good
impression of how small the steps you can take in assembler are. It can't get smaller, as this is what the
mcu does. None of the instructions will be split into smaller ones by the assembler. With the comments in
mind have a look at the AVR instruction set and also have a look at other instructions of the same type.
Assembler is very sensitive to programming errors. Try the above example with the increase instruction
and the compare instruction swapped. What happens? The first value on PortD is 0, the last one is 9.
Now have a look at the "Flow Charts" section and try to write a flow chart of the code above. You'll see that
flow charts make code less "cryptic" and readable. And keep them up to date every time you make a big
enhancement to your code so that you can still read it after two weeks. Comments are also very important,
especially if you can't make a flow chart every time your code changes. In assembler, a comment can be
written for almost every line of code, especially when tricks are used.

Flow Charts
Flow charts are a graphical representation of code, Program states or even SRAM contents, if used in a
creative way. Once you know how to use them for code you'll quickly develop your own style to create flow
charts for almost anything.
When you think about implementing a special algorithm or peripheral driver it might be better to have a
flow chart already done before you start hacking code. That will save lots of time. Trust me. I know. If you
have code that is not sufficiently commented or just BIG, analyse it by making up a flow chart. Very often
that helps, especially when you got it from the web.
Especially when writing code in assembler they are a great help, because assembler instructions are not
always self-explanatory and even well-structured code will get hard to read once it has grown to a certain
size.
Here is a small example flow chart:

You can find a good program for editing flow


charts atwww.rff.com. But a piece of paper will do
the job too if you need to make up one.

MCU Status
The microcontroller operates based on the Status Register (SREG) and other internal registers or
components. Most important is the Status Register which holds information on the last instruction and its
result and Interrupt enable status.
The SREG holds 8 Flags:
bit:
0
1
2
3
4
5
6
7

Name:
C (Carry bit)
Z (zero Flag)
N (Negative Flag)
V (Overflow Flag)
S (Signed Flag)
H (Half Carry bit)
T (Bit store Flag)
I (Global Interrupt enable Flag

The Carry flag is used for shift and rotate instructions. See the Basic Maths section for this. If the Global
Interrupt Flag is set, the mcu will perform the Interrupt Service Routine corresponding to the interrupt that

occured. Detailed knowledge about the other flags is not essential - most of the compare and branch
instructions can be used without looking at the flags in detail.
Just remember: They are important for mathematical operations, and changing them between calculating a
value and comparing it with something else might be fatal. That's why Interrupt Service routines should
preserve the SREG and any other registers they use (unless these registers are unused in normal code).
An interrupt might occur between comparing two values with each other and a following branch - the ISR
might change status flags and corrupt the flags the branch relies on.

Basic Mathematical Operations


Adding/subtracting, multiplying, shifting, rotating and bit manipulation of registers are essential steps when
calculating addresses, offsets or other values at runtime and when converting strings to data we want to
work on or data to values we can display.
Most of these instructions rely on specific bits in the Status Register (SREG), so you should have a look at
Assembler Basics -> MCU Status first.
Shifting and Rotating
Shifting and rotating registers is good for handling serial data. The following example shows rotating in a
bit into r16 if a serial data stream is present at PinD, 0:
shift_in:
clc
sbic PinD, 0
sec
rol r16

;when datastream bit is valid, go here


;clear carry flag
i;f PinD, 0 is cleared skip next
;set carry flag
;shift in carry flag

When a register is rotated left (rol), the MSB is shifted into the Carry Bit, bit 6 goes to bit 7, bit 5 to bit 6
and so on and bit 0 is replaced by the old Carry Bit.
C <- b7 <------ b0 <- C (Rotate left, rol)
C -> b7 ------> b0 -> C (Rotate right,ror)
When a register is shifted, the bit shifted out replaces the Carry Bit, the bit shifted in is 0.
C <- b7 <------ b0 <- 0 (Shift left, lsl)
0 -> b7 ------> b0 -> C (Shift right, lsr)
Another shift operation is asr (arithmetic shift right), which works like lsr, but bit 7 remains unchanged. The
rest is shifted right and the Carry bit is replaced by bit 0. This effectively divides a signed number by 2 (bit
7 holds the sign) and the Carry Bit can be used to round the result.
Bit Manipulation
cbr and sbr clear or set one multiple bit(s) in a register. These instructions only work on registers r16 to
r31. They do not use single bits as an argument, but masks which can contain multiple bits:
sbr r16, (1<<5)+(1<<3)
cbr r16, 0x03

;set bits 5 and 3 in register 16


;clear bits 1 and 0 in register 16

See the instruction set summary or AVRStudio assembler help for details on Status Register flags changed
by this instruction. You can for example use breq after this instruction when the result is zero.
See instruction set for logical instructions like and, andi, or, ori, eor, com and so on. These are pretty
simple to understand.

Adding and Subtracting


There is a whole bunch of add/subtract instructions available in AVRs which all have advantages and
disadvantages. Some can be used with all registers, others only with r16 to r31. I'll first list the add
instructions:
add r16, r17
adc r16, r2
adiw YL, 3

;r16 = r16 + r17


;r16 = r16 + r2 + C
;YL:YH = YL:YH + 3

add is easy to understand. The registers are added to the first register and appropriate flags in the Status
register are set.
adc is a bit fancier. It also uses the Carry flag of the previous operation to increase the result if it is set.
Good for multiple-byte operations (see Advanced Assembler section).
adiw is the only add instruction which takes a constant as an argument. It only works on the low bytes of
register pairs (24, 26, 28, 30) and adds the constant to the pair.
Subtracting can be done with some more instructions:
sub r16, r17
sbc r16, r2
sbiw ZL, 5
subi r16, 30
sbci r16, 4

;r16 = r16 - r17


;r16 = r16 - r2 - C
;ZL:ZH = ZL:ZH - 5
;r16 = r16 - 30
;r16 = r16 - 4 - C

The subtract instructions work like the add instructions, so I won't explain them in detail. The "subtract
immediate" (sbi) instruction can be used to make up an "addi" instruction:
(This can also be used to make an "addi" macro)
subi r16, -5
; r16 = r16 + 5
Multiplying
Multiplication is a bit more difficult than adding or subtracting. Classic AVRs (AT90S Series) don't support
the mul instruction, ATmegas have 6 different multiply instructions (multiply, multiply signed with unsigned,
multiply signed, fractional mul, fractional mul signed, fractional mul signed with unsigned).
Classic AVRs need some extra coding to perform multiplications, here is some pseudo-code:
for r16 = 1 to multiplier do
result = result + multiplicand
r16 = r16 + 1
repeat
This is more a loop adding than a multiplication, but it will do the job. In most cases the result will be 16bits wide, so have a look at the Advanced Assembler Section as well if you're not familiar with 16-bit
operations.
As ATmegas have the mul instruction, they don't need 16-bit operations and loops to perform a
multiplication. The result of mul instructions is always returned in r0:r1.
mul r16, r17
muls r16, r17
mulsu r16, r17

;r0:r1 = r16 * r17


;r0:r1 = r16(signed) * r17(signed)
;r0:r1 = r16(signed) * r17(unsigned)

As I mentioned above the megas also have fractional multiply instructions, but these are more advanced
and I never used them. So I can't tell you how they work. And I won't.

Multiple byte maths

When the 8 bit range is not enough, multiple bytes are needed to hold a value. Performing mathematical
operations on them requires more work than maths on single bytes. Most of these operations can written
as a macro and then be used just like normal instructions. The Carry Bit (SREG) is the reason for all this to
work - if an add instruction resulted in a number that is greater than 255, the carry bit is set. It can then be
used by adc (add with carry) to increment the high byte (16-bit example):
ldi r16, 1
ldi r17, 0
ldi r18, 255
ldi r19, 0
add r16, r18
add r17, r19

; load r16 with 1


; load r17 with 0 (r16:r17 = 1)
; load r18 with 255
; load r19 with 0 (r18:r19 = 255)
; add low bytes (= 256 => r16 = 0)
; add high bytes (= 0)

The result of the operation above is 0 because the carry of the low byte add was not used when adding the
high bytes.
add r16, r18
adc r17, r19

; r16:r17 = 1
; r18:r19 = 255
; add low bytes (= 256 => r16 = 0)
; add high byte with carry
; (= 0 + 1 (from carry) = 1)
; => r16:r17 = 256

Subtracting words from each other works just like adding: Subtract (sub) the low bytes from each other,
then subtract the high bytes with carry (sbc). This also works with subi (subtract immediate) and sbci
(subtract with carry immediate). A close look at the instruction set reveals some usefule compare
instructions as well! If a normal compare instruction (cp) returns "not equal" the carry bit is set as well. This
can them be taken into account by cpc (compare with carry) to compare the high bytes of two words. So
what about 32 bit values? It's the same, but with 4 bytes:
; r16..r19 = 0x00000100
; r20..r23 = 0x002000FF
add r16, r20
adc r17, r21
adc r18, r22
adc r19, r23
cp r16, r20
cpc r17, r21
cpc r18, r22
cpc r19, r23
brne not_eq

; add bytes0
; add bytes1 with carry
; add bytes2 with carry
; add bytes3 with carry
; result:
; r16..r19 = 0x00200200
; perform 32-bit compare
; (result: greater than)
; jump to "not equal"-code

Multiply and divide operations are not as easy as adding and subtracting. The bigger AVRs have a
hardware multiplier, but dividing values still has to be done in software. For those AVRs without a HW mul,
you'll have to write a software multiply routine (which is not difficult, I will add one here). If you need
multiply and divide operations, see The Atmel Appnote page and look for AVR200, AVR201 and AVR202.

The Different Jumps


That variety of jump instructions the AVR has looks a bit frightening for beginners. On this page, they're
described in the order they appear in the instruction set summary:
rjmp
"Relative Jump". This instruction performs a jump within a range of +/- 2k words. Added together, it can
reach 4k words or 8k bytes of program memory, so it's possible to reach the whole program space of the
8515 or mega8 with it, as well as any other 8k-AVR. You can also use it on other AVRs as well, of course!

The advantage of rjmp over jmp is that rjmp only needs 1 word of code space, while jmp needs 2 words.
Example:
rjmp go_here
ijmp
"Indirect Jump" to (Z). This instruction performs a jump to the address pointed to by the Z index register
pair. As Z is 16 bits wide, ijmp allows jumps within the lower 64k words range of code space (big enough
for a mega128). This instruction is especially cool for jumping to calculated addresses, or addresses from
a lookup table. Of course, special care has to be taken when setting up Z. Example:
ldi ZL, low(go_there)
ldi ZH, high(go_there)
ijmp
jmp
"Jump". While rjmp is limited to +/- 2k words, jmp can be used to jump anywhere within the code space.
The address operand of jmp can be as big as 22 bits, resulting in jumps of up to 4M words. The
disadvantage over rjmp is that jmp needs 2 words of code space, while rjmp needs just one word.
Example:
jmp go_far

Subroutine Calls
The AVR also has various subroutine call instructions. These are now described in the order they appear in
the instruction set summary. IMPORTANT: Subroutine calls require a proper stack setup and use of the
return instructions (which are described at the end of this page). For more about the stack,
read Architecture -> The Stack. That section also provides some more information on subroutines.
rcall
"Relative Call Subroutine". Just as rjmp, rcall can reach addresses within +/- 2k words. When rcall is
executed, the return address is pushed onto the stack. It needs 1 word of program space. Example:
rcall my_subroutine
icall
"Indirect Call to (Z)". This instruction works similar to ijmp, but as a subroutine call. The subroutine pointed
to by the Z index register pair is called. As Z is 16 bits wide, the lower 64k words of code space can be
addressed. The return address is pushed onto the stack. icall needs two words of code space. Example:
ldi ZL, low(my_subroutine)
ldi ZH, high(my_subroutine)
icall
call
"Call Subroutine". This instruction can reach the lower 64k words of code space (enough for the biggest
AVR, the mega128). It works just like rcall (regarding the stack) and needs 2 words of code space.
Example:
call my_subroutine
The Return Instructions ret And reti

These instructions have to placed at the end of any subroutine orinterrupt service routine (ISR). The return
address is popped from the stack and program execution goes on from there. This is what ret does.
reti is used after ISRs. Basically it works like ret, but it also sets the I Flag (Global Interrupt Enable Flag) in
the status register. When an ISR is entered, this bit is cleared by hardware.

Indirect Calls/Jumps
Indirect calls or jumps are needed when a computed value determines where the ALU has to proceed
executing code. They are fairly easy to understand.
Indirect calls/jumps don't use a constant address as a target, but have the Z index register pair as an
argument instead. As the program memory is organized in 16-bit words, they also don't need an extension
for 128kbyte devices such as the mega128. For lpm, it has elpm to reach the whole program space, but
lpm uses addresses for 8-bit organisation. So no eijmp or eicall is available. While the label for setting up Z
for lpm needs to be multiplied by two (to have byte addresses), this doesn't have to be done for ijmp/icall.
Example:
ldi ZL, low(led_on)
ldi ZH, high(led_on)
icall
led_on:
ldi r16, 0b11111110
out PortA, r16
ret

; load Z with address to call


;
; call led_on
;
;this is where Z points at and therefore the address to call

Indirect jumps/calls can also be used to make big case structures faster: If 20 different cases can occur
and the case we have is determined at the end of all checks, it takes longer to be precessed than the first
one, as 19 checks have already been done before.
If the value to be checked for different case values is used to perform an indirect jump or call, life is easier,
faster and more effective regarding code space usage:
The value we want to process is multiplied by the number of words a jmp needs (which is two) and then
added to the base address of our table. The following interrupt routine loads r16 with the current UDR data
and calls the appropriate subroutine:
in r16, UDR
lsl r16
clr r17
ldi ZL, low(case_table)
ldi ZH, high(case_table)
add ZL, r16
adc ZH, r17
icall
reti
case_table:
jmp UDR_is_one
jmp UDR_is_two
jmp UDR_is_three

; get data from UDR


; multiply by two
; zero reg for 16-bit addition
;load jump tabe base address
;
; add UDR*(jmp size) to Z
;
; call table cell
; and return
;
; these are jumps to subroutines which in turn handle the
; case that occured.
; when returning, the ALU will jump to the reti of the ISR above again.
;

One fundamental thing is not shown here: The ISR has to check for values not handled by the table before
the icall is done. The table as it is above does not have entries for the values zero and 4 to 255, so these
can result in an error! Also watch out for the table cell size: A jump needs two words and is slower than a
rjmp, but using an rjmp (which needs one word only) must only be done together with a nop so that things
stay correct: An rjmp without a nop (the nop should follow the rjmp for speed reasons) would result in an
error (two jmp instructions are combined to one garbage word).

Indirect addressing is also used by operating system to let tasks install themselves into interrupt jump
tables: The operating system cares for the interrupt being serviced, the tasks leave their own ISR address
in a table and the operating system can call each routine. Such a table could be an array of addresses in
SRAM at an address known when the code is written:
.org 0x0000
rjmp reset
reset:
lds ZL, reset_table
lds ZH, reset_table + 1
icall
.dseg
reset_table: .byte 8
.cseg
ldi XL, low(my_reset_ISR)
ldi XH, high(my_reset_ISR)
sts reset_table, XL
sts reset_table + 1, XH

; setup reset interrupt vector


;
;
; the reset vector (in this case) calls the first routine installed in the
; reset vector table:
; load Z with first address
; call the routine
;
;
; make a table for 4 addresses which can be filled at runtime
;
;
; this is what the task does for installing itself in the reset table:
; load X with my_reset_ISR address
; store the address at reset_table[0]
;

Of course, the task needs some more information: Which table position is free (not used by other tasks)
and what information is provided by the OS and so on, but that's not the problem now, this page is just
meant to illustrate how indirect jumps and calls work.

Conditional Branches
Conditonal branches are branches based on the micro's Status Register. If the result of a previous
operation left a status (for example "Zero"), this can be used to jump to code handling this result. Loops
(for, while...) make use of this.
Any add, subtract, increment, decrement or logic instruction for example leaves a status that can be used
for almost any branch instruction the AVR offers. There are as well some tests which set status flags based
on their arguments. Basically they are just a subtraction: Comparing two numbers to each other is done by
subtracting one from the other. The result of a - b can be negative (b > a), positive (b < a) and zero (b = a).
This information is stored in the status register. When two numbers are added to each other, it can happen
that the 8-bit result is greater than 255 and therefore "rolls over". In this case, the carry bit in SREG is set.
Some examples:
subi r16, 5
breq r16_is_0
brlo r16_is_lower0
r16_is_greater5:

; r16 = r16 - 5
; r16 was 5, handle that
; r16 was lower than 5
; r16 was higher than 5

Now some examples for conditional branches in loops:


ldi r16, 5
loop:
dec r16
brne loop
clr r16
loop:
inc r16
cpi r16, 5
brne loop

; load desired loop count into r16


; loop label
; decrease loop count
; if not equal (result=0), loop again
;
; clear counter
; loop label
; increase loop count
; compare to desired loop count
; if not reached, loop again

Here is a list of simple tests which can also be used for branch instructions (exception: cpse - this
instruction performs a compare and skips the next instruction if equal):
Valid SREG flags after test:

instruction:
cpi
cp
cpc
tst
cpse

arg 1:
reg
reg
reg
reg
reg

arg 2:
const
reg
reg
--reg

"action":
reg - const
reg - reg
reg - reg - C
reg AND reg
reg - reg

I
-

T
-

H
<>
<>
<>
-

S
<>
<>
<>
<>
-

V
<>
<>
<>
0
-

N
<>
<>
<>
<>
-

Z
<>
<>
<>
<>
-

C
<>
<>
<>
-

If you want more info on which results change which SREG flag, see the AVRStudio assembler help. Here
is a list of a few branch instructions and what they do based on the flags:
breq
brne
brsh
brlo
brmi
brpl
brge
brlt

; branch if equal
; branch if not equal
; branch if same or higher
; branch if lower
; branch if minus
; branch if plus
; branch if greater than or equal (signed)
; branch if less than (signed)

AVR assembler has more branches which test the interrupt flag or single other status flags. If you need
one of them, see the AVRStudio assembler help. Branch instructions leave the flags they test untouched,
so the code branched to or the code following the branch can use them without restriction.

Case Structures
Quite often, for example when receiving command values via the UART, it is necessary to build up a case
structure to deternime which function needs to be called. The case structure compares a value to various
case values. As the branch instructions don't change any flags, this can be implemented straight forward:
in r16, UDR
cpi r16, 0
breq case_0
cpi r16, 1
breq case_1
cpi r16, 2
breq case_2

; get UART data


; compare with case_0 value (0)
; if case_0, jump there
; compare with case_1 value (1)
; if case_1, jump there
; ...and so on

After all the case tests you can write the "default" code that will be executed if none of the tests results
equal. Case structures don't necessarily test for single values, but can also test for values within a specific
range, or compare strings to each other. Here is value range example:
in r16, UDR
chk_case_03:
cpi r16, 4
brlo case_03
cpi r16, 20
brlo case_419
default:
; (default code)

; get UART data


; compare with 4
; if lower (0, 1, 2, 3) jump to case_03
; compare with 20
; if lower (4 to 19), jump to case_419
; if none of the tests was successful,
; execute default code

The second example could also use brlt (branch if less than) if signed numbers are used. Advanced users
can write compare routines for any data structure they want. If these return usable flags in SREG,
conditional branch instructions can of course be used then.

For Loops
I don't think I need to explain how a for loop works, but in assembler we need to take care of the counting
register, which we wouldn't need to do in C ot Pascal. For loops can work in many different ways. Some
are more code efficient, some are more flexible.
The flexible version counts from zero up to the required number of iterations. It is possible to use the
counting register to address for example array elements.
ldi r16, 0
loop:
out PortB, r16
inc r16
cpi r16, 10
brne loop

; clear counting register

ldi r16, 0
loop1:
inc r16
out PortB, r16
cpi r16, 10
brne loop1

; basically this loop acts like


; the first one, with one exception.
; Find it.

; write counting register to PortB


; increase counter
; compare counter with 10
; if <>10, repeat

What is the difference beteween the two example loops? The first loop increments the counter after writing
the counter value to PortB. So the values we can see on that port are 0..9. The second loop increments
the counter before writing it to PortB. We can see the values 1..10. So whenever you plan to use the
counter register within the loop (for whatever you can think of) remember to check where the counter has
to be incremented.
If the counter value is only important for counting purposes (not used from within the loop), you can use a
decrementing version:
ldi r16, 10
loop2:
(insert loop code)

; load r16 with desired number of iterations

dec r16
brne loop2

; decrement loop counter


; if not zero, repeat loop

; do whatever the loop does...

You might have noticed that there is no compare instruction. decleaves the status manipulated in uch a
way that we can use breq to determine whether the result was zero or not. That saves 1 word of program
space compared to the up-couting version of a for-loop.

While Loops
Just as for loops, while loops come in different variations. This time, we don't have to care about a counter
register.
The while()...do{}-loop checks if a certain test result is true and performs the loop instructions.
while1:
in R16, PinD
cpi r16, 1
brne while1_end
rcall port_is_1
rjmp while1
while1_end:

; while(PinD = 1) rcall port_is_1


; perform check by reading pin value and
; comparing it to 1
; if not true, end the loop
; rcall port_is_1
; and repeat loop

This type of while-loop will only execute the loop instructions if the condition is true, else it will never do
that.
The do{}...while()-loop executes the loop instruction at least once:
while2:
rcall port_is_1
in r16, PinD
cpi r16, 1
breq while2
while2_end:

; Do rcall port_is_1 while(PinD = 1)


; rcall port_is_1
; check, see above
; if true, repeat
; not true; proceed with following code

These two examples also demonstrate two different branch instructions used for (basically) the same
thing. You'll easily find that out by yourself. Little helper: Conditional Branches.

Macros in AVR Assembler


Macros are a good way to make code more readable, for example if it contains code that is often reused or
if a lot of 16-bit calculations are done.
Macros in AVR assembler can be defined everywhere in the code as long as they're not used at a location
before the macro definition. They can take arguments which are replaced during assembly and can't be
changed during runtime. The arguments can only be used in the form @0 or @1 (while 0 or 1 are the
argument numbers startnig from 0). The arguments can be almost everything the assembler can handle:
integers, characters, registers, I/O addresses, 16 or 32-bit integers, binary expressions...
This works:
.macro ldi16
ldi @0, low(@2)
ldi @1, high(@2)
.endmacro

; lets make a macro for loading two registers


; with a 16-bit immediate
; load the first argument (@0) with the low byte of @2
; same with second arg (@1) and high byte of @2
; end the macro definition

ldi16 r16, r17, 1024

; r16 = 0x00 r17 = 0x04

While this does not:


ldi16 r16, r17, 1024

; r16 = 0x00 r17 = 0x04

.macro ldi16
ldi @0, low(@2)
ldi @1, high(@2)
.endmacro
Above, I wrote that arguments are replaced during assembly. The following should make it clear:
ldi16 r16, r17, 1024
; is assembled to:
ldi r16, 0
ldi r17, 0x04

; in the macro, this was:


; ldi @0, low(@2)
; ldi @1, high(@2)

As I said, macros can also be used to replace 16-bit calculations. This is one example (along with ldi16):
.macro addi
subi @0, -(@1)
.endmacro

; This is the "Add Immediate to register" instruction we all


; missed in the instruction set!
;Now here's the 16-bit version:

.macro addi16
subi @0, low(-@2)
sbci @1, high(-@2)
.endmacro
Macros can of course be more complex, take more arguments and crash the assembler. If too many
macros are defined in one file, the last ones can't be found. I've had this with more than 7 I think. Just split
them into more files, that helps sometimes. Or just don't be that lazy and write the code yourself...

AVR Assembler Directives and


Expressions
[ Directives ] [ Expressions ]
Assembler Directives
Assembler Directives change or adjust the way the assembler works with your code. For example, you can
change the location of your code in program memory, assign labels to SRAM addresses or define constant
values. ".macro" is also an assembler directive. Assembler directives can be divided into the following
groups:
[ Program Memory ] [ SRAM ] [ EEPROM ] [ Registers and Constants ] [ Coding ] [ Assembler Output ]
You will see that some directives can be used in more than one context, but that makes sense as soon as
you understand them. They always do basically the same thing.
Program Memory Directives
.cseg
"Code Segment"; This directive tells the assembler that the following code/expressions/whatever is to be
put into program memory. This is necessary when the .dseg directive was used before.
Syntax:
.cseg
.db
"Data Byte"; With this directive you can place constant values in program memory at a known address, for
example serial numbers, strings for a menu, lookup tables. They are treated byte-wise and therefore have
to be within the 8-bit range. Almost any expression can be used with the .db directive.
CAUTION!
Every .db directive will place it's expressions starting at a new word in program memory! So two .db
directives each with a single byte as an argument will use two words, while one .db with two bytes as
arguments will use only one word. See example below
Syntax:
.db expression1, expression2, expression3, ...
Examples:

.org 0x0100
.db 128
.db low(1000)
.db 128, low(1000)

;set program memory address counter to 0x0100


;place the number 128 at low byte of address 0x0100 in program memory
;place the low byte of 1000 at low byte of address 0x0101
;place 128 at the low byte and the low byte of 1000 at the high byte of
;address 0x0102 in program memory

Strings can of course be placd in program memory with only one .db directive:
.db "Hello World"
This will fill 6 words and a 0 will be added by the assembler. If your string processing routine looks for 0
terminated strings, this is no problem, as the 0 is already there. If the string is
.db "Hello World!"
no 0 will be added, so
.db "Hello World!", 0
is better.
.dw
"Data Word"; .dw works just like .db, but will use one word for every value.
.org
.org can be used to set the program counter to a specific value.
.org 0x01 is the Interrupt Vector for external interrupt 0 in devices with 1-word interrupt tables. The
mega128 has two words for each interrupt, so for setting the program counter to the external interrupt 0
you have to use .org 0x02 in this case.
Syntax:
.org location (location is the word address of where the following instructions/data tables are to be placed)
SRAM Directives
.byte
Reserves a given number of bytes of SRAM space for a label. This might sound a bit complicated, but the
syntax example will make it clear... This directive is only allowed in data segments (see .dseg).
Syntax:
.byte size
array_5: .byte 5
my_word: .byte 2

; array_5 is a 5-byte SRAM segment


;and is followed by my_word

.dseg
"Data Segment"; Tells the assembler that the following text is meant to used for setting up the SRAM. To
switch to code again, use .cseg.

.org
Use this directive to set the SRAM location counter to a specific value within a .dseg. (see .org in "Program
Memory Directives"). Together with .byte you can define SRAM locations at a specific address with a
specific size.
EEPROM Directives
The EEPROM Directives work just like the directives for program memory and SRAM. I won't go into detail
here. As EEPROM values can be downloaded to EEPROM to be stored there, the .db and .dw directives
can be used for storing calibration values in EEPROM during programming.
.db
.dw
.eseg
.org
Register and Constant Directives
.def
"Define (register)"; With this directive you can assign names to registers.
Syntax:
.def name = register
Example:
.def temp = r16
.equ
This directive assigns a name to a constant value which can't be changed later:
.equ max_byte = 255
.set
This work similarly to .def, but the value of the label can be changed later:
.set counter = 1
.set counter = 2
can occur in the same piece of code and they're each valid until a new .set is found, so .set counter = 1 is
overridden by .set counter = 2.
Coding Directives
.endm / .endmacro
"End Macro"; This tells the assembler that a macro previously started ends here. Only use after you've
also started a macro with .macro :-)

.macro
This will start a piece of macro code. See Assembler -> Macros for examples and usage suggestions.
.include
Including files (for examples the part specific definition files for each AVR) makes code more readable and
gives you the possibility to split code into sperate files.
Syntax:
.include path
Example:
.include c:\program files\avr studio\assembler\8515def.inc"
.include "\drv_routines\lcd.inc"
Assembler Output Directives
.device
This directive tells the assembler which AVR this code is for and only has effect on the AVR Studio
Simulator settings and does not affect the way your code will run on the actual device. Possible arguments
(device codes) are (list not complete, but you'll get the picture....):
AT90S1200
AT90S2313
ATmega8535
ATmega128
ATtiny11
ATtiny26
Syntax:
.device devicecode
.exit
Tells the ssembler to stop assembling the current file. WHAT FOR????? Well, if include files contain text at
the end (explanations of routines, constants and so on), the .exit directive can be used to let the assembler
proceed with the file in which the .include directive occured without any warnings or errors caused by the
text.
Example:
.def byte_max = 255
.def clock = 8000000
.exit
The maximum value a byte can hold is 255 and the device is clocked at 8 MHz
.list
The assembler by default creates a listfile (a combination of source code, opcodes, constants and so on).
Together with .nolist you can secify which parts of your file are to be shown in the listfile.
.listmac

This directive will turn macro expansion in the listfile on. By default, you'll only see which macro is called
and which arguments are used. As it can be useful to see what's going on (for debugging pruposes), it's
possible to get expanded macros.
.nolist
Turns listfile generation off (see .list)
Assembler Expressions
The AVR assembler supports many expressions for calculating constants or manipulating other values to
suit your needs. I don't want to cut n paste the AVR Studio assembler help file (as I did with the directives
part.......), so I'll just give you some pointers.
The possible operators are labels (addresses in Flash, SRAM or EEPROM), variables defined by the .SET
directive (see above), constants defined by the .EQU directive (also see above), integer constants
(decimal, Hexadecimal, Binary or Octal) and the program counter (PC).
How to use the labels, variables, constants and and integer values should be clear. The PC is interesting
and can be quite handy. Many loops only constist of one instruction and thinking of a label for all those little
things can be nasty, especially if the label you just want to use is already in use in some include file. The
code ends up seperated by ugly labels nobody can understand. Using the program counter is much better
(That's my opinion and this has been subject to heavy discussions...) for those loops (add a comment to
be on the safe side). Here's an example:
ldi r6, 0
inc r16
out PortD, r16
cpi r16, 10
brne PC - 3

; load r16 with zero


; increment r16
; write value of r16 to PortD
; compare r16 with 10
; branch to inc r16 if not euqal

When the PC is used for calculations, the current PC value is used. That means that in the example above
the location of the branch instruction is used for the calculation (not the loaction afterwards). The
instructions we want to include in the loop are inc, out and cpi and each occupy one word of program
memory. So we need to skip those three words: PC - 3. Always verify your code by simulating it!

Conversions...
As you read through these pages you might want to have a look at an ascii table. You can find one in the
Banner frame ("quick links").
Conversions are very important for user interaction: If a register has the value 0x30 its corresponding ascii
character is '0'. But if you want it to be displayed as '48' (0x30 is 48), you need to convert the number. In
the case of '0' this is not that important, but 0xFF is displayed as a block. And '48' is better than '0' if you're
displaying a temperature...
Some protocols, such as Ymodem, also use strings of values we have to convert first before we can
perform calculations on them: Ymodem sends a file size of 512 bytes as '512'. An AVR has to convert this
from ascii coded decimal to 16bit int first before it knows what '512' means.
Some number formats you should have in mind when doing calculations:
128
'128'
0x30
0b11001010
'11001010'

; normal decimal value


; ascii coded decimal. In this case you need three bytes ('1', '2' and
; '8') to store that number.
; hex value
;
; binary value
; ascii coded binary

It's up to you which number format you use for a specific task. Ascii coded hex is quite often used for
debugging purposes, because the numbers are all of the same size (number of characters needed) and
becase the conversion always takes the same number of cpu cycles and doesn't require much space. Ascii
coded decimal is better for things like temperatures or rpm of a motor. Ascii coded binary is good for
displaying flag registers (SREG, Interrupt flag registers and so on).
I'll show you ways to convert numbers in both directions: From int to something you can display and back
(remember the Ymodem example).

Commonly Used Number Formats


The ALU of an AVR only knows the integer number, unsigned as well as signed, and only 8 bits wide. The
8 bit limit is not as bad, as we can still use the carry bit to make 16-, 24- and 32 bit operations possible.
Converting numbers from one format to another is not as easy and requires the person writing the code to
understand the number formats first.
All conversions explained on these pages have the integer as one "end" (source or result). This is the
number the AVR actually deals with.
Other formats use ASCII characters or the fact that a digit (which has a range of 0 to 9) only uses one
nibble of a byte.
HEX format:
In AVR Assembler (and on this site) HEX numbers are written with the "$"-sign or "0x" at the beginning:
$10 is equal to 16 and 0x20 is equal to 32.
The Hex format splits the 8 bits of a byte into "nibbles" of 4 bits (the high nibble and the low nibble) and
displays them with a number or character:
Nibble value:
Hex:

0
0

1
1

2
2

3
3

4
4

5
5

6
6

7
7

8
8

9 10 11 12 13 14 15
9 A B C D E F

If an 8-bit number is sent or printed as ASCII Coded Hex, the number is split into high nibble and low
nibble (in the case of ox20 these are 2 and 0). Then the nibbles are converted to their ASCII
representative: 0x32 for 2 and 0x30 for 0. These values can be printed on screen. The values from the
table above can not be preinted on the screen: In the ASCII table these are either not defined or control
characters. A won't be displayed as 'A'.
Binary format:
The binary format should be quite clear: 0b00001000 is equal to 8, 0b00011000 is equal to 24. Easy.
When a number comes as ASCII coded binary, the 1s and 0s are sent as their ASCII representative, 0x30
and 0x31, and thus have to be converted before they are "real" bits. The binary format also requires bit
shifting for the conversion.
Binary Coded Decimal (BCD) format:
Binary coded decimal is very handy for storing two digits (0..9) in one byte without much coding. The digits
are directly written to a byte nibble.
0x22 means that the low nibble contains the number 2 and the high nibble contains the number 2 as well.
A consequence of this is that a byte can only hold value in the range of 0 to 99: The values 10 to 15 (A to F
in Hex format) are not allowed in BCD format.
This format can for example be written to a port which has a 7447 connected to it. This IC is a 7-segment
LED driver which converts this format so that the segments of the LED display show the number of the
nibble.

The Ascii Table


Very often you'll need to convert ascii to hex or decimal numbers and back. An ascii table is THE tool you'll
need for that. Here is one.
If you need the hex value of 'H', look for H. It's in column $4x and row $x8. 'H' = $48 or 0x48. Other way
around: You need to know what 0x69 is when shown as a character. Column $6x, row $x9: 'i'
The "ctrl" column contains the control characters in short form. The real name can be found further down
on this page in a seperate table.
We're working on a printable version of this... Most probably we'll have to divide the table by three or so
and fill three pages. One won't be enough for all this...

$x0
$x1
$x2
$x3
$x4
$x5
$x6
$x7
$x8
$x9
$xA
$xB
$xC
$xD
$xE
$xF

dec
000
001
002
003
004
005
006
007
008
009
010
011
012
013
014
015

$0x
char

ctrl
NUL
SOH
STX
ETX
EOT
ENQ
ACK
BEL
BS
HT
LF
VT
FF
CR
SO
SI

dec
016
017
018
019
020
021
022
023
024
025
026
027
028
029
030
031

$1x
char

ctrl
DLE
DC1
DC2
DC3
DC4
NAK
SYN
ETB
CAN
EM
SUB
ESC
FS
GS
RS
US

dec
032
033
034
035
036
037
038
039
040
041
042
043
044
045
046
047

$2x
char
spc
!
"
#
$
%
&
'
(
)
*
+
,
.
/

dec
048
049
050
051
052
053
054
055
056
057
058
059
060
061
062
063

$3x
char
0
1
2
3
4
5
6
7
8
9
:
;
<
=
>
?

dec
064
065
066
067
068
069
070
071
072
073
074
075
076
077
078
079

$4x
char
@
A
B
C
D
E
F
G
H
I
J
K
L
M
N
O

0x20 ('spc") means space, of course.


Here is the control character table.
SOH - Start Of Header
STX - Start Of teXt
ETX - End Of teXt
EOT - End Of Transmission
ENQ - ENQuiry
ACK - ACKnowledge
BEL - BELl
BS - BackSpace
HT - Horizontal Tabulation
LF - Line Feed
VT - Vertical Tabulation
FF - Form Feed
CR - Carriage Return
SO - Shift Out
SI - Shift In

DLE - Data Link Escape


DC1 - Device Control 1
DC2 - Device Control 2
DC3 - Device Control 3
DC4 - Device Control 4
NAK - Negative AcKnowledge
SYN - SYNchronous idle
ETB - End of Transmission Block
CAN - CANcel
EM - End of Medium
SUB - SUBstitute
ESC - ESCape
FS - File Separator
GS - MainForm.Group Separator
RS - Record Separator
US - Unit Separator

Converting Int to ASCII Coded Decimal


Converting integers to ASCII coded decimal is pretty simple. To understand how it is done, you first have to
think about how the numbers we're using in written documents are built up:

dec
080
081
082
083
084
085
086
087
088
089
090
091
092
093
094
095

$5x
char
P
Q
R
S
T
U
V
W
X
Y
Z
[
\
]
^
_

Let's take the number 233 and rip it apart:


Multiply number
100
with
decimal

result 200

10

30

These add up to 233. The number consists of single digits which specifiy the number of hundreds, tens,
and ones we add together.
To get these results, we need to divide the number; first by 100, then by 10 and then by 1. As the AVR
doesn't have a divide instruction, this has to be done manually:
Divide by 100:
- copy the number into a temporary register
- compare the number with 100
- if greater or equal, increase the hundreds count and subtract 100 from the temporary register
- go to the compare again
When this is done, the number in the temporary register is lower than 100. Now we can proceed with 10s
and 1s. Instead of dividing it by 1 we can just copy the remaining number to the register that holds the
ones.
Unfortunately, this is not enough to convert a number to decimal coded ASCII. In an ASCII table we can
see that '0' is 0x30. So we add 0x30 to the single digits (hundreds, tens, ones) and can now print it on the
screen (via UART, USB, LCD interface, whatever).
It's now also possible to reformat the number, delete characters we don't need (print a space instead of 0
hundreds if the number was lower than 100) or add additional characters in between.
Here's a flow chart of how the conversion can be done:

It should be pretty easy for you to write the code for this yourself.

Doing this with a 16-bit number is just the same, but with 5 digits and 16-bit compares. The code space
needed (as well as cpu time) is 40% bigger. If you have a lot of free program space, you can build up a
case-like structure to do the conversion: If the number is greater than 200, the hundreds counter is loaded
with 2 and 200 is subtracted from the original number. This is faster but requires more space. It's up to
you.

Converting Int to ASCII Coded Hex


This conversion is a bit more difficult than int to ASCII coded decimal, as you don't only have to display
numbers, but characters as well. In the ASCII table, these are not found directly after the numbers.
However, the first task is to load the two nibbles that make up to an 8 bit integer into separate registers:

The reason why you have to swap the nibbles in reg A is that the register holding the high nibble should
have a value between 0x00 and 0x0F (15). If we didn't swap the nibbles, their value would be 0x00..0xF0
which we can't convert to ASCII.
Now we have two nibbles, each in a separate register, which are between 0x00 (0) and 0x0F (15). These
must now be converted into their ASCII representative: For 0 it's '0', for 10 it's 'A' and for 15 it's 'F'. This can
be done with a lookup table or by using a case structure.
A lookup table for this would have the ASCII values of the possible nibble values at the nibble positions:
Table Position:
Nibble value:
ASCII:

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
'0' '1' '2' '3' '4' '5' '6' '7' '8' '9' 'A' 'B' 'C' 'D' 'E' 'F'

The conversion only consists of replacing the register value by a corresponding table value (which is the
overall concept of a lookup table).
When this is done and the number we had was 128, we now have reg A holding '8' and reg B holding '0'
because 128 is 0x80.
Converting 16-bit numbers from int to ASCII coded hex is not much harder. The result needs 4 registers
and we need to convert two ints without having to use any 16-bit instructions:
1024 (hex 0x0400) converts to '0' and '4' for the high byte and '0' and '0' for the low byte.

Converting Int to ASCII Coded Binary


This is the only conversion that needs byte shifting and doesn't destroy the original value. As one byte has
8 bits, you might want to reuse the register used for printing the character instead of using 8 individual
registers.
The shift direction depends on how the number is to be displayed (which bit first - most significant or least
significant?). We do 8 shifts, load the register with the corresponding value (0 or 1) and add ASCII '0' to it.
It's also possible to load the register with '0' and then increase it if the bit shifted out is 1. This is what I'd
do. Here's an example code snippet:
ldi r16, 8

; load a counter with 8

itoacb_loop:
ldi result, '0'

; a label for looping 8 times


; load '0' into result

ror value
brcc is_0
inc result
is_0:
rcall print_result

; get next bit by rotating right, new bit is in carry


; check if carry bit is 0
; if carry was set, increase result (0x30 + 1 = 0x31)

dec r16
brne itoacb_loop

; decrease the counter


; and, if necessary, loop

; call the print routine (for UART, LCD or whatever)

Converting Int to Binary Coded Decimal


Binary coded decimal is a number format useful for storing single decimal numbers without converting
them to ASCII and therefore needing more space. It's actually a step of the conversion from int to ASCII
coded decimal and similar to the hex format:
22 converted to BCD (binary coded decimal) is 0x22.
Numbers greater than 99 can't be converted to BCD if only one byte is to be used as the result. 100 needs
one more byte and would be converted to 0x0100 (256).
For the conversion, the algorithm of the int to ASCII coded decimal conversion is used, but some more
coding is to be done because the number is packed into one byte if it had two digits before.

Example: 45->BCD
45 divided by 10 is 4; 5 remains. 4 Is stored in the result register and the nibbles are swapped. The result
is now 0x40. Now 5 is added and the result is 0x45, which is exactly the result we want.
Solving the problem that occurs when converting numbers >99 is up to you. A second byte is needed.

Converting ASCII Coded Decimal to INT


Ascii Coded Decimal strings are, for example, used in the Ymodem protocol for sending file size
information. To use that string as a file size it has to be converted to an integer in order to be processed
fast and code-efficient.
The conversion is not as straight-forward as others, as the string length has to be taken into account:
'512' has a length of 3 characters, and therefore the '5' means 500, and not 50 or 5. As the most significant
digits are usually sent first, the string is not too hard to be converted. If the least significant digit is sent first
the string has to be stored first and can't be processed byte for byte.
Given that the most significant digit is sent first, the following steps are necessary for each character
received:

For the '512' string, this happens:


'5' is converted to 5 by subtracting ASCII '0' from '5'. Then the result (which is still zero) is multiplied by 10.
The result is zero.
Now 5 is added. Result = 5
Now the '1' is received and converted to 1. The result (which is 5) is multiplied by 10, so we now have 50.
1 is added which equals 51.
The '2', being the last character of the string, is again converted, the result is pultiplied by 10 (=510) and 2
is added. The result is 512. Easy huh?
The only thing I didn't mention until now is how to determine when the string ends. This depends on the
transmitter or source of the string. If it is a null-terminated string, we can use the following program flow:

Converting ASCII Coded Hex to Int


For the conversion from ASCII coded hex to int we first need some code that converts one ASCII character
(which represents one hex nibble) to its decimal value. 'A' -> 10. This can't (or shouldn't) be done with a
lookup table, because the ASCII value of 'A' is bigger than the size of efficient code doing the same thing.
Therefore, the lookup table would also be quite big. So, it's better to choose the case structure.
Then the second nibble is converted in the same way and the two nibbles are combined to form one int.

This conversion is done for the two nibble characters. These are then combined in one byte:

Maybe the nibbles have to be swapped again, depending on how the two nibbles were sent: If the high
nibble was sent first, the received byte can left as it is: The high nibble was added, the nibbles were
swapped and then the low nibble was added.
Consequently, if the low nibble is sent first, the nibbles have to be swapped again.

Converting ASCII Coded Binary to INT


This one is mean. Like with the conversion from ASCII coded decimal to int, we need the file size to
determine whether we're receiving a 8-, 16-, or even 32-bit number. If the string is sent most significant bit
first, this is no problem, but if the value is sent least significant bit first, we need to store the entire string
first (or need some neat trick). To determine the string end of a null-terminated string, the following check
is necessary:

I'll only discuss the case of the string being sent most significant bit first here.

What is to be done? The bit is received, converted to a 'real' bit (1 or 0) and then shifted into the result.
rcall receive_character
tst character
breq end_of_string
clc
subi character, '0'
lsr character
rol result

; get the next character


; if it is zero, the string ended
; clear carry bit
; convert character to 1 or 0
; shift least significant bit from converted character into carry bit
; and rotate it into the result

What happens during the shift operations? These are most important for the conversion, as they combine
the result with the converted character.
The lsr instruction shifts the byte one place to the right. The bit that is shtifted out is placed in the carry bit
and the most significant bit is replaced by 0.
The carry bit is then rotated (rol) into the result: The most significant bit of the result is placed in the carry
bit and the least significant bit is replaced by the old carry bit which we got from the character.
But how can we determine when to begin a new byte? We don't know the exact string length by now and
have to start a new byte after 8 bits have been received and shifted into the result. The code above has to
be altered so that it handles the highest possible number of bits that might be sent by the source of the
string.
This can be done with multiple rol instructions. One for each byte, beginning with the least significant one:
rol result_0
rol result_1
rol result_2
rol result_3

; get new bit into lowest byte of result


; rotate most significant bit of result_0 into result_1
; rotate most significant bit of result_1 into result_2
; and do the same with result_3

Now the bit from the character is shifted into the 32-bit result. Of course, the result can also be left 16 bits
wide.
Make sure that the result is initialised to 0 before starting to shift in bits: Some bits might not be overwritten
by the bits from the string!

Converting Binary Coded Decimal to Int


The BCD format is easily convertd to integer.
A byte formatted as binary coded decimal contains two digits, which can be directly taken from one nibble
each:
0x22 means 22. So the conversion is clear just by looking at this example: Multiply the high nibble with 10
and then add the low nibble! If a BCD word is to be converted to int, just do the same: Take the high nibble
of the high byte, multiply with 1000, low nibble of high byte with 100, high nibble of low byte with 10 and
then add the low nibble of the low byte.
Here is an example code snippet that converts a BCD byte to int:
r16: BCD byte
r17: int result
r18: temporary register
r19: another temporary register
An AVR that supports the mul instruction has to be used for this example! If your AVR doesn't support it,
you have to either write a mul macro or multiply the high nibble by 10 another way.

mov r18, r16


cbr r18, 0b00001111
swap r18
ldi r19, 10
mul r18, r19
mov r17, r0

; copy BCD byte into temp register


; clear low nibble of temp register
; swap nibbles
; load r19 with 10 (for multiplication)
; r1 = r18 * r19
; copy result of multiplication into r17

mov r18, r16


cbr r18, 0b11110000
add r16, r18

; again copy BCD byte into temp register


; clear high nibble (no swap needed this time)
; and add the low nibble to the result

The Arithmetic Logic Unit (ALU)


The ALU is what you might also know as CPU on a normal computer. It is connected to the registers,
program memory (FLASH), SRAM, and internal peripherals. That's not completely correct (external
peripherals can also be connected to the ALU, such as external SRAM), but it's enough for now. Take a
look at your AVR's datasheet to get a more complete view of things.You can only write effective code if you
know how the peripherals work and which instructions can be performed by the ALU. The Alu is what you
basically write your code for. Loading a register with an immediate value is done by the ALU, writing that
value to an I/O Port is
done by the ALU as
well, just as any other
code.
The program counter
is the ALU's
connection to the
program memory. It's
basically just the
address of the
instruction to be
executed and is
changed by jump or
subroutine call instructions and other instructions that work on the program memory, like lpm(load program
memory).
The lower registers (0..15) can not be loaded by ldi and also don't support andi and some other
instructions. More information on this can be found in the AVRStudio help. But still the AVR ALU is very
powerful: Some other microcontrollers only have a few registers (often referred to as accumulators), which
slows down calculations if multiple byte values are handled and have to be stored in SRAM between
calculations. With the 32 registers provided by AVRs, this is no problem if the registers are used well.
The registers are connected to the SRAM of the AVR. All registers can be stored in SRAM and some can
also be used as pointers (registers 26..31) to handle indirect addresses, arrays and so on.
The internal peripherals are function blocks such as the UART (Universal asynchronous receiver and
transmitter), Timers, SPI (Serial Peripheral Interface), EEPROM and so on. These are described in their
own AVR Architecture section. Not all AVRs have all internal peripherals described here, and some have
even more of them. Refer to the datasheet of your AVR to get an overview.

SRAM and Flash


SRAM
The SRAM is the actual AVR memory. While the registers are used to perform calculations, the SRAM is
used to store data during runtime. In small projects, all variables can be kept in the registers and things are
more easy. In order to use SRAM effectively, you need to know some things about the AVR address space,
the assembler directives and instructions operating on SRAM.
All memory locations can be reached by either direct or indirect addressing. Direct addressing is simpler,
so I'll explain it first.

When storing data to/loading data from a direct address you know exactly where the data is stored. When
using address 0x60 to store a byte value (lets call it "hour"), you can use the sts and lds instructions to
handle that data:
lds r16, hour
That's pretty simple. r16 is loaded with the SRAM contents at address 0x60. Same procedure for sts:
sts hour, r16
Indirect addressing is done similarly to using pointers in C or pascal: The Index Register Pairs (r26:r27 are
called X, r28:r29 Y and r30:r31 Z) can be used to point at the AVR address space. If X (r26:r27) holds the
value 0x60, it will point to "hour" and can be used to handle that value. This is what the indirect addressing
instructions are made for:
ldi XL, 0x60
ldi XH, 0x00
ld r16, X

; load r26 (XL) with low(0x60)


; load r27 (XH) with high(0x60)
; load r16 with value from X, which points to 0x60

Indirect loading of a register is pretty useless without indirect storing, so there's also a store instruction
which is used just like sts but with X, Y or Z instead of a direct address. It's called st.
It's now time to explain the address space a bit more detailed before proceeding with more advanced
load/store instructions.
The AVR address space consists of 3 major regions: The register file (r0..r31), I/O registers (Timers, UART
and so on) and internal SRAM. Here is a diagram showing how it's organised:

In this diagram you can see why the first SRAM address is 0x60. The AVR registers and I/O registers are
also located in the data space and occupy the low addresses. This has an advantage: You can access the
I/O registers via the index register pairs X, Y, and Z as well. The only thing to remember is that the I/O
addresses you can use with In and out don't work in this case, as in and out work with 0x00 to 0x3F (see
left column). In this case you must use 0x20 to 0x5F instead (see Address Space column). The working
registers can also be accessed using indirect addressing. The code below demonstrates the difference
between addressing I/O registers indirectly and writing to them using out.
ldi XL, 0x3B
ldi XH, 0x00
lds r16, X

;load X with PortA data


;space address
;load r16 with PortA value

out 0x1B, r16

;write r16 to PortA

I hope this could clarifiy the difference...


In the diagram you see "RAMEND" at the end of the internal SRAM space. This value varies between the
different AVRs, as they all have different SRAM sizes. External SRAM can also be connected to some of
them. The External SRAM data space begins at RAMEND + 1.
The advanced SRAM instructions are very useful when it comes to handling strings or in any other case
when data is stored one element after the other (arrays, structures....). In this case, we can just tell the
AVR to not only handle data, but also to manipulate the index register that was used:
st X+, r16
This means that r16 is is stored at the address X is pointing to. Then X is incremented by 1. It's also
possible to use pre-decrementing (st -X, r16).
Flash
Flash is organised in words of 2 bytes. All instructions need one ore more words of it, never single bytes.
That's no problem we have to care about.
But what if we wanted to used a lookup table in Flash? Here's an example of such a lookup table:
string:
.db "Hello!", 0

; the table's label (which can be used as an address)


; the table: A null-terminated string (Hello!)

The assembler will store the table in Flash. This will use up 4 words: 6 characters + 0 zero. That 7 bytes.
The last one will be padded with another zero so that only whole words are used. 8 bytes. For getting the
BYTE address of the table, it's address has to be multiplied with 2. Let's load the table's address into the
inde register pair Z:
ldi ZL, low(2*string)
ldi ZH, high(2*string)
string:
.db "Hello!", 0

; load low reg with low byte of the address*2


; load high reg with high byte of the address*2
;
; the table's label (which can be used as an address)
; the table: A null-terminated string (Hello!)

SRAM And Flash Operations


The SRAM contents can't be changed as easy as the register contents, but there are a few powerful
instructions. You can either use direct addresses (like 0x60 or a label which has been assigned to an
address) or index registers to have indirect addressing.
Direct Addressing:

lds and sts load from and store to a direct address.


lds r16, 0x60 will load register 16 with the data stored at address 0x60 in SRAM.
sts 0x60, r16 will store the register 16 data at SRAM address 0x60.
Indirect Addressing:
I'll first explain the AVR's index register pairs. The index register pairs are registers 26 to 31, with register
pair 26:27 named X, 28:29 named Y and 30:31 named Z. These can be used as pointers to SRAM. The
instructions for index registers not only support loading/storing, but also act on the index register pair itself
if needed.
ld r16, X loads register 16 with data pointed at by X (r26:r27).
ld r16, X+ loads register16 with data pointed at by X and also increments the register pair afterwards. This
is handy for processing strings or arrays of bytes. The index register pair will then point at the next address
(the next character of the string).
ld r16, -X decrements the register pair before loading r16 with data pointed to by X. If the register pair
pointed at 0x61, r16 will have the value of address 0x60. X will point at 0x60 after the instruction is
completed.
Y and Z support these instruction as well, but have more features. Y and Z can be post-incremented by a
constant value if needed (does not work for pre-decrementing). If an array consists of 3-byte values, Y can
be used to handle the first byte of consectutive array elements in the following way:
ldd r16, Y+3 If Y pointed at array[0].byte0, it will load that value to r16 and then point at array[1].byte0. Do
not try ld for this, it causes the assembler to give an error message (a good source for errors which is not
easy to find sometimes...)
The Z register has even more features: It can be used to work with Flash contents.
lpm loads register 0 with the Flash byte Z points at. As Flash is arranged (and addressed) in words (byte
pairs) you have to multiply constant addresses by 2 to get a byte address.
ldi ZL, low(2*label)
ldi ZH, high(2*label)
label:
.db "Hello world", 0
lpm
If the address is not multiplied by two and label is at byte address 0x60 (word address 0x30), Z will point at
the code stored there. I hope this clarified the addressing problem. Other versions are
lpm r16, Z
lpm r16, Z+
which work like ld and st, but in Flash. The ATmega series also supports spm (store program memory).
This is an advanced feature and not as easy to use as lpm.
push and pop work on the Stack Pointer. The Stack points at SRAM as well and is also modified by
function call and return instrutions (for a description of the stack pointer (SP) see the AVR Archtecture
section). push stores a register on the stack and pop loads a value from the stack into a register. This
feature is usually used in ISRs (Interrupt Service Routines) to preserve registers also used in normal code.

Lookup Tables

Lookup tables are an easy way to convert numbers to other formats and they also provide a way to make
calculations faster by providing basic values. A sine table could look like this (see second example below
for an AVR assembler version of this):
Angle:

10

20

30

40

50

60

70

80

90

Sine (*100)

17

34

50

64

77

87

94

98

100

This tble is pretty rough and won't give you a very good result if you really need to calculate values based
on a sine. On the other hand this table would only use 8 bits for the lookup value and doesn't even really
use the 8 bit range as it could: By multiplying the sine of an angle by 200 we would still move within the 8
bit range while having a higher precision: The sine of 60 * 200 is 173, not 174 as the table above would
give us (when multiplied by two). With 16 bits the precision is fairly good, as we can even multiply the sine
by 40,000 for storing it in a table! When the values have to be used however we still need to keep in mind
that the AVR needs them for calculations: A multiplication by 2 or 4 is good as we can just shift the result
right one or two places for dividing it again. But that's not the problem for now....
Lookup tables are usually stored in program memory using the .db or .dw directives and have an own label
for addressing:
sine_table:
.db 0, 17, 34, 50, 64, 77, 87, 94, 98, 100
The label is used to have an index register pair point at the table. Assuming the angle we need the sine of
is 40 and the index register pait used is X, the following code returns the sine of 40 * 100:
ldi XL, low(2*sine_table)
ldi XH, high(2*sine_table)
ldi r16, 4
ldi r17, 0
add XL, r16
adc XH, r17
lpm

; load BYTE ADDRESS (word address*2) of the table into


;X
; load r16 with 4 (which is the offset to the sine of 40)
; load a dummy with 0 for 16-bit addition
; add the angle to the sine table base address
;
; and load sine(40)*100 from program memory into r0

This example was fairly simple, but it shows how to get a value from a table we first made up in program
memory. If multiply values are to be read from the table, the AVR has two powerful instructions for us: adiw
(add immediate to word) and sbiw (subtract immediate from word). These only take the lower register of a
word as an argument and can only operate on r24, 26, 28, 30 (which includes X, Y and Z). The advantage
of these instructions over normal 6-bit additions and subtractions is that they don't need any registers for
holding the add/sub value. The example above needs those registers, as the value we want to add is not
known at the time when we're writing the code.
On some AVRs lpm can also load the program memory contents to a register different than r0 and can
post-increment Z. Possible ways of using lpm are:
lpm
lpm r16, Z
lpm r16, Z+ (while r16 can in both cases be replaced by any other register)
This makes usage of adiw or sbiw not necessary and saves code space on the devices which support the
lpm rr, Z+ instruction. The AT90S1200 doesn't have lpm at all, the 2313 only supports a bare lpm. Look at
your device's specific instruciton set for details.

Working With Strings


[ Strings in Flash ] [ Making it More Elegant ]
Strings in Flash

Strings can be stored in Flash using the .db directive. At runtime, they can be loaded from Flash by using
the lpm (Load Program Memory) instruction in order to work with them. There are many ways to write
routines to transfer them to SRAM, and some of them are really cool. Basically, you need to write a loop
that loads the character from Flash and processes it (by transferring it to SRAM or something else
depending on the application) until the terminating character is reached (usually zero).
The following example uses a routine to move a string from Flash (Z) to a known memory location (Y) and
returns a pointer (Y) to the string. The address of the string in SRAM shall be 0x100 after the transfer. The
terminating character (zero) shall not be cut off the string.
transfer_string:
push YL
push YH
clr r1
transfer_loop:
lpm
adiw ZL, 1
st Y+, r0
cp r0, r1
brne transfer_loop
pop YH
pop YL
ret
;Usage:
ldi YL, low(0x100)
ldi YH, high(0x100)
ldi ZL, low(2*mystring)
ldi ZH, high(2*mystring)
rcall transfer_string
mystring:
.db "Hello", 0

; this is the label we call to transfer a string from Flash (Z) to SRAM (Y)
; first, we save the SRAM pointer so that we can return it again
;
; as r0 doesn't support cpi, we need a zero register for the compare
; which is r1
; this is a do...while loop:
; load characte from Flash to r0
; increment Flash pointer
; store in SRAM and increment Y
; check if terminator reached
; if not, go back to the loop
;
; restore SRAM pointer
;
; and return.
; this is how you use this routine:
; load Y with destination address
;
; load Z with source address
;
;
; call the routine
;
; this is our string in flash:
; "Hello" with terminating zero

Not all AVRs support lpm rd, Z+ (load program memory to register and post-increment Z), so I used lpm
together with a seperate add immediate to word (adiw). Check the device specific datasheet of your AVR
to see if it supports lpm rd, Z+.
Making it More Elegant
The are of course more elegant ways to implement this. A design note (#043) on www.avrfreaks.net by
Kelly Small (Here's the direct link) shows how to use the stack to get the source address of the string. The
string is stored in Flash directly after the routine call instruction:
rcall transfer_string
.db "Hello", 0
The cool thing about this is that the string can be found where it is used by the code. The example above
would most probably use a string that is stored together with some other strings that might be a few pages
away.
When the routine is called, the rcall instruction will store the address of the NEXT instruction (in this case
the string). Therefore, the address of the string can just be popped off the stack. When the string has been
processed and Z points at the next data byte after the zero, this will be the next address after the string,
where normal code can follow (read the design note for a better explanation). So Z can be pushed onto the
stack and ret will return to it.
rcall stores the low byte of the return address first, so popping has to be done in high byte, then low byte
order. As the return address is a word address, it has to be multiplied by 2 to get a byte address for lpm,

which can be done by shifting the address left once. For devices with 128KB of program memory, elpm
has to be used then (together with rampz).
Messing around with return addresses is quite dangerous and has to be done carefully to avoid serious
errors. The design note mentioned above does not mention why it's safe, So I'll explain that here. First of
all, the routine has to be explained though:
Te return address is pushed onto the stack by the call instruction. The routine pops it from the stack and
multiplies it by two to make it a byte address:
pop ZH
pop ZL
lsl ZL
rol ZH
Z now points at the first character of the string, which can now be processed by the process_string routine
in a do...while loop that post-increments Z after every load-'n-process. The loop in this example is slightly
different from the one above (uses r16). process_string uses the data in r16 to process it (send to UART,
LCD, whatever).
read_string:
lpm
mov r16, r0
adiw ZL, 1
rcall process_string
cpi r16, 0
brne read_string
Once the string is processed, the routine has to finish it's job by translating the address Z is pointing at to a
word address again. Then the return address has to be pushed onto the stack again (for the ret
instruction). BUT: What if the string had an even number of characters? Then the overall length of the
string including the terminating zero is odd, so that Z still points to the string itself. The following two
drawings assume that the instruction following the string is "inc r16":

The second zero in the second example is added by the assembler (padding to achieve even number of
bytes in the .db directive). As the routine stops upon reaching the first zero, Z will point to a zero in the
second example and not at the increment instruction.
After the string has been processed, the routine will (as written above) push the return address onto the
stack again. If Z is now shifted one place right (division by 2 to convert the byte address to a word address
for ret), it will (second example) point to the first zero again and this word (two zeros) will be exectued after
the return, but that's not too bad as the opcode 0x0000 is a nop. So any string with an even number of
characters plus the terminating zreo will result in an additional nop being executed.
Therefore, the routine can end with
lsr ZH
ror ZL
push ZL
push ZH
ret
The three code snippets pasted together work without problems, just process_string has to be added by
you. Here' a working example with comments: flash_string.asm
This example stores the string at the memory address Y points to and works well in the simulator.

Memory Copy Routines

Copying data from one memory location to the other can cause serious headaches. When only one byte is
copied, that's no problem. Assuming the source pointer is X and the destiation pointer is Y, we can just
load a register from X and save it to Y:
ld r16, X
st Y, r16
When more than one byte (e.g. a string) has to be copied/moved, things can get more difficult. If the
source and destination memory areas don't overlap, the copy routine is still fairly simple. The following
routine copies from X to Y, while the number of bytes to be copied are in r16. To ensure that even 0 bytes
can copied, the loop first checks for r16=0, then copies and post-decrements. When everything is finished,
the number of bytes copied is subtracted from both pointers so that they each point to the first byte of the
data block again:
copy_mem:
mov r17, r16
copy_mem_loop:
tst r17
breq end_copy_mem
ld r18, X+
st Y+, r18
dec r17
rjmp copy_mem_loop
end_copy_mem:
sub XL, r16
sbci XH, 0
sub YL, r16
sbci YH, 0
ret

; routine to call, X=source, Y=destination, r16=length


; save number of bytes to be copied
; the loop label
; if no data to be copied/left, end the routine
;
;
; load data from X and post-inc
; store at Y and post-inc
; decrement number of bytes left
; and loop
;
; this is what we do when we've finished
; subtract number of bytes copied from X
;
; and Y
;
; and then return.

A disadvantage of the above is that a zero-copy call needs more cycles than actually needed, because it
subtracts 0 from both pointers. With an extra label just before the return instruction and an extra test
before copy_mem_loop this can be improved at the cost of code space.
The Problem...
...is that during runtime it can happen that the source space and the destination space overlap. Assuming
that an array of 10 bytes has to be moved, the source being at 0x65 and the destination being at 0x60.
That's no problem:

"A" is copied to 0x60 and so on without problems because only the addresses from 0x65 upwards overlap.
Where we before had "A" the routine will now copy the "F" from address 0x6A and "A" will be overwritten.
That's not bad because we already copied it to it's destination.
Now swap source and destination: The array has to be copied from 0x60 to 0x65. What happens? The
result is "ABCDEABCDE" - why?
Remember that the copy routine uses load - post-increment(source) - store - post-increment(destination).
Here's a diagram of the memory contents before copying the block:

When "A" is copied from 0x60 to 0x65, "F" (which we want to copy later) will be overwritten. The result is
that at the address where the "F" was, an "A" will be read and stored at the "F" destination address.
I've made up an example asm file which can be simulated in AVR Studio. Before running it please put a
10-byte array into sram beginning at address 0x65. The code will copy it to 0x60 and then back to 0x65
which will result in the error described above.
A mem copy routine that is supposed to correctly copy the data has to be able to handle the two cases
above: First, it has to use post-increment addressing if the two blocks overlap and the source is at a higher
address than the destination (first diagram). Second, it can happen that they overlap and the source is at a
lower address. Then, pre-decrement addressing has to be used (second diagram).
Testing for all these cases requires more code and time than necessary - the only information required is if
the source address is higher or lower than the destination address: If the source address is lower, predecrement addressing has to be used. If the two blocks overlap we're on the safe side then. Otherwise
(source address > destination address; first diagram) post increment addressing can be used. Here's the
complete one:
copy_mem:
ldi r18, 0
mov r17, r16
cp XL, YL
cpc XH, YH
brsh copy_mem_inc_loop
add XL, r17
adc XH, r18
add YL, r17
adc YH, r18
copy_mem_dec_loop:
tst r17
breq end_copy_mem_dec
ld r18, -X
st -Y, r18
dec r17
rjmp copy_mem_dec_loop
end_copy_mem_dec:
ret
copy_mem_inc_loop:
tst r17
breq end_copy_mem_inc
ld r18, X+
st Y+, r18
dec r17
rjmp copy_mem_inc_loop
end_copy_mem_inc:
sub XL, r16
sbci XH, 0
sub YL, r16

;
; create zero register
; save block size
; source >= destination?
;
; if so, use incrementing addressing
;
; for pre-decrement addressing, we have to copy from top to bottom
; so we have to add the blocksize to the two pointers
;
;
; here's the loop:
; first, check if the copy is done
; if so, return
;
; load from source
; store at destination
; and decrement the to-be-done counter
; loop
;
; when the whole thing is finished, we can just return, because the
pointers
; were pre-decremented to their original value
;
; in case the source address is smaller than the destination address,
; we can use post-inc
; again: the copy-done-check
;
; get data...
; ...store it again
; and decrement the counter
; loop
;
; when the copy has been finished, we have to subtract the block
size
; from the two pointers again, because this time we used post-inc

sbci YH, 0
ret

;
;
;
; return.

As always, there's ways to make this routine faster. A 2313 for example only has 128 bytes of SRAM which
only have 8-bit addresses, so all 16-bit calculations can be converted to 8-bit. If only even numbers of
bytes have to copied, two can be subtracted from the loop counter in every run if two bytes are copied
(load-store-load-store -> loop).

The Stack
The Stack is used by the ALU to store return addresses from subroutines.
Imagine you can't remember where you just left. You'd have to write down where you left and, if you're
visiting several locations, put the notes onto a stack. Your stack pointer tells you where that stack is. A
microcontroller is just doing that - when a subroutine is called, it leaves the place in flash where it was just
working and saves the return address on the stack.
The Stack needs a stack pointer (SP) and space in SRAM (the stack pointer must point above the first
SRAM address). When a return address is stored, the SP is post-decremented (!!!!!!). In other words: The
stack is growing towards smaller SRAM addresses. The biggest stack possible is initialised to RAMEND. It
can then grow all the way down to the first SRAM address.
Here's a table/diagram/figure/whatever of how the stack is changed by rcall and ret.
.org 0x00
ldi SPL low(RAMEND)
ldi SPH, high(RAMEND)
rcall subrtn_1
.org 0x100
subrtn_1:
rcall subrtn_2
ret
.org 0x140
subrtn_2:
ret

layer 0:
layer 1:
layer 2:

Stack value
-------

SP value
SP = ???

Comment
Stack before init

Then the SP is set to RAMEND:

layer 0:
layer 1:
layer 2:

Stack value
-------

SP value
<-SP

Comment
Stack after init

Stack state after rcall subrtn_1:

layer 0:
layer 1:
layer 2:

Stack value
0x01
-----

SP value
<- SP

Comment
return address
SP=SP-1

Stack state after rcall subrtn_2:

layer 0:
layer 1:
layer 2:

Stack value
0x01
0x0101
---

SP value

<- SP

Comment
return address
return address
SP=SP-1

When the return is executed, the return address is popped from the stack and the SP is incremented. In
the example, when returning from subrtn_2, the micro jumps to 0x101 (the ret instruction in subrtn_1) and
the Stack Pointer points to stack layer 1 again. I didn't make a table for that as it should be easy to
understand now.

The stack can also be used to pass arguments to subroutines using push and pop. If a subroutine has a
16-bit argument, passing it would look like this:
push r16
push r17
rcall set_TCNT1
set_TCNT1:
pop r17
pop r16
out TCNT1H, r17
out TCNT1L, r16
ret

; push 16-but argument r16:r17


;
; and call the subroutine
;
; our subroutine writes its 16-but argument to the Timer 1 counter
; register. It pops the argument from the stack
; (reversed order!)
; and uses it
;
; now it returns.

It's important to keep the push and pop instructions balanced to each other. If a value is pushed on the
stack as an argument folowed by a subroutine call, the next ret can result in unexpected behavior if the
subroutine popped too many or no argumants at all. One push, one pop. This bug is often hard to find.
Why can't the subroutine just use r16:r17 instead of the stack as a base for passing arguments? Good
question. By using the stack, you can use any register to push the value on the stack. You're not limited to
r16 and r17. You can also push an argument and then use the registers to calculate the next one (file
systems for example need lots of registers for calculations). You can also use a heap to pass arguments.
This has the advantage that you can't mess up your return addresses.
Let's take a closer look at how the return address is stored on the stack by simulating it in AVR Studio. I've
not included images of this in order to save space, but it's quite simple. This is the code for finding out how
return addresses are pushed on the stack:
;(include 2313def.inc)
.org 0x0000
rjmp reset
reset:
ldi r16, low(RAMEND)
out SPL, r16
rcall dummy
.org 0x0123
dummy:
rcall dummy2
ret
dummy2:
ret

;
;
; reset interrupt vector
;
;
; initialisation:
; stack pointer to RAMEND
;
;this will push 0x0004 on the stack (note 1)
;
;
;first dummy routine
; address on stack: 0x0124 [Break Point]
; the ret is at address 0x0124
;
; second dummy routine
; [Break point]

note 1: rcall dummy will push 0x0004 on the stack because there are 3 instructions before it that use one
word of code space each (rjmp; ldi; out; + rcall) so the next address after the subroutine call instruction is
0x0004.
The simulator is set up as follows: 2313 @ 1MHz, one memory window (Data) for viewing SRAM contents.
Now run the code. After the first break the SRAM will hold 0x04 at address 0xDF and 0x00 at address
0xDE. That means that the low byte of the address (which is 0x04) is at the higher address.
After the second rcall (second break) the return address to dummy's ret is also pushed on the stack: 0x24
at address 0xDD and 0x01 at address 0xDC.
The low address byte is pushed first, as the simulation shows. If you wanted to do calculations on that
address, you'd have to pop the high byte first. Beware: Messing with the stack is not easy and should be
done with caution!

Subroutines
Subroutines are code segments you can call and return from. That's cool, because you can reuse the code
from every point in your program without wasting program space. For subroutines to work, some
preparation is needed.

In order to know where to store and find return addresses (where to go on when returning from a
subroutine), we need to setup the Stack Pointer (SP). When a return address is stored, the SP is set to the
location before the stored address, so setting the SP to the last SRAM location (RAMEND) for initialisation
is (in most cases) best (see upper part of image). In the lower part of the image you can see how the
address is stored. If external memory is connected, the SP should be set to the last internal address for
speed reasons (accessing external SRAM takes longer).
ldi r16, low(RAMEND)
out SPL, r16
ldi r16, high(RAMEND)
out SPH, r16

;load r16 with low byte of last sram address


;setup SP low byte
;same with high byte

RAMEND is defined in the micros include file you get with AVRStudio and equal to the last available
internal SRAM address.
A subroutine begins with a label which is the subroutine's name. The follwing example routine writes the
value of r16 to PortA and then returns:
out_portA:
out PortA, r16
ret
main:
rcall out_PortA
rjmp main

;this the routine's name


;output r16 to PortA
;return to code where this routine was called from
;
;again, a label
;relative call subroutine
;repeat forever

In the example above rcall is used. This instruction jumps to a relative address and is 2 bytes long and
needs 3 cycles for execution. The disadvantage is that the subroutine has to be located at +/- 2k words.
Another possible instruction is call. This instruction jumps to an absolute address and therefore needs
more code space: 4 bytes, 4 cycles. It can reach the whole code space, which is important in devices with
more than 8kB of program space. The 8k AVRs only need rjmp and rcall, as all addresses can be reached
with +/- 2k word jumps.
In the Advanced Assembler section you will find an introduction to icall. It uses an address stored in the Z
register pair to call a subroutine.

Interrupts
Interrupts, as the name suggests, interrupt the normal program flow. When an interrupt occurs, the ALU
calls the correspoding interrupt vector and executes the code at that address. As the interrupt vectors each
are only one word long (classics AVR, two words for some megas), you'd usually put a jump instruction
there which goes to an Interrupt Service Routine.
The Interrupt vectors start at address 0x0000. The very first one (at 0x0000) is the reset vector. When a
reset (internal or external) occurs, this is where the program counter will be set to. That's why almost all
programs begin with
.org 0x0000
rjmp reset
;(maybe something
;in between...)
reset:
...

; tell the assembler that the following is supposed to start at address


; 0x0000. At 0x0000, jump to "reset"
;
; other interrupt vectors can be put here, as well as any other code
;
;
; do this after a reset occurs
;

Other interrupt vectors will follow the reset interrupt vector. The first ones are the external interrupt lines
(INT0, INT1 and so on), then there's timers, UART and other periphrals. Every AVR datasheet has an
"Interrupts" section somewhere which will include a list of the available interrupts and their vector
addresses. If the table is not entirely filled, you can use single .org statements to set the program counter
of the assembler to the right interrupt vector address instead of filling up the table with other useless code.
Here are two examples for the 8515 doing the same thing:
.org 0x0000
rjmp reset
rjmp Ext_Int0
rjmp Ext_Int1
reti
reti
reti
reti
reti
reti
rjmp UART_RxC

; reset vector address (0x0000)


; upon reset, jump to "reset"
; external interrupt 0 vector address (0x0001)
; external interrupt 1 vector address (0x0002)
; (timer 1 capture event)
; (timer 1 compare match A)
; (timer 1 compare match B)
; (timer 1 overflow)
; (timer 0 overflow)
; SPI transfer complete
; UART Receive Complete vector address (0x0009)

.org 0x0000
rjmp reset
rjmp Ext_Int0
rjmp Ext_Int1
.org 0x0009
rjmp UART_RxC

; reset vector address (0x0000)


; upon reset, jump to "reset"
;
;
;
;

So why do some people use the first version? The second one is shorter and, if many interrupt sources are
available (have a look ata the mega128!) better to look at if only a few are used.
The first one is safer. If an interrupt occurs (by error) that has no instruction at the reset vector address, the
next valid one will be called. So if in the second table the SPI transfer complete interrupt occurs for some
unknown reason, the UART_RxC ISR is called. Not good.
Interrupts can occur at any time (unless the Interrupt Enable Bit in the SREG is cleared). Consequently
they can also occur if the code is just doing some calculations. These calculations change flags in the
status register and are used for the next step of the calculation, or some branch. If the ISR is also
changing flags in SREG (for example by testing a register for zero) it can corrupt the calculation that is
taking place in the normal application. That's why ISRs should take some precautional steps:

- Preserve the status register (calculation flags might be corrupted in ISR)


- Preserve any registers it uses (as long as they're used in main code as well)
Of course you can also skip all that, given that the following is ensured:
- The ISR doesn't change any status flags
- The ISRs are given dedicated working registers which are not used in the main code.
An ISR wil only be called if it's corresponding interrupt enable bit is set AND if the global interrupt enable
bit is set. This gives you the possibility to select the interrupts you allow. The following flow chart might
clarify things a bit more:

If interrupts are not wanted during a particular code segment (when doing time critical stuff or calculations),
just disable the Global Interrupt Enable Bit (GIE bit) in SREG.
When an ISR is called, the GIE bit is cleared, so that no int can interrupt the ISR. ISRs should return with
reti instead of ret, as reti reenables the GIE bit automatically.

I/O Ports
The AVR I/O Ports are pretty simple to understand. They have a Port register, a Data Direction register and
a Pin register. These are part of every I/O port in every AVR. Here's a drawing of their basic functionality:

As you can see, there's an internal pull-up for every pin. It can be activated by setting the DDR bit of the
pin to 0 and the Port bit to 1. A cleared DDR bit means that the pin is an input pin. So the pin is
disconnected from the Port register (see the driver in the drawing?) and the pin is floating. In this case the
Port bit controls the pull-up. I have not drawn boxes for writing/reading the data direction bit, because it
would only make the drawing more complex.
Why I actually made the drawing for is not only describing the pull-up, but also to explain a mistake many
people make, even experienced programmers, just because it doesn't "hit the eye":
When the actual state of a port pin is needed (which is not necessarily the Port bit value), often the Port bit
is read instead of the pin bit (by mistake). The Pin bit is directly connected to the physical pin. The port bit
can be disconnected from the pin via the data direction register. So if you have problems with your I/O
code, check for this mistake first.
Here's a table with the possible Port/DDR cominations and what they do to the pin:
DDR bit = 1
Port bit = 1
Port bit = 0

High
Low

DDR bit = 0
pull-up
floating

Reading from/writing to the ports can be done bit-wise or byte-wise (whole port), on Pin, Port, and Data
Direction registers.
The drawing above is just a simple one. As many Port pins have special functions, their values are also
controlled by the internal peripherals, like the pins of the UART or SPI. These are more complex and can
be looked at in the datasheets.

I/O Instructions
The simplest I/O instructions are in and out.
in reads the value of an I/O Port or internal peripheral register (Timers, UART and so on) into a register.

Example: in r16, PinD


out writes the value of a register to an I/O Port or an internal peripheral register.
Example: out PortD, r16
If only one bit of an I/O register is needed, the AVR has some bit instructions: sbi and cbi for manipulating
single bits of an I/O register (though these instructions don't operate on all I/O registers) and sbic and sbis
for testing bits of an I/O register.
sbi PortD, 7 (same for cbi)
sbic PortD, 7
rjmp bit_is_set
sbic/sbis (skip if bit in I/O register is cleared/set) skips the next instruction depending on the I/O bit's state.
In the example above I added a relative jump (rjmp) to show you how you could use this instruction. In this
case, the mcu will jump to bit_is_set if bit 7 of PortD is not cleared (->set) and proceed with the instruction
following the relative jump if that bit is cleared.
As I mentioned above, sbi and cbi don't operate on all I/O registers. The same is true for sbic/sbis. These
can only be used for "classic" I/O Ports and other peripheral registers with addresses from 0 to 31 (0x00 to
0x1F).

Timers
[8-bit] [16-bit] [Register Overview] [Modes] [Examples]
The AVR has different Timer types. Not all AVRs have all Timers, so look at the datasheet of your AVR
before trying to use a timer it doesn't have...This description is based on the AT90S2313.
I will only describe the "simple" timer modes all timers have. Some AVRs have special timers which
support many more modes than the ones described here, but they are also a bit more difficult to handle,
and as this is a beginners' site, I will not explain them here.
The timers basically only count clock cycles. The timer clock can be equal to the system clock (from the
crystal or whatever clocking option is used) or it can be slowed down by the prescaler first. When using the
prescaler you can achieve greater timer values, while precision goes down.
The prescaler can be set to 8, 64, 256 or 1024 compared to the system clock. An AVR at 8 MHz and a
timer prescaler can count (when using a 16-bit timer) (0xFFFF + 1) * 1024 clock cycles = 67108864 clock
cycles which is 8.388608 seconds. As the prescaler increments the timer every 1024 clock cycles, the
resolution is 1024 clock cycles as well: 1024 clock cycles = 0.000128 seconds compared to 0.125s
resolution and a range of 0.008192 seconds without prescaler. It's also possible to use an external pin for
the timer clock or stop the timer via the prescaler.
The timers are realized as up-counters. Here's a diagram of the basic timer hardware. Don't panic, I'll
explain the registers below.
The 8-bit Timer:

The 8-bit timer is pretty simple: The timer clock (from System Clock, prescaled System Clock or External
Pin T0) counts up the Timer/Counter Register (TCNT0). When it rolls over (0xFF -> 0x00) the Overflow
Flag is set and the Timer/Counter 1 Overflow Interrupt Flag is set. If the corresponding bit in TIMSK (Timer
Interrupt Mask Register) is set (in this case the bit is named "TOIE0") and global Interrupts are enabled,
the micro will jump to the corresponding interrupt vector (in the 2313 this is vector number 7).
The 16-bit Timer
is a little more complex, as it has more modes of operation:

Register Overview:

[TCNT1] [TCCR1A / TCCR1B] [OCR1A] [ICR1] [TIMSK and TIFR]


The register names vary from timer to timer and from AVR to AVR! Have a look at the datasheet of the AVR
you're using first.
Important note: All 16-bit registers have can only be accessed one byte at a time. To ensure precise timing,
a 16-bit temporary register is used when accessing the timer registers.
Write:
When writing the high byte (e.g. TCNT1H), the data is placed in the TEMP register. When the low byte is
written, the data is transferred to the actual registers simultaneously. So the high byte must be written first
to perform a true 16-bit write.
Read:
When reading the low byte, the high byte is read from TCNT1 simultaneously and can be read afterwards.
So when reading, access the low byte first for a true 16-bit read.
TCNT
Most important is the Timer/Counter Register (TCNT1) itself. This is what all timer modes base on. It
counts System Clock ticks, prescaled system clock or from the external pin.
TCCR
The Timer/Counter Control register is used to set the timer mode, prescaler and other options.
TCCR1A:
Bit 7
COM1A COM1A
1
0

Bit 0
---

---

---

---

PWM11 PWM10

Here's what the individual bits do:


COM1A1/COM1A0: Compare Output Mode bits 1/0; These bits control if and how the Compare Output pin
is connected to Timer1.
COM1A1
0
0
1
1

COM1A0
0
1
0
1

Compare Output Mode


Disconnect Pin OC1 from Timer/Counter 1
Toggle OC1 on compare match
Clear OC1 on compare match
Set OC1 on compare match

With these bit you can connect the OC1 Pin to the Timer and generate pulses based on the timer. It's
further described below.
PWM11/PWM10: Pulse Width Modulator select bits; These bits select if Timer1 is a PWM and it's
resolution from 8 to 10 bits:
PWM11
0
0
1
1

PWM10
0
1
0
1

PWM Mode
PWM operation disabled
Timer/Counter 1 is an 8-bit PWM
Timer/Counter 1 is a 9-bit PWM
Timer/Counter 1 is a 10-bit PWM

The PWM mode of Timer1 is dsecribed below.


TCCR1B:

Bit 7
ICNC1

ICES1

---

---

CTC1

CS12

CS11

Bit 0
CS10

ICNC1: Input Capture Noise Canceler; If set, the Noise Canceler on the ICP pin is activated. It will trigger
the input capture after 4 equal samples. The edge to be triggered on is selected by the ICES1 bit.
ICES1: Input Capture Edge Select;
When cleared, the contents of TCNT1 are transferred to ICR (Input Capture Register) on the falling edge
of the ICP pin.
If set, the contents of TCNT1 are transferred on the rising edge of the ICP pin.
CTC1: Clear Timer/Counter 1 on Compare Match; If set, the TCNT1 register is cleared on compare match.
Use this bit to create repeated Interrupts after a certain time, e.g. to handle button debouncing or other
frequently occuring events. Timer 1 is also used in normal mode, remember to clear this bit when leaving
compare match mode if it was set. Otherwise the timer will never overflow and the timing is corrupted.
CS12..10: Clock Select bits; These three bits control the prescaler of timer/counter 1 and the connection to
an external clock on Pin T1.
CS12

CS11

CS10

Mode Description

Stop Timer/Counter 1

No Prescaler (Timer Clock = System Clock)

divide clock by 8

divide clock by 64

divide clock by 256

divide clock by 1024

increment timer 1 on T1 Pin falling edge

increment timer 1 on T1 Pin rising edge

OCR1
The Output Compare register can be used to generate an Interrupt after the number of clock ticks written
to it. It is permanently compared to TCNT1. When both match, the compare match interrupt is triggered. If
the time between interrupts is supposed to be equal every time, the CTC bit has to be set (TCCR1B). It is
a 16-bit register (see note at the beginning of the register section).
ICR1
The Input Capture register can be used to measure the time between pulses on the external ICP pin (Input
Capture Pin). How this pin is connected to ICR is set with the ICNC and ICES bits in TCCR1A. When the
edge selected is detected on the ICP, the contents of TCNT1 are transferred to the ICR and an interrupt is
triggered.
TIMSK and TIFR
The Timer Interrupt Mask Register (TIMSK) and Timer Interrupt Flag (TIFR) Register are used to control
which interrupts are "valid" by setting their bits in TIMSK and to determine which interrupts are currently
pending (TIFR).
Bit 7
TOIE1 OCIE1A

---

---

TICIE1

---

TOIE0

Bit 0
---

TOIE1: Timer Overflow Interrupt Enable (Timer 1); If this bit is set and if global interrupts are enabled, the
micro will jump to the Timer Overflow 1 interrupt vector upon Timer 1 Overflow.

OCIE1A: Output Compare Interrupt Enable 1 A; If set and if global Interrupts are enabled, the micro will
jump to the Output Compare A Interrupt vetor upon compare match.
TICIE1: Timer 1 Input Capture Interrupt Enable; If set and if global Interrupts are enabled, the micro will
jump to the Input Capture Interrupt vector upon an Input Capture event.
TOIE0: Timer Overflow Interrupt Enable (Timer 0); Same as TOIE1, but for the 8-bit Timer 0.
TIFR is not really necessary for controlling and using the timers. It holds the Timer Interrupt Flags
corresponding to their enable bits in TIMSK. If an Interrupt is not enabled your code can check TIFR to
deternime whether an interrupt has occured and clear the interrupt flags. Clearing the interrupt flags is
usually done by writing a logical 1 to them (see datasheet).
Timer Modes
[Normal Mode] [Output Compare Mode] [Input Capture Mode] [PWM Mode]
Normal Mode:
In normal mode, TCNT1 counts up and triggers the Timer/Counter 1 Overflow interrupt when it rolls over
from 0xFFFF to 0x0000. Quite often, beginners assume that they can just load the desired number of clock
ticks into TCNT1 and wait for the interrupt (that's what I did...). This would be true if the timer counted
downwards, but as it counts upwards, you have to load 0x0000 - (timer value) into TCNT1. Assuming a
system clock of 8 MHz and a desired timer of 1 second, you need 8 Million System clock cycles. As this is
too big for the 16-bit range of the timer, set the prescaler to 1024 (256 is possible as well).
8,000,000/1024 = 7812.5 ~ 7813
0x0000 - 7813 = 57723 <- Value for TCNT1 which will result in an overflow after 1 second (1.000064
seconds as we rounded up before)
So we now know the value we have to write to the TCNT1 register. So? What else? This is not enough to
trigger the interrupt after one second. We also have to enable the corresponding interrupt and the global
interrupt enable bit. Here's a flow chart of what happens:

The only steps you have to do are:


- set prescaler to 1024 (set bits CS12 and CS10 in TCCR1B)
- write 57723 to TCNT1
- enable TOIE1 in TIMSK
- enable global interrupt bit in SREG
- wait. Or do anything else. All the counting and checking flags is done in hardware.
Output Compare Mode
The Output Compare mode is used to perform repeated timing. The value in TCNT1 (which is counting up
if not stopped by the prescaler select bits) is permanently compared to the value in OCR1A. When these
values are equal to each other, the Output Compare Interrupt Flag (OCF in TIFR) is set and an ISR can be
called. By setting the CTC1 bit in TCCR1B, the timer can be automatically cleared upon compare match.

The flow chart should show an arrow from CTC1 set? "Yes" to set OCF instead of a line, but somehow the
Flow Charting Program didn't think that was a good idea. Bad luck.
Let's discuss a small example:We want the Timer to fire an int every 10ms. At 8 MHz that's 80,000 clock
cycles, so we need a prescaler (out of 16-bit range).
In this case, 8 is enough. Don't use more, as that would just pull down accuracy. With a prescaler of 8, we
need to count up to 10,000.As the value of TCNT1 is permanently compared to OCR1 and TCNT1 is upcounting, the value we need to write to OCR is acutally 10,000 and not 0x0000-10,000, as it would be
when using the timer in normal mode.
Also, we need to set CTC1: If we didn't, the timer would keep on counting after reaching 10,000, roll over
and then fire the next int when reaching 10,000, which would then occur after 0xFFFF*8 clock cycles.
That's after 0.065536 seconds. Not after 10ms. If CTC1 is set, TCNT1 is cleared after compare match, so
it will count from 0 to 10,000 again with out rolling over first.
What is to be done when those 10ms Interrupts occur? That depends on the application. If code is to be
executed, the corresponding interrupt enable bit has to be set, in this case it is OCIE1A in TIMSK. Also
check that global interrupts are enabled.
If the OC1 (Output Compare 1) pin is to be used, specify the mode in TCCR1A. You can set, clear or
toggle the pin. If you decide that you want to toggle it, think about your timing twice: If you want a normal
pulse which occurs every 10ms, the timer cycle must be 5ms: 5ms -> toggle on -> 5ms -> toggle off. With
the 10ms example above and OC1 set up to be toggled, the pulse would have a cycle time of 20ms.
Input Capture Mode
The Input Capture Mode can be used to measure the time between two edges on the ICP pin (Input
Capture Pin). Some external circuits make pulses which can be used in just that way. Or you can measure
the rpm of a motor with it. You can either set it up to measure time between rising or falling edges on the

pin. So if you change this setting within the ISR you can measure the length of a pulse. Combine these two
methods and you have completely analysed a pulse. How it works?
Here's a flow chart of its basic functionality:

You see that it's actually pretty simple. I left out the low level stuff such as Interrupt validation
(enabled/global enable), as you should understand that by now. The contents of TCNT1 are transferred to
ICR1 when the selected edge occurs on the Input Capture Pin and an ISR can be called in order to clear
TCNT1 or set it to a specific value. The ISR can also change the egde which is used to generate the next
interrupt.
You can measure the length of a pulse if you change the edge select bit from within the ISR. This can be
done the following way:
Set the ICES (Input Capture Edge Select) bit to 1 (detect rising edge)
When the ISR occurs, set TCNT1 to zero and set ICES to 1 to detect negative egde
When the next ISR is called, the pin changed from high to low. The ICR1 now contains the number of
(prescaled) cycles the pin was high. If the ISR again sets the edge to be detected to rising (ICES=1), the
low pulse time is measured. Now we have the high time AND the low time: We can calculate the total cycle
time and the duty cycle.
It's also possible to connect the Analog Comparator to the input capture trigger line. That means that you
can use the Analog Comparator output to measure analog signal frequencys or other data sources which
need an analog comparator for timing analysis. See the Analog Comparator page for more.
PWM Mode
The Pulse Width Modulator (PWM) Mode of the 16-bit timer is the most complex one of the timer modes
available. That's why it's down here.
The PWM can be set up to have a resolution of either 8, 9 or 10 bits. The resolution has a direct effect on
the PWM frequency (The time between two PWM cycles) and is selected via the PWM11 and PWM10 bits
in TCCR1A. Here's a table showing how the resolution select bits act. Right now the TOP value might
disturb you but you'll see what it's there for. The PWM frequency show the PWM frequency in relation to
the timer clock (which can be prescaled) and NOT the system clock.
PWM11 PWM10 Resolution TOP-value PWM Frequency
0
0
PWM function disabled

0
1
1

1
0
1

8 bits
9 bits
10 bits

$00FF
$01FF
$03FF

fclock/510
fclock/1022
fclock/2046

To understand the next possible PWM settings, I should explain how the PWM mode works. The PWM is
an enhanced Output Compare Mode. In this mode, the timer can also count down, as opposed to the other
modes which only use an up-counting timer. In PWM mode, the timer counts up until it reaches the TOP
value (which is also the resolution of the timer and has effect on the frequency).
When the TCNT1 contents are equal to the OCR1 value, the corresponding output pin is set or cleared,
depending on the selected PWM mode: You can select a normal and an inverted PWM. This is selected
with the COM1A1 and COM1A0 bits (TCCR1A register). The possible settings are:
COM1A1
0
0
1
1

COM1A0
0
1
0
1

Effect:
PWM disabled
PWM disabled
Non-inverting PWM
inverting PWM

Non-inverted PWM means that the Output Compare Pin is CLEARED when the timer is up-counting and
reaches the OCR1 value. When the timer reaches the TOP value, it switches to down-counting and the
Output Compare Pin is SET when the timer value matches the OCR1 value.
Inverted PWM is, of course, the opposite: The Output Compare Pin is set upon an up-counting match and
cleared when the down-couting timer matches the OCR1 value. Here are two diagrams showing what this
looks like:

The reason why you can select between inverting and non-inverting pwm is that some external hardware
might need an active-low pwm signal. Having the option to invert the PWM signal in hardware saves code
space and processing time.
The PWM is also glitch-free. A glitch can occur when the OCR1 value is changed: Imagine the PWM
counting down to 0. After the pin was set, the OCR1 value is changed to some other value. The next pulse
has an undefined length because only the second half of the pulse had the specified new length. That's

why the PWM automatically writes the new value of OCR1 upon reaching the TOP value and therefore
prevents glitches.

Typical applications for the PWM are motor speed controlling, driving LEDs at variable brightness and so
on. Make sure you have appropriate drivers and protection circuitry if you're using motors!
Some Examples...
Some simple examples can also be found here...
- Setting up a timer
- Flashing LED using the Timer Overflow Interrupt and the Output compare mode
- Pulse Width Modulated LED demo with two timers

Setting up a Timer
Setting up a timer is pretty simple - once you know how it basically works. Once you've set up a timer
successfully you can also use the other modes without much learning as all timer modes are based on the
same principles.
Right now, we just want to let an LED light up for 1 second after reset. What do we need for that? The best
way is to set up a timer (Timer1, the 16 bit timer), switch the LED on and wait. As the timer overflow has an
own interrupt, we can write an ISR that switches the LED off again.
First, some stuff that has to be prepared/kept in mind. The following is assumed:

- The program is running on a AT90S2313, the LED is connected to PortB.4 (this pin doesn't have any
special functions), cathode connected to PortB.4 via a current limiting resistor and the anode connected to
Vcc. That means that the LED is ON when the port pin is LOW.
- The micro is running at 4MHz
- How do we get the timer to overflow after 1 second? 1 second means 4 Million cycles, so we need a big
prescaler: 1024 seems to be good. 4,000,000 / 1024 = 3906,25; so after 3906 timer clock cycles the timer
has to overflow. As the timers count UP and then overflow from $FFFF to 0 (that's when the ISR is called),
we have to load TCNT1 with -3906 (=0xF0BE)
- The interrupt vector for Timer1 overflow is at address 0x0005
Here's the code: (don't forget to include 2313def.inc!!!)
.org 0x0000
rjmp reset
.org 0x0005
rjmp led_off
reset:
ldi r16, low(RAMEND)
out SPL, r16
ldi r16, high(0xF0BE)
out TCNT1H, r16
ldi r16, low(0xF0BE)
out TCNT1L, r16
ldi r16, 0b00000101
out TCCR1B, r16
ldi r16, 0b10000000
out TIMSK, r16
sei
sbi DDRB, 4
cbi PortB, 4
loop:
rjmp loop
led_off:
push r16
in r16, SREG
push r16
ldi r16, 0
out TCCR1B, r16
sbi PortB, 4
pop r16
out SREG, r16
pop r16
reti

; reset vector address


; when reset occurs, jump to label "reset"
; Timer 1 overflow interrupt vector address
; jump to "led_off"
;
; reset handler:
; setup stack pointer
;
;
; load timer 1 register (TCNT1) with 0xF0BE
;
;
;
;
; set CS10 and CS12 for 1024 cycle prescaler
;
;
; set bit 7 in TIMSK to enable Timer 1 overflow interrupt
;
;and enable global interrupts
;
; set PortB.4 to output
; switch LED on
;
; loop forever (we're waiting for the interrupt)
;
;
; This is the ISR:
; preserve r16
; save MCU status register
;
;
; stop Timer 1 (clear CS10 and CS12)
;
; Turn off LED
;
; restore SREG
;
; restore r16
; return from interrupt

Simulating this code in AVR Studio showed that the LED is turned off after 3999759 cycles. When
changing the timer value to 0xF0BD the simulator turns the LED off after 4000783 cycles (3999759 +
1024).
This is not the fastest code you can write for this specific problem. As the micro isn't doing anything during
the loop, the ISR doesn't need to preserve any register or the SREG, but I included this anyway to remind
you of that important step.

No it's up to you to alter this code to use a prescaler of 256 or turn the LED off after some other time
interval. Then the TCCR1B and TCNT1 values change. You can also connect the LED to PortB.3 (Output
Capture Pin) and use the Output Compare mode of Timer 1 to make the LED flash! It's now just a matter of
reading the datasheet or the AVR Architecture -> Timers page (see "modes").

Embedded "Hello World" - The Flashing


LED (Timer Version)
Making an LED flash is not only possible with delay loops, but also with timers, of course. The good thing
about timers is that they need little tweaking (or none at all) and don't block the CPU.
This example does the same as the Getting Started Example did. We let an LED flash once per second.
The AVR (2313) it is connected to is running at 4 MHz. The LED is again (via a current limiting resistor)
connected to PortB.3 (cathode) and to Vcc (anode). The LED will therefore be ON when the port pin is
LOW.
Setting up the timer requires some quick calculations. First of all, we need the 16 bit timer for this. Why?
Half a second at 4 MHz is 2 million clock cycles. With a prescaler of 1024 (the biggest possible), we still
need 1953 counts, which is more than the 8 bit range can offer. The 16-bit timer can do that. We can even
choose a smaller prescaler for higher accuracy. With a prescaler value of 64, 31250 timer clock cycles are
needed, which will result in exactly 2,000,000 clock cycles. That's good.
0x0000 - 31250 = 0x85EE is what we're loading into TCNT1. The timer is up-counting and will generate an
overflow when counting from 0xFFFF to 0x0000, that's why we start counting at "-31250".
The timer 1 overflow interrupt ISR has to reload timer 1 with this value every time the ISR is called.
Otherwise, the timer ill start from 0 again and the next interrupt will occur after 1.049 seconds.
In order to enable the timer, we need to set up the interrupt vector, initialise the timer and write an ISR that
toggles the LED port pin. For the timer stuff, re-read "Setting up a Timer". On that page, the example will
switch on the LED for one second after reset. If you understand that example, this one will be no match for
you.
Here's the example code, again don't forget to include 2313def.inc or whatever include files you need for
your AVR!
.equ timer_value = 0x85EE
.org 0x0000
rjmp reset
.org 0x0005
rjmp Timer1_ovf
reset:
ldi r16, low(RAMEND)
out SPL, r16
ldi r16, 0b10000000
out TIMSK, r16
ldi r16, high(timer_value)
out TCNT1H, r16
ldi r16, low(timer_value)
out TCNT1L, r16
ldi r16, 0b00000011
out TCCR1B, r16
sei
sbi DDRB, 3
cbi PortB, 3

; value to be loaded into timer 1


;
; reset vector
;
; Timer 1 Overflow interrupt vector
; this is where we jump to on overflow
;
; jump here after reset and begin init:
; set up stack pointer
; remember SPH=high(RAMEND) as well for bigger devices!!!
;
; the timer 1 setup code:
; enable Timer 1 overflow int in TIMSK
; set timer 1 counting value
;
;
;
; and start timer 1 with a prescaler of 64
;
;
; enable global interrupts
;
; PortB.3 = output
; PortB.3 = low => LED on
;

loop:
rjmp loop
Timer1_ovf:
in r16, SREG
push r16
ldi r16, high(timer_value)
out TCNT1H, r16
ldi r16, low(timer_value)
out TCNT1L, r16
in r16, PortB
ldi r17, 0b00001000
eor r16, r17
out PortB, r16
pop r16
out SREG, r16
reti

; loop forever (do nothing but wait for the ISR)


;
;
; the ISR:
; save SREG on stack
;
;
; reload timer value
;
;
;
; get PortB status
; load r17 with bit 3 = "1"
; toggle bit 3 in r16
; and write to PortB again
;
; restore SREG
;
; and return from ISR

When simulating this in AVR Studio the ISR toggles the Port pin every 2,000,000 clock cycles.
There's an even better way to make an LED flash! Timer 1 has a mode called "Output Compare Mode".
You can read about this on the AVR Architecture -> Timers page or in the datasheet. On the 2313, PortB.3
is also the Output Compare pin, that's why I chose it for this example. I'll not go into detail, just give you
some code and explain what it does.
.equ timer_value = 31250
.org 0x0000
rjmp reset
reset:
ldi r16, low(RAMEND)
out SPL, r16
ldi r16, 0b01000000
out TIMSK, r16
ldi r16, high(timer_value)
out OCR1AH, r16
ldi r16, low(timer_value)
out OCR1AL, r16
ldi r16, 0b01000000
out TCCR1A, r16
ldi r16, 0b00000011
out TCCR1B, r16
sbi DDRB, 3
sbi PortB, 3
loop:
rjmp loop

; value timer 1 is constantly compared to


;
; (same as above)
;
;
; jump here after reset and begin init:
; set up stack pointer
;
; enable Timer 1 Compare match interrupt
;
; load OC value
; and store it in Output Capture Register
;
;
; set OC1 mode to "toggle"
;
; start timer 1 with a prescaler of 64
;
;
;
; set output compare pin to output
; turn off the LED
;
;loop
;

This time, as the timer value is compared to the Output Compare value, we use 31250 to dreate the
required timing of 0.5 seconds. The timer counts up, and when it reaches 31250, the Output Compare
Match occurs. Then, the OC1 pin is toggled (we told the AVR to do so via TCCR1A). We need no ISR, as
the LED is toggled by hardware. It's also possible to use the ISR from the first example. You then need to
delete the lines which set up TCCR1A (these connect the timer to the OC pin) and set up the interrupt
vector at 0x0003.

Pulse Width Modulated LED Demo

This example uses the Timer 1 PWM mode to make an LED sweep through different brightness levels.
The LED shall reach its full brightness after 1 second. After another second it shall be off again and so on.
The PWM resolution is 8 bits. A second timer (Timer 0) os used to update the pwm value 256 times per
second. Therefore the whole pwm range will be gone through once per second. First second up-couting,
second one down-counting and so on.
The LED is connected to the PWM output pin OC1 with anode to Vcc and cathode via a current limiting
resistor to the output pin. That means that the LED is switched on when the output pin is low (if you have
an STK500 just connect one of the LEDs to the output compare pin) and we need an inverted PWM.
We're using two timers and a flag register for this. The flag register does nothing but signal if the LED is
currently getting brighter or not. Let's choose r2 for this. If it is cleared (=0), the LED is getting less power
over time (OC1 value is decreasing). If it is set (=0xFF), the LED is getting brighter (OC1 value is
increasing).
All this can be done interrupt driven. After setting up the cpu and the timers nothing needs to be done:

The AVR (AT90S2313) shall run at 7.3728 MHz


As we need to go through 256 PWM settings (8-bit range) within one second, we need to set up timer 0 as
follows:
- 1 sec/256 = 0.00390625 sec; at 7.3728 MHz this equals 28800 cycles

- for 28800 cycles we'll need a prescaler of 1024 or 256 (28800 / 1024 = 28.125; 28800 / 256 = 112.5;
28800 / 64 = 450 which is out of 8-bit range). Let's choose 256.
- As the timer is up-counting, we need to set TCNT0 to (256 - 113 =) 143 every time the Timer 0 overflow
ISR is called. Unfortunately, timer 0 does not support an output compare mode.
Timer 1 is responsible for generating the PWM output for the LED.
- For simple PWM output we don't need a prescaler.
- To enable inverted PWM operation of timer 1, we need to set COM1A1 and COM1A0 in TCCR1A.
- The resolution shall be 8 bits. For an LED this doesn't really matter, we could also choose a higher
resolution. But 8 bits require less calculations at runtime. That means that the PWM10 bit in TCCR1A has
to be set, while PWM11 is cleared.
TCCR1B has to be set to 1 for enabling the timer 1 clock.
Remember to include 2313def.inc from a path that works on your system. With the one from my example it
most probably won't work.Here's the code, with enough comments to make everything clear.

The AVR TWI (IC) Interface


[ Overview ] [ Bus Hardware ] [ Start and Stop ] [ Addressing ] [ Data transfer ] [ How It's Done ] [ TWI
Clock Speed ]
Overview
Some AVRs, such as the mega8, have a built-in hardware IC interface (Atmel calls it TWI, "Two-Wire
Interface"). The TWI is a two-wire synchronous serial bus which can be used to connect up to 127 slaves
to one or more masters. Devices can also be slave AND master if wanted. For now, we'll only talk about
single masters. This is the setup that is usually used: The AVR controls the slaves and is the only device
doing so.
A typical bus transfer consists of a Start condtion, a slave address plus read/write bit, one or more bytes of
data (direnction depending on the R/W bit) and a Stop condition. The TWI hardware takes care of all bus
activities, such as generating Start and Stop, clock generation and data transfer. Before digging into the
actual data transfer, we'll have to explain some things regarding "Which device is what, what is it doing
and why does it bother me"-problems.
Every bus action that is performed by the TWI returns a status byte in TWSR (TWI Status Register). These
status codes can be used to determine if something went wrong (or not). A common pitfall when writing
TWI apps is checking for the *right* status code. It can happen that an application spits out an error
message because it checked for the wrong status code. Which status codes are returned by the TWI
hardware depends on wether the AVR is master or slave and also wether it was the transmitting or the
receiving device. The status codes are divided into four groups: Master Transmitter Mode (MT), Master
Receiver Mode (MR), Slave Transmitter Mode (ST) and Slave Receiver Mode (SR).
Example: Reading a data byte from page 0, address 0 from an external 24C16 (2 kBytes) EEPROM (slave
address 0xA0 for writing, 0xA1 for reading):
Master generates Start (MT), status code 0x08 is returned
Master sends slave address (0xA0, MT), EEPROM returns ACK, status code 0x18
Master sends 0x00 (MT), EEPROM returns ACK, status code 0x28
Master generates repeated start (MT)
Master sends slave address (0xA1, MR), EEPROM returns ACK, status code 0x40
Master reads data (MR) from EEPROM, and returns NACK, status code 0x58
Master generates Stop (MT), no status code returned

The problem with this is the following: Though the Master transmits the slave address (see line five), it is
doing that in Master Receiver Mode, because the read bit is set. This important, because the status codes
returned in Master Receiver mode are not in the same table as those in Master Transmitter mode in the
datasheet! Both tables should be printed out to have them ready for programming, as all TWI operations
should be initiated only if the status codes were those that had been expected.
One more thing though: The short TWI action list above mentions ACK and NACK. These are transmitted
as a 9th data bit and indicate whether the device receiving data accepted the data transfer or address.
More on that in the Addressing and Data Transfer parts of this page.
Bus Hardware
All devices connected to the bus must be capable of driving the bus lines SCL (clock) and SDA (data).
That's why the bus is externally pulled up by resistors. The devices connected to the bus only pull it low.
The following figure (Figure 68 from the mega8 datasheet) shows how devices are connected to the bus:

...pretty simple actually. In fact, this is just about everything you'll need for a start. If you're missing the
master in the figure, remember that all devices can be the master if programmed to. Usually there will be
just one master (your AVR), which might be device 2 or 6 or 120 or n. It doesn't matter. The value of the
Pull-Up resistors depends on your bus capacitance (HAHA I know) and can be calculated with a formular
in the datasheet (TWI Characteristics). 4K7 works.
When idle, the bus lines are high (pulled high by the resistors). When SCL is high, SDA must not change
except for Start and Stop. Data on SDA can change while SCL is low and must be valid when SCL goes
high. SCL is pulsed high to clock in the data. SCL is ALWAYS controlled by the master. Both master and
slaves can control SDA.
Start and Stop Conditions
Befora any address or data transmission takes place, the master generates a Start condition. This is done
by taking SDA low while SCL remains high. After a transmission is complete, the master has to generate a
stop condition. This is done by taking SDA high while SCL is high. Again, a figure from the datasheet:

Oh, the repeated start... In multi-master systems it can happen that when a master generates a stop
condition, another master will take control over the bus, though the first master actually wanted to transfer
some more data to a different device than before (more on this is in the datasheert). This doesn't happen if
a repeated start is generated. It works just like a normal start, but the current master remains master.
Addressing the Slaves
After a start condition has been generated by the master, it has to send the address of the slaves it wants
to address. The slave address byte consists of a 7-bit slave address plus a transfer direction bit (R/W; 0 for
writing, 1 for reading). Again, from the datasheet:

The main part of the slave address is in the high nibble (bits 4..7). A 24C16 EEPROM for example has the
slave address 0xA0/0xA1 (for write and read, respectively) and uses the lower three bits (1..3) for page
addressing. The lowermost bit (bit 0) is the R/W bit as explained.
If a slave recognizes its own address, it will pull SDA low in the 9th SCL cycle. This is called an ACK
(acknowledge pulse), and is also used for verifying data transfers. If no slave (with the right address) is
present (or if the slave doesn't want to ACK or if it's busy), SDA will stay high during the 9th SCL cycle
(Pull-Up!). That would be a NACK (Not ACK).
A special case of slave addressing is the "General Call". A general call is done by addressing slave 0x00
(write). It can be used for all sorts of stuff depending on the slaves.
Data Transfer
Data transfers work just like address transfers, but they can be done in both directions (address transfers
always go from the master to the slave). They are as well terminated by an ACK/NACK. The ACK or NACK
is generated by the receiving device. This can be the master or the slave, depending on the transfer
direction (depends on the previous address + R/W transfer!). Multiple data transfers can be done after
transmitting the slave address (for EEPROM page writes, or for reading multiple slave status registers for
example). When a master reads data from a slave, it has to generate ACKs after every byte received and
a NACK after the last byte. The data transfer(s) is/are followed by a Stop condition or a repeated start. A
figure shouldn't be necessary here (similar to the address transfer).
How It's Done

It's time to show some example code! First, some important notes:
- The TWI operates based on the TWINT (TWI Interrupt) Flag in TWCR. This flag is set when an operation
has been finished by the TWI hardware. While it is set, no new operation can be started. It is cleared by
being written to 1. As every TWI operation is started by setting appropriate flags in TWCR, TWINT has to
be written as well for EVERY operation.
- The TWI Enable bit (TWEN) is also located in TWCR and also has to be written to one for starting an
operation.
Here's a TWCR description:
Bit 7 Bit 6 Bit 5 Bit 4 Bit 3 Bit 2 Bit 1 Bit 0
TWINT TWEA TWSTATWSTOTWWC TWEN --- TWIE
Bit 7 - TWINT: As described above; This is the TWI Interrupt Flag. It is set when the TWI finishes ANY bus
operation and has to cleared (by writing a 1 to it) before a new operation can be started.
Bit 6 - TWEA: TWI Enable Acknowledge; When the device receives data (as slave or as master), this bit
has to be set if the next incoming byte should be ACKed and cleared for a NACK.
Bit 5 - TWSTA: TWI Start; When a master has to generate a start condition, write this bit 1 together with
TWEN and TWINT. The TWI hardware will generate a start condition and return the appropriate status
code.
Bit 4 - TWSTO: TWI Stop; Similar to TWSTA, but generates a Stop condition on the bus. TWINT is not set
after generating a Stop condition.
Bit 3 - TWWC: TWI Write Collision; Set by the TWI hardware when writing to the TWI Data Register TWDR
while TWINT is high.
Bit 2 - TWEN: Any bus operation only takes place when TWEN is written to 1 when accessing TWCR.
Bit 0 - TWIE: TWI Interrupt Enable; If this bit is set, the CPU will jump to the TWI reset vector when a TWI
interrupt occurs.
As all TWI operations are determined by the value written to TWCR, they're all similar. Here's the usual
structure:
ldi r16, (1<<TWINT)+(1<<TWEN)+(1<<TWSTA)
out TWCR, r16
This will generate a Start condition. After that, you might want for TWINT to be set and then check the TWI
status register (TWSR) if everything is right:
TWI_wait:
in r16, TWCR
sbrs r16, TWINT
rjmp TWI_wait
in r16, TWSR
andi r16, 0xF8
cpi r16, 0x08
brne TWI_error
HA! Something I didn't tell you above: TWSR also contains the clock prescaler bits (TWSR:0..1). These
have to masked away for checking the status value. More on the TWI clock rate below. What this piece of
code does is:
- Wait for TWINT to be set (after generating the start condition above)
- get the status value
- mask away the prescaler bits

- compare the status value to the status value expected. The expected status values are in 4 tables in the
datasheet. The first two tables are for master transmitter and receiver mode. Print out the tables for
programming! You'll need them.
- If the TWI status is not as expected, jump to TWI_error. This can occur for example if a master that is
NOT master tries to control the bus (-> datasheet!)
Sending data or an address is similar, but you'll have to load the address/data into TWDR first. Assuming a
start condition has just been generated, this piece of code will send slave address 0xA1 (24C16 EEPROM
read):
ldi r16, 0xA1
out TWDR, r16
ldi r16, (1<<TWINT)+(1<<TWEN)
out TWCR, r16
Now the same wait-and-check procedure as above will follow. The status code expected is 0x28 (see
master receiver mode status code table). Again: it's VERY important to be absolutely sure what's
happening on the bus for checking for the correct status value!
TWI Clock Speed
The TWI clock speed is usually 100kHz or 400kHz. It is set by writing proper prescaler and clock rate
values to TWSR (bits 0 and 1: prescaler) and TWBR (TWI bit rate register). The formula for the resulting
TWI clock speed is:
CPU_clock/(16 + 2*TWBR*(4^prescaler))
At 8 MHz, a prescaler of 0 (4^0 = 1) and TWBR = 32 will result in the clock speed being 100kHz. The
mega8 datasheet says that TWBR values <10 should not be used.

The EEPROM
EEPROM (Electrically Erasable Programmable Read Only Memory) is one of the three memory types of
AVRs (the other are the Flash memory and SRAM). EEPROM is able to retain its contents when there is
no supply voltage. You can also change the EEPROM contents on runtime, so, EEPROM is useful to store
information like calibration values, ID numbers, etc.
Most AVRs have some amount of EEPROM (the exceptions are ATtiny11 and ATtiny28). You must check
the corresponding datasheet to know the exact amount of memory of your particular device.
To write in the EEPROM, you need to specify the data you want to write and the address at which you
want to write this data. In order to prevent unintentional EEPROM writes (for instance, during power supply
power up/down), a specific write procedure must be followed. The write process is not instantaneous, it
takes between 2.5 to 4 ms. For this reason, your software must check if the EEPROM is ready to write a
new byte (maybe a previous write opeartion is not finished yet).
The address of the byte you want to write is specified in the EEPROM Address Register (EEAR). If the
AVR you are using has more than 256 bytes, the EEAR register is divided in the EEARH and EEARL
registers. The EEPROM Data Register (EEDR) contains the data you want to store.
The EEPROM Control Register (EECR) is used to control the operation of the EEPROM. It has three bits :
EEMWE, EEWE and EERE. The EERE (EEPROM Read Enable) bit is used to read the EEPROM and is
discussed later. In order to issue an EEPROM write, you must first set the EEMWE (EEPROM Master
Write Enable) bit, and then set the EEWE (EEPROM write enable) bit. If you don't set EEMWE first, setting
EEWE will have no effect. The EEWE bit is also used to know if the EEPROM is ready to write a new byte.
While the EEPROM is busy, EEWE is set to one, and is cleared by hardware when the EEPROM is ready.
So, your program can poll this bit and wait until is cleared before writing the next byte.
The following is a code snippet for writing the data 0xAA in address 0x10 :

cli

; disable interrupts

EEPROM_write:
sbic EECR, EEWE
rjmp EEPROM_write
ldi r16, 0x10
out EEAR, r16
ldi r16, 0xAA
out EEDR, r16
sbi EECR; EEMWE
sbi EECR, EEWE

; if write enable bit is cleared, EEPROM is ready to be written to


; else loop until EEWE cleared
; load r16 with address (0x10)
; and write it to the address register
; load with the data we want to write (0xAA)
; and write it to the data register
; set master write enable bit
; set write enable bit

sei

; and allow interrupts again

To read a data from the EEPROM, you must first check that the EEPROM is not busy by polling the EEWE
bit, then you set the EEAR register with the address you want to read, and then set the EERE bit in the
EECR register. After that, the requested data is found in the EEDR register.
The following is a code snippet for reading the data stored in address 0x10. The read data is stored in r16.
EEPROM_read:
sbic EECR, EEWE
rjmp EEPROM_read
ldi r16, 0x10
out EEAR, r16
sbi EECR, EERE
in r16, EEDR

; check if the EEPROM is busy writing a byte


; load address register with 0x10
; set read enable bit
; and get the data from address 0x10

Quite often people report problems reading the data at EEPROM address 0. The data is corrupted or
appears not to be written correctly after a reset. This has a power reason: If the AVR does not have
enough power to run (during times of low supply voltage) it can perform unexpected instructions and
corrupt the first eeprom address. You either need a good reset circuit which can do a reset whenever
needed or just don't use address 0.

The Serial Peripheral Interface (SPI)


[Overview] [Bus Description] [Registers] [ISP]
Overview
The SPI (Serial Peripheral Interface) is a peripheral used to communicate between the AVR and other
devices, like others AVRs, external EEPROMs, DACs, ADCs, etc. With this interface, you have one Master
device which initiates and controls the communication, and one or more slaves who receive and transmit
to the Master.
The core of the SPI is an 8-bit shift register in both the Master and the Slave, and a clock signal generated
by the Master. Let's say the Master wants to send a byte of data (call it A) to the Slave and at the same
time receive another byte of data from the Slave (call it B). Before starting the communication, the Master
places A in its shift register, and the Slave places B in its shift register. (Figure 1-a). Then the Master
generates 8 clock pulses, and the contents of the Master's shift register are transferred to the Slave's shift
register and vice versa (Figure 1-b to 1-e). So, at the end of the clock pulses, the Master has completely
received B, and the Slave has received A. As you can see, the transmission and reception occurs at the
same time, so it is a full duplex data transfer.
The first image is the sate of the two devices before the transfer:

Bus Description
Before you can successfully communicate through the SPI, both the Master and Slave must agree on
some clock signal settings. Details on how to configure this in the AVR will be discussed later.
Please note that not all AVRs have an SPI (you must check the particular datasheet). If your AVR doesn't
have an SPI, you still can implement it in software (the details are not discussed here).
In an AVR, four signals (pins) are used for the SPI: MISO, MOSI, SCK and SS' (SS' means SS
complemented). Here is a brief description of the function of each signal:
MISO (Master In Slave Out): the input of the Master's shift register, and the output of the Slave's shift
register.
MOSI (Master Out Slave In): the output of the Master's shift register, and the input of the Slave's shift
register.
SCK (Serial Clock): In the Master, this is the output of the clock generator. In the Slave, it is the input clock
signal.
SS' (Slave Select): Since in an SPI setup you can have several slaves at the same time, you need a way
to select which Slave you want to communicate to. This is what SS' is used for. If SS' is held in a high
state, all Slave SPI pins are normal inputs, and will not receive incoming SPI data. On the other hand, if
SS' is held in a low state, the SPI is activated. The software of the Master must control the SS'-line of each
Slave.
If the SPI-device is configured as a Master, the behavior of the SS' pin depends on the configured data
direction of the pin. If SS' is configured as an output, the pin does not affect the SPI. If SS' is configured as
an input, it must be held high to ensure Master SPI operation. If the SS' pin is driven low, the SPI system
interprets this as another Master selecting the SPI as a Slave and starting to send data to it. Having two
SPI Masters is quite unusual, so the details of how to manage this are not discussed here (if you are
curious, read the datasheet). So, if you want to keep your life simple, configure the Master's SS' pin as an
output.
The following figures show a typical setup used with SPI:

A word of caution about the SPI pin names. MISO, MOSI, SCK and SS' are the names used by AVRs.
Other devices may use a different set of names. You must check the data sheet of the particular device
you are using to get them right.
What are the data directions of the SPI pins? It depends on the particular pin and on whether the SPI is set
as a Master or Slave. In general, there are two possibilities. A pin is configured as an input regardless of
the setting of the Data Direction Register of the port, or the pin must be configured by the user according
to its function. The following table summarizes this:
Pin
MOSI
MISO
SCK
SS'

Direction, Master SPI


User defined
Input
User defined
User defined

Registers
[SPCR] [SPSR] [SPDR]
SPCR
(SPI Control Register)

Direction, Slave SPI


Input
User defined
Input
Input

Bit 7
SPIE

Bit 6 Bit 5 Bit 4 Bit 3 Bit 2 Bit 1 Bit 0


SPE DORD MSTR CPOL CPHA SPR1 SPR0

SPIE (SPI Interrupt Enable) bit: Set SPIE to one if you want the SPI interrupt to be executed when a serial
transfer is completed.
SPE (SPI Enable) bit: If you want to use the SPI, you must set this bit.
DORD (Data Order) bit: You can choose in which order the data will be transmitted. Set DORD to one to
send the least significant bit (LSB) first. Set DORD to zero to send the most significant bit (MSB) first.
MSTR (Master/Slave Select) bit: Set MSTR to configure the AVR as a Master SPI device. Clear MSTR to
configure it as a Slave.
CPOL (Clock Polarity) and CPHA (Clock Phase) bits: As stated previously, Master and Slave must agree
on how to interpret the clock signal. The first thing to do is to configure which logic level the clock will be in
when the SPI is idle. If CPOL is set to one, SCK is high when idle, and if CPOL is set to zero, SCK is low
when idle. The second thing is to configure during which clock transition the data will be sampled. Set
CPHA to sample the data on the trailing (last) edge, and clear CPHA to sample the data in the leading
(first) edge.
So, there are four different ways of configuring the clock generation, which are known as 'SPI modes'. The
following table summarizes the four SPI modes.
SPI Mode
0
1
2
3

CPOL
0
0
1
1

CPHA
0
1
0
1

Sample
Leading (Rising) Edge
Trailing (Falling) Edge
Leading (Falling) Edge
Trailing (Rising) Edge

The following image shows figure 76 and 77 from the mega128 datasheet:

SPR1 and SPR2 (SPI Clock Rate Select) bits: The SPR bits configure the frequency of the clock signal.
Since the Slave reads the clock from an input pin, the SPR bits have no effect on the Slave. The frequency
of the SPI clock is related to the frequency of the AVR oscillator. The faster the SPI clock signal is, the
faster the data trasfer will be, however, you must respect the maximum clock frequency specified by the
Slave (as usual, read the datasheet). The following table summarizes the relationship between the SCK
frequency and the SPR bits:
SPR1
0
0
1
1

SPR0
0
1
0
1

SCK frequency
fosc/4
fosc/16
fosc/64
fosc/128

SPSR
(SPI Status Register)
Bit 7 Bit 6 Bit 5
SPIF WCOL ---

Bit 4
---

Bit 3
---

Bit 2
---

Bit 1 Bit 0
--- (SPI2x)

SPIF (SPI Interrupt Flag) bit: This is a read only bit. It is set by hardware when a serial transfer is
complete. SPIF is cleared by hardware when the SPI interrupt handling vector is executed, or when the
SPIF bit and the SPDR register are read.

WCOL (Write Colision Flag) bit: This is a read only bit. The WCOL bit is set if the SPDR register is written
to during a data transfer. The WCOL bit (and the SPIF bit) are cleared by first reading the SPI Status
Register with WCOL set, and then accessing the SPI Data Register.
SPI2x (Double SPI Speed) bit: This feature is not implemented in all AVRs (check the particular data
sheet). When this bit is set to one, the SPI speed will be doubled when the SPI is in Master mode.
SPDR
(SPI Data Register)
The SPI Data Register is a read/write register used for data transfer between the Register File and the SPI
Shift Register. Writing to the register initiates data transmission. Reading the register causes the Shift
Register receive buffer to be read.
Finally, here is a code snippet to generate a data transfer between a Master and a Slave. Both Master and
Slave are configured to send the MSB first and to use SPI mode 3. The clock frequency of the Master is
fosc/16. The Master will send the data 0xAA, and the Slave the data 0x55.
Master code:
SPI_Init:
sbi DDRB,DDB5
sbi DDRB,DDB7
sbi DDRB,DDB4
ldi r16,01011101b
out SPCR,r16
SPI_Send:
ldi r16,0xAA
out SPDR,r16
Wait:
sbis SPSR,SPIF
rjmp Wait
in SPDR,r16

; Set MOSI as output.


; Set SCK as output.
; Set SS' as output.
; Set SPI as a Master, with interrupt disabled,
; MSB first, SPI mode 3 and clock frequency fosc/16.

; Initiate data transfer.


; Wait for transmission to complete.
; The received data is placed in r16.

Slave code
SPI_Init:
sbi DDRB,DDB6
ldi r16,01001100b
out SPCR,r16
ldi r16,0x55
out SPDR,r16
SPI_Receive:
sbis SPSR,SPIF
rjmp SPI_Receive
in r16,SPDR

; Set MISO as an output.


; Set SPI as a Slave, with interrupt disabled,
; MSB first and SPI mode 3.
; Send 0x55 on Master request.

; Wait for reception to complete.


; The received data is placed in r16.

SPI and In Circuit Programming (ISP)


The SPI interface is also used to program the AVR. If you want to program your AVR in-circuit and are
using the SPI interface, a series resistor should be placed on each of the three dedicated lines to avoid
'driver contention' (see figure below). A driver contention is the situation you get if two outputs are
connected together. For more details, see AVR application note No. 910.

The UART
[Registers] [Baud Rate Generator] [TX] [RX] [Important Hardware Note]
The AVR UART is a very powerful and useful peripheral and used in most projects. It can be used for
debugging code, user interaction, or just sending data for logging it on a PC. Here's an image of how it
basically is built up (based on the AT90S2313 UART):

The AVR UART can be set up to transmit 8 or 9 bits, no parity, one Stop bit. It filters the data received and
also detects overrun conditions and framing errors. It has three interrupts and allows highly efficient data
stream handling with software buffers.
From the diagram you see that the transmitter and receiver share the UDR (UART Data Register). Actually
they only share the UDR address: The "real" register is divided into the transmitter and receiver register so
that received data cannot overwrite data being written into the transmit register. Consequently you can't
read back data you wrote into the transmitter register.
As both parts of the UART, the transmitter and the recevier, share the Baud Rate Generator and the
control registers, I'll explain them first before showing you the basics of transferring data via the UART.
UART Registers

[UDR] [UBRR] [UCR] [USR]


UDR
Of course, the UART has a Data Register (UDR). It is buffered in receive direction, so that a completely
received byte can be read while the next one is being shifted in. The transmitter part of this register is not
buffered (what for?). A transmission is initiated when data is written to UDR. When reading from UDR, the
byte shifted in by the receiver part of the UART is read. You can not read back the last byte transmitted.
UBRR
The Uart Baud Rate Register is used to set the clock for the UART shift registers. See The Baud Rate
Generator part of this page for details on how it works and what to do with it. In fast AVRs (megas) it is a
16-bit register that allows low baud rates at high CPU speeds.
UCR
The UART Control Register controls the receiver and transmitter functions and interrupts.
Bit 7
RXCIE

TXCIE

UDRIE

RXEN

TXEN

CHR9

RXB8

Bit 0
TXB8

RXCIE: Receive Complete Interrupt Enable; If this bit is set, the reception of a byte via the UART will
cause an Interrupt if global Ints are enabled.
TXCIE: Just the same as RXCIE, but will allow a transmit complete Interrupt.
UDRIE: UART Data Register Empty Interrupt Enable; If this bit is set, an interrupt occurs if UDR is empty.
That allows writing the next byte to UDR while the currently being sent byte is still in the shift register. Also
good if the transmit complete interrupt doesn't write the next byte to UDR. It also allows interrupt driven
start of a transmission if nothing was sent before and a transmit complete interrupt therefore can't occur.
RXEN: Receiver Enable; If this bit is set, the UART receiver is enabled and the RXD pin is set up as an
input pin connected to the UART. All the previous port settings are now disabled, but not overwritten:
Disabling the receiver again will restore the old port settings.
TXEN: Transmitter Enable; If this bit is set, the UART transmitter is enabled and the the TXD pin is set up
as an output pin connected to the transmitter.
CHR9: 9 bit characters; This bit enables the 9-bit character size. By default, it is set to 0 and 8 bits are
used. If 9 bit characters are enabled, the 9th bit is found in RXB8 and TXB8.
RXB8: If CHR9 is set, this is the 9th received bit.
TXB8: If CHR9 is set, this is the 9th bit that is to be transmitted.
If 9 bit transmissions are enabled, TXB8 has to be filled before transmission is started by writing the lower
8 bits to UDR. RXB8 is valid after the received data has been transferred from the rx shift register. It is
buffered as well, so it doesn't change until a new byte is completely received.
USR
The UART status register holds status flags such as interrupt flags, overflow and framing error flags:
Bit 7
RXC

TXC

UDRE

FE

OR

---

---

Bit 0
---

RXC:
Receive Complete; This is the interrupt flag that is set when the UART has completely received a

character. You can clear it in software by writing a 1 to it. You can either use it to let the AVR execute the
interrupt service routine or poll it in a loop with interrupts disabled.
TXC:
Transmit Complete; This flag is set when a transmit is completed. It can be used in the same ways as RXC
(regarding clearing it in software and polling).
UDRE:
UART Data Register Empty; This flag is set while the UDR is empty. This condition occurs when a
character is transferred from the UDR to the transmit shift register. If the next character is written to UDR
now, it will not be transferred to the UDR until the character currently being transferred is completely
shifted out.
This flag can be used to ensure maximum throughput by using a software buffer. Consequently, the UDRE
ISR has to wite UDR: Otherwise the interrupt will occur again until data has been written to UDR or the
UDRIE flag has been cleared.
UDRE is set upon reset to indicate that the transmitter is ready.
FE:
Framing Error; This flag is set if the STOP bit is not received correctly. This is the case if it was interpreted
to be low by the data recovery logic. And that's wrong. So if the FE bit is read 1 by your software, you must
have serious noise problems or another hardware error.
OR:
OverRun; The OverRun Flag is very useful for detecting if your code is handling incoming data fast
enough: It is set when a character is transferred from the rx shift register to UDR before the previously
received character is read. It is cleared again when the next character is read.
The Baud Rate Generator
The UART Baud Rate Generator defines the clock used for transmitting and receiving data via the UART.
Unlike the timer clock, which can be prescaled in some rough steps, the UART clock can be divided very
precisely, resulting in clean and (to some extent) error-free data transfer.

You might have noticed that the baud rate is divided by 16 before it is fed into the Rx/Tx Shift registers.
The clock generated by the UART baud rate generator is 16 times higher than the baud rate we want to
use for transferring data.
This clock is used by the Data Recovery Logic: It samples the data and therefore filters it a bit, so that less
errors occur. In the middle of a bit that is to be received, it takes three samples: The two (or three) equal
samples are high, the bit shifted into the Rx Shift register is high as well. If two samples are wrong, the
data in the shift register is also wrong, but that is only possible if the connection is really bad.

The Clock used for shifting in the data is then divided by 16 (see diagram) and therefore corresponds to
the baud rate.
As there's no need to sample data for the Tx shift register, it is directly clocked by the baud rate.
The formlua for calculating the Baud rate generated from a specific value in UBRR (UART Baud Rate
Register) the AVR datasheets presents this formula:
BAUD= fck / (16(UBRR+1))
Example: System Clock is 8 MHz and we need 9600 Baud. Unfortunatley, the formula above does not give
us the UBRR value from fck and baud rate, but Baud rate from fck and UBRR. The better version for this of
the formula is:
UBRR
=

fck
-1
(16 * baud)

Using the value above (8 MHz and 9600 baud) we get the value of 51.08333333 for UBRR. So it's 51. The
error we get is the actual baud ratedivided by the desired bud rate: The actual baud rate is (first formula!)
9615 baud, dividing this by 9600 gives 1.0016 and therefore an error of 0.16%.
This will work, but it's not perfect. That's why you can get crystals with funny frequencies, such as 7.3728
MHz: Using that one for 9600 baud gives (2nd formula) us UBRR = 47 and no error. You can find tables
with various clock/baud combinations in the AVR datasheets. If you can't find the one you want to use, just
use the formulas above which wil give you the same results.
The UART Transmitter

The UART transmitter sends data from the AVR to some other device (data logger, PC, ...anything) at the
specified Baud Rate. The transmission is initiated by writing data to UDR. This data is then transferred to
the TX shift register when the previously written byte has been shifted out completely. The next byte can
now be written to UDR.

When a byte is transferred to the TX shift register, the UDRE flag is set. The UDRE ISR can write the next
byte to UDR without corrupting the transmission in progress.
When a byte is completely shifted out AND no data has been written to UDR by the UDRE ISR, the TXC
flag is set.
How the transmitter interrupt flags work together can be understood quite easily with the following flow
chart:

This flow chart depends on a software FIFO buffer which is a somehow non-trivial task, but it also explains
the flags pretty well I think: The transmission complete flag will only be set if the transmission is really
complete: By writing the buffer software properly YOU tell the UART when the transmission is complete.
Isn't that cool?
The UART Receiver

The UART receiver is basically built up like the transmitter, but with the appropriate extras for receiving
data: Data recovery logic for sampling the data and just one interrupt for the completion of data reception.
It uses the same baud rate setting as the transmitter. The data is sampled in the middle of the bit to be
received:

The small lines at the bottom of the image (three of which are samples) are the clock generated by the
UART Baud Rate Generator. This should also make clear why the baud rate is first generated 16 times
higher than needed and then divided by 16 in order to shift in the data. This higher baud rate is used for
sampling/filtering.
Important Hardware Note

If you want to connect your AVR to a PC you have to use RS-232 voltage levels. The voltage levels used
by an AVR are normal TTL levels (5V or 3.3V for high and 0V for low levels). RS-232 levels are much
different from that.
To convert the logic levels to RS-232 you need a normal level converter such as the MAX232. It's pretty
cheap an only needs a few external caps to work. It comes in a variety of packages and is available almost
everywhere.

Setting up the UART


[ Voltage Level Conversion ] [ The Cable ] [ UART Setup ] [ Code ]
Voltage Level Conversion
The AVR UART is a very powerful peripheral. You can use it to send messages to your PC and let a
terminal program display them (for debugging purposes or as a user interface), or to communicate with a
self-written program for analyzing logged data. As the UART heavily relies on timing (for generating the
correct baud rate), you have to know which frequency your AVR is running at and what speed you need for
communications.
It's also important to use the correct driver circuit between your AVR and the PC, as the COM port is using
RS232 voltage levels. They are different from CMOS levels and without a driver chip you'll fry your AVR.
That's bad. So use a driver chip. A widely used one is the MAX232 which just needs some caps and
supply voltage to work. Here's a diagram of it:

WARNING! This is a diagram of the MAX202 from the MAX232 datasheet. Use 10F caps for the
MAX232! The one connected from VCC to ground should be 0.1F though.
For the UART to work you need one driver per direction only: One Transmitter (T1 or T2 in the diagram)
from AVR to PC and one receiver (R1 or R2 in the diagram) from PC to AVR.
The Cable

The Cable from your circuit to the PC will most probably have a 9-pin D-type connector. The signals we
need are Ground, Receive Data and Transmit Data. Below is a table of the necessary connections. The
signal name refers to the PC side.
Signal

PC side (male)

Device Side (female)

5
3
2

5
3
2

Ground
Tx
Rx

MAX232 pin to
connector
15
13 or 8
14 or 7

AVR pin to MAX232


(ground)
RxD
TxD

To find out which pin of the connector has which number, have a close look at it: Most have tiny numbers
next to the pins on the plastic isolator. For more information, see www.hardwarebook.net. If your PC has a
25-pin connector you'll find the pinouts for it on that site as well.
I will not go into detail about the RS232 protocol. The AVR datasheets have a small description of it (in the
2313 datasheet see the "sampling received data" figure), which should be enough for a start. If you want
more, have a look at www.beyondlogic.org.
UART Setup
Setting up the UART is not very hard. You need to know the following:
- Clock frequency of your AVR
- desired baud rate
- data format (how many bits per transmission)
The clock frequency and the desired baud rate are used for calculating the UBRR value. With the formula
from the datasheet or the AVR Architecture -> UART page this can be calculated in no time. Assuming a
speed 3.6864 MHz and a desired baud rate of 38400, we get a value of 5. This must be written to UBRR
The data format will usually be 8 bits per transfer. Sometimes 9 bits are used, which the 2313 supports as
well. The megas even have more options, but the 8-bit format is enough for now.
The next question we have to answer is: Interrupt driven or polling? Interrupt driven is of course more
efficient, but when sending strings or packets of data, polling is easier, as an interrupt driven UART needs
software buffers for efficient string transfers. These can be added, but then it's not a "simple example" any
more :-) Below the polling example, you will find an interrupt driven version of it.
The example code below shows how to use polling. As we don't use interrupts, these can be left disabled.
The transmitter and receiver have to be enabled though in order to make usage of the UART possible.
The Setup Code
setup_uart:
ldi r16, 5
out UBRR, r16
ldi r16, 0b00011000
out UCR, r16
ret

; we can call this as a subroutine during intitialisation


; write correct clock divider value
; to UBRR
;
; set Rx and Tx enable bits
; write them to the UART Control Register
;
; done. Nothing more to do!

So what do we want the AVR to do with the UART. A very simple task is to echo back the data we received
from the PC. When typing in characters in a terminal, we should receive copies from it, so everything we
type in should show up twice (assuming a local echo).
For receiving data we wait until the RXC flag in USR (UART Status Register) is set and then read that data
from UDR (UART Data Register). Then we can transmit it again by writing it to UDR. If we write data to

UDR while a byte is received that won't hurt, as the UDR is divided into two registers, one for each
direction. God huh? Before writing it to UDR we need for the UDRE flag to be set, because it indicates
when a character is transferred to the UART transfer shift register. Then a new character can be written to
UDR.
Example Code
So, enough theory, here's the code. Don't forget to include the 2313def.inc file and the setup routine
above!
.org 0x0000
rjmp reset
reset:
ldi r16, low(RAMEND)
out SPL, r16
rcall setup_uart
loop:
rcall rx_uart
rcall tx_uart
rjmp loop
rx_uart:
in r16, USR
sbrs r16, RXC
rjmp rx_uart
in r16, UDR
ret
tx_uart:
in r17, USR
sbrs r17, UDRE
rjmp tx_uart
out UDR, r16
ret
;include setup_uart here!

; reset interrupt vector


; for startup
;
;
; initialise Stack Pointer
;
;
; initialise the UART
;
; then loop back the characters received from the PC
; receive data
; and transmit it again
; and do this forever
;
; receive routine:
; get UART Status Register
; and see if Rx Complete flag is set
; if not, go back to rx_uart
;
; data came in. RXC is cleared by reading UDR, UDR is stored in r16
; return
;
; transmit routine:
; get UART Status Register (r17 this time, the data is in r16!!!)
; and see if UDR is free for transfer
; if not, go back to rx_uart
;
; send the data, UDRE will be cleared by hardware
; return
;
;

After thinking about the code a bit you might come to the conclusion that the status register check for
transmitting data is not necessary, as the data is coming in at very low speed (as fast as you can type) and
therefore will be echoed back before the next character comes and can be transmitted again. I included
this for showing how this check is done, because other applications might send data at higher speed. This
is the case when sending data packets or strings. In that case, the application would send a character, get
the next one from memory and send it as soon as possible.
Interrupt Driven Examples
The interrupt driven example doesn't hang around in loops checking if data had come in. Instead, the Rx
Complete interrupt is used to determine when data is ready. It is then echoed back by the RXC ISR. To
make the interrupt driven echo possible, the RXC Interrupt has to enabled (RXCIE in UCR is set) and, of
course, global interrupts have to be allowed as well. The correct interrupt vector has to be installed, too.
.org 0x0000
rjmp reset
.org 0x0007
rjmp UART_RXC
reset:
ldi r16, low(RAMEND)

; reset vector address:


; handle reset
; UART Receive Complete Interrupt vector:
; go to UART_RXC
;
; jump here at reset
; stack setup

out SPL, r16


ldi r16, 5
out UBRR, r16
ldi r16, 0b10011000
out UCR, r16
sei
loop:
rjmp loop
UART_RXC:
in r17, UDR
out UDR, r17
reti

;
;
; clock divider value for 38400 baud @ 3.6864 MHz
;
; enable Rx Complete Int, enable receiver and transmitter
;
;
; enable interrupts
;
; loop here (do nothing)
;
;
; UART Rx complete interrupt handler:
; get data we received
; write it to UDR
; return from int

This examle, apart from being interrupt driven, is different from the first one: The ISR doesn't check if it's
allowed to write to UDR, so collisions can occur if the previous character wasn't transferred yet. This could
be done with an ISR for the UART Data Register Empty interrupt. The flow chart shows how the two ISR
would communicate via the UDRE Interrupt Enable (UDRIE) bit:

.org 0x0000
rjmp reset
.org 0x0007
rjmp UART_RXC
rjmp UART_DRE
reset:
ldi r16, low(RAMEND)
out SPL, r16
ldi r16, 5
out UBRR, r16
ldi r16, 0b10011000
out UCR, r16
sei
loop:
rjmp loop

; same as above
;
;
; here's the Rx Complete vector
; here's the UDRE Int vector (.org 0x0008)
;
;
; stack setup
;
;
; set baud rate
;
; enable Rx and Tx, enable Rx Complete Interrupt
; UDRIE is NOT(!) set!!! This is done by the RXC ISR
;
; enable Interrupts
;
; do nothing as long as power is present
;

UART_RXC:
in r17, UDR
in r16, UCR
sbr r16, 0b00100000
out UCR, r16
reti
UART_DRE:
in r16, UCR
cbr r16, 0b00100000
out UCR, r16
out UDR, r17
reti

;
; UART Rx Complete ISR:
; get data
; get UART Control Register
; and set UDRIE bit;
; store UART Control Register again
; and that's it.
;
; UART Data Register Empty ISR: Will be called as soon as
UART_RXC
; returns! Get UCR
; clear UDRIE bit
; and store UCR again
; send data
; return from ISR

These three examples should have given you an idea about UART usage and interrupt setup issues. The
last example (with RXC and UDRE interrupts) is almost ready for FIFO buffer usage.

UART Ascii to Hex converter


Maybe I should explain what that means.... Any character you type in your terminal will be converted to
ascii coded hex and sent back to the terminal. What you need for this is basic knowledge about the UART
(Setting up The UART) and about the ascii to hex conversion (see the conversions section). As ascii to
ascii coded hex is not one the conversions described in the conversions section we'll need to either
combine them or think about something else.
But first things first. Before we can convert any data, we need to receive it. This time, we'll use a 2313 at 4
MHz. At 19200 baud with an UBRR value of 12 that gives an error of 0.2% (see 2313 datasheet for this or
calculate it yourself with the formulas from AVR architecture -> UART "Baud Rate Generator"). We'll use
an interrupt driven UART for this.
The RxC ISR will convert the data to ascii coded hex and send the first (high) nibble character. The UDRE
interrupt will now be enabled. The second (low) nibble character will be stored and then sent by the UDRE
interrupt, which then disables itself. The main loop can still do some other stuff (add the flashing LED
driven by a delay loop!)
Using the UART should be no problem for you once you've read the "Setting up The UART" page. I'll now
concentrate on the conversion:
The data that is received needs to be split into two nibbles we then convert to ascii hex characters. 0xB6
should be sent as "B" "6". How do we do this? We'll use a lookup table. This table contains the ascii
character "0" at address 0 and "F" at address 0x0F, or 15. The single nibbles are added to the base
address of the table and the resulting address is used to read the right character from the table. Example:
0x4F is coming in (r16).
r17 = r16 (get a copy of the data into r17)
r16 AND 0x0F (clear high bits of r16; r16 = low nibble = 0x0F)
swap nibbles in r17 (high nibble = low nibble; low nibble = high nibble; r17 = 0xF4)
r17 AND 0x0F (clear high bits of r17; r17 = high nibble = 0x04)
Z = base address of table + r17
send character Z is pointing at
Z = base address of table + r16
send character Z is pointing at
The lookup table looks like this:

.db "0123456789ABCDEF"
The base address of the table plus 4 is the address where "4" is stored at.
Here's the complete code (without includes!):
.equ baud19200 = 12
.dseg
low_nibble: .byte 1
.cseg
.org 0x0000
rjmp reset
.org 0x0007
rjmp UART_RXC
rjmp UART_UDRE
reset:
ldi r16, low(RAMEND)
out SPL, r16
ldi r16, baud19200
out UBRR, r16
ldi r16, 0b10011000
out UCR, r16
sei
loop: rjmp loop
UART_RXC:
push r16
in r16, SREG
push r16
in r16, UDR
mov r17, r16
andi r16, 0x0F
swap r17
andi r17, 0x0F
ldi ZL, low(2*hex_table)
ldi ZH, high(2*hex_table)
add ZL, r17
lpm
out UDR, r0
ldi ZL, low(2*hex_table)
ldi ZH, high(2*hex_table)
add ZL, r16
lpm
sts low_nibble, r0
ldi r16, 0b10111000
out UCR, r16
pop r16
out SREG, r16
pop r16
reti
UART_UDRE:
push r16
in r16, SREG

; define the UBRR value for 19200 baud @ 4 MHz


;
; reserve one byte in SRAM for the low nibble character
;
;
;
; reset vector address
; jump to "reset" on reset interrupt
; Rx Complete interrupt vector address
; jump to "UART_RXC" when Rx complete int occurs
; jump to "UART_UDRE" when data register is emtpy
;
; reset code:
; stack setup
;
;
; UART setup:
; set baud rate
; enable receiver, transmitter and Rx Complete Interrupt
;
;
; enable interrupts;
;
; loop forever
;
; Rx Complete Interrupt Service Routine:
; save r16
; use r16 to save SREG
;
;
; get received data
; and copy to r17
; clear high nibble of r16 (r16 = low nibble)
; swap r17
; clear high nibble of r17 (r17 = high nibble)
;
; load Z with table address; As the table label returns the word
; address, we need to multiply it with 2 to get the byte address
; add high nibble offset
; load from table to r0 (lpm loads to r0!!!)
; and send the data
;
; again load Z with table address
;
; but this time add low nibble offset
; load from table to r0
; and store r0 at low_nibble
;
; enable UDRE interrupt
;
;
; restore r16
; and restore SREG
;
;return from interrupt
;
; UART Data Register Empty ISR:
; same as above, store r16
; and r16

push r16
lds r16, low_nibble
out UDR, r16
ldi r16, 0b10011000
out UCR, r16
pop r16
out SREG, r16
pop r16
reti
hex_table:
.db "0123456789ABCDEF"

;
;
; load r16 with data from low_nibble
; and send it
;
;disable UDRE interrupt
;
;
; restore r16
; and SREG
;
;and return from interrupt
;
; This is the hex table label
; and this is the hex -> ascii table

Just add the include file for the 2313 and simulate the code now. You'll see that "A" typed in at the terminal
returns "41", which correspond to "A"'s ascii value. A backspace returns 0x08, return is 0x0D.

Software UART FIFO Buffer


The UART can only send and receive one byte at a time. I'm not talking about the simultaneous
transmission and reception of data now, but the ability to receive data blocks, like strings or commands
with arguments. Wouldn't it be nice to have the ability to read a whole string from UART and then process
it? That's where a FIFO buffer comes in.
A FIFO buffer for the UART can make life easier and doesn't slow things down if it's properly programmed.
While Advanced Assembler -> Buffers and Queues only gives a general overview of this topic, here's a
complete and working example.
To make things faster, interrupts are used for both directions. The Rx FIFO and the Tx FIFO each need
their own space in SRAM. Each need the space for received data / data to be sent, one pointer for writing
(rx_in or tx_in), one pointer for consuming (rx_out or tx_out) and one byte for holding the number of bytes
currently held in the FIFO (rx_n or tx_n).
When a byte is to be stored in the FIFO, the following is done:
The byte is stored at the write pointer address, which is then post-incremented. If needed, the pointer will
have to roll over to the base address of the FIFO data space again. Then the number of bytes in the FIFO
is incremented. As the pointers and the amount of data are stored in sram, we need to take care of restoring them again. To make things safer, the routines have to check if storing/reading is actually possible.
No data is allowed to be stored if the FIFO is full. Similarly, no data can be read if it is empty.
The FIFOs have to be initialised in order to work. The pointers have to point at the FIFOs base address
(where the first byte will be stored) and rx_n or tx_n have to be set to zero before any FIFO operation is
done. This can be placed at the location where the UART is initialised as well.
The initialisation looks like this:
.equ rx_size = 16
.equ tx_size = 16
.dseg
rx_fifo: .byte rx_size
rx_in: .byte 2
rx_out: .byte 2
rx_n: .byte 1
tx_fifo: .byte tx_size
tx_in: .byte 2
tx_out: .byte 2

; first set the size of the receiver


; and the transmitter FIFO
;
;
; then reserve sram space for the rx FIFO
; and its pointers
;
; and the counter
;
; same for the transmitter side;
;
;

tx_n: .byte 1
.cseg
init_FIFOs:
ldi r16, low(rx_fifo)
ldi r17, high(rx_fifo)
sts rx_in, r16
sts rx_in + 1, r17
sts rx_out, r16
sts rx_out + 1, r16
clr r16
sts rx_n, r16
ldi r16, low(tx_fifo)
ldi r17, high(tx_fifo)
sts tx_in, r16
sts tx_in + 1, r17
sts tx_out, r16
sts tx_out + 1, r16
clr r16
sts tx_n, r16
ret

;
;
;
; this is a routine we can call during init:
; load address of the rx FIFO space to r16:r17
;
; and store it as the in and
;
; out pointer
;
; clear the counter
; and store it as well.
;
; same for the transmitter
;
;
;
;
;
;
;
; return from the routine

Receiver FIFO:
As the UART receiver only has one interrupt source, we don't need to choose one (this will be needed for
the transmitter). The UART Rx interrupt occurs whenever a byte has been received. This byte is then
added to the Rx FIFO by the ISR. Another routine is needed to consume a byte from the buffer again
during normal operation, for example when we need to process some received data.
That makes 2 routines for the Rx side. First, the ISR:
UART_RXC:
push r16

; UART Rx Complete ISR


; save r16
;
lds r16, rx_n
; get counter
cpi r16, rx_size
; if FIFO not full,
brlo rx_fifo_store
; store data
pop r16
; else restore r16
in r16, UDR
; clear interrupt by reading UDR
reti
;
;
rx_fifo_store:
;
in r16, SREG
; SREG
push r16
;
push r17
;r17
push XL
; and a pointer
push XH
;
;
in r16, UDR
; get data
lds XL, rx_in
; set up pointer
lds XH, rx_in + 1
;
st X+, r16
; and store in FIFO
;
ldi r16, low(rx_fifo + rx_size) ; load r16:r17 with first invalid address after FIFO space
ldi r17, high(rx_fifo + rx_size) ;
cp XL, r16
; do a 16-bit compare:
cpc XH, r17
; X = r16:r17?
breq rx_fifo_w_rollover
; if yes, roll over to beginning of FIFO space
;
rx_fifo_w_store:
; store pointer rx_in
sts rx_in, XL
;
sts rx_in + 1, XH
;
;

lds r16, rx_n


inc r16
sts rx_n, r16
pop XH
pop XL
pop r17
pop r16
out SREG, r16
pop r16
reti
rx_fifo_w_rollover:
ldi XL, low(rx_fifo)
ldi XH, high(rx_fifo)
rjmp rx_fifo_w_store

; get counter
; increment
; store counter again
;
; restore registers we used
;
;
;
;
;
; return
;
; if X stored the data at the last fifo memory location,
; roll over to the first address again
;
; and proceed as usual

Reading from the buffer requires another routine which uses the rx_out pointer to get data from the buffer.
It also doesn't need to save stuff, as it's not an ISR and will be executed at a known time. The routine shall
return the data from the buffer in r18.
UART_read_fifo:
lds r16, rx_n
cpi r16, 1
brsh rx_fifo_read
ret

; call this from within the application to get UART Rx data to r18
; load number of received bytes
; if one byte or more available,
; branch to rx_fifo_read
;else return
;
rx_fifo_read:
; data is available:
lds XL, rx_out
; Get the Rx FIFO consume pointer
lds XH, rx_out + 1
;
ld r18, X+
; and load data to r18
;
ldi r16, low(rx_fifo + rx_size) ; check if end of mem space reached:
ldi r17, high(rx_fifo + rx_size) ; r16:r17 = first invalid address above Rx FIFO memory
cp r16, XL
; 16-bit compare: X = invalid address above Rx FIFO memory?
cpc r17, XH
;
breq rx_fifo_r_rollover
; yes, roll over to base address
;
rx_fifo_w_store:
; store the new pointer
sts rx_out, XL
;
sts rx_out + 1, XH
;
;
lds r16, rx_n
; load counter
dec r16
; decrease it
sts rx_n, r16
; and store it again
ret
; return to application
;
rx_fifo_r_rollover:
; roll over to base address:
ldi XL, low(rx_fifo)
; load base address to X
ldi XH, high(rx_fifo)
;
rjmp rx_fifo_r_store
; and store the pointer
Transmitter FIFO
The transmitter FIFO for the UART works just like the receiving one, with a small difference: The ISR
routine in this case reads from the FIFO and writes the data to UDR, while the write routine takes the data
from a specified location or register (let's take r18) and writes it to the FIFO.
So which interrupt do we choose? The UART offers the UART Data Register Empty (UDRE) interrupt and
the UART Transmit Complete (TXC) interrupt. The transmit complete interrupt only occurs when a
transmission is finished, so we can't use it for our purpose for two reasons:
- The Transmission finishes and then the ISR is called. So what? Maximum speed can't be achieved when
using this interrupt. By using the UDRE int, the next byte to be transmitted is already in UDR when the

previous transmission finishes and can be tranmitted by the hardware. If the interrupt occurs when the
previous transmission finishes, the next byte has to be taken from the buffer memory space first and time
is lost between two transmissions.
- If the UDRE interrupt is used and no data is available (last transmission was the last byte in the buffer)
we can just disable the UDRE int re-enable it as soon as new data is written to the transmit FIFO. By reenabling it, the ISR will be called because UDR is emtpy and transmission will start again. The TXC int will
not provide this automatical transmission start. The code for the transmit FIFO can be cut 'n pasted from
the RX FIFO with the small changes described above. This will be no problem if you understood the RX
FIFO.
The following code does the following (it's written for a 2313):
Stack and UART setup (38400 baud @ 7.3728 MHz)
FIFO setup
Receive data via Rx FIFO and loop it back via Tx FIFO
If you have an STK 500 you only need to plug in a 2313 and a 7.3728 MHz crystal, connect PD0 to the
RS232 spare RxD pin and PD1 to the TxD pin. Don't forget power and the connection to your PC via a
COM port...
Also change the first line (include directive for 2313def.inc) to suit your system.
Here's the asm file

The Analog Comparator


The analog comparator is a useful peripheral to compare two analog signals. For instance, you can
compare the output of a temperature sensor with a reference voltage, and take some action when the
temperature exceeds the level corresponding to the reference voltage.
The Analog Comparator has two stages. The first one is the Analog Comparator itself, which has two
inputs: Analog Input 0 (AIN0) and Analog Input 1 (AIN1). If AIN0 is greater than AIN1, the output of the
Analog Comparator is high. On the other hand, if AIN1 is greater than AIN0, the output of the Analog
Comparator is low. The second stage takes the output of the Analog Comparator and sets the
corresponding interrupt flag (ACI) and the Analog Comparator Output Flag (ACO). The following figure
shows a simplified scheme of the Analog Comparator :

Since AIN0 and AIN1 are the alternate functions of two Port B pins, you must set the data direction bits
accordingly (which pins are connected to the Analog Comparator depends on the particular AVR you are
using, check the datasheet). Clear DDBx and DDBy to set the pins as an input, and clear PBx and PBy to
disable the internal pullup resistor.
The Analog Comparator is quite simple and has only one register : the Analog Comparator Control and
Status Register (ACSR):

Bit 7
ACD

Bit 6
---

Bit 5
ACO

Bit 4
ACI

Bit 3
ACIE

Bit 2
ACIC

Bit 1
ACIS1

Bit 0
ACIS0

ACD (Analog Comparator Disable) bit : If you want to disable the Analog Comparator (for instance, to
reduce power consumption), you must set this bit. A word of caution : you must disable the Analog
Comparator interrupt before disabling the Analog Comparator to avoid an unintentional interrupt.
ACO (Analog Comparator Output) bit : Is the output of the Analog Comparator. You can read this bit to
determine the current state of the Analog Inputs. What the output states mean is described above.
ACI (Analog Comparator Interrupt Flag) bit : This bit is set when a comparator output triggers the interrupt
mode defined by ACIS bits (see below for details). Also, if the Global Interrupt and the Analog Comparator
interrupt is enabled, the Analog Comparator interrupt service routine is executed. ACI is cleared by
hardware when executing the corresponding interrupt handling vector. Alternatively, ACI is cleared by
writing a logical 1 to the flag. YES, it's not a typo: you must write a 1 to clear the flag. This has a nasty side
effect : if you modify some other bit of ACSR using the SBI or the CBI instruction, ACI will be cleared if it
was set before the sbi/cbi operation.
ACIE (Analog Comparator Interrupt Enable) bit : When the ACIE bit is set and global interrupts are
enabled, the Analog Comparator interrupt is activated. When cleared, the interrupt is disabled.
ACIC (Analog Comparator Capture Enable) bit : One interesting thing you can do, is to connect the Analog
Comparator output to the Timer1/Counter1 Input Capture function. In this way, you can measure the time
between two events in the Analog Comparator. If you want to use this feature, set this bit.
ACIS (Analog Comparator Interrupt Mode Select) bits : you can choose when the the Analog Comparator
Interrupt Flag (ACI) will be triggered. There are three possibilities. When the Analog Comparator output
changes from 0 to 1 (rising output edge), when the Analog Comparator Output changes 1 to 0 (falling
output edge), or whenever the Analog Comparator output changes (output toggle). As with the ACD bit,
you must disable the Analog Comparator interrupt when you change these bits to avoid unwanted
interrupts.
ACIS1 ACIS0
Interrupt Mode
0
0 Interrupt on output toggle
0
1 Reserved (don't use)
1
0 Interrupt on falling output edge
1
1 Interrupt on rising output edge
Some AVRs have a more complex (see datasheets) Analog Comparators. They are equal to the Analog
Comparator explained here, but in addition, you can use an internal voltage reference or one of the inputs
of the Analog to Digital Converter as one of the Analog Inputs.
CAUTION :
You MUST respect the voltage range allowed for the AVR pins (see Maximum Absolute Ratings in the
Electrical Characteristics section of the datasheet). The voltage must be below VCC+0.5V and above -1V.
If you don't respect this, you will blow your AVR. Be sure that the analog signals you are using are in the
right range. If they come from the external world, is a good idea to use some kind of protection at the input.
See the suggested circuit below (which consists of just one resistor...).

This circuit uses the internal clamping diodes present in all AVR I/O pins. If the analog voltage is higher
than Vcc plus the conduction voltage of the diode (around 0.5V), the upper diode will conduct and the
voltage at the input pin is clamped to Vcc+0.5 . On the other hand, if the analog voltage is lower than 0V
minus the conduction voltage of the diode, the lower diode will conduct, and the voltage at the input pin is
clamped to 0.5V. The resistor will limit the current through the conducting diode, which must not exceed
1mA, so you must design the resistor accordingly. For instance, if you expect that the max value that may
reach the analog voltage is 24V, the resistor value should be :
R=24V/1mA=24K.

The Analog to Digital Converter (ADC)


[How it works] [Modes] [Registers] [Hardware Issues]
How it works
The Analog to Digital Converter (ADC) is used to convert an analog voltage (a voltage that vary
continuously within a known range) to a 10-bit digital value. For instance, it can be used to log the output
of a sensor (temperature, pressure, etc) at regular intervals, or to take some action in function of the
measured variable value. There are several types of ADCs. The one used by AVR is of the "succesive
approximation ADC" kind. The following is a simplified scheme of the ADC.

At the input of the ADC itself is an analog multiplexer, which is used to select between eight analog inputs.
That means that you can convert up to eight signals (not at the same time of course). At the end of the
conversion, the correponding value is transferred to the registers ADCH and ADCL. As the AVR's registers
are 8-bit wide, the 10-bit value can only be held in two registers.
The analog voltage at the input of the ADC must be greater than 0V, and smaller than the ADC's reference
voltage AREF. The reference voltage is an external voltage you must supply at the Aref pin of the chip. The
value the voltage at the input is converted to can be calculated with the following formula:
ADC conversion value = round( (vin/vref)*1023)
Since it is a 10-bit ADC, you have 1024(1024=2^10) possible output values (from 0 to 1023). So, if vin is
equal to 0V, the result of the conversion will be 0, if vin is equal to vref, it will be 1023, and if vin is equal to
vref/2 it will be 512. As you can see, since you are converting a continuous variable (with infinite possible
values) to a variable with a finite number of possible values (elegantly called a "discrete variable"), the
ADC conversion produces an error, known as "quantization error".
Modes of Operation
The ADC has two fundamental operation modes: Single Conversion and Free Running. In Single
Conversion mode, you have to initiate each conversion. When it is done, the result is placed in the ADC
Data register pair and no new conversion is started. In Free Runing mode, you start the conversion only
once, and then, the ADC automatically will start the following conversion as soon as the previous one is
finished.
The analog to digital conversion is not instantaneous, it takes some time. This time depends on the clock
signal used by the ADC. The conversion time is proportional to the frequency of the ADC clock signal,
which must be between 50kHz and 200kHz.
If you can live with less than 10-bit resolution, you can reduce the conversion time by increasing the ADC
clock frequency. The ADC module contains a prescaler, which divides the system clock to an acceptable
ADC clock frequency. You configure the division factor of the prescaler using the ADPS bits (see below for
the details).

To know the time that a conversion takes, just need to divide the number of ADC clock cycles needed for
conversion by the frequency of the ADC clock. Normaly, a conversion takes 13 ADC clock cycles. The first
conversion after the ADC is switched on (by setting the ADEN bit) takes 25 ADC clock cycles. This first
conversion is called an "Extended Conversion". For instance, if you are using a 200kHz ADC clock signal,
a normal conversion will take 65 microsenconds (13/200e3=65e-6), and an extended conversion will take
125 microseconds (25/200e3=125e-6).
Registers
[ADMUX] [ADCSR] [ADCL/ADCH]
There are four registers related to the operation of the ADC : ADC Multiplexer Select Register (ADMUX),
ADC Control and Status Register (ADCSR), ADC Data Register Low (ADCL) and ADC Data Register High
(ADCH). Let's discuss them in detail.
ADMUX
Bit 7
---

Bit 6
---

Bit 5
---

Bit 4
---

Bit 3 Bit 2 Bit 1 Bit 0


--- MUX2 MUX1 MUX0

This register is used to select which of the 8 channel (between ADC0 to ADC7) will be the input to the
ADC. Since there are 8 possible inputs, only the 3 least significant bits of this register are used. The
following table describe the setting of ADMUX.
MUX2 MUX1 MUX0
0
0
0
0
0
1
0
1
0
0
1
1
1
0
0
1
0
1
1
1
0
1
1
1

Selected Input
ADC0
ADC1
ADC2
ADC3
ADC4
ADC5
ADC6
ADC7

You can see that it's possible to load a register with the desired input number and write it to ADMUX
directly, as the register does not contain any other flags or setting bits.
If these bits are changed during a conversion, the change will have no effect until this conversion is
complete. This is a problem when multiple channels are scanned:

If you can make sure that the ISR always changes the ADMUX value to the next channel (or some other
value that can be reconstructed by the next ISR) the value in the ADC data register pair is always the
conversion result from the last ADMUX change. When the ISR changes ADMUX from 2 to 3, the value in
the data registers is from channel 2.
ADCSR
Bit 7
ADEN

Bit 6
ADSC

Bit 5
ADFR

Bit 4
ADIF

Bit 3
ADIE

Bit 2
Bit 1
Bit 0
ADPS2 ADPS1 ADPS0

ADEN (ADC Enable) bit : Setting this bit enables the ADC. By clearing this bit to zero, the ADC is turned
off. Turning the ADC off while a conversion is in progress will terminate this conversion.

ADSC (ADC Start Conversion) bit : In Free Running Mode, you must set this bit to start the first
conversion. The following conversions will be started automatically. In Single Conversion Mode, you must
set it to start each conversion. This bit will be cleared by hardware when a normal conversion is
completed. Remember that the first conversion after the ADC is enabled is an extended conversion. An
extended conversion will not clear this bit after completion.
ADFR (ADC Free Running Select) bit : If you want to use the Free Running Mode, you must set this bit.
ADIF (ADC Interrupt Flag) bit : This bit is set when an ADC conversion is completed. If the ADIE bit is set
and global interrupts are enabled, the ADC Conversion Complete interrupt is executed. ADIF is cleared by
hardware when executing the corresponding interrupt handling vector. Alternatively, ADIF is cleared by
writing a logical 1 (!) to the flag. This has a nasty side effect : if you modify some other bit of ADCSR using
the SBI or the CBI instruction, ADIF will be cleared if it has become set before the operation.
ADIE (ADC Interrupt Enable) bit : When the ADIE bit is set and global interrupts are enabled, the ADC
interrupt is activated and the ADC interrupt routine is called when a conversion is completed. When
cleared, the interrupt is disabled.
ADPS (ADC Prescaler Select ) bits : These bits determine the division factor between the AVR clock
frequency and the ADC clock frequency. The following table describe the setting of these bits :
ADPS2 ADPS1 ADPS0 Division Factor
0
0
0
2
0
0
1
2
0
1
0
4
0
1
1
8
1
0
0
16
1
0
1
32
1
1
0
64
1
1
1
128
ADCL and ADCH
These registers hold the result of the last ADC conversion. ADCH holds the two most significant bits, and
ADCL holds the remaining bits.
When ADCL is read, the ADC Data Register is not updated until ADCH is read. Consequently,it is essential
that both registers are read and that ADCL is read before ADCH.
Here is a code snippet to make a conversion of ADC3. The result is placed in r16 and r17. The AVR is
running at 4MHz:

ADC_Init:
ldi r16,3
out ADMUX, r16
ldi r16, 10000101b
out ADCSR,r16
sbi ADCSR,ADSC

; Select ADC3
; Enable ADC, Single Mode conversion
; ADC Interrupt disable, Prescaler division factor = 32
; this gives an ADC clock frequency of 4e6/32=125kHz.
; Start conversion

Wait:
sbis ADCSR,ADIF
rjmp Wait:

; Wait until the conversion is completed

in r16,ADCL
in r17,ADCH

; Place ADCH in r16:r17.

The ATmega series of AVRs have a more complex ADC. They are similar to the ADC explained here, but
have some additional features like (see the datasheet for the details) :

7 Differential Input Channels

2 Differential Input Channels with Optional Gain of 10x and 200x(1)

Optional Left adjustment for ADC Result Readout

Selectable 2.56V ADC Reference Voltage

ADC Start Conversion by Auto Triggering on Interrupt Sources

Hardware issues
Due to the analog nature of the ADC, there are some additional issues you must consider. First of all, the
ADC has two separate analog supply voltage pins, AVCC and AGND. If your application doesn't require
great accuracy, you can keep your life simple and just connect directly AVCC to VCC, and AGND to GND.
However, if you want to get the best performance of the ADC, you must pay special attention to the ADC
power supply and PCB routing. See the "ADC Noise Canceling Techniques" section of the datasheet to get
the details. Beside that, the CPU core of the AVR also induce some noise during the conversion. For this
reason, the ADC features a noise canceler that enables conversion during Idle Mode. Please see the
datasheet to get the details.
CAUTION :
You MUST respect the voltage range allowed for the AVR pins (see Maximum Absolute Ratings in the
Electrical Characteristics section of the datasheet). The voltage must be below VCC+0.5V and above -1V.
If you don't respect this, you will blow your AVR. Be sure that the analog signals you are using are in the
right range. If they come from the external world, is a good idea to use some kind of protection at the input.
See the suggested circuit below (which consists of just one resistor...).

This circuit uses the internal clamping diodes present in all AVR I/O pins. If the analog voltage is higher
than Vcc plus the conduction voltage of the diode (around 0.5V), the upper diode will conduct and the
voltage at the input pin is clamped to Vcc+0.5 . On the other hand, if the analog voltage is lower than 0V
minus the conduction voltage of the diode, the lower diode will conduct, and the voltage at the input pin is
clamped to 0.5V. The resistor will limit the current through the conducting diode, which must not exceed
1mA, so you must design the resistor accordingly. For instance, if you expect that the maxim value that
may reach the analog voltage is 24V, the resistor value should be :
R=24V/1mA=24K.
Common Pitfall

I am not sure if this is a "Common" pitfall, but at least two guys (one of them me) had fallen in it. Is a
common temptation to use the output of the voltage regulator as the voltage reference for the ADC. The
problem is that a typical voltage regulator, like a 7805, has voltage tolerance of about 5%. This mean that
the ADC converted value will have a 5% error. Lets take an example. Suppose that the regulator output
voltage is 5.1V, and the input to the ADC is 2.5V. You would expect a converted value of 512, but instead
you get 501. Seeing that, you could think that something is wrong with your ADC, but the problem is with
your reference voltage. Don't worry, there's components designed to produce reference voltages, like the
LM285. However. There is one exception to this rule: when you are making a radiometric measurement. In
a radiometric measurement, the voltage is a proportion of the regulator voltage, so any error in the value of
this voltage is canceled. The output of a potentiometer is a typical radiometric output. The problem
described above is common to external sensors that have an own power supply.

ATmega8 ADC Example


This is a very simple example you probably won't see in any serious design, but it's straight forward and
not hard to troubleshoot. We want to read the voltage output of a pot and show the most significant 8 bits
of the conversion result on port D (the ADC has a resolution of 10 bits!).
Choosing ADC Mode And Clock
The mega8 ADC offers 2 modes of operation, the Single Conversion Mode and the Free-Running Mode.
Both have advantages and disadvantages and I will try to discuss them both well enough now.
In Free-Running Mode, a new conversion is started when a conversion finishes. The new conversion is
done for the ADC channel set in ADMUX. If a new channel has to be converted, it's number has to be set
*before* the new conversion starts. If an ISR is used to process the ADC results and update ADMUX, care
has to be taken, as a change of ADMUX just after the conversion start can have unpredictable results
(read pages 196/197 of the mega8 datasheet for more). As we only want to work with one ADC channel,
this is no problem for us. Only the very first conversion has to be started by setting the ADSC bit!
In Single Conversion Mode, every conversion has to be started by setting the ADSC (ADC Start
Conversion) bit in ADCSR. The advantage of this is that a new channel can be selected before the
conversion is started without watching out for timing problems or unpredictable results. If the ADC
Conversion Complete ISR is used for this, the loss in speed is quite small.
In this case we can use the Free-Running Mode - we don't need to change ADMUX (just one channel is
used).
The recommended ADC clock range is 50 kHz to 200 kHz. If a faster ADC clock is used, the resolution will
go down. The ADC clock is prescaled from the main clock by a seperate ADC prescaler. The division factor
is selected with the ADPS2..0 bits in ADCSR (see pages 203/204 of the datasheet). At 4 MHz, appropriate
values are 32 (resulting ADC clock is 125 kHz) and 64 (resulting ADC clock is 62.5 kHz). We'll use 32.
What The ISR Does...
The mega8 interrupt vector for the ADC Conversion Complete ISR doesn't have to do much. When it is
called, a fresh ADC result is available in ADCL and ADCH. It is converted to 8 bits (the two LSBs aren't
used) and written to PortD. On the STK500 the LEDs are active low, so inverting the result before writing it
to PortD is a good idea.
The ISR is, of course, only called if the ADIE (ADC Interrupt Enble) bit in ADCSR is set and if global
interrupts are enabled.
The Example Code!
.org 0x0000
rjmp reset
.org 0x000E
rjmp ADC_ISR

; reset vector
; jump to "reset"
;
; ADC Conversion Complete Interrupt vector:
; jump the "ADC_ISR"
;

reset:
ldi r16, low(RAMEND)
out SPL, r16
ldi r16, high(RAMEND)
out SPH, r16
ldi r16, 0xFF
out DDRD, r16
ldi r16, 0
out ADMUX, r16
ldi r16, 0b11101101
out ADCSR, r16
sei
loop:
rjmp loop
ADC_ISR:
push r16
in r16, SREG
push r16
push r17
in r16, ADCL
in r17, ADCH
lsr r17
ror r16
lsr r17
ror r16
com r16
out PortD, r16
pop r17
pop r16
out SREG, r16
pop r16
reti

; the reset code:


; stack setup; set SPH:SPL to
; RAMEND
;
;
;
; set all PortD pins to output
;
;
; write zero
; to ADMUX (select channel 0)
; from left to right: ADC Enable, Start Conversion, Free-Running Mode,
write
; zero to ADC Int flag, enable int, prescaler: 101 for XTAL/32
;
; enable interrupts
;
; and loop
; forever
;
; Here it is, our ISR!
; save r16
; use r16 16 to save SREG
; (push both on stack)
; also save r17
;
; get the last ADC result, low byte first,
; then high byte
; shift ADC result right (2 bits)
; by first shifting out bit 0 of r16, then shifting it into r17
;
; (twice)
;
; now invert result
; and write to PortD
;
; restore r17,
; SREG
;
; and r16
; and return

Fuses
Certain features of AVRs are controlled by fuse bits. For example, settings like clock options or the brownout reset circuit are configured with these bits. These configurations differ from the other AVR peripherals
(like SPI, ADC, etc) because you set the fuse bits at program time (with a programmer) instead of writing
to some I/O memory space register at run time. So, for instance, to set the AVR to use an external
oscillator, you must set this at the moment you program it. There is no way to change the clock behavior
through the program code. If you change your mind, you must reprogram the AVR.
The details of how to program the fuse bits in your AVR depend on the particular programmer you are
using. Consult the manual about the details. For instance, if you are using an STK500 with AVR Studio, the
STK500 window has a tab labeled Fuses, where you set the different bits and where you can program,
verify or read the fuse bits.
A little oddity is that to program a feature, you must write a "0" to the particular bit. It's sort of a negative
logic. If you are programming the fuse bits with AVR studio, you don't have to worry about this issue
because the value of the fuse bit is managed by the programmer.
Fuse bits live in a different memory space than the program memory. This means that the fuse bits are not
affected by a program memory erasure. This has the advantage that once you program the correct fuse

bits in your AVR, you can forget about them and don't need to reprogram them each time you alter the
program memory.
Fuse bits differ greatly between different AVR variants. Some AVRs, like the AT90S8535, have only two
fuse bits, while others, like Atmega128, have 18. I will explain the AT90S4433's fuse bits.
The AT90S4433 AVR has 6 fuse bits. One is related to serial programming (SPIEN), two are related to the
Brown-Out Reset Circuit operation (BODEN and BODLEVEL), and three are for configuring the clock
options (CKSEL2..0).
AVRs have two programming modes, parallel and serial. (See the Memory Programming section of the
datasheet for details). When the SPIEN fuse bit is programmed, serial programming and data downloading
is enabled. The default value for this fuse is programmed. You can change this fuse only if you are
programming the AVR in parallel mode.
The Brown-Out Detector circuit monitors the Vcc voltage. When Vcc drops below the trigger level, this
circuit resets the AVR. When Vcc is above the trigger level, the reset signal is released. If you want to
enable the Brown-Out Detector circuit, you must program the BODEN fuse. The BODLEVEL fuse sets the
trigger level. The following table summarizes this:
Fuse

Programmed

Unprogrammed

BODEN

Brown-Out Detector circuit enabled


Brown-Out reset threshold voltage
BODLEVEL
4.0V

Brown-Out Detector circuit disabled


Brown-Out reset threshold voltage
2.7V

Default
Unprogrammed
Unprogrammed

There are several options for the AVR clock which differ in the start-up time after a reset. You need to
adjust the start-up time according to the clock source you are using. See the datasheet and the clock
section for more details. The following table summarizes the setting of CKSEL fuses:
CKSEL[2..0]

Recommended Usage

000

External Clock, slowly rising power

001

External Clock, BOD enabled

010

Crystal Oscillator

011

Crystal Oscillator, fast rising power

100

Crystal Oscillator, BOD enabled

101

Ceramic Resonator

110

Ceramic Resonator, fast rising power

111

Ceramic Resonator, BOD enabled

Common Pitfall
A common pitfall with fuses is to forget about them, so you end up working with the default settings. If you
are lucky, these are the settings you need, but if you are not, strange things will happen. For instance,
many AVRs have an internal oscillator, which is enabled by default. If you are using the UART based on
the frequency of an external oscillator, your serial link won't work. Or maybe your Atmega128 is not
working as expected, because you forget to unprogram the Atmega103 compatibility mode fuse, which is
programmed by default. So, here is my advice: Carefully check the fuse bits before using your AVR!

Lock Bits
Lock Bits are similar to fuse bits. You program them through a programmer, and you can't change them at
run time. Lock Bits are used to set different access levels to Flash and EEPROM memory from an external
programmer.

All AVRs have at least two Lock Bits, LB1 and LB2, which allow you to configure three different Lock Bits
modes, as shown in the following table :
Mode
1

LB1
1

LB2
Description
1 You can read from and write to Flash and EEPROM with a programmer
You can only read from Flash/EEPROM. Writing is disabled. The Fuse Bits
1
can't be changed either.
0 Both reading and writing are disabled on Flash, EEPROM and Fuse Bits.

May be you are wondering why these different acces level are needed. Let say that you finished a
prototype, and don't want to accidentally erase your AVR, but you want in the future to read the program
memory to know the software version used. Then you use Lock Bit Mode 2. Or may by you are selling a
product and don't want that sombody copy your program. Then you use Lock Bit Mode 3.
If you are suspicious, you may wonder how secure AVRs really are. In my opinion, they are pretty good in
that respect. However, they are not bulletproof. There are companies, like www.chipworks.comthat
specialize in reverse engineering. Their services are VERY expensive, and so it's unlikely that someone
pays them to copy your program. If your work is top secret, or if you are just paranoid, then you need to
use a microcontroller specifically designed with high security in mind, like Atmels AVR based Secure
Microcontrollers. For more information about this issue, look at thisas well.
The only way to unprogramm a Lock Bit (change it from 0 to 1) is the Chip Erase command. Also notice
that if you use Lock Bits Mode 2 or 3, you can't change the Fuse Bits anymore. So, if you mess up things
with Lock Bits, don't worry, just erase the chip and start again.
The AVRs with bootloading capabilities (ATmega series), have four additional Lock Bits, which configure in
which section of FLASH the LPM and SPM instructions can be used. For the details, look at the particular
datasheet.
The details of how to program the Lock Bits in your AVR depend on the particular programmer you are
using. Read the manual for details. For instance, if you are using a STK500 with AVR Studio, the STK500
window has a tab labeled Lock Bits, where you set the different modes, and where you can program, verify
or read the Lock Bits.

The Watchdog Timer (WDT)


The Watchdog Timer (WDT) is a counter clocked by a separate On-Chip oscillator (different from the main
clock of the AVR). You can reset the WDT with the WDR (WDT reset) instruction. When the WDT
overflows, the AVR resets and starts the program from the reset vector, which is similar to externally pulling
down the reset pin.
The main purpose of the WDT is to make the device you are building robust and fault tolerant. The general
idea is to reset the WDT regularly. If something goes wrong (for instance when EMI corrupts the Program
Counter), the WDT won't be cleared and will overflow, and consecuently resets the AVR. A discussion
about how to properly use the WDT to make your design bullet proof is out of scope. Here are links to
some very good articles about this issue :
- http://www.embedded.com/story/OEG20010920S0064
- http://www.embedded.com/story/OEG20021211S0032
- http://www.embedded.com/story/OEG20030115S0042
A general rule of thumb is that if your code is sprinkled with the WDR instruction, you are doing things
wrong.
Lets discuss in detail how the AVR Watchdog Timer works. It is quite simple, since there is only one
register used : the Watchog Timer Control Register (WDTCR).
There are three things we can do with the WDT: enable the WDT, disable the WDT and set the prescaler.

You can set the prescaler to divide the WDT oscillator frequency, so that the reset interval can be adjusted.
It is important to notice that the On-chip oscillator frequency is dependant on the Vcc value and the
temperature of the chip. The following table shows the setting of the prescaler for the AT90S4433. All AVRs
behave similar in that respect, but there are some minor variations between differents devices. Check the
datasheet for the details.
WDP2 WDP1 WDP0
0
0
0
0
1
1
1
1

0
0
1
1
0
0
1
1

0
1
0
1
0
1
0
1

Number of Oscillator
cycles
16K cycles
32K cycles
64K cycles
128K cycles
256K cycles
512K cycles
1024K cycles
2048K cycles

Timeout (Vcc = 3V)


47 ms
94 ms
0.19 s
0.38 s
0.75 s
1.5 s
3.0 s
6.0 s

Timeout (Vcc = 5V)


15 ms
30 ms
60 ms
0.12 s
0.24 s
0.49 s
0.97 s
1.9 s

Enabling the WDT is simple, just set the Watchdog Enable (WDE) bit. Disabling the WDT is not that
simple, you must follow a special procedure. The WDT is designed this way to avoid disabling the WDT
unintentionally by an instruction executed during a fault condition.
The Watchdog Turn-off Enable (WDTOE) bit is used to disable the WDT. To disable the WDT you must
follow the following procedure:
1. Write, with one instruction, a logical "1" to the WDTOE and WDE.
2. Within the next four clock cycles, write a logical "0" to WDE. This disables the Watchdog.
This is a code snippet to disable the WDT:
ldi r16, 0x18
out WDTCR, r16
ldi r16, 0x10
out WDTCR, r16

;
; set WDTOE and WDE
;
; write a 0 to WDE

A word of caution : before turning-on the WDT, and before changing the prescaler, execute a WDR
instruction. In this way you are sure that the WDT starts cleared and does not generate an accidental WDT
overflow.

Watch Dog Timer Demo Code


The WDT demo code shows how the watch dog timer periodically generates a reset if it's not reset by the
wdr instruction. It uses the delay loop from the Getting Started code example. This delay loop generates a
delay of 0.5 seconds at 4 MHz.
An LED is connected to PortB.3 (active low - anode to Vcc, cathode to the pin via a current limiting
resistor). This example works with the mega8 and should be adaptable to fit other AVRs without problems:
.org 0x0000
rjmp reset
reset:
ldi r16, low(RAMEND)
out SPL, r16
ldi r16, high (RAMEND)
out SPH, r16
sbi DDRB, 3
ldi r16, 0b00001110

; reset interrupt vector:


; jump to "reset"
;
; our code:
; init Stack Pointer to RAMEND
;
;
;
;
; set PortB.3 Direction bit to 1
;
; configure Watch Dog prescaler for 1.0 seconds at 5V, enable WDT

out WDTCR, r16

;
;
; delay for 0.5 seconds
; turn off the LED
; loop forever (until the WDT resets the AVR)
;
;
;
;
;
;
;
;
;
;
;
;
;

rcall delay_05
sbi PortB, 3
loop:
rjmp loop
delay_05:
ldi r16, 8
outer_loop:
ldi r24, low(3037)
ldi r25, high(3037)
delay_loop:
adiw r24, 1
brne delay_loop
dec r16
brne outer_loop
ret

What the code will do is: Turn on the LED, wait for 0.5 seconds and turn it off again. As the Watch Dog is
enabled to generate a reset after 1 second, the reset will occur 0.5 seconds after the LED has been turned
off. The result is an LED flashing at 1 Hz with a duty cycle of 50%.

Connecting a HD44780 compatible LCD to


an AVR
[ The LCD Connector ] [ Example Ciruit ] [ LCD Command Set ] [Display Data Addressing ] [ Next Page ]
These LCDs are the standard LCDs used everywhere. They come in many sizes and one size that's used
a lot is 16x2 characters, so we'll use one of that size for our examples. They're very easy to use in 8 bit
mode (4 bit is a little tricky).
The LCD Connector
The LCD Connector has 16 pins and is usually located at the top of the LCD. It offers pins for power,
contrast, control lines, data lines and the LED backlight (if installed):
Pin 1
G
n
d

3
5
V

V
e
e

5
R
S

R
/
W

7
E

8
D
0

9
D
1

10
D
2

D
3

11
D
4

12
D
5

13
D
6

14
D
7

15
L
E
D
+

16
L
E
D
-

Gnd and 5V shouldn't need any explanation. Vee is the LCDs contrast voltage and should be connected to
a pot (voltage divider). The voltage should be between 0 and 1.5V (this may vary for different
manufacturers) and the pin *can* also be tied to ground.
RS is the register select pin. To write display data to the LCD (characters), this pin has to be high. For
commands (during the init sequence for example) this pin needs to be low.
R/W is the data direction pin. For WRITING data TO the LCD it has to be low, for READING data FROM
the LCD it has to be high. If you only want to write to the LCD you can tie it to ground. The disadvantage of
this is that you then can't read the LCDs busy flag. This in turn requires wait loops for letting the LCD finish
the current operation, which also means wasting CPU time.
E is the Enable pin. When writing data to the LCD, the LCD will read the data on the falling edge of E. One
possible sequence for writing is:

- Take RW low
- RS as needed for the operation
- Take E high
- put data on bus
- take E low
You can also prepare the data before taking E high, which might save 1 word of code space (more on that
later).
The purpose of the data lines should be obvious. When no read operation is in progress, they're tri-stated
which means that these lines can be shared with other devices. In 4-bit mode only the high nibble (D4..D7)
is used. Bit 7 is the busy flag (more on that later).
The Example Circuit
As we want to concentrate on the mega8, the code examples for the LCD are also written for the mega8.
PortD is the only "complete" port offering us 8 bits, so PortD will be used for as data port. PortB is used by
the ISP, so it will not be used by the LCD. PortC can be used for the LCD control lines (RS, R/W and E):
PortD -> LCD Data
PortC.0 -> RS
PortC.1 -> R/W
PortC.2 -> E
For the LCD to work, three more lines are necessary: Vcc, Ground and Vee. Vee can be tied to ground.
The circuit can be built up using our mega8 board or an STK500.
When powered up, you should see a black bar in the first LCD line. That bar will disappear when the LCD
is being initialised. Init is done by using a special command set. Commands are issued by writing data to
the LCD when RS is low (see above). If the LCD is not initialised correctly, writing display data to it won't
work at all. So I'll briefly describe the command set now, then go on with writing characters on the screen.
Afterwards I'll describe the commands in detail.
The LCD Command Set
Most of the LCD commands don't need more time fore the LCD to execute them than writing a character.
The datasheet of the LCD used for writing this code stated 40s for a simple command.
Clear Display: 0x01
This command clears the display and returns the cursor to the home position (line 0, column 0). This
command takes 1.64 ms to complete!
Cursor Home: 0b0000001x
This commands also sets the cursor position to zero, but the display data remains unchanged. It also
takes 1.64 ms for execution, but it also shifts the display to its original position (later!).
Entry Mode:
0

I/D

I/D: Increment/Decrement Cursor bit. If set, the cursor will be post-incremented after any read data or write
data operation. If cleared, the cursor will be post-decremented.
S: If set, the whole display will be shifted, depeding on I/D: If I/D and S are set, the display will be shifted to
the left, if I/D is cleared (S set), the display will be shifted to the right. Usually I/D = 1, S = 0 is used
(increment cursor, don't shift display).
Display On/Off:
0

D: Display On/Off. If set, the display is turned on. When the display is turned off, character data
remains unchanged!
C: Cursor On/Off. If set, the cursor is visible in the way set by B.
B: Cursor blink on/off. If this bit is set, the cursor will blink as a black block. Otherwise the cursor is shown
as an underscore _.
Shift Cursor/Display:
0

1 S/C R/L

S/C: If set, the display is shifted. If cleared, the cursor is shifted. The direction depends on R/L.
R/L: If set, the display/cursor is shifted to the right. If cleared, it's shifted to the left.
Function Set:
0

DL

DL: Interface length. If set, 8-bit mode is selected (as in this example). If cleared, 4 bit mode is selected.
N: Number of display lines. If cleared, the LCD is a one line display. If set, the display is in 2/4 line mode.
F: Font size. If cleared, 5x7 Font is selected. If set, 5x10 font is selected.
The last two features might lead to the question "Why doesn't my display know what it is?" Well the
controller (HD44780) is always the same, but it works with many different display types (1 to 4 lines, 8 to
40 characters per line) and the displays also come with different character sizes (5x7 or 5x10).
CG Ram Address Set:
0

ACG

ACG is the Address to be set for Character Generator Ram access. The CG can be used to configure and
show custom characters.
DD Ram Address Set:
1

ADD

ADD is the address to be set for Display Data Ram access. The display data ram holds the data displayed
(the characters). See below for DD Ram organisation.
Busy Flag/DD Ram Address READ:
BF

ADD

If the command register is read, the value actually returned by the LCD is the DD Ram Address (bits 0..6)
and the busy flag (bit 7). THe busy flag should be read after every command to achieve max speed. If the
busy flag is set, the controller is still busy executing a command/writig data.
Display Data Addressing
The display controller has to offer a way of addressing all display characters which works for ALL kinds of
displays (remember: 1 to 4 rows, 8 to 40 characters). That's why the rows don't follow each other. The
rows start at fixed addresses:

Display size
Nx8
N x 16
N x 20

1st row
$00 - $07
$00 - $0F
$00 - $13

2nd row
$40 - $47
$40 - $4F
$40 - $53

3rd row
x
$10 - $1F
$14 - $27

4th row
x
$50 - $5F
$54 - $67

Of course this list is not complete, but it shows some mean details about using a 16 x 4 display: The first
address of the second row is bigger than the first address of the third row!
Example: For setting the cursor to the 3rd character of the second row in a 16 x 2 display, write 0x42 |
0b10000000 to the display command register.
The following page will explain display initialisation and simple read/write operations together with code
examples.
[ Next Page ]

First LCD steps...


[ Writing Commands ] [ Writing Data ] [ Reading Data/Busy Flag ] [Init Sequence ]
Writing Commands
As already described, writing commands is writing data to the LCD with RS low. With PortD -> Data,
PortC.0 -> RS and PortC.2 -> E the code for writing a command can look like this:
LCD_Command:
cbi PortC, LCD_RS
out PortD, r16
sbi PortC, LCD_E
nop
nop
nop
cbi PortC, LCD_E
ret

;write command to LCD routine (command is in r16)


;clear RS for command register select
;put data on bus
;set Enable
;the number of nops here depends on your clock
;speed. One or two works well at 7.3728 MHz
;
;clear Enable
;return from subroutine

The nop delay between setting and clearing E is important for the line to settle. I've tried the code without
them and sometimes it didn't work due to the long wires. After a command is issued you have to either
insert a wait routine or check the busy flag (see "Reading Data").
So if we want to initialise the LCD for 8-bit mode, 2 lines and 5x7 font, we need to write 0b00111000 (0x38)
to the command register:
ldi r16, 0x38
rcall LCD_Command

;Load command (8bits, 2 lines, 5x7 font)


;and write it to the LCD using the routine above

Writing Data
Writing data to the LCD is just as easy as writing commands to it, but now we have to SET the RS line.
We'll use r16 for the routine argument again:
LCD_w_Data:
sbi PortC, LCD_RS
out PortD, r16
sbi PortC, LCD_E
nop
nop
nop
cbi PortC, LCD_E

;write data to LCD routine (data is in r16)


;set RS for data register select
;put data on bus
;set Enable
;the number of nops here depends on your clock
;speed. One or two works well at 7.3728 MHz
;
;clear Enable

cbi PortC, LCD_RS


ret

;and RS
;return from subroutine

As AVR Studio can convert ascii characters to their hex value, we can use 'A' for loading r16:
ldi r16, 'A'
rcall LCD_w_data

;Load character 'A'


;and write it to the LCD using the routine above

Reading Data and Busy Flag


For reading data from the LCD we have some more work to do. Again, we'll need to make two routines:
One will access the command register for reading the address counter and the busy flag, one will be used
for reading display data (characters). As reading requires PortD (the data port) to be configured as input,
we'll have to take a close look at the data direction. If it is not reset (to output) correctly before returning
from the routine, the next write routine could run into problems.
The LCD will put its data on the bus while E is high. So we need to take E high, wait a bit (for the LCD to
give us the data), then get the data byte and then take E low again. For the address/busy flag read
(command reg, RS low) the routine would look like:
LCD_r_Addr:
cbi PortC, LCD_RS
sbi PortC, LCD_RW
ldi r16, 0x00
out DDRD, r16
sbi PortC, LCD_E
nop
in r16, PinD
cbi PortC, LCD_E
ldi r17, 0xFF
out DDRD, r17
cbi PortC, LCD_RW
ret

;read address from the LCD


;clear RS for command register select
;set RW for read direction
;configure Data port
;as input
;now take E high
;wait a bit
;get the address and busy flag
;clear E
;PortD as output again
;
;clear RW again
;return

For reading data from the LCD, just rewrite the routine with LCD_RS = 1 (data register select). It will then
read the data at the location the Address counter is pointing to. I won't rewrite this routine now, as you just
have to change one single line.
The good thing is that we can now use the LCD_r_Addr routine to check the LCDs busy flag. Before we
would have needed to include delays between the command and data writes. Now we can wait until the
LCD has finished (AND NOT ANY LONGER!) and then proceed with the next command. The LCD_wait
routine can have sepereate read data code (this will speed up things, but require more code space) or it
can use LCD_r_Addr for reading the busy flag:
LCD_wait:
rcall LCD_r_Addr
sbrc r16, 7
rjmp LCD_wait
ret

; LCD_wait: wait for busy flag to clear


; read address and busy flag
; if busy flag cleared, return
; else repeat read/check
; return when busy flag cleared

This way of writing the routine has a good side effect: When it returns, the busy flag is cleared from r16
(because the LCD cleared it), but r16 still holds the address we just read from the display. It can be used
for other purposes then.
The Init Sequence
The LCD init sequence has to be executed after startup. It tells the LCD which font size it has, what kind of
interface to use, if and how the cursor should be shown and so on. Here's a working init sequence for a 16
x 2 LCD, 8 bit interface, 5 x 7 font; show cursor as underscore; auto-increment cursor:

ldi r16, 0b00111000


rcall lcd_command
rcall lcd_wait
ldi r16, 0x00001110
rcall lcd_command
rcall lcd_wait
ldi r16, 0x01
rcall lcd_command
rcall lcd_wait
ldi r16, 0b00000110
rcall lcd_command

; 0x38: 8 bit interface, 5 x 7 font, 2 lines


;
;
; display on, show cursor, don't blink
;
;
; clear display, cursor home
;
;
; auto-increment cursor
;

Though no data has been written to the LCD before issuing the clear display/cursor home command, the
cursor can be at a position that's not visible, so this command is important if you want to see what you
write to the LCD!
Some really slow LCDs might require your app to write 0x30 (8 bit interface) to the LCD before any other
command. If your LCD refuses to work, try that.

LCD C Example Code


This is my first attempt to present C example code, so please don't flame for bad programming style or
something like that. It's written in good old asm style (I just couldn't resist...) and includes lots of
comments. It has been written for AVR-GCC.
Some notes on the hardware this code is written for:
ATmega8 (clock frequency doesn't matter)
LCD data port <-> PortD
LCD_RS <-> PortC.0
LCD_RW <-> PortC.1
LCD_E <-> PortC.2
Also make sure to have a valid contrast voltage at pin 3 of the LCD (0..1.5V), or just tie it to ground. The
LCD this code was tested with is a Displaytech 164A (16x4 characters) LCD.
//#include <avr/io.h>
//#include <avr/delay.h>
#define LCD_RS 0
#define LCD_RW 1
#define LCD_E 2
// LCD_putchar writes a character to the LCD at the current address, no busy flag check is done before or
after
//the character is written!
//usage: LCD_putchar('A'); or LCD_putchar(0x55);
void LCD_putchar(char data)
{
//PortD is output
DDRD = 0xFF;
//put data on bus
PORTD = data;
//RW low, E low
PORTC &= ~((1<<LCD_RW)|(1<<LCD_E));
//RS high, strobe E
PORTC |= ((1<<LCD_RS)|(1<<LCD_E));
//the number of nops required varies with your clock frequency, try it out!
asm volatile ("nop");
asm volatile ("nop");
asm volatile ("nop");
asm volatile ("nop");

//RS low again, E low (belongs to strobe)


PORTC &= ~((1<<LCD_RS)|(1<<LCD_E));
//release bus
DDRD = 0;
}
//LCD_getaddress reads the address counter and busy flag. For the address only, mask off bit7 of the
return
//value.
char LCD_getaddr(void)
{
//make var for the return value
char address;
//PortD is input
DDRD = 0;
//RW high, strobe enable
PORTC |= ((1<<LCD_RW)|(1<<LCD_E));
asm volatile ("nop");
asm volatile ("nop");
//while E is high, get data from LCD
address = PIND;
//reset RW to low, E low (for strobe)
PORTC &= ~((1<<LCD_RW)|(1<<LCD_E));
//return address and busy flag
return address;
}
//LCD_wait reads the address counter (which contains the busy flag) and loops until the busy flag is
cleared.
void LCD_wait(void)
{
//get address and busy flag
//and loop until busy flag cleared
while((LCD_getaddr() & 0x80) == 0x80)
}
//LCD_command works EXACTLY like LCD_putchar, but takes RS low for accessing the command reg
//see LCD_putchar for details on the code
void LCD_command(char command)
{
DDRD = 0xFF;
PORTD = command;
PORTC &= ~((1<<LCD_RS)|(1<<LCD_RW)|(1<<LCD_E));
PORTC |= (1<<LCD_E);
asm volatile ("nop");
asm volatile ("nop");
asm volatile ("nop");
asm volatile ("nop");
PORTC &= ~(1<<LCD_E);
DDRD = 0;
}
/*LCD_init initialises the LCD with the following paramters:
8 bit mode, 5*7 font, 2 lines (also for 4 lines)
auto-inc cursor after write and read
cursor and didsplay on, cursor blinking.
*/
void LCD_init(void)
{
//setup the LCD control signals on PortC
DDRC |= ((1<<LCD_RS)|(1<<LCD_RW)|(1<<LCD_E));
PORTC = 0x00;
//if called right after power-up, we'll have to wait a bit (fine-tune for faster execution)
_delay_loop_2(0xFFFF);
//tell the LCD that it's used in 8-bit mode 3 times, each with a delay inbetween.
LCD_command(0x30);

_delay_loop_2(0xFFFF);
LCD_command(0x30);
_delay_loop_2(0xFFFF);
LCD_command(0x30);
_delay_loop_2(0xFFFF);
//now: 8 bit interface, 5*7 font, 2 lines.
LCD_command(0x38);
//wait until command finished
LCD_wait();
//display on, cursor on (blinking)
LCD_command(0x0F);
LCD_wait();
//now clear the display, cursor home
LCD_command(0x01);
LCD_wait();
//cursor auto-inc
LCD_command(0x06);
}
//now it's time for a simple function for showing strings on the LCD. It uses the low-level functions above.
//usage example: LCD_write("Hello World!");
void LCD_write(char* dstring)
{
//is the character pointed at by dstring a zero? If not, write character to LCD
while(*dstring)
{
//if the LCD is bus, let it finish the current operation
LCD_wait();
//the write the character from dstring to the LCD, then post-inc the dstring is pointing at.
LCD_putchar(*dstring++);
}
}
This code example is also available as a complete .h file with tabs for better reading: lcd.h

Using An LCD In 4 bit Mode


If 8-bit mode cannot be used (not enough free pins on the AVR side or due to PCB design problems), the 4
bit mode of HD44780 compatible LCDs can be used. In this mode only the upper 4 data lines are used and
the control lines stay the same, which makes a total of 7 lines for the LCD. However, the code size and
execution time increase, as each data or control byte has to be sent to the LCD in 2 nibbles.
The basic principles of using LCDs are the same as in 8-bit mode. The commands are the same and the
busy flag should be checked before any operation. Have a look at the pages about LCDs in 8-bit
mode before reading this.
Writing data or commands to the LCD in 4-bit mode is done high nibble first, then low nibble, so the enable
pin (E) has to be strobed twice. Special care should be taken when using a single AVR port, as mixing up
input and output lines can damage both the AVR and the LCD.
Here's a one-port setup for using an LCD in 4-bit mode (PortD):
PortD.4 ... PortD.7 -> Data.4 ... Data.7
PortD.3 -> LCD Enable line
PortD.2 -> LCD RW line
PortD.1 -> LCD RS line
The control lines are always output lines from the controller to the LCD, but the data line direction changes
depending on the current operation (as already noted, special care should be taken here!).

I'll now explain how to write commands and data to the LCD as well as reading the address counter and
data. For having a better overview, I suggest opening the LCD 4-bit mode codem8_LCD_4bit.asm in a
seperate window. The code contains init and some other routines which are not of interest now. They are
commented though and should be easy to understand. Scroll down to LCD_command8, LCD putchar and
LCD_command.
For initialising the LCD, we need to write commands to it. So have a look at LCD_command8 and
LCD_command now. LCD commands was written after LCD_putchar, but it shares LCD_putchar's
comments. I didn't repeat them.
After power-up the LCD is in 8-bit mode by default. For switching to 4-bit mode, we need to write an 8-bit
command to it : 0b00100000. Hey! We've just got 4 data lines, so how should we write an 8-bit command?
The good news about this is that the data length bit is in the upper nibble, which is connected to the LCD
(see above). The lower 4 bits are not important now. We just want to set the data length to 4 bits now. As it
is an 8-bit command, we only have to strobe E once, not twice as in 4-bit mode. The extra routine is ONLY
needed for init. The only thing to watch out for is the data direction for each pin, as control and data lines
share the same port. It's all in the comments, really! Have a look (send me an email if it's not commented
enough!).
Now comes the interesting part. Writing characters to the LCD works EXACTLY like writing commands, but
when writing character RS is taken high, while for commands it is taken low. Nothing special :-) First of all,
the data direction bits for the data lines are set for output, as in
DDRD |= 0xF0;
Then, for safety, all LCD lines are cleared. PortD.0 is not used by the LCD, so this bit is saved through this
process (see code, only the upper 7 bits are cleared). In 4-bit mode, the high nibble of our data byte is
written first, then the low nibble is written. For writing the high nibble, the low nibble of the argument is
cleared. Then the rest (the high nibble) is combined with the PortD data:
PortD |= (argument & 0xF0);
Then the control lines are set as needed: RS high (char) or low (command), RW low. Then E is strobed to
write the high nibble. Now we need the low data nibble, which we destroyed by clearing the low nibble
before for writing the high nibble. This means that the argument has to be saved at the beginning of the
routine (in this case it's pushed onto the stack). For getting the low nibble again, we now pop the argument
again and clear the high nibble. The data lines are on the high port lines though, so we also need to swap
the argument now. The high port data nibble is cleared, then the port data is ORed with the argument to
set the data lines as required. Again, E is strobed. Before returning from the routine, the data direction of
the LCD data port is set to input again, the control lines stay outputs. This procedure is the same for data
and commands, as already mentioned.
With these tools (LCD_command and LCD_command8) it's possible to init the the LCD (a small delay
routine is also needed). First, the LCD is set to 4-bit mode. Then the usual settings are made (-> LCD_init
at the end of the code!).
We still can't check the busy flag or read data from the LCD. Checking the busy flag is especially useful
during LCD init (we can get rid of those looooong delays!). I'll now just describe reading in general, as all
read operations are again almost equal. When reading from the LCD, the high nibble is also tansmitted
first. We have to read it while E is high during the E strobe. The read routines first make sure that the data
direction bits for the data lines are zero (input). Then E is taken high and the PIN data is read. Now the pin
data still contains some unknown bits (especially PinD.0, which might be used by other app code!) and
these bits are masked away. The remaining value is the high nibble of the data read, which will be stored
in the high nibble of our return value (mov is used for this so that the return value doesn't have to be
cleared before). Then the low nibble is read in the same manner (read PINs while E is high). For
combining the high nibble with the new low nibble we again have to clear the unused bits in the value from
the PIN register. The low nibble of the LCD data is now in the high nibble of the PIN value, so we need to
swap the pin value. Then OR is used to combine the return value (high nibble) with the new low nibble.
That's it!
A routine that waits for the busy flag is no problem then: Just read the address counter (which inclues BF),
mask off the lower 7 bits and see if the result is zero. Then the busy flag is cleared. Have a look at
LCD_wait.

The main code first inits the LCD and then writes 'A' to the first LCD position. Then LCD_command is used
to set the address counter to zero again: If bit 7 of the command is set, the lower 7 bits are interpreted as
an address for the cursor. Then the character at position zero is read (after a read operation, the cursor is
also auto-incremented!) and written to the next position. The LCD now shows "AA".

Interfacing an AVR to a 24C16 TWI


EEPROM
(mega8 Example Code in C!)
[ Hardware ] [ Write Operations ] [ Write Code Example ] [ Read Operations (with code) ] [ Complete File ]
Hardware
This TWI code example for communicating with a 24C16 EEPROM also uses a 16*4 LCD for debugging.
Please have a look at theLCD C Code in the LCD section for a short description on the functions used
(they're all pretty simple). The LCD is connected as explained there. Everything needed for the TWI
connection is the TWI bus itself, plus ground and supply voltage for the EEPROM and pull-up resistors on
the TWI lines. On the mega8, SCL is at PortC.5 and SDA is at PortC.4. The EEPROM pinout is shown
below:

<- Pinout of the 24C16 DIP package


The EEPROM has more pins than just SCL, SDA and power. First of all, there's three address pins
(A0..A2), which have no effect for this type of EEPROM (just don't connect them). Smaller EEPROMs
(24C08) use them. The memory size of the 24C16 is 16k bits (!) which makes 2k bytes. They are
organised in 128 pages with 16 bytes each. The upper three page address bits are transferred together
with the slave address (which is 0xA0 for writing and 0xA1 for reading).
The WP pin can be used to protect certain memory areas from being overwritten (calibration values for
example). If tied to Vcc, the upper half memory array is protected. When WP is tied to ground, the whole
EEPROM can be written to. We'll tie it to ground for now.
Write Operations
You can write single bytes or whole 16 byte pages to the EEPROM. Both write operations have the same
structure. Here's the byte-write figure:

First of all, a start condition is generated. Then the slave address byte is transferred. It consists of the bare
slave address (upper nibble: 0xA0 = 0b10100000 for better viewing), the upper three page select bits (bits
1..3) and the read/write bit (LSB) which is zero for writing. This address block is ACKed by the EEPROM if
the EEPROM is present and NOT BUSY (!). If the EEPROM is busy, it will not respond with an ACK. Then

the word address is transferred. Don't mind the "*" in the figure, it's for the 1K version of the EEPROM. In
our case, the word address has 8 bits. This address is also ACKed by the EEPROM.
The difference between byte-write and page-write is just the number of data bytes that are transferred now.
For a byte write just transfer one data byte (which is again ACKed by the EEPROM if everything's alright).
For a page write, transfer up to 16 bytes. If more than 16 bytes are transferred, the page address counter
will roll over to the first address of the current page.
When everything is done, the master generates a Stop condition. The EEPROM should disconnect itself
from the bus and enter some kind of power-save mode.
After every TWI operation, the TWI will set the TWINT flag and return a status code in TWSR. TWINT is
NOT set after the TWI generated a Stop condition (why should it?). Our code will tell the TWI what to do,
then wait for TWINT being set and then check the status code to see if everything is right. Depending on
the status of the operation that was completed, it will print success/error messages on the LCD.
Before we can do any printing, we'll have to run through some init code though:
LCD_init();
TWBR = 32;
LCD_init() will initialize the LCD (8 bit interface, 2 lines, 5*7 font, auto-inc cursor, cursor on and blinking).
TWBR is the TWI bit rate register. At 8 MHz, a value of 32 will result in a SCL frequency of 100kHz.
Write Code Example
The first thing we'll need is some function that initiates TWI operations, such as generating Start and
address transfer. As the TWI won't do anything while TWINT is set, our function will also make sure that
TWINT is cleared when writing TWCR. TWINT is cleared by writing a 1 to it. Then we'll wait for the TWI
hardware to set TWINT again and return the status code from TWSR:
char TWI_action(char command)
{
//write command to TWCR and make sure TWINT is set
TWCR = (command | (1<<TWINT);
//now wait for TWINT to be set again (when the operation is completed)
while (!(TWCR & (1<<TWINT)));
return TWSR;
}
The status codes are a good and rich source for errors. If the application checks for errors by looking at
the status codes, it can happen that the *wrong* status code is expected (especially when reading from the
EEPROM). This is not too dangerous now, but I thought it might be worth mentioning. The status codes
are divided into four groups: Master Transmitter Mode (MT), Master Receiver Mode (MR), Slave
Transmitter Mode (ST) and Slave Receiver Mode (SR). The slave modes are not interesting now. The
tables are in the mega8 datasheet (print them out). When switching between these modes, it can happen
that status codes get mixed up. For writing, only MT mode is used. Reading from the EEPROM also uses
MR mode.
Here's the code for sending a Start condition and the slave address. We'll write to page 0, byte 0. The
word address and data transfer is described seperately (but it's similar).
//send start. the expected status code is 0x08
if(TWI_action((1<<TWINT)|(1<<TWEN)|(1<<TWSTA)) == 0x08)
//if that worked, print 'S' on the LCD
LCD_putchar('S');
else
//if something went wrong, print 'E'
LCD_putchar('E');
//wait for the LCD to finish the character write (just for safety...)
LCD_wait();
//now send slave address, expected status code 0x18 (ACK received)

TWDR = 0xA0;
if(TWI_action((1<<TWINT)|(1<<TWEN)) == 0x18)
LCD_putchar('A');
else
LCD_putchar('N');
That's all you need for addressing a slave. The TWI hardware will return different status codes for data
sent AFTER the slave address. On the bus side, these transfers are equal, but the status codes are
different. That's why I've divided the code into two parts. The word address and the data byte are true data
transfers:
LCD_wait();
//send word address 0x00, expected status code is 0x28 (ACK)
TWDR = 0;
if(TWI_action((1<<TWINT)|(1<<TWEN)) == 0x28)
//if word address ACKed, print 'W', else print 'N' on the LCD
LCD_putchar('W');
else
LCD_putchar('N');
LCD_wait();
//now send the data byte. We'll use 0x55. Again, the expected status code is 0x28.
TWDR = 0x55;
if(TWI_action((1<<TWINT)|(1<<TWEN)) == 0x28)
//if data ACKed, print 'D', else print 'N' on the LCD
LCD_putchar('D');
else
LCD_putchar('N');
The very first 24C16 memory location should now be ready to be verified as 0x55.
If we wanted to write the whole page (or parts of it), we would just write more data bytes now (up to 16
byte in total plus the word address first):

Both byte write and page write are terminated with a stop condition:
LCD_wait();
TWCR = ((1<<TWINT)|(1<<TWEN)|(1<<TWSTO));
LCD_putchar('P');
The LCD should now show "SAWDP" for Start - Slave Address ACK - Word Address ACK - Data ACK Stop. The TWI hardware does not leave any specific status code after generating the Stop condition.
TWINT will also not be set. If the EEPROM is not connected to the bus, the LCD will show "SNNNP".
Read Operations
Read operations will put the TWI in two different states: MT mode and MR mode. There are three different
read operations: The current address read, the random read (from a specific address) and the sequential
read. The current address read just consists of a single read transfer without word address:

The master generates a Start condition, then sends the slave address (now with the R/W bit set for
reading). The slave responds with an ACK. Then the master reads the data from the slave (SCL driven by
the master, SDA driven by the EEPROM) and sends a NACK afterwards indicating that it does not want to
read any more data. Then a Stop is generated by the master.
When reading from a specific address, the word address is transferred first (as in a data write operation).
Then a repeated start is generated and the data is read:

Now it's important to understand which transfer modes the master is in: The first time the device address is
sent, the master is in MT mode for both the address and the word address transfer. Then, after the
repeated start condition, the master is is MR mode. This important, because the status codes come from
different tables. Here's the complete random read code (read from address 0):
LCD_wait();
//send start
if(TWI_action((1<<TWINT)|(1<<TWSTA)|(1<<TWEN)) == 0x08)
LCD_putchar('S');
else
LCD_putchar('E');
LCD_wait();
//send slave address
TWDR = 0xA0;
if(TWI_action((1<<TWINT)|(1<<TWEN)) == 0x18)
LCD_putchar('A');
else
LCD_putchar('N');
LCD_wait();
//send word address
TWDR = 0x00;
if(TWI_action((1<<TWINT)|(1<<TWEN)) == 0x28)
LCD_putchar('W');
else
LCD_putchar('N');
LCD_wait();
//repeated start
if(TWI_action((1<<TWINT)|(1<<TWSTA)|(1<<TWEN)) == 0x10)

LCD_putchar('S');
else
LCD_putchar('E');
LCD_wait();
//send slave address, read bit = 1; MR mode!
TWDR = 0xA1;
if(TWI_action((1<<TWINT)|(1<<TWEN)) == 0x40)
LCD_putchar('A');
else
LCD_putchar('N');
LCD_wait();
//now, in MR mode, get data byte. We don't set TWEA, so no ACK is sent afterwards:
TWI_action((1<<TWINT)|(1<<TWEN));
if (TWDR == 0x55)
LCD_write("read OK");
else
LCD_write("read error");
//send stop
TWCR = ((1<<TWINT)|(1<<TWSTO)|(1<<TWEN));
For this to work as expected, the first memory location should hold 0x55. If everything is right, the LCD
should show "SAWSAread OK" after such a read operation.
As you can see in this example, it's very important to look at the right status code!
It is also possible to do a sequential read (of the whole EEPROM if required). To do that, just write the
desired start address after sending the slave address, and read as man bytes as you wish, each time
sending an ACK. After the last byte, a NACK has to be sent:

The address pointer will roll over to address 0 after the last byte in memory has been read (it does NOT
roll over page-wise like in a write page operation!). You can read out all 2k bytes in one go if you want.
I've put these examples in a file, with read and write functions for single bytes and pages. Though not all
functions are used by the main code, they have all been tested.
24C16.c
lcd.h and the makefile (atmega8, avrdude programmer: stk500) are also required. The makefile has been
generated with mfile. mfile can be downloaded from the http://winavr.sourceforge.net/ news page.
The files contain connecting information.

AVR Calc
AVR Calc is a cool tool for calculating values for timers and the UART. It can also convert floating point
values to their hex representation. No need to tell you a lot more, just download and try it. It's available
on www.avrfreaks.net in the tools section (here's a direct link).

Creating a New Project in AVR Studio


Here's how to create a new project in AVR Studio. It's fairly simple and straight forward, but I thought it
might be worth explaining. In the menu, click on "Project -> New". Select a project name and directory (I
suggest placing each project and its files into an own project directory). Also choose "AVR Assembler" as
the Project Type; then hit OK.

You will see the "Project Manager" (right image). It is used to add files to the project, create new ones and
keeping track of the files associated with the project. Add a file to the project byy clicking right on the
"Assembler Files" folder and select "Create New File...". In the Dialog, check that the directory is correct,
choose a file name and make sure you add a valid extension (.asm), otherwise it won't work. The new file
will show up underneath the "Other Files" folder. Drag it into the "Assembler Files" folder.
You can include files in your assembler project using the .include directive. The definition files for the AVR
types for example have to be included for names like "PortB" to work. The top file, from which all other files
are included is the "Assembler Entry File" which the assember starts with when it try to translate your
code. You can set the assembler entry file (if your project contains more than one file) by right clicking in it
and checking the "Assembler Entry File" option in the drop-down menu. For the first file this will be
checked by default (what else should the assembler start with?).

This is it. You can now open the new file and add code to it. You can also add an already existing file and
choose that one to be the entry file of course.
The "Project Settings" box is quite important as well.

If you want to simulate your code in AVR Studio, choose "Object format for AVR Studio" in the "Output file
format" box. The assember will the create files that can be simulated by AVR Studio. If you want to
download your code to a target system such as the STK500 or some other board/projaect hardware, you'll
have to choose "Intel Hex" as output format. The assembler will then generate a hex file you can
download. Quite often I wonder why my code doesn't work because I already had an old hex file with the
right file name, but simulated newer code. If I then didn't change the output file format, old code is stored in
Flash. Of course there's more you can do with your projects in the menus, but this will be enough for the
project to work.

Software: Terminal Program


First of all: Forget about Hyperterminal! It just doesn't work. That's all I want to say about it and basically
there isn't more to say. The terminal program used for debugging apps or as a user interface should be
simple and it be able to show data in various formats, such as hex, binary or simple ascii. There's a few
around that I can recommend:
Bray's AVR Terminal:
Is a simple but powerful terminal. You can use almost any data length you might want (5 to 8 bits), speed,
handshaking and all that in the main window. No searching around in "options"-menus! Decimal, Hex and
binary display can be turned on in addition to the main data window which can be switched from ascii to
HEX. Logging and macros are also implemented. Very good!

SuperTerm V2 and Super Simple Terminal:


These are both available on the same page and also have proven to work well. SuperTerm V2 has a
scripting/macro engine as well. Check them out! I haven't tested them thoroughly, but they are good.
When setting up the terminal programs, the following general settings are suitabe for communicating with
an AVR:
Baud rate as required
data length as required (usually 8 bits)
no parity
1 Stop bit
no handshaking
With a properly working cable everything should be fine!

Our ATmega8 Dev Board


The avrbeginners ATmega8 board is a small and simple development platform just for the ATmega8 micro.
We've done this to show you how a simple AVR circuit can look like and to show you that you don't really
need an STK500. Before you can use this board you need a programmer. The web is full of schematics for
easy to buid parallel port programmers and most of them work really well. Have a look at PonyProg
(http://www.lancos.com/prog.html)if you want to build your own one. Of course, the mega8 board can also
be connected to the STK500 via the 10-pin ISP header. You can then program the mega8 with the STK500
dialog in AVRStudio.
The first thing you should do after building this board is trying out some of the simple code examles you
can find on this site (architecture section). When you verified that the hardware is OK, it's time to get rid of
the external programmer (be it an STK or a parallel port one - it doesn't really matter) by uploading a
bootloader. Code examples for them can be found on AVRFreaks.net or by googling.
This board offers some of the features the STK500 from Atmel has and, in some cases, code written for
the STK500 can be used for our board without ANY changes. Most of the code examples on this site are
written for or compatible with the mega8 board. The following pages describe each of the subcircuits (like
the power supply, ISP, micro part, port headers and so on) and might give you ideas for your own designs.
Atmel has put together an app note on some design rules: AN042(AVR Hardware Design Consideartions).
The subcircuit images are screenshots from the layout program (Proteus ISIS Lite) and contain garbage
here and there. As I cannot put all circuits on one sheet (due to the limitations of the lite version), IC,
Jumper, Resistor and so on numbers repeat.
As I can't make .pdfs I took screenshots of the layout and top silk so that you can copy it with your own
layout editor:
Bottom Copper | Top Silk
It's no problem to build the board on perfboard though if you can't etch the PCB yourself.
Here's an overview of the board features with direct links to the subcuircuit descriptions:
Power Supply Circuit
mega8 / Reset / Crystal Circuits

ISP Circuit
Port Headers
RS232 Transceiver Circuit
LEDs and Buttons
Other Connectors And Jumpers

The Power Supply Circuit


The power supply cicuit of the mega8 board is pretty simple and not optimized for efficiency or anything
like that. It uses a simple 7805 voltage regulator and just 3 caps and a rectifier. A "power on" LED has also
been added:

The Micro And Surrounding Circuits


The AVR needs a reset button. As \Reset is active low, the button is connected from \Reset (Pin 1/PortC.6)
to ground. To keep \Reset high during normal operation a 4K7 pullup resistor is added. For additional
protection a 10nF cap is added from \Reset to Ground.
For crystal operation two load caps of 22pF are needed, these are connected from the crystal pins to
ground. If no crystal is used the free pins belong to PortB. The header for this is described on theOther
Connectors and Jumpers page.
Aref and Avcc each have a decoupling cap to ground to reject noise on these pins. The donnector for an
external Aref is also described together with the other connectors (see link above).

The ISP circuit


This board uses the standard AVR ISP connector (10-pin version) for downloading. The ISP connector
requires no additional circuit if no external SPI slaves are connected to the mega8's hardware SPI.

If you wish to use external slaves and maintain ISP capability, Atmel recommends adding series resistors
of 4.7K in the SPI lines (mega8 -> ISP connector -> resistor -> slave).

The Port Headers


On the STK500 you can see 5 port headers. These are for Ports A to E. As the ATmega8 only has ports B,
C and D this board only has 3 headers. All of them provide Vcc and Gnd for external components, have a
decoupling cap and are compatible to those on the STK:

(note: each connector has its own decoupling cap and connecton to the power lines. This is only shown for
PortC here!)
You can see that not all pins are used. The mega8 doesn't have all those port pins. A special case is PortB:
PortB.6 and PortB.7 are the crystal oscillator pins. In order to ensure correct operation of the crystal
oscillator, these are located next to the crystal and NOT routed to the Port header (See Other Connectors).

The RS 232 Circuit


The RS 232 Circuit uses a standard MAX232 transceiver chip and the necessary external circuits.
It has 4 connectors: The female SubD 9-pin connector, jumpers to connect the spare pair of transceivers to
RTS and CTS (for optional flow control), one Rx and Tx connector (like on the STK500) and one RTS/CTS
connector (this may sound confusing, the schematic will make things clear).

J4 is the 2-pin connector we know from the STK500: It can be used to connect PortD.0 (Rxd) and PortD.1
(Txd) to the transceiver IC. On the board you can find it next to the PortD header.
The other pair of transceivers can be used in basically two ways. I have made simple drawings of how they
can be used. They also show how the pins are located on the board:
Flow Control Signals (RTS/CTS): If the spare pair of transceivers should be used for the flow control
signals RTS (from the PC) and CTS (to the PC), close J2/J3 as shown in the drawing and connect J5 to
those pins you want to use for flow control.

If you want to use the transceivers for other purposes, for example a second UART (in software or ext.
hardware), you can take the RS232 side of the data on J2. The J2 pin next to the MAX232 is for data
FROM the PC and goes to the RTS pin on J5. The CTS pin can be used for data TO the PC, which will
come out on the J2 pin next to J5.

The LEDs And Buttons


The mega8 board has 8 buttons and 8 LEDs. First I'll explain the LEDs. These are connected to Vcc via a
current limiting resistor (330R). This results in an active low operation: When the LED's header pin is held
low by the micro, the LED is ON. When the pin is high, the LED is off. Here's a schematic of one LED and
one button:

The Buttons are just connected to ground and will thus generate a low level when pressed. They are NOT
equipped with a pull-up resistor, so the internal pull-ups of the AVR have to be used.

Both headers provide Vcc and Gnd (just as the port headers) and have a decoupling cap (not shown
here).

Other Connectors and Jumpers


There's not much left to describe. Our mega8 board offers three more jumpers/connectors:
External Aref connector: This connector can be used to supply either an external reference voltage for the
ADC or to route the internal ADC reference voltage of the mega8 (either Vcc or 2.56 V) to external circuits.
It's a 2 pin connector with Aref and Gnd, whre Aref is next to the micro, Gnd is next to the ISP header.
TWI connector / Pull-up jumpers: The TWI connector provides the signals SDA (PortC.4), SCL (PortC.5),
Gnd and Vcc so that external TWI slaves can be connected. The two jumpers can be closed to enable on
board pullups on SCL and SDA.
PortB.6/PortB.7: These pins are also used by the crystal oscillator. As the oscillator traces have to be as
short as possible these Port Pins are located next to the crystal. This also ensures that no external circuitry
connected to the PortB header can disturb the oscillator. If the internal RC oscillator is used these pins can
be used without limitations.
This drawing shows the connectors and (in green) how the TWI Pullup jumpers are set (if wanted):

http://www.avrbeginners.net/

You might also like