You are on page 1of 52

Arrays (Lists) in Python

one thing after another


Problem
 Given 5 numbers, read them in and
calculate their average
 THEN print out the ones that were
above average
Data Structure Needed
 Need some way to hold onto all the
individual data items after processing
them
 making individual identifiers x1, x2,
x3,... is not practical or flexible
 the answer is to use an ARRAY
 a data structure - bigger than an
individual variable or constant
An Array (a List)
 You need a way to have many variables
all with the same name but
distinguishable!
 In math they do it by subscripts or
indexes
 x1, x2, x3 and so on
 In programming languages, hard to use
smaller fonts, so use a different syntax
 x [1], x[0], table[3], point[i]
Semantics
 numbered from 0 to n-1 where n is the
number of elements
0 1 2 3 4 5
Properties of an array (list)
 Heterogeneous (any data type!)
 Contiguous
 Have random access to any element
 Ordered (numbered from 0 to n-1)
 Number of elements can change very
easily (use method .append)
 Python lists are mutable sequences of
arbitrary objects
Syntax
 Use [] to give initial value to, like x =
[1,3,5]
 refer to individual elements
 uses [ ] with index in the brackets
 most of the time you don’t refer to the
whole array as one thing, or just by the
array name (one time you can is when
passing a whole array to a function as
an argument)
List Operations you know
Operator Meaning
<seq> + <seq> Concatenation
<seq> * <int-expr> Repetition
<seq>[] Indexing
len(<seq>) Length
<seq>[:] Slicing
for <var> in <seq>: Iteration
<expr> in <seq> Membership (Boolean)
Python Programming, 2/e 8
Indexing an Array
 The index is also called the subscript
 In Python, the first array element always has
subscript 0, the second array element has
subscript 1, etc.
 Subscripts can be variables – they have to
have integer values
 k =4
 items = [3,9,’a’,True, 3.92]
 items[k] = 3.92
 items[k-2] = items[2] = ‘a’ 9
List Operations
 Lists are often built up one piece at a
time using append.
nums = []
x = float(input('Enter a number: '))
while x >= 0:
nums.append(x)
x = float(input('Enter a number: '))

 Here, nums is being used as an


accumulator, starting out empty, and
each time through the loop a new value
is tacked on.
Python Programming, 2/e 10
List Operations
Method Meaning

<list>.append(x) Add element x to end of list.

<list>.sort() Sort (order) the list. A comparison function may be passed as a


parameter.
<list>.reverse() Reverse the list.

<list>.index(x) Returns index of first occurrence of x.

<list>.insert(i, x) Insert x into list at index i.

<list>.count(x) Returns the number of occurrences of x in list.

<list>.remove(x) Deletes the first occurrence of x in list.

<list>.pop(i) Deletes the ith element of the list and returns its value.

Python Programming, 2/e 11


Using a variable for the size
 It is very common to use a variable to
store the size of an array
 SIZE = 15
 arr = []
 for i in range(SIZE):
arr.append(i)
 Makes it easy to change if size of array
needs to be changed
Solution to starting problem
SIZE = 5
n = [0]*SIZE
total = 0
for ct in range(SIZE):
n[ct] = float(input("enter a number “))
total = total + n[ct]

cont'd on next slide


Solution to problem - cont'd
average = total / SIZE
for ct in range(5):
if n[ct] > average:
print (n[ct])
Scope of counter in a for loop
 The counter variable has usual scope
(body of the function it’s in)
 for i in range(5):
 counter does exist after for loop finishes
 what‘s its value after the loop?
Initialization of arrays
 a = [1, 2, 9, 10] # has 4 elements
 a = [0] * 5 # all are zero
Watch out index out of range!
 Subscripts range from 0 to n-1
 Interpreter WILL tell you if an index
goes out of that range
 BUT the negative subscripts work as
they do with strings (which are, after
all, arrays of characters)
 x = [5]*5
 x[-1] = 4 # x is [5,5,5,5,4]
Assigning Values to
Individual Array Elements
temps = [0.0] * 5
m=4
temps[2] = 98.6;
temps[3] = 101.2;
temps[0] = 99.4;
temps[m] = temps[3] / 2.0;
temps[1] = temps[3] - 1.2;
// What value is assigned?
7000 7004 7008 7012 7016

99.4 ? 98.6 101.2 50.6


temps[0] temps[1] temps[2] temps[3] temps[4] 18
What values are assigned?
SIZE =5
temps = [0.0]* SIZE
for m in range(SIZE):
temps[m] = 100.0 + m * 0.2

for m in range(SIZE-1, -1, -1):


print(temps[m])

7000 7004 7008 7012 7016

? ? ? ? ?
temps[0] temps[1] temps[2] temps[3] temps[4] 19
Indexes
 Subscripts can be constants or variables
or expressions
 If i is 5, a[i-1] refers to a[4] and a[i*2]
refers to a[10]
 you can use i as a subscript at one
point in the program and j as a
subscript for the same array later - only
the value of the variable matters
Variable Subscripts
temps = [0.0]*5
m=3
. . . . . .
What is temps[m + 1] ?

What is temps[m] + 1 ?
7000 7004 7008 7012 7016

100.0 100.2 100.4 100.6 100.8


temps[0] temps[1] temps[2] temps[3] temps[4]
21
Random access of elements
 Problem : read in numbers from a file,
only single digits - and count them -
report how many of each there were
 Use an array as a set of counters
 ctr [0] is how many zero's, ctr[1] is how
many ones, etc.
 ctr[num] +=1 is the crucial
statement
Parallel arrays
 Sometimes you have data of different
types that are associated with each
other
 like name (string) and GPA (float)
 You CAN store them in the same array
 ar = [“John”, 3.24, “Mary”, 3.9, “Bob”, 2.7]
 You can also use two different arrays
"side by side"
Parallel arrays, cont'd
for i in range(SIZE):
name[i], gpa[i] = float(input(“Enter”))
 Logically the name in position i
corresponds to the gpa in position i
 Nothing in the syntax forces this to be
true, you just have to program it to be
so.
Parallel Arrays

Parallel arrays are two or more arrays that have the


same index range and whose elements contain
related information, possibly of different data
types

EXAMPLE
SIZE = 50
idNumber = [“ “]*SIZE
hourlyWage = [0.0] *SIZE parallel arrays

25
SIZE = 50
idNumber = [“ “] *SIZE // Parallel arrays hold
hourlyWage =[0.0] *SIZE // Related information

idNumber[0] 4562 hourlyWage[0] 9.68

idNumber[1] 1235 hourlyWage[1] 45.75

idNumber[2] 6278 hourlyWage[2] 12.71

. . . .
. . . .
. . . .
idNumber[48] 8754 hourlyWage[48] 67.96

idNumber[49] 2460 hourlyWage[49] 8.97 26


Selection sort - 1-d array
Algorithm for the sort
1. find the maximum in the list
2. put it in the highest numbered
element by swapping it with the data
that was at that location
3. repeat 1 and 2 for shorter unsorted list
- not including highest numbered
location
4. repeat 1-3 until list goes down to one
Find the maximum in the list
# n is number of elements
max = a[0] # value of largest element
# seen so far
for i in range(1, n): # note start at 1, not 0
if max < a[i]:
max = a[i]
# now max is value of largest element in list
Find the location of the max
max = 0 # max is now location of the
# largest seen so far
for i in range(1,n):
if a[max] < a[i]:
max = i
# now max is location of the largest in
# array
Swap with highest numbered
Remember element at right end of list is
numbered n-1
temp = a[max]
a[max] = a[n-1]
a[n-1] = temp
# there is a shorter way in Python!
The Python way!
 The previous code of finding the max
and its location will work in ANY high-
level language.
 Python has some nice functions and
methods to make it easier!
 Let’s try that.
The Python Way
 To find the max of the whole list
mx = max(a)
loc = a.index(mx)
Is using index SAFE here? If it doesn’t
find mx in a, it will crash!
But you just got mx from the list using
the max function, so it IS in the list a.
The Python Way
 The swap then becomes
a[loc], a[n-1] = a[n-1],a[loc]
 Python “hides” the temporary third
variable
Find next largest element and
swap (generic way)
max = 0;
for i in range(1,n-1): # note n-1, not n
if a[max] < a[i]:
max = i
temp = a[max]
a[max] = a[n-2]
a[n-2] = temp
put a loop around the general
code to repeat for n-1 passes
for pss in range(n, 1, -1):
max = 0
for i in range(1,pss):
if a[max] <= a[i]:
max = i
temp = a[max]
a[max] = a[pss-1]
a[pss-1] = temp
The whole thing the Python
way
for pss in range(n, 1, -1): # n-1 passes
mx = max(a[0:pss])
loc = a.index(mx)
a[loc], a[pss-1] = a[pss-1], a[loc]
2-dimensional arrays
 Data sometimes has more structure to
it than just "a list"
 It has rows and columns
 You use two subscripts to locate an
item
 The first subscript called “row”, second
called “column”
2-dimensional arrays
 syntax
 a = [[0]*5 for i in range(4)]
# 5 columns, 4 rows
 Twenty elements, numbered from [0][0] to
[4][3]
 a = [[0]*COLS for i in range(ROWS)]
 Which has ROWS rows and COLS columns in
each row (use of variables to make it easy to
change the size of the array without having to
edit every line of the program)
EXAMPLE -- Array for monthly high temperatures
for all 50 states

NUM_STATES = 50
NUM_MONTHS = 12
stateHighs = [[0]*NUM_MONTHS for i in range(NUM_STATES)]

[0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [10][11]
[0]
[1]
[2]
row 2, 66 64 72 78 85 90 99 105 98 90 88 80
.
col 7 . stateHighs[2][7]
might be
.
Arizona’s
high for [48]
August [49]
39
Processing a 2-d array by
rows
finding the total for the first row
for i in range(NUM_MONTHS):
total = total + a[0][i]
finding the total for the second row
for i in range(NUM_MONTHS):
total = total + a[1][i]
Processing a 2-d array by
rows
total for ALL elements by adding first
row, then second row, etc.

for i in range(NUM_STATES):
for j in range(NUM_MONTHS):
total = total + a[i][j]
Processing a 2-d array by
columns
total for ALL elements by adding first
column, second column, etc.

for j in range(NUM_MONTHS):
for i in range(NUM_STATES):
total = total + a[i][j]
Finding the average high temperature for Arizona

total = 0
for month in range(NUM_MONTHS):
total = total + stateHighs[2][month]
average = round (total / NUM_MONTHS)

average

85

43
Passing an array as an
argument
 Arrays (lists) are passed by reference =
they CAN be changed permanently by
the function
 Definition def fun1 (arr):
 Call the function as
x = fun1 (myarr)
Arrays versus Files
 Arrays are usually smaller than files
 Arrays are faster than files
 Arrays are temporary, in RAM - files are
permanent on secondary storage
 Arrays can do random or sequential,
files we have seen are only sequential
Using Multidimensional Arrays

Example of three-dimensional array

46
NUM_DEPTS = 5 # mens, womens, childrens, electronics, furniture
NUM_MONTHS = 12
NUM_STORES = 3 # White Marsh, Owings Mills, Towson
monthlySales = [[[0]*NUM_MONTHS for i in range(NUM_DEPTS)] for
j in range(NUM_STORES)]
monthlySales[3][7][0]
sales for electronics in August at White Marsh
5 DEPTS
rows

12 MONTHS columns 47
Example of filling a 3-d array
def main():
NUM_DEPTS = 5 # mens, womens, childrens, electronics, furniture
NUM_MONTHS = 12
NUM_STORES = 3 # White Marsh, Owings Mills, Towson
monthlySales = [[[0]*NUM_MONTHS for i in range(NUM_DEPTS)] for j in
range(NUM_STORES)]
storeNames = ["White Marsh", "Owings Mills", "Towson"]
deptNames = ["mens", "womens", "childrens", "electronics", "furniture"]
for store in range(NUM_STORES):
print (storeNames[store], end=" ")
for dept in range(NUM_DEPTS):
print (deptNames[dept], end = " ")
for month in range(NUM_MONTHS):
print("for month number ", month+1)
monthlySales[store][dept] [month] = float(input("Enter the sales "))
print()
print()
print (monthlySales)
Find the average of
monthly_sales
total = 0
for m in range(NUM_MONTHS):
for d in range(NUM_DEPTS):
for s in range(NUM_STORES):
total += monthlySales
[s][d][m]
average = total /
(NUM_MONTHS * NUM_DEPTS * NUM_STORES)
Problem: student data in a file
 The data is laid out as
 Name, section, gpa
 John Smith, 15, 3.2
 Ralph Johnson, 12, 3.9
 Bob Brown, 9, 2.5
 Etc.
Read in the data
inf = open(“students”,”r”)
studs = []
for line in inf:
data = line.split(“,”)
studs.append(data)
inf.close()
#studs looks like [[“John Smith”,15,3.2],
#[“Ralph Johnson”,12,3.9],[“Bob Brown”…]]
Find the student with highest
GPA
max = 0
for j in range(1, len(studs)):
if studs[max][2] < studs[j][2]:
max = j
#max is now location of highest gpa
studs[max][0] is the name of the student
studs[max][1] is the student’s section

You might also like