The Placement of Cross-Referencing Material Within A Reference Manual

Oliver Tomlinson – Usability testing: self-directed project (Spring term 2010) 1
The placement of cross-referencing material

within a home reference manual
Summary
This user testing report accompanies a home reference manual on
falconry, designed by the author. A feature of the manuals’ design is its
alternative book layout compared with conventional methods found in other
falconry manuals.
The author tests 12 participants, with little or no knowledge of falconry,
by recording the time taken to find cross reference page numbers in two
spread designs; spread A having cross reference under the body text, and
spread B placing them on photographs.
Following a log (base 10) transformation to normalise the results, a t-
test concluded participants were significantly faster at cross referencing in
spread A than in spread B (one-tailed matched pairs t-test; t11= -2.00,
p=0.035).
Test objectives
To enable the user to navigate a falconry book with a new access
structure successfully, it is imperative that a system exists to aid them to find
relevant links and sections quickly and efficiently.
This test aims to determine the effectiveness of two different
navigation methods within a single spread of the book, and calculate if
different design approaches show a significant difference in the time it takes
the user to find a page reference. The results from this test will be useful, not
only in the case of the falconry manual, but any design requiring complex
navigation through its content.


Test method
Design approach
The section in the book describing raptor (birds of prey) flights and
hunting techniques has a number of links to another section describing
specific falconry tasks. The reader is required to navigate the book using a
cross referencing system. Two design variants were tested; spread A refers to a
design where cross references are located under the body text, whereas spread
B locates the cross references in the photos. The cross references in both tests
share similar design characteristics (type colour, typeface and within a
coloured banner). Appendix A contains the two design approaches seen in the
test spreads.
Procedure
Two spreads of the same book section were printed at actual size, one
using spread A references and the other using spread B. The author carried out
the test with each participant within a quiet room with few distractions,
reading out the statement below before commencement:

“On the page in front of you there are some references to specific
tasks involved in flying a bird of prey. In a moment I would like you
to turn over the piece of paper and tell me what page you would
find an explanation on hooding/flushing game [choose]. I shall be
timing your response but please be aware, this is not a test of your
ability, I am testing the design effectiveness of this page. I have
two different pages to try”.

Initial participant dialogue
Each participant was asked to find a reference to either ‘hooding’ or
‘flushing game’ in each spread; to reduce the impact of learning they were not
asked to search for the same reference in both spreads. To obviate the

potential effects of practice and task sequence, half of the participants were
tested on spreads A then B, and the other half B then A (to further reduce any
influence of short-term memory, the reference page numbers for hooding and
flushing game were different on each test spread). Details of participant
questioning can be seen in Appendix B.
The variable this usability test is investigating is the time taken to find
a page reference. It was measured by the author recording the time taken from
turning the spread, to the participant saying out-loud the page number. Once
the test had been timed, each participant was asked to give their opinion on
the spread design, with the author asking which one they preferred and why.
Thinking out-loud after the test, as apposed to during the test, did not distract
the participant during the timed task.
Participants
There were four combinations in which the variables could be asked
(see table below), i.e. it would take four participants to achieve a complete
range, so asking 12 participants would repeat all variations three times. It was
decided that 12 would be sufficient to give a large enough data-set, and this
method of counterbalancing would make sure the order varied and was
balanced. The results were focussed on the time taken to find page references;
results for hooding and flushing game were not compared as this was not the
variable of interest in this test.
Participant Test sequence
1 Hooding on Spread A Flushing game on Spread B
2 Hooding on Spread B Flushing game on Spread A
3 Flushing game on Spread B Hooding on Spread A
4 Flushing game on Spread A Hooding on Spread B
Counterbalanced test method


Participants were not from a design background and had not been
involved in the design of the test artwork. None were practicing falconers and
there was a mix of gender, age, and country of origin, but all could speak
English fluently.
Statistics
Statistical analysis of the quantitative results was required to ascertain
if the spread design had a significant effect on page referencing times. A t-test
is a parametric test, which requires data to be normally distributed. A
normality test revealed that the results were normally distributed in spread A
(Anderson-Darling normality test, p=0.61), but not normally distributed in
spread B (Anderson-Darling normality test, p<0.005). The results of tests A
and B were normalised using a log (base 10) transformation (Anderson-
Darling normality test; test A, p=0.27, test B, p=0.53). Probability plots of the
normality tests can be seen in Appendix C.
Hypotheses
The test hypothesis for this investigation predicts that the time
participants take to find references when placed under the text, will be quicker
than when references are placed in the photographs. The reasoning for this
hypothesis is due to the participants little understanding of falconry. It would
be expected that a newcomer to the sport would first look at text references
rather than explanatory photos, as they would have little knowledge of what
image relates to the falconry task. The null hypothesis is that the time to find
references under the text will not be quicker than when the reference is placed
in the photographs.

Limitations
There are a number of constraints in usability testing, namely
reliability and validity (Wenger and Spyridakis, 1989). To increase reliability
this test would need to use more participants to enable a better understanding
of the population and increase the chance of achieving the same results in
multiple tests, but this is made more difficult due to humans being
complicated and subject to a large number of external and internal stimuli.
Validity of the results could be difficult as the findings may actually occur from
a variable other than the placement of the text references.


Results
Quantitative
The table below illustrates the times taken to find page references in
spreads A and B by 12 participants. Participant 5 may have been a contributing
factor to why spread B results did not have a normal distribution, however, the
log (base 10) transformation solved this issue as mentioned previously in the
test method.
Spread A –
Spread B –
references under Difference
Participant references in
text (time in (seconds)
photos (seconds)
seconds)
1 7.41 4.00 +3.41
2 15.81 39.41 -23.60
3 2.85 23.28 -20.43
4 1.86 11.60 -9.74
5 10.62 120.22 -109.60
6 6.35 3.37 +2.98
7 1.92 10.41 -8.49
8 1.38 6.79 -5.41
9 5.00 2.35 +2.65
10 8.97 3.05 +5.92
11 7.23 7.60 -0.37
12 11.53 40.35 -28.82
Mean 6.74 22.70 -15.96
Standard
4.47 33.55 31.65
deviation
Standard error of
1.29 9.68 9.14
the mean
Time taken for participants to find page references in spreads A and B


Plotting the results in the diagram below illustrates, on average, the
sample used in this test was 15.96 seconds faster in finding page references in
spread A, with no overlap of the standard error of the mean.
Plotting the standard error of the mean for spreads A and B
Once the data had been normalised it could be analysed using a
parametric test. A matched pairs, one-tailed (due to the use of a directional
hypothesis) t-test was performed, producing the following output:
Paired T-Test and CI: logtime1, logtime2
Paired T for logtime1 - logtime2
N Mean StDev SE Mean

logtime1 12 0.718 0.350 0.101
logtime2 12 1.036 0.531 0.153
Difference 12 -0.318 0.549 0.158
95% upper bound for mean difference: -0.033

T-Test of mean difference = 0 (vs < 0): T-Value = -2.00 P-Value = 0.035

Results of a t-test from normalised data using Minitab v.15


Reviewing the t-test it is possible to conclude participants were
significantly faster in spread A than in spread B (one-tailed matched pairs t-
test; t11= -2.00, p=0.035). Due to this result, the null hypothesis can be
rejected.
Qualitative
A full description of the participant opinions can be seen in Appendix
B. 11 of the 12 participants tested preferred spread A, with page references
under the text. 3 of the 11 who preferred spread A, stated that spread B was not
the convention they were used to.
Influence on final design
The final spreads can be seen in Appendix D. Due to the quantitative
results of this test showing that page referencing placed under the body text is
significantly quicker to read, the falconry manual has been designed using this
method. The final designs align the top of each process step across the spread,
with page referencing kept in standard sized, and consistently coloured, boxes.
Consistency is a key element to the design and was mentioned in the
qualitative results of participant feedback. It is presumed this positioning will
enable the reader (with little understanding of falconry processes) to access
the cross referenced section efficiently and quickly.
Future designs shall presume that inexperienced readers, that is to say
readers with no knowledge of the subject matter, may find references placed
next to body text quicker than if placed on photo elements of the design.

Acknowledgements
The author sought advice on statistical testing from Patricia Cremona (MSc.
Wildlife Management and Conservation, University of Reading 2010), and
used her licensed copy of Minitab v.15.
References
Null hypothesis. URL: http://en.wikipedia.org/wiki/Null_hypothesis

[13.03.2010]
t-Test for the Significance of the Difference between the Means of Two
Correlated Samples. URL: http://faculty.vassar.edu/lowry/ch12pt1.html
[10.03.2010]
Wenger, M.J., & Spyridakis, J.H. (1989). The relevance of reliability and
validity to usability testing. IEEE Transactions on Professional
Communication, 32(4), 265-271)

Appendices
Appendix A: Test spreads
Spread A: Page references placed under body text (upper is the left page of the
spread, lower is the right)

Spread B: Page references placed in photographs (upper is the left page of the
spread, lower is the right)

Appendix B: Participant results
Participant Test Participant comments
1 1A, 2B Once I’ve learnt the style it’s a quick jump. I preferred
the first one (spread A)
2 1B, 2A I prefer spread A, and like the [reference] boxes being

bigger
3 2B, 1A Spread A is clearer
4 2A, 1B I like spread A where the reference is under what I read
5 1A, 2B Spread A is better as I’m not used to references on

photos. Below the photo would be better
6 1B, 2A Reading through you see spread A references better,

but references in the pictures may be better for
skimming
7 2B, 1A A is easier, I look at the text first. Either way, once you
know it’s in a tan box it’s quick
8 2A, 1B They are very similar, hardly any difference, but I prefer
references near the text
9 1A, 2B I like the colour difference and went through the

numbers first. B is against the convention so A is better
10 1B, 2A I prefer A, the references stand out more than when

they are placed in the pictures
11 2B, 1A Prefer A; I’m used to referencing in the text
12 2A, 1B Prefer the design of B as A seems too condensed. I like

the flow of B
Participant testing and qualitative feedback (1=hooding, 2=flushing game)


Appendix C: Probability plots of normality tests
Spread A normality test (non-transformed)
Spread B normality test (non-transformed)


Spread A normality test (Log-transformed)

Spread B normality test (Log-transformed)


Appendix D: Final spread design (post usability testing)

Final spread design (upper is the left page of the spread, lower is the right)

The Placement of Cross-Referencing Material Within A Reference Manual

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

The Placement of Cross-Referencing Material Within A Reference Manual

Uploaded by

Copyright:

Available Formats

Oliver Tomlinson – Usability testing: self-directed project (Spring term 2010) 1

The placement of cross-referencing material

This user testing report accompanies a home reference manual on

falconry, designed by the author. A feature of the manuals’ design is its

alternative book layout compared with conventional methods found in other

The author tests 12 participants, with little or no knowledge of falconry,

spread B placing them on photographs.

Following a log (base 10) transformation to normalise the results, a t-

test concluded participants were significantly faster at cross referencing in

spread A than in spread B (one-tailed matched pairs t-test; t11= -2.00,

To enable the user to navigate a falconry book with a new access

structure successfully, it is imperative that a system exists to aid them to find

relevant links and sections quickly and efficiently.

This test aims to determine the effectiveness of two different

navigation methods within a single spread of the book, and calculate if

different design approaches show a significant difference in the time it takes

navigation through its content.

hunting techniques has a number of links to another section describing

share similar design characteristics (type colour, typeface and within a

reading out the statement below before commencement:

Each participant was asked to find a reference to either ‘hooding’ or

flushing game were different on each test spread). Details of participant

questioning can be seen in Appendix B.

the participant during the timed task.

There were four combinations in which the variables could be asked

variable of interest in this test.

Participant Test sequence

1 Hooding on Spread A Flushing game on Spread B

2 Hooding on Spread B Flushing game on Spread A

3 Flushing game on Spread B Hooding on Spread A

4 Flushing game on Spread A Hooding on Spread B

Counterbalanced test method

Statistical analysis of the quantitative results was required to ascertain

is a parametric test, which requires data to be normally distributed. A

(Anderson-Darling normality test, p=0.61), but not normally distributed in

spread B (Anderson-Darling normality test, p<0.005). The results of tests A

and B were normalised using a log (base 10) transformation (Anderson-

normality tests can be seen in Appendix C.

hypothesis is due to the participants little understanding of falconry. It would

There are a number of constraints in usability testing, namely

reliability and validity (Wenger and Spyridakis, 1989). To increase reliability

complicated and subject to a large number of external and internal stimuli.

a variable other than the placement of the text references.

spreads A and B by 12 participants. Participant 5 may have been a contributing

1 7.41 4.00 +3.41

2 15.81 39.41 -23.60

3 2.85 23.28 -20.43

4 1.86 11.60 -9.74

5 10.62 120.22 -109.60

6 6.35 3.37 +2.98

7 1.92 10.41 -8.49

8 1.38 6.79 -5.41

9 5.00 2.35 +2.65

10 8.97 3.05 +5.92

11 7.23 7.60 -0.37

12 11.53 40.35 -28.82

Mean 6.74 22.70 -15.96

Time taken for participants to find page references in spreads A and B

Plotting the results in the diagram below illustrates, on average, the

spread A, with no overlap of the standard error of the mean.

Plotting the standard error of the mean for spreads A and B

Once the data had been normalised it could be analysed using a

parametric test. A matched pairs, one-tailed (due to the use of a directional