You are on page 1of 12

This is the authors version of a work that was submitted/accepted for pub-

lication in the following source:

Mills, Chris, Fidge, Colin J., & Corney, Diane (2012) Tool-supported
dataflow analysis of a security-critical embedded device. In Pieprzyk,
Josef & Thomborson, Clark (Eds.) Proceedings of the 10th Australasian
Information Security Conference (AISC 2012), Australian Computer Soci-
ety, RMIT University, Melbourne, VIC, pp. 59-70.

This file was downloaded from: http://eprints.qut.edu.au/47261/

c Copyright 2012 Australian Computer Society


Copyright 2012, Australian Computer Society, Inc. This paper appeared


at the Tenth Australasian Information Security Conference (AISC2012),
Melbourne, Australia, 30th January 2nd February 2012. Conferences
in Research and Practice in Information Technology (CRPIT), Vol. 125,
J. Pieprzyk and C. Thomborson, Ed. Reproduction for academic, not-for-
profit purposes permitted provided this text is included.

Notice: Changes introduced as a result of publishing processes such as


copy-editing and formatting may not be reflected in this document. For a
definitive version of this work, please refer to the published source:
Tool-Supported Dataflow Analysis of
a Security-Critical Embedded Device
Chris Mills Colin J. Fidge Diane Corney

Faculty of Science and Technology,


Queensland University of Technology, Brisbane

Abstract within Australia the Defence Signals Directorate fol-


lows such standards to produce a list of trustworthy
Defence organisations perform information security devices, known as the Evaluated Products List 1 .
evaluations to confirm that electronic communica- A particularly challenging aspect of high-grade
tions devices are safe to use in security-critical sit- infosec evaluations is to trace all (potential) dataflow
uations. Such evaluations include tracing all possible paths through the device. With respect to the de-
dataflow paths through the device, but this process vices electronic circuitry this process is notoriously
is tedious and error-prone, so automated reachabil- tedious and error-prone, but it becomes virtually im-
ity analysis tools are needed to make security eval- possible when embedded microprocessors are encoun-
uations faster and more accurate. Previous research tered on the circuit board. The number of dataflow
has produced a tool, Sifa, for dataflow analysis of paths through embedded program code far outweighs
basic digital circuitry, but it cannot analyse data- the number of physical connections in the surround-
flow through microprocessors embedded within the ing circuitry and, unlike a circuitry schematic dia-
circuit since this depends on the software they run. gram, potential dataflow paths through software are
We have developed a static analysis tool that pro- not self-evident from mere inspection of the source
duces Sifa-compatible dataflow graphs from embed- code.
ded microcontroller programs written in C. In this To help alleviate this problem we recently com-
paper we present a case study which shows how this pleted a static analyser (Fidge & Corney 2009) which
new capability supports combined hardware and soft- can extract dataflow graphs from Embedded C pro-
ware dataflow analyses of a security-critical commu- grams in a form compatible with an existing tool for
nications device. reachability analyses of digital circuitry (McComb &
Keywords: Information security evaluation; Dataflow Wildman 2005). The combination of these two tools
analysis; Static analysis; Embedded devices thus promises to support seamless automated analy-
ses of dataflow through both the electronic circuitry
and embedded software of security-critical communi-
1 Introduction cations devices.
In this paper we present a detailed case study
Security-critical communications devices used to safe- demonstrating for the first time how these tools can
guard data confidentiality and integrity in govern- be used together to analyse an actual domain separa-
ment, military and industrial applications must be tion device, tracing dataflow through both its hard-
rigorously evaluated before they are deployed. Typ- ware architecture and embedded software. The device
ical domain separation devices used to control the itself is a testbed specifically intended for experimen-
flow of information between classified and unclassified tation with infosec evaluation processes. The analy-
communications networks include data diodes (which sis produced all of the known dataflow paths through
enforce unidirectional information flow), encryption this device, as well as revealing some that were not
devices (which allow classified data to be sent over anticipated.
insecure networks), trusted filters (which constrict in-
formation flow) and keyboard-video-mouse switches
(which allow a single workstation to access both high- 2 Previous and related work
security and low-security computers).
International standards, such as the Common Cri- Overall our concern is with automated tools that can
teria for Information Technology Security Evalua- help an information security evaluator understand the
tion (ISO 2009), mandate information security, or (potential) flow of data through an electronic device,
infosec, evaluations of such devices. For instance, including both its electronic circuitry and embedded
software.
This research was funded in part by the Defence Signals Direc- There are, of course, numerous electronic circuit
torate and the Australian Research Council via ARC Linkage- simulators available as both educational and debug-
Projects Grant LP0776344. ging aids. These include Spice2 , the Electric VLSI
Design System3 and NGSpice4 . However these are
for modelling simple electronic components, semicon-
Copyright 2012,
c Australian Computer Society, Inc. This pa-
per appeared at the Tenth Australasian Information Security
ductors and logic gates, not microcontroller software.
Conference (AISC2012), Melbourne, Australia, 30th January
1
2nd February 2012. Conferences in Research and Practice in http://www.dsd.gov.au/infosec/epl/
2
Information Technology (CRPIT), Vol. 125, J. Pieprzyk and http://bwrc.eecs.berkeley.edu/classes/icbook/spice/
3
C. Thomborson, Ed. Reproduction for academic, not-for-profit http://www.staticfreesoft.com/
4
purposes permitted provided this text is included. http://ngspice.sourceforge.net/
There are also many multiprocessor simula- These usually represent discrete electronic compo-
tors such as PTLSim5 for the x86 microprocessor, nents on the board such as logic gates, integrated
CASPER6 for the OpenSPARC T1, the SESC Su- circuit chips, connectors, etc. Sifa allows compo-
perESCalar Simulator7 , and the IBM Full Systems nents to be grouped hierarchically, thus providing
Simulator8 for the PowerPC processor, but these tools a highly flexible modelling capability. Furthermore,
generally focus on simulating a single processor at the Sifa treats all identically-named ports as denoting
level of individual instruction cycles. the same physical object. This allows circuitry dia-
Much closer to our needs are simulators for entire grams to be split horizontally into different pages,
circuit boards, together with their embedded micro- with identically-named ports acting as off-page con-
processors, including commercial tools such as Wind nectors, or vertically into layered models, allowing the
same circuit to be described at different levels of ab-
River Simics9 and OVPSim10 . straction simultaneously.
However, all of the above-cited tools are simula- Sifa provides a variety of graph-theoretic func-
tors for helping a developer debug a device by exam- tions for analysing models of security-critical cir-
ining one functional behaviour at a time. An info- cuitry (McComb & Wildman 2007). These include
sec evaluator is instead faced with the problem of identifying all components between two points in the
analysing a given device which is presumed to be graph (which helps exclude components that have
functionally correct and does not need debugging. no security significance), finding cutsets between two
Furthermore, a security evaluator needs to consider points (which helps identify places in the circuit where
all possible behaviours of the device, not just a few. infosec evaluations can be done most efficiently), and
This requirement is best served not by a simulator comparing two different graphs for overall equivalence
but by a static analyser which can explore all of the (which allows an abstract model of expected data flow
devices behaviours at once. Finally, none of the tools to be compared with the actual data flow in the con-
cited above are designed specifically for security eval- crete circuit).
uations. However, Sifas most important function is its
Much more useful for this purpose are tools that ability to identify all dataflow paths between selected
treat security-critical circuitry as a graph which can points in a graph, typically between a high-security
be analysed topologically. For instance, the Uni- data source and a low-security data sink. Since a
versal Virtual Laboratory includes a circuit analy- circuitry graph is usually fully-connected (i.e., ev-
sis module which can determine whether or not two ery electronic component is connected directly or in-
seemingly-different circuits are topologically equiv- directly to every other one), Sifa uses the concept
alent (Mahalingam, Butz & Duarte 2005). More of a devices operating modes to allow such graphs
importantly, however, the Secure Information Flow to be partitioned meaningfully. The user can define
Analyser, Sifa, performs topological analyses of cir- intra-component data flow with respect to particu-
cuitry schematics specifically to support information lar modes. Modes are further divided into normal
security evaluations (McComb & Wildman 2005). and fault behaviours, with a probability attached
We therefore used the Sifa tool as the starting to the latter. (Sifa has no semantic understanding
point for our own research; its capabilities are de- of modes, however, using them merely as a syntactic
scribed further in Section 3.1 below. In essence, the way of partitioning the search space.)
goal of our overall project is to extend Sifa with the
ability to analyse embedded program code as well as Sifa thus performs a mode-specific analysis of
circuitry. inter-component reachability and presents the user
with a list of those paths through the circuit that
connect selected data sources and sinks in particular
3 Dataflow analysis tools used modes. The infosec evaluator can then inspect each
such path to determine whether or not it poses a se-
Before presenting the case study, this section briefly curity risk. While adequate for circuits comprised
describes both of the tools that were used, namely of simple electronic components only, this process
the Secure Information Flow Analyser (McComb encounters difficulties coping with the complex be-
& Wildman 2007) and our new C-to-Sifa Con- haviours of embedded microprocesors. The infosec
verter (Fidge & Corney 2009). evaluator is obliged to separately analyse the program
code to determine how data may flow through these
3.1 The Secure Information Flow Analyser components.

Sifa, the Secure Information Flow Analyser, is an 3.2 The C-to-SIFA Converter
open-source11 software tool developed for the Defence
Signals Directorate to assist with infosec evaluation of To solve this problem, we recently completed a C-
electronic circuits (McComb & Wildman 2005). It in- to-Sifa Converter. This is a compiler-like program
corporates a simple graph editor to allow device mod- that converts Embedded C code to Sifa-compatible
els to be constructed manually, but can also import dataflow graphs capable of being integrated into hard-
circuitry schematics expressed in the Vhdl hardware ware circuitry models. Its input consists of computer
design language. programs written in Custom Computer Services C
Sifa represents electronic circuitry as a graph of dialect for Programmable Integrated Circuit micro-
ports, which form the basis for its reachability anal- controllers12 , and its output is an XML description
yses (McComb & Wildman 2006). Typically ports of a dataflow graph that can be loaded directly into
denote physical pins and connections on a circuit Sifa. A preliminary description of the principles un-
board. Ports can be grouped to form components. derlying the tool can be found elsewhere (Fidge &
5 Corney 2009), with a more detailed description of the
http://www.ptlsim.org/
6
http://coe.uncc.edu/kdatta/casper/casper.php
final implementation to appear in a forthcoming pa-
7
http://iacoma.cs.uiuc.edu/paulsack/sescdoc/ per.
8
http://www.research.ibm.com/systemssim/ To model (potential) data flow through program
9
http://www.windriver.com/products/simics/ code the tool uses the Augmented Static Single
10
http://www.ovpworld.org/technology ovpsim.php
11 12
http://sifa.sourceforge.net/ http://www.ccsinfo.com/
if (u > 0) { flow nodes generated for each of the two embedded
t = t + v; programs (because Sifa unifies all identically-named
} else { nodes) via a command-line option for prefixing the
t = w; names of graph nodes with a microcontroller-specific
} identifer. Also, the large number of dataflow nodes
generated for program code relative to its surround-
Figure 1: Example of a conditional statement. ing circuitry makes it difficult to interpret the long
dataflow paths generated by Sifa, so the source code
u t v w programs line number is included in the name of each
dataflow graph node generated.
Most significantly, the user needs a way to link the
microcontroller pins appearing in the circuitry dia-
_ >0 + gram to corresponding input and output statements
in the program code (Fidge & Corney 2009). A vari-
ety of potential solutions to this were contemplated,
t such as adding a bridging component to the Sifa
model to explicitly link hardware and software fea-
tures, or providing a configuration file to the C-to-
u > 0 Sifa Converter to tell it what names are used for the
microcontrollers pins in the hardware schematic. For
if- the purposes of this particular case study, however,
Legend: port it was found to be expedient to simply directly edit
the pin names in the (hand-crafted) hardware model
t to match those in the (automatically-generated) soft-
component ware model, especially since only a handful of the
many pins on the microprocessor chips were used to
transfer data.
data flow

control influence 4 The case study

To test the combined capabilities of Sifa and the C-


Figure 2: Dataflow graph generated by the C-to-Sifa to-Sifa Converter we performed a small, but com-
Converter for the code fragment in Figure 1. plete, case study to show how potential data flow can
be traced through both the hardware and software of
an embedded domain-separation device.
Assignment representation, originally developed for
performing taint analyses of security-critical pro-
grams (Scholz, Zhang & Cifuentes 2008). In partic- 4.1 The data diode device
ular, this representation considers not just explicit
data flow between program variables, but also the The subject of the trial was a data diode device pro-
implicit information flow created by one variables duced by Australias Defence Signals Directorate13 as
value exercising control over assignments to another an unclassified and non-proprietary testbed for exper-
variable (Sabelfeld & Myers 2003). For instance, imenting with infosec evaluation techniques (Mallen
given the C program fragment in Figure 1, the C- 2003). Our project team was given access to the de-
to-Sifa Converter will produce the Augmented SSA vices design drawings, circuitry schematics and code
dataflow graph in Figure 2. As in traditional data- listings, as well as a functional version of the device
flow graphs (Cytron, Ferrante, Rosen, Wegman & itself. A data diode device is typically used as part
Zadeck 1989) it uses a node to merge alternative of a gateway between a high-security network and a
dataflow paths through the if statement, in this case low-security one, in order to ensure that there is no
showing that variable ts final value (t2 ) may be de- information leakage from the former to the latter.
rived either from the initial values of variables t and v The particular data diode device analysed
or from variable w. In addition, however, the Aug- here (Mallen 2003) contains two circuit boards con-
mented SSA graph also shows relevant control flows, nected by a ribbon cable as shown in Figure 3. The
in the same way as Gated Single Assignment repre- red circuit board is connected to the high-security
sentation (Ballance, Maccabe & Ottenstein 1990), in network via an RS232 serial cable and the black cir-
this case showing that variable us value exercises con- cuit board is similarly connected to the low-security
trol over the final value of variable t. network. (This split architecture is intended to aid
Apart from implementing the basic conversion security evaluation of the device; high-security data
from imperative programming code to dataflow should be found on the red circuit board only and
graphs, we also needed to extend the analysis to han- the ribbon cable forms a narrow, well-defined bottle-
dle program constructs peculiar to embedded code. neck between the two security domains.) Both circuit
These included input and output statements that in- boards contain their own Programmable Integrated
teract directly with the surrounding hardware, low- Circuit microcontroller each running a different pro-
level, non-block structured control-flow statements gram written in Custom Computer Services C di-
such as breaks and continues, asynchronous control alect. Both microcontrollers directly control LEDs on
flow via hardware interrupts, and byte- and bit-level the devices front panel to display its communication
data operations. status (ready for data or waiting for acknowledge-
The case study described below also highlighted ment). Two switches on the front panel (reset and
some practical issues that needed to be solved within ack mode) are connected directly to the black micro-
the C-to-Sifa Converter. For instance, since the de- controller, and indirectly through the ribbon cable to
vice analysed contains two separate microcontrollers
it was necessary to uniquely distinguish the data- 13
http://www.dsd.gov.au/
Switches LEDs

Black (low-security) network

Red (high-security) network


Black Red
serial serial
port port
Black Red
micro- micro-
processor processor

Black circuit board Red circuit board


Data diode device

Figure 3: Block architecture of the data diode device.

the red microcontroller, to allow the operator to con- components, except for the microcontrollers, was de-
trol the devices operating mode. fined for the two main operating modes of the data
The data diode devices primary function is to al- diode device, namely ack mode on and ack mode
low data bytes to flow from the low-security network off. (Another advantage of the data diode device as
to the high-security one (i.e., from left to right in Fig- a testbed is that its significant operating modes are
ure 3) but not vice versa. However, to support com- obvious in its design.)
munication over unreliable networks, this particular Next, the source code programs for the two micro-
device also allows acknowledgements to be returned controllers were processed by the C-to-Sifa Con-
from the high-security network to the low-security one verter. (Both programs are written in the Embedded
(i.e., from right to left in Figure 3). Such a capabil- C dialect for the PIC16F877 microcontrollers used in
ity is, of course, clearly dangerous because it allows the data diode device.) Although the programs be-
information to flow from the high-security domain to ing analysed were quite small, the resulting dataflow
the low-security one. graphs were still highly complex. The black pro-
To (partially) mitigate this threat, the acknowl- gram consisted of only 106 lines of commented, for-
edgement function is directly controlled by the op- matted C code, plus a 248 line header file, but re-
erator via a front panel switch. Furthermore, entire sulted in a graph containing 195 Sifa ports grouped
bytes returned by the high-security network are not to form 87 dataflow graph components. Similarly,
directly forwarded to the low-security one. Instead, the red programs 109 lines, plus header file, gener-
the red microprocessor compares the returned byte ated 200 ports forming 89 dataflow components. Part
with the one just sent. Depending on whether or not of these graphs is shown in Figure 5. (Sifa does not
they match it sets one of two binary signals sent to the have an in-built graph layout tool, and the C-to-Sifa
black microprocessor. Finally, the black microproc- Converter merely generates nodes in a simple grid,
essor converts these signals into one of two characters without giving consideration to layout issues such as
(Y or N) returned to the low-security network, thus minimising line cross-overs. Fortunately, the infosec
constricting (but not entirely eliminating) the flow of evaluator will not normally be obliged to study these
information in the unsafe direction. graphs, relying merely on the output from the analy-
Overall, therefore, this data diode device offers an sis, unless an exceptionally-detailed understanding of
ideal testbed for infosec evaluation procedures since a particular dataflow path is required.)
it has both a well-defined safe behaviour (the black- It was then possible to load both the hardware
to-red data path) and a potentially unsafe behaviour and software models into Sifa, select source and sink
(the red-to-black acknowledgement path). nodes, and automatically analyse the model to find
dataflow paths of potential security signficance. A
4.2 Modelling and analysis process for the variety of analyses were performed to ensure that all
case study the dataflow pathways anticipated for this device were
detected by the combined hardware-software model.
To perform the analysis a model of the data diode Several of the paths returned by Sifa were then hand-
devices hardware layout was first developed using checked in order to ensure that they conformed with
Sifas built-in editor, as shown in Figure 4. No Vhdl our understanding of the way the device processes
representation of the circuitry was available, so the and forwards data. Doing this confirmed that the en-
model was constructed manually, but this was not tire toolchain was working correctly and also helped
a major problem since this devices hardware is rel- us understand some unexpected, but logically cor-
atively simple; the black circuit boards model con- rect, false-positive paths produced. (Inevitably a
tained nine distinct components and the red boards static analysis such as that performed by Sifa will to
model contained eight. (As is usual in these evalu- some extent overapproximate the actual paths that
ations, power circuitry components, such as capac- occur dynamically. While we can seek to minimise
itors and resistors, were not modelled.) Appropri- such false-positives, their existence is a fundamental
ate mode-specific connectivity through each of these limitation of static analyses.)
Figure 4: Models of the data diodes black and red circuit boards in Sifas editing window.

5 Dataflow analysis results 66 if (getNextChar==TRUE) {


...
Having completed the hardware and software models 70 #use rs232(baud=9600,Xmit=PIN A1,
a number of dataflow analyses were performed to test Rcv=PIN A0,parity=n,bits=8)
the combined capabilities of the existing Sifa tool and 71 inputChar = getc();
our new C-to-Sifa Converter. 72 // Disable ready LED
73 output low(PIN C7);
5.1 An explicit dataflow path 75 if (input(PIN c2)) {
76 // Set waiting for ack LED
As an initial test we selected the incoming line of the 77 output high(PIN C6);
data diode devices black serial connector as the data 78 lastC0 = input(PIN C0);
source of interest and the outgoing line of the red 79 lastC1 = input(PIN C1);
serial connector as the data sink, in order to iden- 80 }
tify (safe) dataflow paths from the low-security net- 81 putc(inputChar);
work to the high-security network via the data diode.
As expected, Sifa reported the existence of one such 82 }
path. This path, which comprised 20 distinct steps,
represents the normal flow of data bytes through the Figure 7: Program code (lines 71 and 81) that trans-
device. fers data from black microprocessor pin A0 to pin A1.
Such paths are essentially just a list of ports, but
to make them easier to interpret Sifas interactive
interface allows the user to single-step through the an example of direct data flow between hardware pins
trace, automatically highlighting corresponding com- and software variables via explicit assignments in the
ponents in the graph and the operating modes in program code, and demonstrates the C-to-Sifa Con-
which they can be traversed. Doing this for the path verters ability to model these relationships.
found in this case allowed us to see how data can From black microcontroller pin A1 the bytes travel
travel from the black network to the red one, via to the red circuit board via the ribbon cable (right-
both hardware and software within the data diode hand side of Figure 6). On the red circuit board (left
device, and relate this path back to the original cir- of Figure 8) the bytes travel via an optocoupler (used
cuitry schematics and code listings (Figures 6 to 9). to ensure unidirectional data flow along this circuit)
In this case, starting from the black serial port (on and enter the red microcontroller via its A0 pin.
the left of Figure 6), data bytes travel via the RS232 Similarly to the other embedded program, the red
receiver component to pin A0 of the black micro- microcontrollers code (Figure 9) transfers data bytes
controller. The microcontrollers program (Figure 7) between its hardware pins A0 and A1 via an interme-
reads these bytes into a local variable, inputChar, diate software variable, rxChar. From pin A1 each
and later sends them to pin A1. (The pins operated byte is forwarded to the data diode devices red serial
on by the getc and putc statements are determined port via an RS232 driver (right-hand side of Figure 8).
by the preceding #use compiler directive.) This is Sifas identification of this expected data path
Figure 5: Part of the automatically-generated (and unformatted) model of data flow through the red micro-
processors software in Sifas editing window.

66 while (TRUE) that Sifa identified the existence of such a path, but
67 { it was interesting to note that ten distinct high-to-low
69 output high(PIN C7); paths were produced, the longest of which involved 46
70 #use rs232(baud=9600,Xmit=PIN A1, steps from source to sink.
Rcv=PIN A0,parity=n,bits=8)
Upon investigation, it was discovered that this
71 rxChar = getc(); large number of dataflow paths in the reverse direc-
72 putc(rxChar); tion is due to the numerous conditional (if) state-
ments in the part of the program code that processes
74 output low(PIN C7); acknowledgements. For instance, both microcont-
75 if (input(PIN C2)) { rollers have program code that is conditional on the
76 ... position of the ack mode switch on the data diode
devices front panel. Also, the red microcontroller
Figure 9: Program code (lines 71 and 72) that trans- executes different code depending on whether or not
fers data from red microprocessor pin A0 to pin A1. the byte returned by the high-security network is the
same as the last byte sent to it. Similarly, the black
microcontrollers program tests the values of both the
helped confirm the correct functioning of the C-to- positive and negative acknowledgement signals gen-
Sifa Converter and demonstrates the ability to find erated by the red microprocessor and performs dif-
composite hardware-software dataflow paths created ferent actions accordingly. Putting all of these al-
by explicit data assignments. ternative behaviours together accounts for the many
distinct dataflow paths found by Sifa. Furthermore,
5.2 Some implicit dataflow paths the computational complexity involved in traversing
these paths accounts for their significant length.
More importantly, we then analysed the model us- For instance, one of the potentially dangerous
ing the data diodes red serial connector as the data dataflow paths from the high-security domain to
source and its black connector as the sink, in order to the low-security one concerns negative acknowledge-
identify potentially unsafe data flows from the high- ments, produced when the data diode sends a byte to
security domain to the low-security one. Given the the high-security network but a non-matching byte is
obvious dangers associated with the data diode de- returned. Sifas analysis shows that this path starts
vices acknowledgement function it was no surprise from the red serial port (on the right of Figure 10)
Figure 6: Data path (left to right) through the black circuit board via the black microprocessor.

Figure 8: Data path (left to right) through the red circuit board via the red microprocessor.
Figure 10: Control path (right to left) for negative acknowledgements through the red circuit board via the
red microprocessor.

and travels through the red processor boards RS232 78 #use rs232(baud=9600,Xmit=PIN A1,
receiver and an AND gate before reaching pin A2 on Rcv=PIN A2,parity=n,bits=8)
the red microcontroller. The AND gate (bottom cen- 87 if (kbhit()) {
tre of Figure 10) is connected to the ack mode switch 88 ackChar = getc();
on the data diode devices front panel (via the black 89 } else {
circuit board) and is used to ensure that acknowledge- 90 timeout error = TRUE;
ment data reaches the red microcontroller only when 91 }
the data diode device is in acknowledgement mode. 93 output low(PIN C6);
Sifas trace through the red microcontrollers pro- 95 if (timeout error) {
gram code for this particular path (Figure 11) shows 96 noAck = !noAck;
that the byte is read from hardware pin A2 into soft- 97 output bit(PIN C1, noAck);
ware variable ackChar (line 88). Later this variable 98 } else {
is compared to the last byte sent to the high-security
domain (line 99), held in variable rxChar. If the val- 99 if ( ackChar == rxChar) {
ues do not match then the binary signal produced by 100 yesAck = !yesAck;
microcontroller pin C1 is toggled to indicate a nega- 101 output bit(PIN C0, yesAck);
tive acknowledgement (line 104). 102 } else {
103 noAck = !noAck;
Notice in this code that there is no direct transfer
of data from pin A2 to pin C1. The byte received via 104 output bit(PIN C1, noAck);
pin A2 influences the binary signal sent via pin C1, 105 }
but no values from the byte are forwarded directly. 106 }
This is, therefore, an example of implicit information
flow between software variables and hardware pins, Figure 11: Data-flow path (lines 88, 99 and 104)
again confirming the C-to-Sifa Converters ability to through the red microprocessors code that translates
capture such system properties. data received from pin A2 into a negative acknowl-
From red microcontroller pin C1 the signal then edgement control signal sent via pin C1.
travels directly via the ribbon cable (left of Figure 10)
to pin C1 of the black microcontroller (from the right
in Figure 12). short path through the black microcontrollers code
The black microcontrollers program code repeat- (Figure 13) occurs when the signal sampled from
edly samples the signal on this pin to see if it has pin C1 (line 92) is used in a condition which di-
changed, in which case it sends a negative acknowl- rectly controls whether or not the N character is sent
edgement character N to the low-security network. (line 95).
These multiple samples account for some of the dif- However, since the sampled signal is compared
ferent dataflow paths found by Sifa. For instance, a with a previous sample from the same pin, a longer
Figure 12: Control path (right to left) for negative acknowledgements through the black circuit board via the
black microprocessor.

75 if (input(PIN c2)) { the rigour of the security evaluation, the infosec evalu-
77 output high(PIN C6); ator may care to study each such path individually or
78 lastC0 = input(PIN C0); may simply note the existence of potential data flow
79 lastC1 = input(PIN C1); between the relevant microprocessor pins, regardless
80 } of its specific cause.
81 ... Finally, the acknowledgement character sent from
84 if (input(PIN C2)) { the black microcontroller via its pin C1 reaches the
85 if (input(PIN C0) != lastC0) { black network via another AND gate (bottom left of
86 ... Figure 12) and the black processor boards RS232
91 } driver.
Paths such as this one, plus the various others pro-
92 if ( input(PIN C1) != lastC1 ) { duced in this case, again confirm the toolchains abil-
93 if (!getNextChar) { ity to automatically identify complex dataflow paths
94 #use rs232(baud=9600, which may be worthy of close scrutiny.
Xmit=PIN A2,
Rcv=PIN A0,parity=n,bits=8) 5.3 Some less obvious paths
95 putc(N);
Apart from the crucial paths between the data diode
96 getNextChar = TRUE; devices red and black serial ports, we also explored
97 output low(PIN C6); our toolchains ability to identify other paths both
98 } within and through the device. In particular, we
99 lastC1 = input(PIN C1); analysed the potential destinations of data emanat-
100 } ing from the switches on the devices front panel, and
101 } possible sources of signals driving the front panels
LEDs. In practice the device itself would normally
Figure 13: Two data-flow paths (lines 92 (left) and 95, reside physically within a high-security domain, so
and lines 79, 92 (right) and 95) through the black there is no serious danger of an adversary receiving a
microprocessors code that translate a control signal coded message via the LEDs. However, the position
received from pin C1 into a negative acknowledge- of the switches could conceivably be detectable by an
ment data value sent via pin A2. observer in the low-security domain, representing a
more realistic threat.
For instance, Sifas analysis, using our hardware
path that ends at the same output statement begins model and the software model generated by the C-to-
by sampling a previous value from pin C1 (line 79) Sifa Converter, revealed 14 distinct dataflow paths
into a software variable, lastC1, which is then com- from the ack mode switch on the data diode de-
pared with the current sample (line 92), thus also vices front panel to the low-security serial port. As
influencing whether or not the negative acknowlege- was the case for the acknowledgement bytes described
ment character is sent (line 95). Such alternative above, this large number of paths proved to be due
paths, due to conditional statements in the program to the numerous conditional statements in the micro-
code, were found to account for the many different controllers programs that rely on the position of this
high-to-low dataflow paths detected. Depending on switch. In essence, of course, the existence of these
68 output high(PIN C7); damental way of modelling information flow (Goguen
69 getNextChar = FALSE; & Meseguer 1982). There is, however, a timing rela-
70 #use rs232(baud=9600,Xmit=PIN A1, tionship between these actions because the low sig-
Rcv=PIN A0,parity=n,bits=8) nal cannot be sent to pin C7 until the byte from
71 inputChar = getc(); pin A0 has been read. A timing channel thus ex-
73 output low(PIN C7); ists between these pins, but our toolchain does not
(currently) attempt to perform timing analyses.
Figure 14: Part of the black microprocessors code re- A similar case of an expected path not being found
sponsible for reading data bytes (line 71) and flashing was from the red serial port, which receives acknowl-
the ready to receive data LED (lines 68 and 73). edgement bytes from the high-security domain, to
the waiting for ack LED attached to the red circuit
board, which flashes to indicate to the operator that
paths simply confirms the obvious fact that an ob- an acknowledgement is being processed. Again this
server in the low-security domain can determine the finding by Sifa and the C-to-Sifa Converter was vin-
position of the (high-security) ack switch merely by dicated by manual inspection of the red microcont-
noting the presence or absence of acknowledgements rollers program code, which showed that the same
coming from the data diode device in response to signals are always sent to flash this LED regardless
bytes sent to it. of the value of the acknowledgement byte received.
We also found numerous dataflow paths from the In fact, it is the position of the ack mode switch
ack mode switch to the two waiting for ack LEDs that influences the behaviour of this LED, not the
on the data diode devices front panel (one LED is acknowledgement bytes themselves. Again there is
attached to each circuit board). This is to be ex- a timing relationship between these actions, but no
pected because the position of this switch determines actual data flow.
whether or not signals are sent to these LEDs. How-
ever, there were no paths from this switch to the
ready to receive data LEDs. Similarly, no paths 6 Conclusion
were found leading from the red serial port to the
ready to receive data LED on the red circuit board One of the key steps during information security eval-
since this port only receives acknowledgements, not uations of communications devices is to trace all po-
data bytes. These results conformed precisely with tential data flow through the devices circuitry and
our understanding of the data diode devices internal embedded program code. We have created a toolchain
behaviour. which automates this process by combining an exist-
Some of the results were not so obvious, however. ing circuitry analysis tool, Sifa, with a new analy-
For instance, we were surprised to discover that the sis tool for embedded program code, the C-to-Sifa
combined hardware-software analysis produced paths Converter. In this paper we have used a small, but
leading from the red serial port, which receives ac- complete, case study to show how this toolchain al-
knowledgement bytes only, to the ready to receive lows the flow of data to be traced seamlessly and
data LED attached to the black circuit board, which accurately through a security devices hardware and
displays the status of data bytes travelling in the op- software. At the time of writing we are conducting
posite direction! Inspection of the black microcont- further case studies involving interrupt-driven micro-
rollers program code revealed that this interaction is controller programs.
due to the acknowledgement signals received by the
black microcontroller from the red one controlling as-
signments to a software variable, getNextChar, which References
in turn is used to control code that sends signals to
this LED (via black microcontroller pin C6). This was Ballance, R. A., Maccabe, A. B. & Ottenstein,
a good example of the C-to-Sifa Converter identify- K. J. (1990), The program dependence web:
ing paths not expected by the research team. (Fur- A representation supporting control-, data-,
thermore, it was noted that the program code could and demand-driven interpretation of imperative
be restructured to eliminate this flow, although in languages, in Proceedings of the ACM SIG-
practice it does not represent a serious security threat PLAN Conference on Programming Language
given the assumption that the LEDs are accessible in Design and Implementation (PLDI90), New
the high-security domain only.) York, USA, June 2022, ACM, pp. 257271.
A particularly counterintuitive finding was that
no dataflow paths were produced from the black se- Cytron, R., Ferrante, J., Rosen, B. K., Weg-
rial port to the black circuit boards ready to re- man, M. N. & Zadeck, F. K. (1989), An effi-
ceive data LED, which flashes once for each data cient method of computing Static Single Assign-
byte received. The relevant part of the black micro- ment form, in Proceedings of the 16th ACM
controllers program is shown in Figure 14. The SIGPLAN-SIGACT Symposium on Principles
LED is first switched on (line 68), the data byte of Programming Languages (POPL98), Austin,
is read (line 71), and the LED is then switched off USA, ACM, New York, USA, pp. 2535.
(line 73). However, despite the clearly-evident se-
quential relationship between execution of these three Fidge, C. J. & Corney, D. (2009), Integrating hard-
statements, there is, in fact, no dataflow relationship ware and software information flow analyses, in
between them. The byte read from pin A0 into vari- Proceedings of the ACM SIGPLAN/SIGBED
able inputChar is not sent to the LED connected to 2009 Conference on Languages, Compilers, and
pin C7. Nor does the byte received control the signals Tools for Embedded Systems (LCTES 2009),
sent to the LED; the same signals are sent to the LED Dublin, June 1920, ACM, pp. 157166.
regardless of the value of the byte received. Thus the
C-to-Sifa Converter correctly produced no dataflow Goguen, J. & Meseguer, J. (1982), Security policies
connection between inputs on pin A0 and outputs to and security models, in IEEE Symposium on
pin C7 in this case. This is in accordance with the Security and Privacy, IEEE Computer Society,
well-established principle of noninterference as a fun- pp. 1120.
ISO (2009), ISO/IEC Standard 15408-1:2009, In-
formation TechnologySecurity Techniques
Evaluation Criteria for IT SecurityPart 1: In-
troduction and General Model, 3.1 edn, Interna-
tional Organization for Standardization, Geneva,
Switzerland.
Mahalingam, A., Butz, B. P. & Duarte, M. (2005), An
intelligent circuit analysis module to analyze stu-
dent queries in the Universal Virtual Laboratory,
in W. Oakes, D. Voltmer & C. Yokomoto, eds,
Proceedings of the 35th ASEE/IEEE Frontiers
in Education Conference (FIE05), Indianapolis,
USA, Institute of Electrical and Electronics En-
gineers, New Jersey, USA, pp. F4E1F4E6.
Mallen, S. (2003), Serial data diode device
Operation manual, Technical report, Defence
Signals Directorate.
McComb, T. & Wildman, L. P. (2005), SIFA: A tool
for evaluation of high-grade security devices, in
C. Boyd & J. Nieto, eds, Proceedings of the
Tenth Australasian Conference on Information
Security and Privacy (ACISP 2005), Brisbane,
Australia, Vol. 3574 of Lecture Notes in Com-
puter Science, Springer-Verlag, Berlin, pp. 230
241.
McComb, T. & Wildman, L. P. (2006), User guide
for SIFA v.1.0, Technical report. Available from
http://sifa.sourceforge.net/.
McComb, T. & Wildman, L. P. (2007), A com-
bined approach for information flow analy-
sis in fault tolerant hardware, in Proceedings
of the Twelfth IEEE International Conference
on Engineering of Complex Computer Systems
(ICECCS 2007), IEEE Computer Society Press.
Sabelfeld, A. & Myers, A. C. (2003), Language-based
information-flow security, IEEE Journal on Se-
lected Areas in Communications 21(1), 115.
Scholz, B., Zhang, C. & Cifuentes, C. (2008), User-
input dependence analysis via graph reachabil-
ity, in Proceedings of the Eighth IEEE Interna-
tional Working Conference on Source Code Anal-
ysis and Manipulation (SCAM 2008), Beijing,
September 2829, IEEE, pp. 2534.

You might also like