You are on page 1of 5

IT TRENDS

EDITOR: Irena Bojanova, NIST, irena.bojanova@computer.org

How Much to
Trust Artificial
Intelligence?

George Hurlburt, STEMCorp

T
here has been a great regimes. This results in vehicles but the driver failed to appreciate
deal of recent buzz about that obey the rules of the road, that his autopilot still required his
the rather dated notion except when they do not. Such full, undivided attention. In this
of artificial intelligence was the case when a motor vehi- rare case, misplaced trust in an
(AI). AI surrounds us, involv- cle in autonomous mode broad- AI-based system turned deadly.
ing numerous applications rang- sided a turning truck in Florida,
ing from Google search, to Uber killing its “driver.” The accident Establishing a Bar for Trust
or Lyft ride-summoning, to air- was ultimately attributed to driver AI advancement is indeed impres-
line pricing, to Alexa or Siri. To error, as the autonomous con- sive. DARPA, sponsor of early
some, AI is a form of salvation, trols were deemed to be perform- successful autonomous vehi-
ultimately improving quality of ing within their design envelope. cle competitions, completed the
life while infusing innovation The avoidance system design at Cyber Grand Challenge (CGC)
across myriad established indus- the time required that the radar competition in late 2016. The
tries. Others, however, sound dire and visual systems agree before CGC established that machines,
warnings that we will all soon evasive action would be engaged. acting alone, could play an estab-
be totally subjugated to superior Evidence suggests, however, that lished live hacker’s game known as
machine intelligence. AI is typi- the visual system encountered Capture the Flag. Here, a “flag” is
cally, but no longer always, soft- glare from the white truck turn- hidden in code, and the hacker’s
ware dominant, and software is ing against bright sunlight. This job is to exploit vulnerabilities to
prone to vulnerabilities. Given system neither perceived nor reach and compromise an oppo-
this, how do we know that the AI responded to the looming haz- nent’s flag. The CGC offered a $2
itself is sufficiently reliable to do ard. At impact, however, other million prize to the winning team
its job, or—put more succinctly— evidence implicated the “driver,” that most successfully competed
how much should we trust the who was watching a Harry Potter in the game. The final CGC round
outcomes generated by AI? movie. The driver, evidently over- pitted seven machines against one
confident of the autopilot, did another on a common closed net-
Risks of Misplaced Trust not actively monitor its behavior work without any human inter-
Consider the case of self-driv- and failed to override it, despite vention. The machines had to
ing cars. Elements of AI come an estimated seven-second vis- identify vulnerabilities in an oppo-
into play in growing num- ible risk of collision.1 The design nent’s system, fix them on their
bers of self-driving car autopilot assurance level was established, own system, and exploit them in

1520-9202/17/$33.00 © 2017 IEEE Published by the IEEE Computer Society computer.org/ITPro 7


IT TRENDS

opponents’ systems to capture the


flag. Team Mayhem from Carne-
gie Mellon University was declared
the winner.2
John Launchbury, director of
DARPA’s Information Innovation
Office, characterizes the type of
AI associated with the CGC as
handcrafted knowledge. Emerging
from early expert systems, this
technology remains vital to the
advancement of modern AI. In
handcrafted knowledge, systems
reason against elaborate, manu-
ally defined rule sets. This type of
AI has strength in reasoning but
is limited in forms of perception.
However, it possesses no ability to
learn or perform abstraction.3
While building confidence that Figure 1. Some prevalent AI machine learning algorithms.
future reasoning AI can indeed
rapidly diagnose and repair soft-
ware vulnerabilities, it is impor-
tant to note that the CGC was capable of harnessing the same AI systems could go awry in unex-
intentionally limited in scope. vulnerabilities to assume control pected ways, effectively defining
The open source operating sys- over others’ networks, including the level of trust in AI based tools
tem extension was simplified for the growing and potentially vul- becomes a high hurdle.6
purposes of the competition,4 and nerable Internet of Things (IoT). At its core, AI is a high-order
known malware instances were This concern prompted the Elec- construct. In practice, numerous
implanted as watered-down ver- tronic Frontier Foundation to loosely federated practices and
sions of their real-life counter- call for a “moral code” among AI algorithms appear to compose
parts.5 This intentionally eased developers to limit reasoning sys- most AI instances—often cross-
the development burden, permit- tems to perform in a trustworthy ing many topical domains. Indeed,
ted a uniform basis for competi- fashion.4 AI extends well beyond com-
tive evaluation, and reduced the puter science to include domains
risk of releasing competitors’ soft- Machine Learning Ups such as neuroscience, linguistics,
ware into the larger networked the Trust Ante mathematics, statistics, physics,
world without requiring signifi- Launchbury ascribes the term sta- psychology, physiology, network
cant modification. tistical learning to what he deems science, ethics, and many others.
The use of “dirty tricks” to the second wave of AI. Here, Figure 1 depicts a less than fully
defeat an opponent in the game perception and learning are inclusive list of algorithms that
adds yet another, darker dimen- strong, but the technology lacks underlie second-wave AI phe-
sion. Although the ability to re- any ability to perform reasoning nomena, often collectively known
engineer code to rapidly isolate and abstraction. While statisti- as machine learning.
and fix vulnerabilities is good, it is cally impressive, machine learn- This myriad of potential under-
quite another thing to turn these ing periodically produces indi- lying algorithms and methods
vulnerabilities into opportunities vidually unreliable results, often available to achieve some state of
that efficiently exploit other code. manifesting as bizarre outliers. machine learning raises some sig-
Some fear that if such a capabil- Machine learning can also be nificant trust issues, especially for
ity were to be unleashed and grow skewed over time by tainted train- those involved in software testing
out of control, it could become a ing data.3 Given that not all AI as an established means to assure
form of “supercode”—both exempt learning yields predictable out- trust. When the AI becomes asso-
from common vulnerabilities and comes, leading to the reality that ciated with mission criticality, as

8 IT Pro July/August 2017


is increasingly the case, the tester One high-level AI test assesses under controlled conditions, sig-
must establish the basis for multi- the ability to correctly recognize nificant differences result between
ple factors, such as programmatic and classify an image. In some the use of single or multiple well-
consistency, repeatability, penetra- instances, this test has surpassed validated datasets used to train and
bility, applied path tracing, or iden- human capability to make such test classifiers. Thus, even con-
tifiable systemic failure modes. assessments. For example, the trolled testing for classifiers can
The nontrivial question of Labeled Faces in the Wild (LFW) become highly complicated and
what is the most appropriate dataset supports facial recogni- must be approached carefully.8
AI algorithm goes as far back as tion with some 13,000 images Other trust-related factors
1976.3 The everyday AI practi- to train and calibrate facial rec- extend well beyond code. Because
tioner faces perplexing issues ognition machine learning tools coding is simultaneously a cre-
regarding which is the right algo- using either neural nets or deep ative act and somewhat of a syn-
rithm to use to suit the desired learning. The new automated AI tactic science, it is subject to some
AI design. Given an intended image recognition tools can sta- degree of interpretation. It is fea-
outcome, which algorithm is tistically outperform human facial sible that a coder can inject either
intentional or unintentional cul-
tural or personal bias into the
The everyday AI practitioner faces perplexing issues resulting AI code. Consider the
case of the coder who creates a
regarding which is the right algorithm to use to suit highly accurate facial recognition
the desired AI design. routine but neglects to consider
skin pigmentation as a deciding
factor among the recognition cri-
the most accurate? Which is recognition capability using this teria. This action could skew the
the most efficient? Which is the dataset.7 The task at hand, how- results away from features oth-
most straightforward to implement ever, is fundamentally perceptual erwise reinforced by skin color.
in the anticipated environment? in nature. These tasks functionally Conversely, the rates of recid-
Which one holds the greatest discriminate through mathemat- ivism among criminals skews
potential for the least corruption ically correlated geometric pat- some AI-based prison release
over time? Which ones are the terns but stop short of any form of decisions along racial lines. This
most familiar and thus the most higher-order cognitive reasoning. means that some incarcerated
likely to be engaged? Is the design Moreover, while it compares selec- individuals stand a better statisti-
based on some form of central- tive recognition accuracy against cal chance of gaining early release
ity, distributed agents, or even human ability, other mission-crit- than others—regardless of pre-
swarming software agency? How ical aspects of the underlying code vailing circumstances.9 Semantic
is this all to be tested? base remain unchecked under this inconsistency can further jeop-
These questions suggest that test. ardize the neutrality of AI code,
necessary design tradeoffs exist especially if natural language
between a wide range of alterna- Beyond the Code processing or idiomatic speech
tive AI-related algorithms and Testing machine learning becomes recognition are involved.
techniques. The fact that such further complicated as extensive Some suggest that all IT careers
alternative approaches to AI exist datasets are required to “train” the are now cybersecurity careers.10
at all suggests that most AI archi- AI in a learning environment. Not This too has a huge implica-
tectures are far from consistent only should the AI code be shown tion for the field of AI develop-
or cohesive. Worse, a high degree to be flawless, but the data used in ment and its implementation. The
of contextually-based customiza- the training should theoretically question of “who knows what the
tion is required for both reason- bear the highest pedigree. In the machine knew and when it knew
ing and learning systems. This, real world, however, datasets often it” becomes significant from a
of course, extends to AI testing, tend to be unbalanced, sparse, cybersecurity standpoint. What a
because each algorithm and its inconsistent, and often inaccurate, machine learns is often not readily
custom implementation brings if not totally corrupt. Figure 2 sug- observable, but rather lies deeply
its own unique deep testing chal- gests that information often results encoded. This not only affects
lenges, even at the unit level. from resolving ambiguity. Even newly internalized data, but—in

computer.org/ITPro  9
IT TRENDS

the IoT—these data can trip deci-


sion triggers to enliven actua-
tors that translate the “learning”
into some sort of action. Lack-
ing concrete stimulus identity and
pedigree, the overall AI-sparked
IoT stimulus-response mecha-
nism becomes equally uncertain.
Nonetheless, the resulting actions
in mission-critical systems require
rigorous validation.

The Third Wave


Launchbury foresees the need
for a yet-to-be-perfected third
wave of AI, which he names con-
textual adaptation. This technol-
ogy, requiring much more work,
brings together strengths in per-
ception, learning, and reason-
ing and supports a significantly Figure 2. Information provenance can often be unclear.
heightened level of cross-domain
abstraction.3
The 2017 Ontology Sum-
mit, aptly entitled “AI, Learning, viewed in context. AI, largely emerging from applied network
Reasoning, and Ontologies,” subsymbolic today, will need to science, offer a better means of
concluded in May 2017. Rein- deal with applied semantics in a assessing dynamic AI behav-
forcing Launchbury’s observa- far more formal sense to achieve ior that emerges over time. This
tion, the draft summit commu- third-wave status. Under such cir- becomes increasingly true as
nique concluded that, to date, cumstances, AI becomes nonlin- the temporal metrics associ-
most AI approaches, including ear, in which cause and effect are ated with graph theory become
machine learning tools, oper- increasingly decoupled via multi- better understood as a means
ate at a subsymbolic level using ple execution threads. This leads of describing dynamic behaviors
computational techniques that do to the establishment of complex that fail to follow linear paths to
not approximate human thought. adaptive systems (CAS), which tend achieve some desired effect.12
Although great progress has been to adhere to and be influenced by
achieved in many forms of AI, the nonlinear network behavior.

U
full treatment of knowledge rep- In a CAS, new behaviors ntil some reliable method-
resentation at the symbolic level emerge based on environmen- ology is adopted for the as-
awaits maturity (bit.ly/2qMN0it). tal circumstance over time. Here, sessment of assured trust
Correspondingly, the utility of there can be multiple self-orga- within AI, the watchword must
ontology as a formal semantic nizing paths leading to success be caution. Any tendency to put
organizing tool offers only limited or failure, all triggered by highly blind faith in what in effect remains
advantages to AI and its ultimate diversified nodes and arcs that largely untrusted technology can
test environment. can come, grow, shrink, and go lead to misleading and sometimes
The semantic network involves over time. Such networks defy dangerous conclusions. 
graph representations of knowl- traditional recursive unit testing
edge in the form of nodes and when composed using embed- References
arcs. It provides a way to under- ded software, which is interre- 1. N.E. Boudette, “Tesla’s Self-Driving
stand and visualize relationships lated to data. This is because in a System Cleared in Deadly Crash,”
between symbols, often repre- CAS, the whole often becomes far New York Times, 19 Jan. 2017.
sented by active words, which more than merely the sum of the 2. D. Coldewey, “Carnegie Mellon’s
convey varying meanings when parts.11 Rather, new approaches, Mayhem AI Takes Home $2 Million

10 IT Pro July/August 2017


from DARPA’s Cyber Grand Chal- 7. A. Jacob, “Forget the Turing Test— George Hurlburt is chief scientist at
lenge,” TechCrunch, 5 Aug. 2016; There Are Better Ways of Judging STEMCorp, a nonprofit that works to
tcrn.ch/2aM3iS7. AI,” New Scientist, 21 Sept. 2015; bit. further economic development via adop-
3. J. Launchbury, “A DARPA Perspec- ly/1MoMUnF. tion of network science and to advance
tive on Artificial Intelligence,” DAR- 8. J. Demsar, “Statistical Compari- autonomous technologies as useful tools
PAtv, 15 Feb. 2017; www.youtube sons of Classifiers over Multiple for human use. He is engaged in dynamic
.com/watch?v5-O01G3tSYpU. Data Sets,” J. Machine Learning Re- graph-based Internet of Things architec-
4. N. Cardozo, P. Eckersley, and J. Gil- search, vol. 7, 2006, pp. 1–30. ture. Hurlburt is on the editorial board of
lula, “Does DARPA’s Cyber Grand 9. H. Reese, “Bias in Machine IT Professional and is a member of the
Challenge Need a Safety Protocol?” Learning, and How to Stop It,” board of governors of the Southern Mary-
Electronic Frontier Foundation, TechRepublic, 18 Nov. 2016; tek.io land Higher Education Center. Contact
4 Aug. 2016; bit.ly/2aPxRXc. /2gcqFrI. him at ghurlburt@change-index.com.
5. A. Nordrum, “Autonomous Security 10. C. Mims, “All IT Jobs Are Cyberse-
Bots Seek and Destroy Software curity Jobs Now,” Wall Street J., 17
Bugs in DARPA Cyber Grand Chal- May 2017; on.wsj.com/2qH5VP2.
lenge,” IEEE Spectrum, Aug. 2016; 11. P. Erdi, Complexity Explained, Springer-
bit.ly/2arLOcR. Verlag, 2008. Read your subscriptions
6. S. Jontz, “Cyber Network, Heal 12. N. Masuda and R. Lambiotte, A
through the myCS
publications portal at
Thyself,” Signal, 1 Apr. 2017; bit Guide to Temporal Networks, World
.ly/2o0ZCVe. Scientific Publishing, 2016.
http://mycs.computer.org

CALL FOR STANDARDS AWARD NOMINATIONS


IEEE COMPUTER SOCIETY HANS K ARLSSON
STANDARDS AWARD

A plaque and $2,000 honorarium is presented in recognition of


outstanding skills and dedication to diplomacy, team facilitation, and
joint achievement in the development or promotion of standards in the
computer industry where individual aspirations, corporate competition,
and organizational rivalry could otherwise be counter to the benefit
of society.

NOMINATE A COLLEAGUE FOR THIS AWARD!

DUE: 1 OCTOBER 2017

• Requires 3 endorsements.
• Self-nominations are not accepted.
• Do not need IEEE or IEEE Computer Society membership to apply.

Submit your nomination electronically: awards.computer.org | Questions: awards@computer.org

computer.org/ITPro  11

You might also like