Cell - 28 July 2016

Leading Edge
Select
Targeting Resistance
Wrestling with cancer can be frustrating. Despite the progress in developing therapies that can effectively control tumor
growth, the devil almost always strikes back with resistance.
Even for the recent excitement in using immunotherapy to
achieve unprecedented success in some cancer patients,
resistance has been seen in clinical settings and is under
active investigation (Restifo et al., 2016).
How do we tackle resistance to cancer therapy? One effective approach is to nail the culpritpinpointing the cell population intrinsically insensitive to the treatment and targeting
their vulnerability. Recent work from Tessa L. Holyoake and
her team successfully applied this strategy on chronic
myeloid leukemia (CML; Abraham et al., 2016). CML is characterized by the aberrant activation of ABL1 tyrosine kinase
due to chromosome translocation, and tyrosine kinase inhibitors (TKIs) have been the standard treatment with clinical
efficacy. However, patients with CML eventually relapse
because the survival of leukemic stem cells (LSCs) does
not rely on the elevated kinase activity and therefore cannot
be eradicated by TKIs. Through integrated analyses, the
team exposed the essential role of p53 and c-MYC on the
CML network and an addictive dependency of LSCs on these
two signaling hubs. They further showed that a combinatory
treatment targeting p53 and c-MYC could effectively kill
LSCs, raising the hope of using this approach for treating
CML patients relapsing from TKIs (Abraham et al., 2016).
Leukemia is not the only type of cancer in which targeting
intrinsic resistance from a specific population is starting to
show promise. Indeed, by developing a mouse model with
knockin reporters, Tannishtha Reya and colleagues were
able to identify high expression of the stem cell determinant
Musashi (Msi) as a marker for populations in pancreatic cancer with strong tumor-initiating capacity and conferring drug
resistance. Inhibiting Msi significantly changed the trajectory
of disease progression and almost doubled the survival time
in mouse models. Moreover, simultaneously inhibiting two of
its potential direct targets was effective in killing tumor cells

resistant to gemcitabine, a FDA-approved chemotherapy
drug for treating pancreatic cancer. To further establish Msi
as a valuable target for drug development, the authors
went on to develop antisense oligonucleotides specifically
targeting MSI1, which successfully inhibited tumor growth
in patient-derived xenograft (PDX) models (Fox et al., 2016).
While drug resistance can be an intrinsic feature of a particular population within the tumor before the drug is even
applied, it can also arise as an adaptive response to the treatment itself. Aiming to better understand this adaptation and to
explore therapeutic opportunities, Scott Lowe and his team
employed a systematic approach to screen for factors that
could sensitize KRAS mutant lung cancer cells to trametinib,
an FDA-approved drug targeting the downstream effector
signaling of mutant KRAS. They found that activation of the
FGFR pathway underlies the resistance of tumor cells to trametinib. Consistently, inhibiting FGFR1 with either shRNAs or
ponatinib that has FGFR1 as one of its targets achieved synthetic lethality with trametinib in treating KRAS mutant lung
cancer cells. Despite the profound efficacy in tumor suppression, this combinatory approach showed minimal, if any,
toxicities in various in vivo models, including PDX, strongly
supporting its therapeutic potential (Manchado et al., 2016).
Strategies narrowing down specific components in the
signaling pathways conferring intrinsic or adaptive resistance
represent a major direction of effort for targeting resistance to
therapy. However, its not the only way. As reported by Cerezo
et al., rather than targeting a particular oncogenic driver in the
context of melanoma resistant to BRAF inhibitors, using compounds to induce ER stress was proven effective in eliminating
resistant cancer cells by promoting apoptosis and autophagy.
Remarkably, this approach did not seem to affect normal melanocytes or fibroblasts, indicating a rather attractive therapeutic window for future development (Cerezo et al., 2016).
By definition, resistance is hard to eradicate. Nonetheless,
in realizing what we have achieved along the way in this
marathon of finding a cure for cancer, we have every reason
to believe that we are heading in the right direction.
REFERENCES
Abraham, S.A., Hopcroft, L.E., Carrick, E., Drotar, M.E., Dunn, K.,
Williamson, A.J., Korfi, K., Baquero, P., Park, L.E., Scott, M.T., et al. (2016).
Nature 534, 341346.
Cerezo, M., Lehraiki, A., Millet, A., Rouaud, F., Plaisant, M., Jaune, E.,
Botton, T., Ronco, C., Abbe, P., Amdouni, H., et al. (2016). Cancer Cell 29,
805819.
Fox, R.G., Lytle, N.K., Jaquish, D.V., Park, F.D., Ito, T., Bajaj, J.,
Koechlein, C.S., Zimdahl, B., Yano, M., Kopp, J.L., et al. (2016). Nature 534,
407411.
Manchado, E., Weissmueller, S., Morris, J.P., Chen, C.C., Wullenkord, R.,
Lujambio, A., de Stanchina, E., Poirier, J.T., Gainor, J.F., Corcoran, R.B.,
et al. (2016). Nature 534, 647651.
Restifo, N.P., Smyth, M.J., and Snyder, A. (2016). Nat. Rev. Cancer 16,
121126.
Expression of the stem cell gene Musashi (red) in human pancreatic

cancer (cancer cells: green). Image courtesy of Dawn Jaquish.
Jiaying Tan
Cell 166, July 28, 2016 2016 Published by Elsevier Inc. 523
Leading Edge
Conversations
Brain Exploration, Off the Beaten Path
Model organisms, such as rodents, monkeys, or Drosophila, have driven much of recent research
in neuroscience. However, studies in other, more unusual systems have broadened the types
of questions that are being asked and have revealed the diverse ways in which species tackle common problems. Cell editor Mirna Kvajo talked with Nachum Ulanovsky, Gilles Laurent, and Anthony
Leonardo about their research and how studying bats, reptiles, and dragonflies informs big questions in neuroscience. An annotated excerpt of the conversation appears below, and the full conversation is available with the article online.
Anthony Leonardo
Janelia Research Campus
Nachum Ulanovsky
Weizmann Institute of
Science
Mirna Kvajo: It seems that a lot if not most of research in

neuroscience is being done in a couple of model organisms;
we hear about mice, we hear about rats, the Drosophila, and
then also monkeys. Most of these organisms are used to
address a broad spectrum of questions, and something that is
happening now is a raised awareness about how many of
these questions we can actually ask [in these traditional
systems]. And it seems that theres a surge of interest in
alternative model organisms.
The three of you are using something that people could call
alternative organisms, right? Youre working on bats, the
dragonfly, reptiles. Just to start off, I wanted to understand
what are your reasons for picking these organisms? What
kind of questions are you asking, and are some of these
questions such that cant be asked in other types of
organisms?
Many of the tools that we use

now actually come from the
study of unusual systems.
GFP, the channelrodopsins,
and CRISPR/Cas9.
Gilles Laurent
MPI for Brain Research
Anthony Leonardo: Ill speak up first. We study prediction

in dragonflies and how they anticipate where prey is going
and use this to construct a flight path. That process requires
internal models of how the body works and how the prey
moves. And the reason weve been studying prediction there
is largely because the problem the animal solves is very clear
when you look at the behavior . In the case of our system,
this sort of prey capture is not unlike reaching out your arm to
grab something, so its really a ubiquitous behavior and you
could study it in any one of a number of systems. But because
of how these animals do the behavior: its easy to elicit and
they do it with a certain amount of complexity, and they have
to solve it with certain accuracy, all those things conspire to
make the system much easier to understand than in other
places. Thats been the reason for me: its not that its the only
place to study it, but we think its the cleanest, clearest place,
and then we can take what weve learned there and apply it to
other systems.
Nachum Ulanovsky: Maybe I could continue on that. I think
this idea some call Kroghs Principle. The notion that for every
problem or every question in biology there are some organisms
that are particularly useful to address it. It could be because
their behavior is very precise. In our case of the bats, we are
studying place cells, grid cells, head-direction cellsthe
spatial system. The one reason is indeed that, on one hand, the
bat is a mammal, so the anatomy of its hippocampal system is
very similar to rodents, so we have that constraint. On the other
Cell 166, July 28, 2016 2016 Published by Elsevier Inc. 525
(L to R) Mirna Kvajo, Nachum Ulanovsky, Gilles Laurent, and Anthony Leonardo
hand, there are certain questions that are difficult to ask in

rodents, but theyre more approachable in bats. For example,
the representation of 3D space or representation of very large
spaces because they fly long distances .
There is another component to that, or another reason to
study non-standard animals, and this is the comparative
approach. Contrast and compare, so that what we find [in bats]
is that a lot of the things are very similar [to rats]; we find place
cells, grid cells, head-direction cells. But there are certain
things that are very different, so, for example, the theta
oscillations are very prominent in the [rat] hippocampal system,
and we dont see that in the bat. This means that those theories
of grid cells that rely in an obligatory manner on a perfect
The problem in the funding

situation is us; its driven by us,
and this has reached a point
that to me is quite dramatic.
526 Cell 166, July 28, 2016
oscillation, this argues against them. This comparative

approach is very powerful and used to be prevalent in
neuroscience, and unfortunately it disappeared. But I agree
with you, I feel it has a little bit of a comeback recently.
Gilles Laurent: I think that Nachum and Anthony have
summarized things really well. Forty, thirty years ago people
used what they called model systems, and it was a common
thing that youd go to Neuroscience [Society for Neuroscience
meeting] and people would work on Tritonia and crabs and
bats and barn owls and so on, and little by little all this has
disappeared, and as you guys were saying now, youre trying
to force all these things back onto one species and for
sometimes good reasons, but not always. I think all of
us agree on the danger of this trend, which means that,
practically, a population of scientists able to tackle interesting
problems on a variety of species, to take into account the
diversity of the animal world, of evolution, of comparisons and
their value its going to disappear as a culture, and thats
really dangerous.
MK: Im curious, Im sure that all of you must have
challenges, and you must be envious of people who are using
well-standardized and well-understood models which have a
lot of tools, and especially now in this age of tool making, you
must feel like, OK, I wish I could do this in my model. How do
You see a behavior outdoors

and start asking, How is that
implemented? And you can
eventually bring it to a
laboratory setting and do a
controlled experiment.
you think about this? Do you think that, for your particular
models, or just in general, theres going to be an age of tools,
or are you adjusting your questions to what you can ask?
NU: I think there will be an age of tools for sure. I think that,
often, people driven to these exotic or unusual systems are
inherently tool builders to some extent. When you pick one of
these unusual organisms, youre picking it because its a
question-driven enterprise and that leads very naturally to
saying, What is the tool that I need to answer this question,
and can I develop it here? And certainly in our work, we are
very inspired by our colleagues in genetic systems and wed
like to try to develop versions of those tools that we can
apply even for our very localized problems, so theyre not of
ubiquitous use but they solve our problems. So, I think that will
come as its needed; theres no fundamental impediments.
Its a question of time, effort, and funding, but it can certainly
be done.
GL: I was going to say that the three of us dont work in the
purely grant-driven American system. I think that the funding
issue is a fundamental one now. Those, the few of us who work
on unusual systems, tend to work in systems that allow the
funding and provide the funding to do that, and its becoming
less and less possible. And when you talk to your colleagues,
they say, Well the funding situation doesnt allow it. The
problem in the funding situation is us; its driven by us, and this
has reached a point that to me is quite dramatic. We dont even
have the confidence in pushing for that diversity.
NU: On one hand I agree, and on the other hand I also have
colleagues in the U.S. who study unusual animals. And those of
them who ask good questions, its clear that they can get
funding. I dont think its as tough . Its tough maybe, but its
possible, for sure. But, to address your question about tools, so
yeah, when you study an unusual animal, you have to develop
tools almost by definition because nobody will do it for you.
Sometimes this is an unusual toollike in our case for the bats,
we want to study them freely flying, so we have to develop
methods to record wirelessly from single neurons in flight, etc.
So, these are the kinds of tools we have to develop that dont
exist elsewhere in the world. But often times when were talking
about tools in neuroscience nowadays, its molecular tools, and
we need genomes and all these things. I think . with the advent
of genome editing, it might become a bit less of an issue.
AL: I started working on an unusual non-genetic system in an
era right when genetic systems were really exploding, and it
was clear that it was tactically not the wisest decision in terms
of certain aspects. And the thing that always sort of struck me is
that the genetic access to these sorts of weirdo systems is only
going to get easier over time, whereas the computations the
animals do and the behaviors they do are fixed. So its not that
mice and flies are going to evolve new behaviors suddenly that
youre going to be able to study in them. So there is a real
reason and a utility in saying, OK, this organism is solving this
computation, and this is a good place to study it, and were
going to work on it at the level of tools we have now, and
gradually more tools will become available and well gain
deeper levels of understanding it. As opposed to forcing that
problem into a genetic system where its very hard to study and
you make progress very slowly, even with the elegance of the
tools there.
GL: You could also turn this around by saying that many of
the tools that we use now actually come from the study of
unusual systems like bioluminescence in jellyfishyou get GFP
and the channelrhodopsins and CRISPR/Cas9. That doesnt
come from directed research at the beginning; its really
curiosity driven. If we lose this, we lose a lot of these
advantages.
AL: Yeah, I think on that same token, it is interesting to notice
that a lot of the problems being studied at a deep mechanistic
level in our genetic systems are problems that were described
at the level of algorithms and principles in other systems
things that we used to study in Hoverflies and locusts and all
these sort of exotic creatures that are being tapped. They
provided essentially the foundation on which these more
mechanistic studies can be built in other systems. And that
really arises from the ubiquity of evolution and the fact that
these principles do transcend the system, and almost by
definition you should be able to look at these things in different
places, and the breadth and depth can combine effectively.
I want to add yet another component, another advantage of
maintaining this diversity and studying non-standard species:
the natural behaviors. I mean, you cannot study a laboratory rat
or laboratory mouse in the wild. You can study a wild rat or wild
mice, which in some of their behaviors are quite different than
the ones in the laboratory. Whereas these non-standard
organisms, they are literally wild animalsliterally, we capture
them from the wild. You can study them also outdoors, so
weve been studying the bats, GPS tracking them outdoors,
looking at their navigation, etc. I think this really opens your
thinking to asking different questions. You see a behavior
outdoors and start asking, How is that implemented? And
you can eventually bring it to a laboratory setting and do a
controlled experiment, but even being able to study the animal
out in the wild is something that you typically cannot do. Or
even if you can, its not done by most people on standard
laboratory animals.
Cell 166, July 28, 2016 527
Leading Edge
Voices
Big Questions in Evolution
Evolution of Cell Types and Tissues
From Chemistry to Communities
From Whether to How
Detlev Arendt
Nicole King
Sean B. Carroll
European Molecular Biology Laboratory
University of California, Berkeley, HHMI
University of WisconsinMadison, HHMI
Evolution refers to the historic unfolding

of organismal life on Earth, which already
intrigued Greek philosophers. It cannot be
studied directlyjust as the plot of a
theatre play cannot be inferred from
watching the last second. Instead; we
study the fossil record; also, synthetic
biology has increasing power to mimic
key steps in evolution. The comparative
approach, however, remains most informative. It infers ancient conditions from
the comparison of extant species. Classically, evolutionary biologists compare
tissues and organsduring development
and in adults. More recently, comparative
genomics allow tracking the increase in
protein complexity in divergent lineages.
Now, a new level of comparison has
emerged, linking organ, tissue, and
protein evolutionthe cell type. Cells are
the basic building blocks of life; once we
understand their evolutionary diversification into types, this will solve the secrets
of multicellular life. A combination of
whole-organism single-cell transcriptomics, proteomics, expression atlases, and
CRISPR-Cas9-based functional studies
in various organisms opens up exciting
new questions: What is the cell type
complement in one species as compared
to others? How is it specified and maintained? What are the cell type-specific
molecular machines, or modules, and
how did these diversify in evolution?
Answering such questions will allow us to
examine even more complex structurestissues and organs, ultimately learning
about their evolution as an emergent
property of their constituting cell types.
Evolution is our family story. It holds the

keys to explaining our ancestry, our
connections with other life forms, and our
eventual, unavoidable extinction. One of
the most exciting frontiers in evolutionary
biology concerns how it all beganthe
origin of life. From our modern vantage,
over 3 billion years after prebiotic
chemistry gave rise to life, how can we
reconstruct the first major evolutionary
transition? The most meaningful
advances have begun with an explicit
model that can be tested experimentally
using techniques from chemistry,
physics, and biology. These approaches
have uncovered plausible prebiotic
chemical pathways leading to ribonucleotides, lipids, and amino acids. The gap
between simple macromolecules and
cells is huge, making this one of the most
compelling intellectual playgrounds for
budding evolutionary biologists. Fast forward to today, as we slowly come to
terms with the outcomes of our 3+ billion
year evolutionary history. Far from existing in isolation, evolution has led nearly all
archaea, bacteria, and eukaryotes to form
obligate associations with communities of
other organisms. Eukaryotic genomes
have evolved through rampant gene exchange with bacteria, our metabolic
pathways are interconnected with those
of our resident bacteria, and this is to say
nothing about pathogens. Understanding
our deep history will require new
approaches that take into account coevolution writ large, the dynamic,
complex, and evolvable interactions
among communities of organisms.
After the debut of Darwins On the Origin

of Species (1859), the big question for
naturalists was whether the mechanism
he (and Alfred Wallace) proposednatural selectioncould explain the exquisite
adaptation of organisms to their environment, and the formation of new species.
More than 150 years later, evolutionary
biologists are still focused on these two
principal evolutionary phenomena. But
the questions have shifted from whether
to howhow do new adaptations arise,
and how does speciation occur?
Empowered by molecular genetics,
biochemistry, and developmental
biology, the focus has expanded from the
organismal scale to the molecular: How is
genetic variation generated and maintained? How do new protein activities,
protein complexes, or physical traits
evolve? Evolutionary biology is the midst
of what Dean and Thornton have dubbed
The Functional Synthesis (Nature
Reviews Genetics 8, 675688) that seeks
a functional understanding of the
mutational paths to new phenotypes.
Biologists are dissecting some of the very
same phenomena that gripped Darwin
and his contemporaries including mimicry
in butterflies, the radiation of Galapagos
finches, the diversity of vertebrate limbs,
and yes, even the giant horns of dung
beetles, as well as experimenting on
model organisms such as bacteria, yeast,
and fruit flies. They are revealing the kinds
of mutations that are necessary and
permissible to make the endless forms
that have been and are being evolved.
528 Cell 166, July 28, 2016 2016 Published by Elsevier Inc.
Developmental Biases in Evolution
Evolution with Foresight
Predictive Evolutionary Genomics
Patricia Wittkopp
Eugene Koonin
Leonid Kruglyak
University of Michigan
National Institutes of Health
UCLA, HHMI
Contemporary evolutionary biology has

been built upon a rich foundation of
theoretical models providing hypotheses
for how and why biodiversity came to be.
When most of these models were developed, little was known about the mechanisms of inheritance that connect one
generation to the next nor about how this
hereditary material directs the development of diverse lifeforms. With the basic
mechanisms of genetics and development now known, evolutionary biologists
have been able to ask how genetic
changes have altered development to
produce diverse phenotypes. These
studies have shown that changes in gene
expression as well as gene function have
contributed to phenotypic evolution and
have also revealed genetic connections
among traits that can influence the evolution of disparate traits. These observations motivate two current big questions
in evolutionary biology: How do existing
genetic and developmental systems
influence the origin of phenotypic
variation and thus shape evolutionary
change? and How can our knowledge
of genetic and molecular mechanism be
integrated into evolutionary theories to
produce more complete models of
evolution? Addressing these questions
will not only help uncover evolutionary
changes that occurred in the past but will
also improve our ability predict evolutionary changes most likely to occur in the
future. The potential to predict paths of
evolutionary change is especially exciting,
as it could be used to help combat
cancer, control disease outbreaks, and
develop bioengineering solutions to the
diverse challenges we face in our
changing world.
For me, perhaps, the most pressing

question in evolutionary biology is: are
evolutionary mechanisms evolvable?
More precisely, is it possible to demonstrate that certain molecular mechanisms
have evolved under specific selective
pressure for increased evolvability or
simply increased rate of a particular type
of evolutionary change? Traditionally,
existence of such dedicated mechanisms
and devices for evolution had been
anathema to theorists because evolution
has no foresight. Yet, I believe that
several such mechanisms have already
been discovered inadvertently, not at all
as a result of focused efforts. One of these
has become famous for its utility in
genome engineering: the CRISPR-Cas
systems of bacterial and archaeal adaptive immunity. CRISPR-Cas is an elaborate molecular machine for directed
change of microbial genomes that makes
the organism immune to a specific
pathogen. Clearly, this is an evolved
mechanism of (quasi)Lamarckian microevolution. The CRISPR-Cas example
shows that evolution has both memory of
past events and some foresight as new
encounters with the same pathogen are
predicted. Another strong case in point
are the gene transfer agents, defective
bacteriophages that, instead of the phage
genome, package random segments of
microbial DNA and transfer them by infecting other microbes. I believe that the
discovery of these dedicated, evolved
mechanisms of genome evolution refutes
the simplistic no foresight view and
calls for an amended evolutionary theoretical framework.
The millions of species on Earth span a

stunning range of phenotypic diversity:
Darwins endless forms most beautiful.
We know, in broad outline, that this
diversity arises from differences in
genomes. With todays DNA sequencing
technologies, it is relatively straightforward to sequence genomes and identify
the differences between them. Indeed,
this has been accomplished for thousands of species, and the list is growing
rapidly. Lagging a long way behind is our
ability to pinpoint which specific genome
sequences underlie each species unique
phenotypic features. To sharpen the
focus on this problem, its worth posing
two related questions. First, given a
genome sequence, what organism will it
produce? This question is readily
answered by the developmental program
of each species (together with the initial
conditions of the starting cell), but we
cannot answer it in a very specific way.
What collection of data, coupled with
what predictive algorithms, would allow
us to tell whether a genome encodes a
mouse, an elephant, or a whale? Second,
given the features of an organism, can we
design a genome that encodes it? What
genome sequence would specify a
Tyrannosaurus rex, a Pterosaur, or a
creature that never existed but whose
existence isnt prohibited by any laws of
biology? Both of the questions appear
well-posed and answerable in principle,
and the difficulty of answering them in
practice highlights how far we still have to
go in our understanding of evolution.
Cell 166, July 28, 2016 529
Leading Edge
Previews
In Praise of Descriptive Science:
A Breath of Fresh AIRE
Mark M. Davis1,*
1Howard
Hughes Medical Institute, Department of Microbiology and Immunology, Stanford University School of Medicine, Stanford,
CA 94305, USA
*Correspondence: mmdavis@stanford.edu
http://dx.doi.org/10.1016/j.cell.2016.07.018
Meyer et al. find that subjects lacking the AIRE gene, critical for self-tolerance in T lymphocytes,
show a broad range of autoantibody specificities, which can have extremely high affinities. The
data also suggest that some of these autoantibodies can, surprisingly, prevent some types of autoimmunity, particularly type I diabetes.
In molecular biology, there has been a
persistent bias against the value of
descriptive biology. This might go back
to a remark attributed to Ernest Rutherford that There are two kinds of science;
physics and stamp collecting. Of course,
the real power in any scientific area
comes with a precise knowledge of
mechanisms, but it shouldnt be forgotten
that careful observation and discovering
new phenomena is the starting point of
every field. A case in point is the paper
in this issue of Cell by Meyer et al.
(2016), who take advantage of new technologies that are allowing us to get precise data about the human immune system to discover some remarkable and
unexpected properties of human beings
with a particular immune deficiency.
They start, innocently enough, with a
straightforward enquiry into the nature of
the autoantibodies in subjects that are
deficient in what is known as the AIRE
gene, which was originally identified in
APS1an autoimmune syndrome characterized by autoantibodies, impaired
endocrine function, and chronic Candida
infections (Nagamine et al., 1997;
Finnish-German APECED Consortium,
1997). Later work in mice has shown
that Aire has a specific role in stimulating
the expression of tissue-specific genes
in the thymus that wouldnt normally be
expressed in that organ and that this
helps ensure T cell tolerance to self-antigens (Mathis and Benoist, 2009). There
is also evidence that Aire has a role in
ensuring T cell tolerance in peripheral immune organs such as the spleen and
lymph nodes (Gardner et al., 2013). Tolerance is induced at least in some cases by
clonal deletion of self-specific T cells (Anderson et al., 2005) but also might take the
form of inhibiting activation (Davis, 2015).
In some mouse strains, Aire deficiency
leads to severe autoimmunity and early
death, but in other strains and in human
beings, the effects are more subtle. In
this study, the authors gathered specimens and data on 81 patients. Then they
analyzed the autoantibodies in their
serum for specificity. Remarkably, they
found that each patient expresses on
average approximately 100 different
specificities, such that, all together, they
were able to identify over 3,700 antibody
specificities, showing that there was an
almost random pattern of targets. However, there were some specificities that
were shared, particularly anti-cytokine
antibodies to cytokines in the type I interferon group. Even more remarkable is
that, when they characterized some of
these autoantibodies with respect to
their affinity, they found that many were
astonishing high, with one having a KD of
10 14M and others in the picomolar range
(10 12M), much higher than the nanomolar affinities that one gets with a standard
immunization. They further note that patients with this syndrome seem resistant
to many autoimmune diseases (multiple
sclerosis, lupus, and others) but a fraction
do develop type I diabetes. While there is
a growing literature correlating anti-cytokine antibodies with a susceptibility to
particular infectious diseases (Kisand
et al., 2010), they postulated that such autoantibodies might, in some cases, have a
protective effect. In particular, a-interferons have been implicated in type I diabetes in mice, and so they looked to see
530 Cell 166, July 28, 2016 2016 Published by Elsevier Inc.
whether the expression of autoantibodies

to these cytokines correlated with this
type of autoimmunity in their cohort.
Indeed, they found that, while all the patients in their substudy had antibodies to
this family of cytokines, those expressed
by the 8 APS1 subjects with type I diabetes did not neutralize, whereas the
13 patients who did not have diabetes
did. While not a proof, this is a seriously
smoking gun suggesting a critical role
for these cytokines in this particular
(and major) autoimmune disease. It is
also interesting that, while a-interferon is
prominent in anti-viral responses and
has been used therapeutically, APS1 patients do not seem to be particularly prone
to viral infections, indicating that there is
enough redundancy in other parts of the
immune system to carry the load.
So how to summarize this study?
Figure 1 attempts to do this by showing
that the lack of AIRE activity in the thymus
and in the periphery in patients with
the APS1 syndrome leads to a failure of
T cell tolerance in those T cells that are
specific for the many self-antigens that
AIRE is responsible for expressing. Just
how many this represents is evident in
the many different autoantibody specificities in this cohort.
But why would these autoantibodies
have such high affinities? A key factor in
stimulating a given B cell to mutate its
immunoglobulin genes from micromolar
to nanomolar affinities in the course of
an immune response cells are follicular
helper T cells (TFH). The authors suggest
that the lack of T cell tolerance in general
and TFHs in particular could divert B cells
originally having other specificities onto
human mutations can uncover new phenomena worthy of further study. This not
only opens up new vistas regarding our
understanding of immune function and
dysfunction but also shows how work
on the human immune system can not
only inform translational work but add to
our understanding of basic principles as
well.
REFERENCES
Anderson, M.S., Venanzi, E.S., Chen, Z., Berzins,
S.P., Benoist, C., and Mathis, D. (2005). Immunity
23, 227239.
Davis, M.M. (2015). Immunity 43, 833835.
Finnish-German APECED Consortium. (1997). Nat.
Genet. 17, 399403.
Figure 1. A Simplified View of T-B Lymphocyte Interactions in the Presence or Absence of

the AIRE Genes Influence on T Cell Tolerance
Normally, follicular helper CD4+ T cells emerge from the thymus with those expressing self-specific T cell
receptors either purged or suppressed (upper part of the figure). They are then able to stimulate B cells
specific for the same antigen to mutate their antibody genes to achieve higher affinities after a bolus of
immunizing antigen or an infection. In AIRE-deficient subjects, it is suggested that the absence of T cell
tolerance allows multiple T-B interactions of this sort due to the continuous presence of self-antigen,
resulting in the very high affinities seen in Meyer et al. (2016). Here, arrows denote increases in antibody
affinity.
this self-reactive path. But an additional

wrinkle could come from recent work by
Goodnow and colleagues (Sabouri et al.,
2014), who have found that, in addition
to TFHs boosting the affinities of antibodies for foreign antigens,, autoantibody-producing B cells can be selected
to reduce their affinity for self. If this
reversal of affinity was dependent on
input from T cells enforcing self-specific
tolerance (either regulatory T cells or

perhaps TFH cells that had escaped clonal
deletion in the thymus), and if those cells
were absent due to the lack of Aire, then
the ubiquitous presence of self-antigens
coupled with a selection for high affinity
antibodies in germinal centers might
result in the amazing affinities seen in
this system. In any event, these results
show how carefully analyzing particular
Gardner, J.M., Metzger, T.C., McMahon, E.J., AuYeung, B.B., Krawisz, A.K., Lu, W., Price, J.D., Johannes, K.P., Satpathy, A.T., Murphy, K.M., et al.
(2013). Immunity 39, 560572.
Kisand, K., Be Wolff, A.S., Podkrajsek, K.T.,
Tserel, L., Link, M., Kisand, K.V., Ersvaer, E., Perheentupa, J., Erichsen, M.M., Bratanic, N., et al.
(2010). J. Exp. Med. 207, 299308.
Mathis, D., and Benoist, C. (2009). Annu. Rev. Immunol. 27, 287312.
Meyer, S., Woodward, M., Hertel, C., Vlaicu, P.,
Haque, Y., Karner, J., Macagno, A., Onuoha,
S.C., Fishman, D., Peterson, H., et al. (2016). Cell
166, this issue, 582595.
Nagamine, K., Peterson, P., Scott, H.S., Kudoh, J.,
Minoshima, S., Heino, M., Krohn, K.J., Lalioti,
M.D., Mullis, P.E., Antonarakis, S.E., et al. (1997).
Nat. Genet. 17, 393398.
Sabouri, Z., Schofield, P., Horikawa, K., Spierings,
E., Kipling, D., Randall, K.L., Langley, D., Roome,
B., Vazquez-Lombardi, R., Rouet, R., et al.
(2014). Proc. Natl. Acad. Sci. USA 111, E2567
E2575.
Cell 166, July 28, 2016 531
Leading Edge
Previews
Broadening Horizons:
New Antibodies Against Influenza
Katherine J.L. Jackson1 and Scott D. Boyd1,*
1Department of Pathology, Stanford University, CA 94305, USA
*Correspondence: sboyd1@stanford.edu
Seasonal influenza vaccine formulation efforts struggle to keep up with viral antigenic variation.
Two studies now report engineered or naturally occurring human antibodies targeting the influenza
hemagglutinin (HA) stem, with exceptional neutralizing breadth (Joyce et al., 2016; Kallewaard et al.,
2016). Antibodies with similar structural features are elicited in multiple subjects, suggesting that
modified vaccine regimens could provide broad protection.
To paraphrase Jane Austen, it is a truth
universally acknowledged that an individual in possession of a good broadly
neutralizing antibody response to a virus
must be in want of characterization. The
human immune response to influenza
has been no exception. The tools of
monoclonal antibody (mAb) characterization from single B cells, and tracking of the
clonal evolution of B cell populations by
high-throughput antibody sequencing,
are providing an increasingly high-resolution map of human immunity to a range of
pathogens and vaccines (Andrews et al.,
2015; Liao et al., 2013). Influenza is a significant global health challenge with millions of infections and up to half a million
deaths globally each year, despite efforts
to increase vaccination in at-risk populations. Influenza virus is a challenging
target for antibodies because of antigenic
drift and shift, genetic processes that produce ongoing and occasionally abrupt
antigenic changes. Antibodies elicited
by seasonal influenza vaccination tend
to target the rapidly mutating globular
head of hemagglutinin (HA), facilitating
viral escape from responses raised to
other viral strains (Krammer and Palese,
2013). Broadly neutralizing antibodies
(bnAbs) directed against the structurally
conserved HA stem could provide more
enduring protection, and examples of
these have been isolated (Krammer and
Palese, 2013). These antibodies have revealed some stereotyped features, such
as IGHV1-69 usage in multiple subjects
(Corti et al., 2010). These responses in vivo
in humans, however, seem to arise from
low-frequency clones, are present at
lower titers than head-specific antibodies,
and have neutralization breadth usually

limited to homologous subtypes. bnAbs
with neutralizing breadth covering both
influenza A group 1 and group 2 subtypes
would be far more valuable, if they could
be reliably stimulated by new vaccine regimens. Such antibodies have only rarely
been observed in responses to vaccination or infection (Corti et al., 2011). In this
issue, Joyce et al. (2016) and Kallewaard
et al. (2016) apply distinct approaches to
obtain bnAbs with breadth against both
influenza A group 1 and group 2 subtypes.
Joyce et al. (2016) study stereotyped or
convergent antibody responses across
multiple human donors after H5 hemagglutinin primed and repeatedly boosted
influenza vaccination, and they define
new multi-donor bnAb classes (Figure 1).
Kallewaard et al. (2016) engineer and
improve a single-subject-derived mAb
to obtain unprecedented neutralization
breadth. The identification of convergent
classes of antibodies capable of neutralizing both influenza A group 1 and group
2 subtypes builds upon a growing body
of literature showing that the diverse antibody-mediated responses of different humans can converge on similar solutions to
complex antigen targeting (Jackson et al.,
2014; Truck et al., 2015). The bnAb developed by Kallewaard et al. (2016) shows
structural similarities to one of the bnAb
classes identified by Joyce et al. (2016).
Kallewaards engineered MEDI8852
bnAb reacts to representatives of all influenza A subtypes from the last 80 years and
is derived from a naturally occurring antibody whose heavy chain uses IGHV6-1,
IGHD3-3, and IGHJ3 gene segments
with low levels of somatic hypermutation.
532 Cell 166, July 28, 2016 2016 Elsevier Inc.
This bnAb interacts with a highly

conserved epitope in the hydrophobic
groove in the fusion domain plus a large
portion of the fusion peptide (Kallewaard
et al., 2016). Similarly, one of the convergent antibody classes isolated from three
subjects by Joyce et al. (2016) also utilizes
IGHV6-1 and IGHD3-3, binds the HA
stem but avoids the conserved HA1 glycans of both group 1 and group 2 subtypes, and uses a somatically mutated
CDR H3 residue to insert into the Trp21
pocket in the hydrophobic groove of
HA2. Two additional convergent bnAb
classes identified by Joyce et al. (2016),
both using IGHV1-18, bind the HA fusion
peptide-helix A region suggesting this region is a recurrent target for bnAbs.
These two studies contribute to
ongoing efforts to understand the threedimensional structural features and the
underlying primary antibody sequences
used to recognize viruses that can alter
many, but not all, of their epitopes. They
add to a growing list of such structurally
conserved vulnerable targets for neutralization in influenza, akin to those for HIV
(Liao et al., 2013), and demonstrate that
shared antibody structural motifs for binding viral epitopes are formed in different
individuals, despite the vast diversity of
antibody repertoires. Analogous convergent antibodies contribute to human antibody responses to many, if not all, pathogens (Jackson et al., 2014; Truck et al.,
2015), but not all convergent antibodies
are neutralizing or otherwise protective.
Structurally constrained pathogen epitopes required for survival may be under
selection so that they are not highly antigenic and may only be accessible to
peutic drug candidates that could be

particularly helpful in populations with
compromised immune function, such as
the very elderly.
The combination of structural analysis
and extensive sequence characterization
of antibody repertoires, married with functional testing of mAbs, has greatly empowered the analysis of human B cell responses to real pathogens and vaccines.
These new studies underscore the value
of taking cues from the structural problem
solving of the immune system, when
designing new candidate therapeutics,
and they add to the evidence that there
are predictable features of human adaptive immune responses that can be accessed by properly designed vaccine regimens to achieve greater efficacy.
REFERENCES
Andrews, S.F., Huang, Y., Kaur, K., Popova, L.I.,
Ho, I.Y., Pauli, N.T., Henry Dunand, C.J., Taylor,
W.M., Lim, S., Huang, M., et al. (2015). Sci. Transl.
Med. 7, 316ra192.
Figure 1. Convergent Broadly Neutralizing Antibodies for Influenza

After exposure to influenza vaccines or viral infections, each individual produces a mixture of different
antibodies specific for a variety of epitopes on the hemagglutinin protein (indicated by antibody color and
epitope color on the hemagglutinin trimer; epitopes are colored on only one of the HA trimers for clarity).
Antibodies that bind to the highly variable head region are most often strain-specific, while antibodies that
bind the more conserved stalk region have the potential to neutralize a wider variety of strains. Rare stalkbinding antibodies that neutralize influenza strains spanning the major antigenic group 1 and group 2
categories have been previously reported. Joyce et al. (2016) now identify several categories of such
broadly neutralizing antibodies (indicated with circles) that also show structural convergence between
different donors (i.e., very similar antibodies are elicited in different individuals, despite the huge diversity
of antibody repertoires). Kallewaard et al. (2016) describe an engineered antibody with particularly
extensive neutralizing breadth, which shares structural similarity with one of the antibody classes in Joyce
et al. (2016) and binds to a similar epitope.
antibodies that are difficult for humans to

elicit, for example, because they are autoreactive or polyreactive and trigger
poorly defined human B cell tolerance
mechanisms (Liao et al., 2013).
It is less clear how much of any individuals actual serum antibody response to
influenza vaccination is composed of
such bnAbs or whether optimized vaccination regimens could elicit them at
high enough titers to be functionally protective. Ongoing antibody proteomic
research with improved mass spectrometry methods may help to address these
key questions, particularly if applied in
clinical trials in which protection from
infection can be measured (Wine et al.,
2013). What do these studies teach us
about immunogen design or selection?
The pragmatic goal of better vaccine efficacy against divergent viral strains could
be served equally well by eliciting rare antibodies with extreme breadth or a handful
of antibody lineages with complementary
partial breadth. The success of each scenario could depend on the titers reached
and the durability of the response. Evaluating the feasibility of these goals will
likely require testing vaccine regimens using heterologous antigens that are more
divergent than those in current vaccine
formulations and assessing antigenic
combinations, the ordering of antigen
administration, and the effects of adjuvants, while analyzing the kinds of antibodies that are stimulated in each case.
Of course, very potent bnAbs are potential passive immunoprotective or thera-
Corti, D., Suguitan, A.L., Jr., Pinna, D., Silacci, C.,

Fernandez-Rodriguez, B.M., Vanzetta, F., Santos,
C., Luke, C.J., Torres-Velez, F.J., Temperton,
N.J., et al. (2010). J. Clin. Invest. 120, 16631673.
Corti, D., Voss, J., Gamblin, S.J., Codoni, G., Macagno, A., Jarrossay, D., Vachieri, S.G., Pinna, D.,
Minola, A., Vanzetta, F., et al. (2011). Science
333, 850856.
Jackson, K.J., Liu, Y., Roskin, K.M., Glanville, J.,
Hoh, R.A., Seo, K., Marshall, E.L., Gurley, T.C.,
Moody, M.A., Haynes, B.F., et al. (2014). Cell
Host Microbe 16, 105114.
Joyce, M.G., Wheatley, A.K., Thomas, P.V.,
Chuang, G., Soto, C., Bailer, R.T., Druz, A., Georgiev, I.S., Gillespie, R.A., Kanekiyo, M., et al.
(2016). Cell 166, this issue, 609623.
Kallewaard, N.L., Corti, D., Collins, P.J., Neu, U.,
McAuliffe, J.M., Benjamin, E., Wachter-Rosati, L.,
Palmer-Hill, F.J., Yuan, A.Q., Walker, P.A., et al.
(2016). Cell 166, this issue, 596608.
Krammer, F., and Palese, P. (2013). Curr. Opin. Virol. 3, 521530.
Liao, H.X., Lynch, R., Zhou, T., Gao, F., Alam, S.M.,
Boyd, S.D., Fire, A.Z., Roskin, K.M., Schramm,
C.A., Zhang, Z., et al.; NISC Comparative
Sequencing Program (2013). Nature 496, 469476.
Truck, J., Ramasamy, M.N., Galson, J.D., Rance,
R., Parkhill, J., Lunter, G., Pollard, A.J., and Kelly,
D.F. (2015). J. Immunol. 194, 252261.
Wine, Y., Boutz, D.R., Lavinder, J.J., Miklos, A.E.,
Hughes, R.A., Hoi, K.H., Jung, S.T., Horton, A.P.,
Murrin, E.M., Ellington, A.D., et al. (2013). Proc.
Natl. Acad. Sci. USA 110, 29932998.
Cell 166, July 28, 2016 533
Leading Edge
Previews
Baby Nuclear Pores Grow Up Faster All the Time
C. Patrick Lusk1,*
1Yale School of Medicine, New Haven, CT 06510, USA
*Correspondence: patrick.lusk@yale.edu
Annulate lamellae (AL) are stacked ER-derived membranes containing nuclear pore complex-like
structures whose fate and function have remained a mystery. During the short interphase of early
embryonic cells, AL are rapidly delivered into the nuclear envelope through fenestrations, highlighting the remarkable dynamics of the nuclear envelope.
The confinement of the genome within the
nucleus suggests that it, like most organelles, is a physically distinct cellular entity.
However, the two nuclear membranes
that comprise the nuclear envelope (NE)
are made up of one lipid bilayer that is
contiguous with the endoplasmic reticulum (ER) (Figure 1). The morphological
and biochemical identity of the NE is
classically defined by the presence of nuclear pore complexes (NPCs), which are
enormous 100 MD transport channels
composed of nucleoporin (nup) proteins.
Importantly, NPCs do not disrupt the continuity of the lipid bilayer between the NE
and ER but effectively seal NE pores
by controlling the passage of soluble
and membrane-bound macromolecules
into and out of the nucleus. Interestingly,
the morphological distinction between
NE and ER is blurred in some cell types,
as extra-nuclear NPC-like structures
occur in stacked ER cisternae (Cordes
et al., 1996). These annulate lamellae
(AL) were considered repositories of
excess nups, but this function was not
formally established. Moreover, it is challenging to contemplate how (or whether)
AL could be incorporated into the NE
without breaking the NE seal that would
also impose a topological barrier to such
an event. In this issue of Cell, Hampoelz
et al. (2016) show how rapidly dividing
cells in the Drosophila embryo overcome
this barrier by visualizing an elegant
mechanism of AL incorporation into the
NE. In addition to providing a solution to
the long-standing question of the function and fate of AL, they also introduce
a mechanism of membrane remodeling
that is challenging our conventional view
of a static interphase NE.
A common feature of early embryonic
divisions is their rapidity, necessitating
constant NE breakdown and reformation

cycles. While Drosophila cells do not fully
break down their NEs, NPC disassembly
and NE fenestration allow spindle microtubules access to the chromosomes. During
the ensuing 10 min interphase (consider
that a typical mammalian cell interphase
lasts several hours!), the NE almost triples
in surface area while maintaining a constant NPC density. These observations
suggest an intimate link between rapid
NE expansion and de novo NPC assembly
(DAngelo et al., 2006), but what are the
sources of membrane and NPCs? As the
embryonic Drosophila cells are chock-full
of AL that disappear during interphase,
a likely possibility is that they provide
the raw materials for NE expansion.
Indeed, by taking advantage of sophisticated time-lapse fluorescence microscopy coupled with correlative light EM
and slice and view focused ion beam
scanning EM, Hampoelz et al. convincingly demonstrate that AL are incorporated into the NE through large fenestrations (in some cases spanning several
micrometers) in the NE that likely persist
from the preceding mitosis (Figure 1A).
The molecular mechanism of incorporation likely involves a growth in nuclear volume that might mechanically stretch the
NE, further dilating these fenestrations
and providing an entryway for the AL to
come into direct physical contact with
intranuclear factors like chromatin. Undoubtedly, membrane remodeling proteins that help shape the ER contribute
to these processes as well through mechanisms that will await future study.
Interestingly, a key indicator of the
capacity of the embryonic nuclei to undergo the dramatic NE remodeling necessary for AL incorporation is the marked
mobility of NPCs, a bellwether of a
uniquely dynamic NE-ER system. Consistent with this idea, as the embryonic cells
progressively immobilize NPCs by the
increased expression of intranuclear scaffolds like the lamins and lamin-binding
proteins, they lose the capacity to incorporate AL into the NE. Thus, AL incorporation may be halted as cell fate and NE
composition become cemented. Alternatively, AL incorporation might be slowed
because of a requirement to remodel a
rigid intranuclear lamin scaffold, a likely
necessity for NE dynamics in highly differentiated cells with more established nuclear architecture (King and Lusk, 2016).
A model in which the AL incorporate
into the NE makes the prediction that the
NE and AL together comprise a compartment that is distinct from the rest of the
ER. Remarkably, and consistent with this
idea, the permeability barrier of the NE
to macromolecules remains intact despite
the presence of the NE fenestrations, suggesting that the AL effectively seal the
NE like a lid that would be part of a
larger, yet-to-be defined NE-AL system
(Figure 1). A major challenge for the future
is to understand how membranes competent for AL formation either remain connected with the NE during mitotic NE
breakdown or establish connections by
generating holes in a sealed NE. A likely
possibility for the former scenario is that,
during NE breakdown, some membranes
(perhaps those that associate with the
mitotic spindle) retain a biochemical or
morphological signature of the NE that
establishes competence for NE reformation and NPC assembly. The future
identification of this molecular signature
might represent the minimal fundamental
component that triggers NPC assembly
and provides the ultimate differentiator
of the NE and ER.
tion of NPC assembly upon AL incorporation into the NE (Figure 1). This NPC
maturation must rely on positional
cues that could reflect binding to chromatin (Franz et al., 2007; Rasala et al.,
2006) or to components of the nuclear
basket known to be required for NPC assembly but missing from AL (Vollmer
et al., 2015). Thus, one exciting possibility is that the maturation mechanism
reflects conformational changes to the
NPC scaffold that expose otherwise
hidden anchor points for the remaining
nups. One wonders whether this process
might be reversed during NPC disassembly or dynamically controlled to alter
the transport properties of the NPC in
response to environmental (or other) inputs. In either case, understanding the
molecular basis behind these putative
conformational changes will likely be a
watershed in our understanding of NPC
assembly and function.
REFERENCES
Cordes, V.C., Reidenbach, S., and Franke, W.W.
(1996). Cell Tissue Res. 284, 177191.
Figure 1. Insertion of Annulate Lamellae into the Nuclear Envelope

(A) Annulate lamellae (AL) contain NPC skeletons lacking the full complement of nucleoporins (nups) that
are embedded in stacked ER membranes. They are directly connected to the NE around large fenestrations. The NPC skeletons have nups capable of establishing a diffusion barrier to large macromolecules
and are unlikely to mediate active nuclear transport of nuclear transport receptor (NTR) cargo complexes.
(B) During rapid NE expansion in interphase, the AL are incorporated into the NE as NPC skeletons mature
to complete NPC assembly. INM and ONM, inner and outer nuclear membrane, respectively.
What can AL teach us about the

mechanism of NPC assembly? As the
core scaffold components of the NPC
assemble into AL within minutes, it is clear
that building an NPC skeleton is a surprisingly efficient and rapid process; studying
AL formation could thus provide an
experimental system to determine the
mechanisms underlying these events. An
advantage of such a system is that it
might harmonize reported differences in
the biochemical (Doucet et al., 2010;
Vollmer et al., 2015) and kinetic (Dultz
and Ellenberg, 2010; Dultz et al., 2008) requirements of NPC assembly observed
either at the end of mitosis or during interphase. For example, these differences
might simply reflect layers of regulation
present at these distinct phases of the
cell cycle (Wandke and Kutay, 2013) or
in unique cell types with distinct nuclear

organizations that would be absent from
a minimal AL assembly system.
To ultimately determine whether there
are fundamental mechanistic differences
in NPC assembly throughout the cell
cycle or into AL will require a deeper
understanding of the biochemical and
structural intermediates that define the
steps in each of these putative pathways. Perhaps most importantly, it needs
to be definitively established whether
these modes of assembly require a
membrane fusion step to insert NPCs
into double membranes (Wandke and
Kutay, 2013). Just as critically, as the
NPC skeletons in AL lack critical asymmetric components of the NPC and
several central channel nups, mechanisms must exist to trigger the comple-
DAngelo, M.A., Anderson, D.J., Richard, E., and

Hetzer, M.W. (2006). Science 312, 440443.
Doucet, C.M., Talamas, J.A., and Hetzer, M.W.
(2010). Cell 141, 10301041.
Dultz, E., and Ellenberg, J. (2010). J. Cell Biol. 191,
1522.
Dultz, E., Zanin, E., Wurzenberger, C., Braun, M.,
Rabut, G., Sironi, L., and Ellenberg, J. (2008).
J. Cell Biol. 180, 857865.
Franz, C., Walczak, R., Yavuz, S., Santarella, R.,
Gentzel, M., Askjaer, P., Galy, V., Hetzer, M.,
Mattaj, I.W., and Antonin, W. (2007). EMBO Rep.
8, 165172.
Hampoelz, B., Mackmull, M.-T., Machado, P.,
Ronchi, P., Bui, K.H., Schieber, N., SantarellaMellwig, R., Necakov, A., Andres-Pons, A., Philippe, J.M., et al. (2016). Cell 166, this issue,
664678.
King, M.C., and Lusk, C.P. (2016). Curr. Opin. Cell
Biol. 41, 917.
Rasala, B.A., Orjalo, A.V., Shen, Z., Briggs, S., and
Forbes, D.J. (2006). Proc. Natl. Acad. Sci. USA
103, 1780117806.
Vollmer, B., Lorenz, M., Moreno-Andres, D., Bodenhofer, M., De Magistris, P., Astrinidis, S.A.,
Schooley, A., Flotenmeyer, M., Leptihn, S., and
Antonin, W. (2015). Dev. Cell 33, 717728.
Wandke, C., and Kutay, U. (2013). Cell 152, 1222
1225.
Cell 166, July 28, 2016 535
Leading Edge
Previews
A Biomarker Harvest
from One Thousand Cancer Cell Lines
Yu-Han Huang1 and Christopher R. Vakoc1,*
1Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
*Correspondence: vakoc@cshl.edu
Identifying molecular biomarkers that predict cancer drug efficacy is crucial for the advancement of
precision medicine. In this issue of Cell, Iorio et al. nominate hundreds of potential genetic and
epigenetic biomarkers through high-throughput drug screening in 1,000 molecularly annotated
cancer cell lines.
Developing personalized therapies that
exploit the unique molecular abnormalities in a patients tumor is a central objective of modern cancer research. Genome
sequencing initiatives, such as The
Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium
(ICGC), have defined the complex genetic
landscapes of the most common human
malignancies; however, only a small
percentage of cancer mutations is
considered to be actionable with existing
therapies (Garraway and Lander, 2013;
Stratton et al., 2009). While new targets
and drugs are clearly needed, a major
obstacle in implementing precision therapies is our incomplete understanding of
the relationship between tumor genotype
and drug sensitivity. To address this
issue, many investigators have turned to
large-scale chemical screens in genetically annotated human cancer cell lines
as a means of nominating predictive biomarkers (Figure 1). While cancer cell lines
in culture are imperfect models of human
tumors, they tend to remain addicted to
the oncogenes that initiated tumor formation and hence are well-validated tools for
studying oncogene-targeted therapies
(Sharma et al., 2010). In this issue, Iorio
et al. present one of the largest attempts
to date at mining predictive correlates of
drug sensitivity using a panel of 1,000
annotated cancer cell lines treated with
265 compounds (Iorio. et al., 2016).
The strategy taken by the authors is the
following: (1) perform a deep genetic and
epigenetic analysis of each cell line,
focusing on the features that match the
recurrent alterations found in human tumors, (2) measure the sensitivity of each
cell line to 265 different compounds/
drugs, which includes approved and

investigational agents, and (3) perform
computational analyses to search for genetic/epigenetic alterations that correlate
with resistance and sensitivity to each
drug. Several important observations
have been made during the implementation of this screening platform. First, the
large panel of cell lines used in this study
capture much of the gene mutations,
DNA copy number alterations, and epigenetic changes found in primary human tumors. Furthermore, the authors use machine learning to investigate which data
type (genomic alterations, DNA methylation, or gene expression) is the best
predictor of drug sensitivity. When performing a pan-cancer analysis, gene
expression is the superior predictor of
sensitivity, whereas within a specific cancer, drug sensitivity is best explained by
genomic analysis. Importantly, many of
the pharmacogenomic relationships identified in Iorio et al. could be validated when
evaluating prior analyses of the Cancer
Cell Line Encyclopedia (CCLE) (Barretina
et al., 2012; Seashore-Ludlow et al.,
2015), thus alleviating concerns about
the reproducibility of results from independent drug screening platforms
(Haibe-Kains et al., 2013). Taken together,
these findings provide significant insight
into our assessment of human cancer
cell lines and drug sensitivity profiling as
tools for therapeutic investigation.
The output of the analysis in Iorio et al.
is a stunning number of genotype-drug
sensitivity associations. In total, 688 statistically significant interactions have
been identified between individual genetic/epigenetic events and specific cancer drugs, with 262 of these associations
being classified as large effectthat is,

reaching a comparable strength of association as observed between clinically
validated kinase inhibitors and their
genetically altered kinase target (imatinib/BCR-ABL, vemurafinib/BRAFV600E).
We list here just a few of these novel associations: sensitivity to the anti-androgen
bicalutamide in squamous cell lung cancer lines has been found to be associated
with inactivating mutations of the chromatin modifier MLL2. In stomach cancer
cell lines, truncating mutations of the transcriptional corepressor BCOR are associated with sensitivity to LY317615, an inhibitor of protein kinase C b. Sensitivity
to the BRD2/BRD3/BRD4 bromodomain
inhibitor JQ1 in breast cancer lines is
linked to the mutational status of the
RNA polymerase II subunit POLR2B. The
functional relationship between these
non-oncogene drug targets and these
specific genetic alterations is presumably
indirect and can be contrasted with the
classical targeted therapy paradigm of
direct oncoprotein inhibition (e.g., imatinib/BCR-ABL) (Luo et al., 2009). Hence,
this study may have exposed a vast array
of synthetic-lethality genetic interactions
for future mechanistic characterization
(Kaelin, 2005).
When Iorio et al. is considered together
with other large-scale compound
screening initiatives (Figure 1), it becomes
apparent that an explosion of new
biomarker-guided therapeutic opportunities is emerging, which now await validation in pre-clinical models and/or in
cancer patients. Unfortunately, the historical experience of pharma and academia
in developing drugs against non-oncogene drug targets, in particular those
Figure 1. The Major Cancer Cell Line Drug Screening Initiatives

The NCI-60 project was initiated in the 1980s and has to date tested >100,000 compounds in 59 cancer cell lines. More recently, the scale of cancer cell line drug
screens has expanded dramatically (e.g., Cancer Cell Line Encyclopedia [CCLE]/Cancer Therapeutics Response Portal [CTRP], and Genomics of Drug Sensitivity
in Cancer [GDSC]) and now includes a detailed genetic analysis of each cell line.
discovered in cell lines, would lead us to

anticipate a low probability of success
during clinical translation. However, the
analysis presented in Iorio et al. may
have addressed the critical issue that
has undermined the prior pursuits of
non-oncogene targets: the lack of predictive genetic biomarkers. A key question
going forward will be whether the inherent
limitation of cancer cell lines as tumor
models will obfuscate the pharmacogenomic relationships identified in this
study, as validation experiments proceed
into more physiological tumor models.
Despite these concerns, the remarkable
scale of this cell line screening effort
places us in a strong position of having
hundreds of potential hypotheses to be
explored. Hence, even a low rate of validation could still amount to a major
advance in the development of targeted
cancer therapies. In this regard, a largescale drug efficacy evaluation in genetically annotated patient-derived xenograft
(PDX) tumor models would be a justified
follow-up venture to this work.
While the clinical significance of the
findings in Iorio et al. remains to be deter-
mined, the utility of this cell line resource

for basic cancer research is unambiguous. NCI-60 was initiated in the 1980s as
the first compound screening initiative in
cancer lines and contributed fundamental
insights to our understanding of drug
mechanisms of action (Chabner, 2016).
The deepening genomic, epigenomic,
and, ultimately, metabolomic characterization of cell lines is allowing investigators to pinpoint cancer-sustaining molecular mechanisms with unprecedented
depth, rigor, and speed. As such efforts
have expanded in recent years, it is now
common practice in many research labs,
including our own, to use publicly available cancer cell line resources to guide
the experimental evaluation of any new
gene or small molecule for its relevance
to cancer biology.
ACKNOWLEDGMENTS
C.R.V. is supported by the Leukemia and Lymphoma Society, the Burroughs-Wellcome Fund,
the Pershing Square Sohn Cancer Research Alliance, the Starr Cancer Consortium, and the NIH/
NCI grant RO1 CA174793.
REFERENCES
Barretina, J., Caponigro, G., Stransky, N., Venkatesan, K., Margolin, A.A., Kim, S., Wilson, C.J., Lehar, J., Kryukov, G.V., Sonkin, D., et al. (2012). Nature 483, 603607.
Chabner, B.A. (2016). J. Natl. Cancer Inst.
108, 108.
Garraway, L.A., and Lander, E.S. (2013). Cell 153,
1737.
Haibe-Kains, B., El-Hachem, N., Birkbak, N.J., Jin,
A.C., Beck, A.H., Aerts, H.J., and Quackenbush, J.
(2013). Nature 504, 389393.
Iorio, F., Knijnenburg, T.A., Vis, D.J., Bignell, G.R.,
Menden, M.P., Schubert, M., Aben, N., Goncalves,
E., Barthorpe, S., Lightfoot, H., et al. (2016). Cell
166, this issue, 740754.
Kaelin, W.G., Jr. (2005). Nat. Rev. Cancer 5,
689698.
Luo, J., Solimini, N.L., and Elledge, S.J. (2009). Cell
136, 823837.
Seashore-Ludlow, B., Rees, M.G., Cheah, J.H.,
Cokol, M., Price, E.V., Coletti, M.E., Jones, V.,
Bodycombe, N.E., Soule, C.K., Gould, J., et al.
(2015). Cancer Discov. 5, 12101223.
Sharma, S.V., Haber, D.A., and Settleman, J.
(2010). Nat. Rev. Cancer 10, 241253.
Stratton, M.R., Campbell, P.J., and Futreal, P.A.
(2009). Nature 458, 719724.
Cell 166, July 28, 2016 537
Leading Edge
Review
The Genetics of Transcription Factor
DNA Binding Variation
Bart Deplancke,1,* Daniel Alpern,1 and Vincent Gardeux1
1Laboratory
of Systems Biology and Genetics, Institute of Bioengineering, Ecole Polytechnique Federale de Lausanne and Swiss Institute of
Bioinformatics, 1015 Lausanne, Switzerland
*Correspondence: bart.deplancke@epfl.ch
Most complex trait-associated variants are located in non-coding regulatory regions of the
genome, where they have been shown to disrupt transcription factor (TF)-DNA binding motifs. Variable TF-DNA interactions are therefore increasingly considered as key drivers of phenotypic variation. However, recent genome-wide studies revealed that the majority of variable TF-DNA binding
events are not driven by sequence alterations in the motif of the studied TF. This observation implies
that the molecular mechanisms underlying TF-DNA binding variation and, by extrapolation, interindividual phenotypic variation are more complex than originally anticipated. Here, we summarize
the findings that led to this important paradigm shift and review proposed mechanisms for local,
proximal, or distal genetic variation-driven variable TF-DNA binding. In addition, we discuss the
biomedical implications of these findings for our ability to dissect the molecular role(s) of noncoding genetic variants in complex traits, including disease susceptibility.
Introduction
Analysis of genomic variation in humans (Auton et al., 2015) as
well as in model species such as the mouse (Keane et al.,
2011; Yalcin et al., 2011) and fruit fly (Huang et al., 2014; Massouras et al., 2012) is providing unprecedented opportunities to
understand the genetic basis of complex traits, including disease
susceptibility. An important insight that emerged from genomewide association studies (GWAS) is that the vast majority of
significantly associated genetic variants is located in non-coding
regions and may thus impact gene regulation. For example, of
465 unique trait/disease-associated single nucleotide polymorphisms (SNPs) derived from 151 GWAS studies, only 12% are
located in protein-coding regions, while 40% fall within introns
and another 40% in intergenic regions (Hindorff et al., 2009). In
addition, genome-wide profiling of accessible chromatin regions
using DNase I hypersensitivity (DHS) mapping revealed that
almost 60% of non-coding GWAS SNPs and other variants are
located within DHS sites, with another 20% being in complete
linkage disequilibrium (LD) with variants that lie in a proximate
DHS site (Maurano et al., 2012). Since DHS sites reflect the occupancy of DNA binding proteins such as transcription factors
(TFs), these data indicate that GWAS loci may alter the binding
of TFs and, as such, induce variation in gene expression and
ultimately in complex organismal phenotypes. In this Review,
we summarize the findings that led to this increasingly accepted
notion of the importance of variation in TF-DNA binding in mediating phenotypic diversity. In addition, we strive to clarify why, for
the majority of studied traits or diseases, establishing a mechanistic link between regulatory and phenotypic variation is still
very challenging.
For this purpose, we explore the molecular mechanisms mediating TF-DNA binding variation and address a major question in
the fieldnamely, why the majority of variable TF-DNA binding

events appear to be driven by mechanisms other than nucleotide
variation in the cognate motifs. We thereby focus on human interindividual, molecular variation and restrict this Review to discussing mechanisms underlying variable TF-DNA binding. Consequently, we will only briefly cover the functional consequences
of this variation or other modes of regulatory variation, which
have been extensively detailed elsewhere both for humans and
model organisms (Albert and Kruglyak, 2015; Lehner, 2013;
Lowe and Reddy, 2015; Mackay et al., 2009; Pai et al., 2015).
A Brief Historical Perspective on Variable TF-DNA
Interactions as Key Drivers of Inter-individual,
Phenotypic Diversity
The discovery of regulatory sequences (or operators) in bacteria by Jacob and Monod initiated the debate of whether variation
in regulator-operator interactions could drive phenotypic diversity (Jacob and Monod, 1961). It was proposed that this variation could arise either through mutations in the regulator itself or
through mutations in the operator that would alter or abolish its
specific affinity for the repressor (i.e., regulator) (Jacob and
Monod, 1961). This fundamental prediction proved to be accurate across species, and multiple examples have since been
revealed that support both scenarios (Barrera et al., 2016; Hoekstra and Coyne, 2007; Lynch and Wagner, 2008; Wray, 2007).
The first concrete evidence supporting the importance of such
non-coding or regulatory variation for human traits or diseases
started to emerge in the early 1980s, when the molecular mechanisms underlying thalassemias were investigated. These heritable blood disorders, characterized by an abnormal form of
hemoglobin, made it intuitive to explore the globin gene locus
for disease-causing genetic variants. Numerous variants were
detected, including several polymorphisms in the b globin gene

(HBB) promoter that correlated with reduced HBB expression
(Orkin et al., 1982; Poncz et al., 1982). For example, a single
nucleotide substitution (C to G) at position 87 of HBBs transcription start site (Orkin et al., 1982) was hypothesized to affect
the recruitment of a transcriptional activator. However, it was
only 11 years later, when the erythroid Kruppel-like factor
(KLF1) was cloned, that the affected site (CA(C/G)CC) was
matched with a TF (Miller and Bieker, 1993) (Table 1). This
groundbreaking example (as well as several others listed in
Table 1) support the idea of TF-DNA interactions being key
drivers of phenotypic variation. However, the fact that the underlying molecular mechanisms were uncovered for these diseases
is still more the exception than the rule.
Assessing the Impact of Genetic Variation on TF-DNA
Binding: A Complex Affair
The ability to elucidate the molecular mechanisms underlying
thalassemia, haemophila B, or malaria resistance was made
possible because several critical pieces of information were
available that are often missing in other genotype-phenotype
relationship studies: (1) knowledge of the affected gene, which
facilitated the identification of the causal mutation(s); (2) availability of DNA binding specificity data for the implicated TF;
and (3) relatively straightforward imputation of the effect of the
causal mutation(s) on TF-DNA binding. Below, we will discuss
each of these three items in more detail and explain why they
collectively complicate studies that investigate the impact of genetic variation on molecular or organismal variation.
Identification of Causal Mutation(s) and Affected
Gene(s)
In contrast to cases like the thalassemia mutations discussed
above, GWAS studies identify genetic variants linked to particular traits, but not necessarily those actually causing the disease
(Manolio, 2013). In addition, such studies do generally have little
prior knowledge regarding which genes will be uncovered.
Therefore, by simply matching a GWAS SNP with a TF binding
site, one risks wrongly inferring that a TF must affect the expression of the gene that is most proximal to a particular binding site.
However, the actual culprit could be another genetically linked
but unprobed variant such as an indel or a rare SNP that impacts
a different binding site and thus a distinct TF and/or target gene.
This potential misidentification is why significant efforts are
currently undertaken to fine-map complex traits using statistical
arguments (Figure 1; see also the Imputing DNA binding variation section) and/or integrative genomic approaches to identify
causal variants and their target genes at nucleotide-level resolution. A striking recent example involves variants that have been
consistently associated with an elevated body mass index in
both children and adults (Dina et al., 2007; Frayling et al.,
2007). These variants are located in the first and second introns
of a large (>250 kb) gene named fatso, or FTO, because its deletion causes a fused toes phenotype in mouse (Peters et al.,
1999). Given its association with obesity, it was subsequently rephrased as fat mass and obesity-associated gene and was
widely mechanistically studied for its role in energy homeostasis
(Fischer et al., 2009). However, recent studies revealed that the
focal variants are, in fact, located in a regulatory element that
controls the expression of the TF-coding genes IRX3 and IRX5

more than 1 Mbp away (Claussnitzer et al., 2015; Smemo
et al., 2014). Thus, these variants appear to have little impact
on the FTO gene, even though they are positioned within its introns. Rather, one variant disrupts the binding site of ARID5B
(AA(T/C)ATT), resulting in elevated IRX3 and IRX5 expression.
This, in turn, increases the formation of white fat cells, possibly
leading to excessive fat accumulation (Claussnitzer et al.,
2015). Significant efforts involving a battery of advanced computational (Claussnitzer et al., 2014) and experimental approaches
were required to study the molecular function of the FTO
variants and their relationship with body mass index. This
complexity illustrates why the number of mechanistically wellstudied relationships between regulatory and phenotypic variation is still relatively low. It also explains why the majority of
such studies focused on gene proximal variants (Table 1), especially prior to 2005, when the importance of chromosome conformation in gene regulation was still less established.
Incomplete TF Motif Catalog
In the early 1990s, DNA binding variation of the TFs CEBPa and
HNF4A, as well as GATA1, was linked to, respectively, haemophilia B and malaria resistance (Table 1). The identification of
these TFs is intuitive since they were among the relatively few
TFs whose DNA binding properties had been described at the
time when these genetic studies were carried out (Faisst and
Meyer, 1992). Several other studies have since established regulatory variant-phenotype relationships on the basis of GATA1
(Table 1), illustrating how such studies tend to be restricted to
investigating phenotypes that involve well-characterized TFs.
While the development of several large-scale in vitro DNA
binding characterization technologies, such as protein-binding
microarrays (PBM) (Berger et al., 2006), bacterial one-hybrid
(B1H) screening (Meng et al., 2005), and high-throughput (HT)SELEX (Jolma et al., 2010), has enabled a significant expansion
of the TF motif catalog, it is worth noting that at least one-third of
human TFs remains uncharacterized (Box 1). In other words,
several hundred TFs are still devoid of DNA binding specificity
models such as the most routinely used position weight matrix
(PWM) (Stormo and Zhao, 2010) (Figure 1). This lack of information severely limits the ability to analyze the effects of genetic
variation on TF-DNA binding, as a large fraction of human TFs
can simply not be taken into account in such studies.
This problem becomes even larger when considering that
many TFs do not bind DNA as single entities but, rather, in the
form of obligate heterodimers such as TFs containing bZIP,
bHLH, MADS box, or Rel DNA binding domains. Since the focus
of DNA binding specificity determination studies has largely
been on single protein-DNA interactions, DNA binding motifs
for such heterodimers are underrepresented in current regulatory lexicons. Moreover, many TFs also participate in facultative
heterodimers since they can bind to DNA both in monomeric or
dimeric context. It is difficult to know how many of such heterodimers routinely form in cells, but predictions range from 3,000
(Ravasi et al., 2010) to >25,000 (Jolma et al., 2015). Interestingly,
these cooperative TF pairs often show distinct binding site
preferences compared to the respective, individual TFs, as the
heterodimer core motif typically consists of closely packed individual motifs that overlap at their flanks. Consequently, individual
Cell 166, July 28, 2016 539
Table 1. Examples Linking Variable TF-DNA Binding to Phenotypic Variation Arranged by Date of Characterization
Causal Variant Position
Relative to TSS
Phenotype
Affected Gene
Hereditary persistance of fetal

haemoglobin
HBG
175 bp
Haemophilia B Leyden
F9
20 bp; 10 bp;
6 bp
Affected Binding
Site
TFBS
Outcome
GATA1; TAL1
Gain
(Martin et al., 1989;

Wienert et al., 2015)
HNF4a; C/EBPa;
OC1/OC2
Loss
(Reijnen et al., 1992;

Crossley and Brownlee,
1990; Funnell et al., 2013)
Reference(s)
Haemophilia B Brandenburg
F9
26 bp
AR
Loss
(Crossley et al., 1992)
Delta-thalassemia
HBD
77 bp
GATA1
Loss
(Matsuda et al., 1992)
Duffy blood antigen/chemokine

receptor expression
DARC
46 bp
GATA1
Loss
(Tournamille et al., 1995)
Familial combined hyperlipidemia
LPL
39 bp
OCT1
Loss
(Yang et al., 1995)
Bernard-Soulier syndrome
GP1BB
133 bp
GATA1
Loss
(Ludlow et al., 1996)
Osteoporosis
COLlAl
Sp1
Gain
(Grant et al., 1996)
Maturity-onset diabetes of the

young
HNF1A
58 bp
HNF4A
Loss
(Gragnoli et al., 1997)
Asthma
IL10
509 bp
YY1
Gain
(Hobbs et al., 1998)
Pyruvate kinase deficiency
PKLR
72 bp
GATA1
Loss
(Manco et al., 2000)
Congenital erythropoietic porphyria
UROS
70 bp;
GATA1; CP2
Loss
(Solis et al., 2001)
Psoriasis
SLC9A3R1
237 bp
RUNX1
Loss
(Helms et al., 2003)
Systemic lupus erythematosus
FASLG
844 bp
CEBPB
Loss
(Wu et al., 2003)
Esophageal cancer
COX-2
1195 bp
c-MYB
Gain
(Zhang et al., 2005)
Treacher Collins syndrome
TCOF1
346 bp
YY1
Loss
(Masotti et al., 2005)
Alpha-thalassemia
HBA
13 bp
GATA1
Gain
(De Gobbi et al., 2006)
Holoprosencephaly
SHH
460 kb
SIX3
Loss
(Jeong et al., 2008)
Various cancers
TERT
187 bp
ETS2
Loss
(Xu et al., 2008)
Nonsyndromic cleft lip
IRF6
14 kb
AP2
Loss
(Rahimov et al., 2008)
Pierre Robin syndrome
SOX9
1.44 Mb
MSX1
Loss
(Benko et al., 2009)
Prostate cancer
MYC
200 kb
FOXA1
Gain
(Jia et al., 2009)
Colorectal cancer
MYC
300 kb
TCF7L2
Gain
(Tuupanen et al., 2009)
Asthma and autoimmune diseases
ZPBP2;
GSDMB;
ORMDL3
5 kb; +44 kb; +54 kb
CTCF
Loss
(Verlaan et al., 2009)
Myocardial infarction
SORT1
44 kb
CEBPA
Loss
(Musunuru et al., 2010)
Beta-thalassemia
HBB
71 bp
GATA1
Loss
(Al Zadjali et al., 2011)
Coagulant factor VII deficiency
F7
60 bp
HNF4A
Loss
(Zheng et al., 2011)
Osteoarthritis
GDF5
41 bp
YY1
Loss
(Dodd et al., 2013)
Breast cancer
CCND1
127 kb;
ELK4; GATA3
Loss; Gain
(French et al., 2013)
Melanoma, various cancers
TERT
+2bp;
ETS2
Gain
(Horn et al., 2013)

(Huang et al., 2013)
Increased cancer susceptibility
KITLG
+20 kb
Hirschsprung disease
SOX10
Insulin resistance
PPARG2
Type 2 diabetes and proinsulindecrease
ARAP1
Melanoma
SDHD
25 bp;
Pancreatic agenesis
PTF1A
25 kb
Acute lymphoblastic leukemia
TAL1
7.5 kb
Obesity and Type 2 diabetes
IRX3; IRX5
0.5 Mb;
Colorectal cancer
FASLG
1377 bp;
540 Cell 166, July 28, 2016
+2 kb
90 bp
76 kb
66 bp;
88 bp
P53
Loss
(Zeron-Medina et al., 2013)
30 kb
AP2; SOX10
Loss
(Lecerf et al., 2014)
6 kb
PRRX1
Loss
(Claussnitzer et al., 2014)
PAX6/PAX4
Loss
(Kulzer et al., 2014)
EHF, ELF1 & ETS1
Loss
(Weinhold et al., 2014)
FOXA2, PDX1
Loss
(Weedon et al., 2014)
MYB
Gain
(Mansour et al., 2014)
ARID5B
Loss
(Claussnitzer et al., 2015)
SP1; STAT1
Loss
(Wang et al., 2016)
+418 bp
7 bp;
4 bp
1.2 Mb
670 bp
Figure 1. A Methodological Workflow for Identifying Regulatory Variants

Sequence-based, computational methodologies that evaluate the impact of potential regulatory variants on TF-DNA binding and downstream regulatory processes are schematically presented. For every putative variant (SNV, as in this example, or indel), a reference and alternate (containing the variant) sequence of
pre-defined length (illustrated by the distinct shades of gray) is extracted. The chosen length defines the sequence environment and varies according to the
type of model that is used. The middle yellow panel shows the common workflow, where both sequences are scored (SALT and SREF) according to a specific model
representation to obtain a differential score (DS) that may indicate a change in DNA binding or more generally in chromatin state. As shown, DS supports a model
in which the variant impacts a gene regulatory process. The bottom part of the figure illustrates the two main strategies that are employed for modeling the
regulatory effect of a variant. The choice of the strategy depends on the posed question: does the variant impact (1) the binding of a TF (left) or (2) the local
chromatin landscape (right)? In the first scenario, computational methods are used that depend on the availability of a comprehensive catalog of TF binding
sequences or motifs (Box 1). The de novo motif discovery part schematizes the procedure that is required to obtain such a catalog, illustrating the use of
sequence over-representation strategies that are applied on both in vivo (ChIP-seq, DHS-seq, etc.) and in vitro (e.g., PBM, HT-SELEX, or B1H) derived datasets.
These strategies then produce TF motifs that can be represented either in regular expression format or using PWM- or HMM-based models. In this example, the
linear HMM model is a generic representation of the PWM motif, with each node (state) of the HMM representing the position of a base in the motif. Additionally, a
second HMM model is depicted, which inherently takes a variable space within the motif into account, for accurate representation of more complex binding
scenarios (e.g., TF dimers). To answer the second question (lower-right), computational methods mainly rely on machine learning models that are trained on a
wide variety of features such as a k-mer vocabulary built on regulatory versus background sequences or additional (epi)genomic datasets. These more elaborate
models can also be used to score the two input sequences. The pipeline then evaluates the regulatory nature of the variant by directly assessing the differential
score DS or by calculating a p value based on the distribution of the scores. Of note, this pipeline can be applied multiple times on different variants, after which
the results can be aggregated and compared to prioritize variants.
Cell 166, July 28, 2016 541
Box 1. How Many Human TFs Have Assigned Motifs?

While seemingly straightforward, it turns out to be difficult to precisely enumerate the number of TFs with defined motifs. There is no consensus on
the number of TF-coding genes in the human genome. A comprehensive, manual curation primarily based on the presence of sequence-specific
DNA binding domains revealed 1,391 high-confidence genes, with another 216 listed as plausible (Vaquerizas et al., 2009). However, this list
may not be exhaustive as a protein microarray-based survey of DNA binding capacity revealed that hundreds of proteins among 3,000 that
were not annotated as TFs were able to bind to DNA in a site-specific manner (Hu et al., 2009). Thus, additional experimental efforts will be required
to derive a more precise estimate of the number of TFs that the human genome encodes. Therefore, we need to simplify the question to how many of
the 1,400 high-confidence TFs have annotated motifs. A very recent, expansive analysis involving almost 1,000 high-quality ChIP-seq and 542 HTSELEX datasets produced binding site models for 601 human TFs that are retrievable from the HOCOMOCO database (Kulakovskiy et al., 2016).
Why only 40% of human TFs feature experimentally derived motifs despite the development of powerful DNA binding characterization technologies such as PBM (Berger et al., 2006), bacterial one-hybrid (B1H) (Meng et al., 2005), or HT-SELEX (Jolma et al., 2010) is unclear but may largely
be due to technical limitations, including loss of DNA binding properties or weak in vitro expression of full-length proteins. This is why most in vitro
DNA binding assays rely on analysis of the DNA binding domains (DBDs) of TFs, as these are easier to work with in terms of cloning and expression
while exhibiting DNA binding properties that appear largely comparable to the respective full-length protein versions (Jolma et al., 2013).
The largest TF families that still resist a comprehensive characterization are the high-mobility group (HMG) TFs and C2H2 zinc finger proteins (Jolma
et al., 2013), of which the human genome encodes more than 700 (Weirauch and Hughes, 2011). Progress is being made though, as illustrated by a
recent study that combined PBMs and B1H assays to probe thousands of individual C2H2 zinc finger domains with the aim of inferring a specific
DNA recognition code (Najafabadi et al., 2015). The resulting motifs proved to be highly diverse in terms of nucleotide composition and exhibited
extensive degeneracy, which means that these motifs can be represented by many different sequences and that small internal perturbations in these
motifs tend to have little impact on DNA binding. Consequently, there is still ample room for alternative approaches or technologies that will enable
the further expansion or fine-tuning of the current catalog of human TF PWMs, also named the human regulatory lexicon.
Nevertheless, it may not be necessary to gather experimental data for all TFs, given that many have nearly identical DNA binding properties because
their DBDs are highly similar. Indeed, TFs (independent of organism) whose DBDs share >87.5% of their amino acids were found to bind to motifs
that were almost indistinguishable from one another (Weirauch et al., 2014). Applying this principle to human TFs adds another 200 inferred motifs to
the current catalog, which can be found in the Cis-BP database (Weirauch et al., 2014). In sum, the DNA binding properties of a significant fraction of
human TFs remain uncharacterized without even taking into account heterodimer or higher-order complex formation (see main text).
TFs may still be able to bind to this core motif, albeit with much
lower affinity. This may, in part, explain the observed discrepancy between in vivo DNA occupancy levels and in-vitro-derived
DNA binding affinities (Biggin, 2011), since these in vivo binding
events may reflect binding by interacting TF pairs and not individual TFs. It is therefore clear that a large portion of motifs
remain to be characterized, emphasizing the need for new technologies or efforts to close this gap.
Imputing DNA Binding Variation
It has often proven difficult to infer whether a specific polymorphism will significantly change TF-DNA binding and act as a
regulatory variant, even if the PWM model of the TF is available.
This complication stems from difficulties in capturing the
DNA binding complexity of a TF in a robust binding model either
to confidently detect a genuine binding site within a given
sequence or to accurately infer the impact of a variant on detected motifs.
The Accuracy of Binding Models and Robustness of Motif
Detection. The majority of motif detection methodologies rely
on PWM representation since PWMs perform relatively well
with respect to capturing the overall binding affinity. This is
because PWMs can be modeled as a numerical matrix, which
enables the scoring of a given sequence according to its similarity to a motif (Figure 1). Nevertheless, it is important to acknowledge that this model also has several limitations, which may
impede the discovery of the correct binding patterns. For
example, PWM models assume that the nucleotide binding energies are independent (Stormo and Zhao, 2010), which proved
not to be generally valid (Bulyk et al., 2002; Jolma et al., 2013;
Maerkl and Quake, 2009; Nutiu et al., 2011), and are also suboptimal to represent the binding of TF dimers, since many of these
542 Cell 166, July 28, 2016
bind to two sequences that are separated by a spacer with variable length. These caveats have spurred the development of
different models for representing TF motifs, such as hidden Markov models (HMMs) (Gelfond et al., 2009; Zhao et al., 2005) and
more advanced machine learning models, stimulated by the
increasing availability of multiple layers of genomic, transcriptomic, and epigenomic information. Among these are support
vector machine (SVM) or neural network (NN) approaches that
are trained on datasets containing both known regulatory and
random sequences, with the goal of recognizing and scoring
new putative regulatory sequences (Gao and Ruan, 2015)
(Figure 1 and Table S1). Such representations have many advantages over conventional models because they are highly flexible.
In addition, they are not limited to the DNA sequence recognized
by the TF and can incorporate additional features that are also
important to model TF-DNA binding. These features include
the 3D structural conformation of DNA and its steric characteristics (Levo and Segal, 2014; Rohs et al., 2009; Zhou et al., 2015),
the chemical properties used to model TF amino acid-nucleotide
contacts at the atomic level (Bauer et al., 2010; MaienscheinCline et al., 2012), protein concentration (Djordjevic et al.,
2003; Wang and Batmanov, 2015) that allows for a more accurate estimation of DNA occupancy and thus intrinsic DNA binding affinity (Biggin, 2011; Simicevic et al., 2013), and, finally,
the nucleotide composition of motif-neighboring sequences.
Indeed, recent work revealed that the sequence environment
of a genuine binding site tends to be distinct from that of unbound sequences. In particular, it was shown to exhibit specific
sequence features such as high GC content (White et al., 2013)
or a higher similarity to the core motif (Dror et al., 2015) that
may guide TFs to their cognate binding sites. These findings
have important consequences in terms of predicting DNA

binding events, since motif-scanning tools typically penalize for
local nucleotide composition biases. Instead, a better practice
may now involve rewarding motifs that are surrounded by established DNA binding-promoting features, such as a high GC fraction or lower-scoring and thus weaker homotypic (i.e., similar)
motifs. Together, these studies illustrate that the formulation of
DNA binding models and computational detection of genuine
binding sites is far from trivial and that further efforts aimed at
integrating a wide range of genomic datasets will be required
to increase the robustness of motif definition and mapping
approaches.
The Complexity of Correctly Inferring the Effect of Motif Variation on TF-DNA Binding. Genetic variants that change a TF motif
often affect the binding ability of a TF to that site because of an
altered DNA binding affinity (Table 1 and Figure 1). Initial efforts
to computationally predict relevant regulatory variants simply
revolved around the consideration of all SNPs that overlap with
TF binding sites (Ameur et al., 2009; Chorley et al., 2008; Ponomarenko et al., 2001). However, given the degenerate nature of
binding motifs (i.e., binding is not binary but is variable depending on different sequences), these kinds of analyses tend not to
provide good sensitivity. A more refined approach in this regard
is to analyze the difference in DNA binding affinity (for example,
scored using a PWM) between two alleles, i.e., the reference and
the alternate impacted by the variant (Figure 1 and Table S1). The
greater this difference, the greater the predicted impact of the
variant on binding of the respective TF and thus also the greater
the likelihood of it being causal.
More recent machine learning methods no longer depend on
the use of a strict motif database and directly infer regulatory
effects from k-mer vocabularies trained on ChIP-seq or other
experimental data. These vocabularies consist of all possible
DNA sequences of length k that collectively capture specific
sequence properties of certain regulatory elements such as
cell-type-specific enhancers. This methodological development
stems from the general appreciation in the field that the motif
alone cannot accurately predict differential DNA binding and
thus should be complemented (or even replaced) with information on the sequence environment around the focal variant, as
well as on other DNA or chromatin features that enhance the
models overall predictive power (as already covered in the previous section). Indeed, it is now well accepted that only a minority
of motif-disrupting variants effectively result in altered DNA binding of the respective TF (Heinz et al., 2013; Kilpinen et al., 2013;
Maurano et al., 2015; Spivakov et al., 2012). One possible explanation is based on the finding that, across the genome, TF motifs
appear to occur in clusters with some built-in redundancy (Gotea
et al., 2010), in line with the observation that the sequence
environment of relevant TF binding sites tends to have a certain
similarity to the core motif (Dror et al., 2015). These clustered
sites may buffer genetic perturbations that affect one of the
motifs. Indeed, the greater the number of such homotypic motifs,
the greater the buffering effect (Kilpinen et al., 2013). Given
the pervasive nature of this buffering phenomenon (Maurano
et al., 2015), the failure to take such neighboring homotypic
motifs into account may result in false TF-DNA binding event
predictions.
More generally, epigenomic properties such as nucleosome

location (Soufi et al., 2015) or density (Barozzi et al., 2014) or
DNA methylation (Domcke et al., 2015) may impact the ability
of a TF to bind to a certain DNA sequence. Upon screening
1,300 human TFs for their ability to bind to one of 150 distinct
CpG-containing motifs, 47 were found to bind to DNA with
several exhibiting methylation-specific DNA specificities (Hu
et al., 2013). This is consistent with an earlier study demonstrating that the TF Kaiso is capable of binding not only to an
unmethylated motif, TCCTGCNA, but also to a methylated,
clearly distinct palindromic motif, TCTmCGmCGAGA, with
even greater affinity (Raghav et al., 2012). How frequently such
methylation-dependent changes in DNA binding occur and the
extent to which other DNA modifications affect DNA binding
specificities is still a matter of debate. Nevertheless, it is clear
that it adds another complexity in linking DNA variation to variable TF binding.
Ongoing computational studies are attempting to take these
complexities into account by implementing big data analyses
that are creating extended machine learning models that rely on
multilayered information of different types of genomic data,
including TF motifs, DNase hypersensitivity sites (DHS), chromatin marks, etc. (see, for example, Table S1). As such, they
can recognize regulatory regions based not only on pure
sequence information, but also on the chromatin state of the
DNA both at the variant locus as well as at neighboring regions.
Once correctly trained, these approaches can be very precise
and predict causal variants and their effects at distinct molecular
levels (Alipanahi et al., 2015; Zhou and Troyanskaya, 2015)
(Figure 1).
However, it is important to emphasize that their performance
depends not only on the diversity of input data, but also on the
correct selection of relevant features. For example, it has been
repeatedly shown that it is crucial to gather data that are specific
to the variant-linked trait or disease in terms of cell type, differentiation stage, tissue, or species since regulatory activity is variable and context dependent (Consortium, 2012; Maurano
et al., 2015). Another limitation of these extended representation
models that may dampen their widespread implementation is
their inherent black box nature. Indeed, most of the binding
patterns that were unveiled by these techniques are difficult to
interpret, especially when no visual representation is provided.
However, despite these caveats, advanced models have the
potential of uncovering completely novel and potentially unexpected cross-mechanisms that more standard methodologies
may fail to grasp.
TF-DNA Binding Is Itself a Complex, Molecular Trait
We are currently limited in our ability to predict TF binding as well
as in our understanding of how genetic variation impacts on this
process. Nevertheless, there is general consensus that differential, regulatory control by TFs is a major driver of phenotypic variation. A key aspect of this regulatory variation is variable TF-DNA
binding. It is in this regard intriguing that only a minority of variable TF binding events are driven by nucleotide changes in the
motifs of the studied TFs. For example, upon assessing binding
variation of the TF NFKB in ten distinct human lymphoblastoid
cell lines (LCLs), only 79 out of >1,100 variable TF-DNA binding
Cell 166, July 28, 2016 543
Figure 2. Distinct Modes of Genetic Variation-Mediated Changes in TF-DNA Binding

(A) Only a minority of variable TF-DNA binding events are caused by DNA variants disrupting the cognate TF recognition motif.
(BD) The majority of variably binding events are motif variation independent, signifying that a variant located either proximally (<200 bp, B and C) or distally (D) to the
focal motif affects the binding of the respective TF. Proximal variants can affect local cooperative DNA binding (B), which involves physical protein-protein interactions that require overlapping or very closely located (a few bp) motifs, or collaborative DNA binding (C), which reflects TF interdependencies needed, for
example, to compete with nucleosomes and thus to access DNA. In contrast, distal variants (D) may alter chromatin state or conformation (e.g., DNA loops), which
could affect the stability of interactions with DNA and between TFs.
events had a SNP directly located in the NFKB motif and induced
a binding difference that was consistent with its perceived
impact on motif quality (i.e., reduced binding was linked to a
SNP that lowered the PWM binding score and vice versa)
(Kasowski et al., 2010). One of the possible reasons that were
listed (next to LD or putative epigenomic variation) involved
trans-effects. However, ChIP-seq analyses of >20 TFs revealed
extensive, allele-specific DNA binding (in a constant trans
environment), effectively refuting this hypothesis (Reddy et al.,
2012). Subsequent studies in human LCLs and in cells or
tissues derived from distinct mouse strains observed a similar
pattern (Heinz et al., 2013; Kilpinen et al., 2013; Soccio et al.,
2015; Stefflova et al., 2013), collectively emphasizing the importance of cis-regulatory variation. Importantly, only a minority
of differential allelic occupancy events involved nucleotide
changes in the respective motifs (Reddy et al., 2012). However,
this does not mean that variation in the motifs of other TFs
should also be dispensed as a possible molecular mechanism for these observationsquite the contrary, in fact, as we
will clarify in greater detail in the next paragraphs (see also
Figure 2).
If a particular genetic variant does not affect the motif of the
studied TF, what then causes the respective TF to exhibit differential DNA binding? It appears that an important fraction (at least
7.5% according to our own estimate [Kilpinen et al., 2013]) of
variable TF-DNA binding events can be explained by alterations
of proximal motifs (Reddy et al., 2012). Thus, at some genomic
sites, TFs appear to be dependent on the proximal presence of
other TFs to bind to DNA. Qualitative motif analysis combined
with prior knowledge about the biological process in which the
focal TF is operational lends credibility to this notion. For
example, in mouse white adipose tissue, PPARg binding sites
that vary between strains and do not harbor an altered PPARg
motif were analyzed for enriched, polymorphic motifs. The topscoring motifs corresponded to the TFs CEBPa and glucocorticoid receptor (Soccio et al., 2015) that exhibit extensive
co- localization with PPARg in mature white fat cells (Siersbk
et al., 2014). Similarly, differential PU.1 binding correlated with
544 Cell 166, July 28, 2016
alterations in the motifs for the TFs CEBP and AP-1, which
modulate macrophage activity (Heinz et al., 2013). However,
this correlation appears to differ according to macrophage subtype. Indeed, a follow-up study in mouse microglia revealed that
other TF motifs correlate better with variable PU.1 DNA binding,
emphasizing the importance of cellular context in determining
this type of TF interactions (Gosselin et al., 2014). Together,
these studies strongly support the notion of pervasive DNA binding whose occurrence is dependent on the presence of other
TFs. Since it is well appreciated that regulatory regions tend to
harbor binding sites for multiple TFs, this notion may not be
entirely surprising. Nevertheless, it is worthwhile in the current
context of genetic variation to briefly revisit this mode of DNA
binding, which is interchangeably called cooperative or collaborative DNA binding (Gosselin et al., 2014; Mirny, 2010; Slattery
et al., 2014; Waszak et al., 2015). We would thereby like to argue
that, for the sake of discussion and molecular understanding, it
might be valuable to differentiate between these two terms
(Figure 2).
Local, Cooperative TF-DNA Binding
In the context of protein-DNA interactions, cooperativity was
initially used in describing the assembly of E. coli lambda repressors on DNA (Ptashne et al., 1980). Binding of a lambda
dimer on a first operator site facilitates binding of another
lambda dimer on the second operator site, given that physical
interactions between the first and second dimer increase the
affinity of the latter for DNA, which explains why cooperative
DNA binding is evoked to define this process. Consequently,
the term cooperativity may be especially suited for DNA binding
processes that involve TFs whose physical interactions at the
protein level may increase the affinity of the entire complex to
specific sites in the genome. For example, binding of the winged
HTH DNA binding domain-containing TF IRF4 is cooperatively
enhanced by the TF PU.1 (Escalante et al., 2002). This is
because binding of the two TFs contorts the DNA in a peculiar
S shape, placing the TFs in an optimal position for electrostatic
and hydrophobic interactions and thus stabilizing the entire
complex (Escalante et al., 2002). Consequently, individual
nucleotide alterations in one of the two binding sites may alter

the extent of cooperativity between two heterodimerizing TFs,
as has recently been quantified for the PPARg-RXRa heterodimer (Isakova et al., 2016). This, in turn, illuminates why the
disruption of either of the two TF motifs tends to affect binding
of the respective heterodimer.
Proximal, Collaborative TF-DNA Binding
For TFs to physically interact on DNA and thus for cooperative
DNA binding to occur, one would intuitively expect that the
respective motifs would be located very close to one another
or would even overlap. However, DNA binding relationships exist
between TFs whose motifs are separated tens, hundreds, or
even thousands of base pairs from one another. For example,
upon examining which TFs (based on motif matches) associated
with NFKB binding enrichment (based on ChIP-seq data), EBF1
and STAT1 were among the most correlated TFs (Karczewski
et al., 2011). Interestingly, this covariation signal of EBF1 and
STAT1 motifs within variable binding regions of the TF NFKB remained significant up to 500 bp from the NFKB binding peak center, suggesting that DNA binding dependencies between these
TFs were maintained over a relatively long distance. It is now
increasingly appreciated that many such dependencies do not
require direct contacts but instead reflect a relatively well-understood phenomenon termed collaborative DNA binding in which
two or more TFs compete with a nucleosome to access DNA
(Biggin, 2011; Mirny, 2010; Spitz and Furlong, 2012) (Figure 2).
Given that the intrinsic affinity of a nucleosome for DNA is
much greater than that of a TF alone (Polach and Widom,
1996), it may often require two or more collaborating TFs to
displace the nucleosome. In this scenario, TFs would be mutually
dependent, and this is indeed what is observed. For example,
HNF4A and CEBPa functioning in mouse liver exhibit a mutual
dependency, given that loss of HNF4A affected CEBPa DNA
binding and vice versa, whereas the absence of HNF4A did not
impact on the DNA binding dependency between CEBPa
and FOXA1 (Stefflova et al., 2013). A similar DNA binding interdependency was found between PU.1 and CEBPa in primary
macrophages (Heinz et al., 2013). It is worth noting that such
nucleosome-mediated, collaborative DNA binding could still be
regarded as a form of indirect, cooperative DNA binding, as
modeling has revealed an analogy between this process and
the one involving cooperative binding of oxygen to hemoglobin
(Mirny, 2010). Nevertheless, to avoid confusion, it may be best
to continue to define this process as collaborative DNA binding.
Interestingly, several of these collaborating TFs have previously been defined as pioneer TFs that are uniquely able to access and open silent or compacted chromatin (Iwafuchi-Doi
and Zaret, 2014). The fact that, at a wide range of loci, they are
nevertheless dependent on other TFs to access DNA constitutes
in this regard an intriguing paradox. For example, FOXA1 is
defined as an archetypical pioneer TF (Mancini and West,
2015), given its ability to open closed chromatin by binding to
DNA with its core DNA binding domain and to core histones
with a binding motif that is located in its C terminus (Cirillo
et al., 2002). However, its DNA binding interdependency with
CEBPa or potentially other TFs suggests that FOXA1s pioneering ability may often not be sufficient to allow DNA binding,
implying that FOXA1 requires the cumulative contribution of
other TFs to displace nucleosomes and successfully unlock

chromatin. This model may be consistent with the dispensability
of FOXA1 and the related TF FOXA2 in maintaining the chromatin
state in liver cells (Li et al., 2012). Similarly, PU.1 is also recognized as a pioneer factor for its ability to promote nucleosome
depletion (Barozzi et al., 2014; Heinz et al., 2010), yet it depends
on CEBP TFs to bind DNA at many genomic sites. What emerges
is that, at some loci, these TFs may act as true pioneer TFs,
whereas at others, they may require collaborations with other
TFs to open chromatin.
Consequently, it is of interest to better understand what distinguishes genomic sites with pioneer activity from collaborative
ones. An interesting observation in this regard is that regions
with high PU.1 occupancy in primary macrophages had, in general, similar motif scores to those with lower PU.1 binding, but
the two types of regions differed in nucleosome organization
(Barozzi et al., 2014). Specifically, the latter regions were surrounded by two nucleosomes (in contrast to sites with high
PU.1 binding) and showed enrichment for the NFKB motif, suggesting that, at those regions, PU.1 and NFKB need to collaborate to outcompete nucleosomes and thus to achieve high
DNA occupancy. As such, TFs may have locus-dependent TF interdependencies reflecting both nucleosome structure and the
presence of distinct TF motif clusters, consistent with the dependency of PU.1 on NFKB at some sites or either OCT2, BLIMP1,
or STAT2 at others in human LCLs (Kilpinen et al., 2013). This
view is also consistent with the flexible binding site grammar
that is typically observed in enhancers in that the position of individual TF motifs within enhancers tends to be of secondary
importance to their simple presence (Arnosti and Kulkarni,
2005). In other words, since collaborative DNA binding does
not require physical contacts, the spacing and orientation of motifs can be flexible with respect to preserving enhancer activity,
as long as the motifs are intact. This also implies that, at collaborative genomic sites, TFs should in principle bind to DNA in
seemingly joint fashion since loss of one TF-DNA interaction
(either in cis [e.g., because of a DNA mutation] or in trans
[because of TF dysfunction]) would reduce the binding capacity
of all other TFs at this locus. Such collective DNA binding
behavior has indeed been observed for TFs mediating heart
development in Drosophila melanogaster (Junion et al., 2012),
and evidence for simultaneous, collaborative TF-DNA binding
is also available for mammals (Adam et al., 2015; Siersbk
et al., 2014; Tijssen et al., 2011). A final illustration of this important notion is the dependency of the pioneer TF NRF1 (Sherwood
et al., 2014) on other TFs to keep a specific set of its target sites
from being methylated, which otherwise would block NRF1 DNA
binding to these sites (Domcke et al., 2015). This example again
illuminates how sequence context may affect the ability of a TF to
bind independently to DNA, even if this TF may normally act as a
pioneer factor. In sum, based on the currently available data,
care needs to be taken when classifying TFs into specific categories without considering sequence and chromatin context.
Variable Chromatin Modules Mediate Long-Range
TF-DNA Binding Interdependence
The previous sections highlighted that proximal variants can
affect DNA binding through cooperative or collaborative mechanisms. However, many of the variants that drive TF-DNA binding
Cell 166, July 28, 2016 545
variation are located beyond the sequence span that is required

for the formation of local TF-TF interactions or for competition
with local nucleosomes, respectively. One possibility is that proteins overcome this distance restraint by inducing DNA looping
through physical interactions. Even though this is an energetically costly process (Saiz and Vilar, 2006), both short- and
long-range looping have now been extensively documented
(de Wit et al., 2013; Gheldof et al., 2010; Lieberman-Aiden
et al., 2009; Rao et al., 2014; Saiz and Vilar, 2006) and may
play an important yet poorly understood role in mediating longdistance TF-DNA binding interdependencies.
It is therefore valuable to explore approaches to study the molecular origin of both short- and especially long-range TF-DNA
binding variation. One such approach is the identification of genetic polymorphisms that significantly correlate with changes in
DNA occupancy. Genomic regions in which such variants are
located are interchangeably termed TF or binding quantitative
trait loci (tfQTLs or bQTLs), as their detection suggests that a
polymorphism within this locus causally affects the ability of a
TF to bind to DNA. One study adopting this approach aimed to
identify variants that affect DNA binding of the insulator protein
CTCF by profiling its binding landscape in human LCLs using
ChIP-seq, after which tfQTLs were explored within a 50 kb region
centered around the CTCF binding region (Ding et al., 2014).
Only a minority of detected tfQTLs overlapped the CTCF motif,
even when the local LD structure was taken into account. A
similar picture emerged from a comparable study on PU.1
DNA binding variation also in human LCLs, since PU.1 tfQTLs exhibited a bimodal log-normal distribution in terms of their distance to the PU.1 binding region (Waszak et al., 2015). The first
mode represented tfQTLs that were located close to or at the
PU.1 binding site and, consistent with the CTCF study, encompassed only a minority of the significantly associated variants.
The second mode featured tfQTLs that were located distally to
the PU.1 binding region with a median distance between 20
and 30 kb. Together, these findings suggest that many variable
CTCF or PU.1 binding events are driven by long-distance mechanisms, which renders TF-DNA binding a complex molecular
trait by itself. Consequently, and even though the effect size of
distal tfQTLs tends to be inferior to that of proximal ones (Waszak
et al., 2015), it will be valuable to decipher how these distal variants affect TF-DNA binding, given that they constitute the majority of DNA binding QTLs.
In the LCLs, PU.1 binding variation often correlated with variation in active chromatin marks such as H3K4me1 or H3K27ac
not only locally, but often over extended distances (Waszak
et al., 2015). That is, high PU.1 DNA occupancy coincided with
both high proximal and distal H3K4me1 and H3K27ac enrichment and vice versa. Such regions with a high level of molecular
coordination between TF and chromatin marks have recently
been termed variable chromatin modules (VCMs; Waszak
et al., 2015; Figure 3A). Each VCM is thus composed of molecular phenotypes (e.g., the level of DNA occupancy by a TF or
enrichment for a specific chromatin mark) that are highly coordinated, often over multiple kbp of DNA. More than 14,000 distinct
VCMs were discovered in human LCLs, covering about 5% of
the genome (Waszak et al., 2015). The majority of these were totem VCMsso named because they were composed of
546 Cell 166, July 28, 2016
stacked or overlapping molecular phenotypes that did not correlate with other neighboring molecular phenotypes. Thus, a totem
VCM represents local chromatin state variation (Figure 3A). The
remaining multi-VCMs are more interesting since, while a
minority, they typically cover two or more distinct regulatory elementshence the term multiand capture the majority of all
detected molecular phenotypes (Figure 3A). The origin of a
multi- VCM is less intuitive than that of a totem-VCM. Its structure suggests, however, a higher-order chromatin organization
that is reminiscent of the modular genomic structure that has
been uncovered in the form of topologically associating domains
(TADs) (Dixon et al., 2012; Nora et al., 2012).
These TADs constitute distinct, three-dimensional genomic
structures in which sequences are more likely to interact with
one another than with those located outside the respective
TAD. However, VCMs and TADs constitute different molecular
entities because VCMs tend to be embedded within TADs and
thus tend to be smaller (Waszak et al., 2015) (Figure 3B). In addition, TADs are relatively stable across cell types and during
development and are even conserved across species (Dixon
et al., 2012; Vietri Rudan et al., 2015), whereas VCMs are by definition variable. As such, multi-VCMs correspond conceptually
better to sub-TADs, which are more fine-grained (sub-Mb),
genomic topologies that have been shown to be dynamic across
cellular differentiation (Dixon et al., 2015; Phillips-Cremins et al.,
2013) and to even differ between individual cells (Giorgetti et al.,
2014). In addition, sub-TADs have been suggested to define cisregulatory networks (Berlivet et al., 2013), with their internal
conformational dynamics being directly related to embedded
transcriptional activity (Giorgetti et al., 2014; Tang et al., 2015).
In parallel, the vast majority of gene-associated multi-VCMs exhibited a molecular activity state that significantly correlated with
the transcriptional activity of the included gene(s) (Waszak et al.,
2015) (Figure 3B). Moreover, the more regulatory elements
encompassed in a VCM, the more likely it was to associate
with variable gene expression. Together, the conceptual similarities between sub-TADs and multi-VCMs suggest that the latter
also reflect fine-grained configurations of interacting regulatory
elements with one or a few target genes whose collective, molecular activity is highly coordinated. As such, VCMs may
provide substantial insights into the structural and thus modular
organization of the chromatin landscape, including TF-DNA
interactions.
Which mechanisms lie at the origin of multi-VCMs? Since the
long-range molecular coordination that typifies multi-VCMs has
been observed at the allelic level (Kasowski et al., 2013; Kilpinen
et al., 2013; McVicker et al., 2013) and since recent chromatin
interaction analysis by paired-end tag sequencing (ChIA-PET)
data has also provided evidence for allele-specific chromatin topologies (Tang et al., 2015), it is reasonable to assume that the
observed molecular variation is largely driven by genetic factors.
Moreover, most of the molecular variation within each VCM
could be captured by a single, quantitative phenotype (Waszak
et al., 2015), which suggests that the activity state of a VCM
can be attributed to relatively few but strong causal variants.
QTL mapping using the activity state of each VCM as input
yielded vcmQTLs that were highly enriched in TF-occupied regions (Waszak et al., 2015) (Figure 3A). Together with previous
Figure 3. Variable Chromatin Modules

(A) Correlated TF (e.g., PU.1 or RNA polymerase II [PolII]) binding and chromatin mark (e.g., H3K27Ac, H3K4me1, H3K4me3) enrichment analyses across individuals allows the mapping of variable chromatin modules (VCMs) (shown in light green in the upper panel and in network format in the panel below). VCMs
thus embody variable regions with highly coordinated, molecular phenotypes.
(B) The majority of VCMs have a totem structure of stacked molecular phenotypes that do not correlate with other neighboring molecular phenotypes and, as
such, reflect local chromatin state variation. Multi-VCMs encompass sub-Mb regions involving distinct regulatory elements whose activity is highly coordinated
and driven by a single or a few highly penetrating variants (vcmQTL) with enrichment in TF-bound regions.
(C) VCMs constitute functional entities of higher-order chromatin organization embedded within topologically associating domains (TADs) and provide a molecular rationale as to how TF-DNA binding can be affected by distal genetic variation.
Cell 166, July 28, 2016 547
observations that TF-DNA binding perturbations are initiating

drivers of downstream changes in chromatin state and gene
expression (Kasowski et al., 2013; Kilpinen et al., 2013; McVicker
et al., 2013) (Table 1), these findings support a model in which the
alteration of one or a few TF binding events affects all molecular
phenotypes in the respective VCM, including other embedded
TF-DNA interactions (Figure 3B). Thus, the ability of a TF to
bind to a VCM-associated genomic region appears to be a function of the respective VCMs activity state, which itself seems
determined by one or few key TFs. As such, VCMs provide a conceptual framework to rationalize how distal genetic variation can
affect TF-DNA binding.
Defining these key TFs remains a work in progress, since only
few TFs reached significant enrichment in terms of their overlap
with vcmQTLs (Waszak et al., 2015). This suggests that each
VCM may have its own set of activity-determining TFs. Interestingly, distinct pairs of these same TFs were also enriched at pairs
of regulatory elements that belonged to the same VCM (Waszak
et al., 2015), suggesting that the functional interactions between
these TFs (or among themselves) may be instrumental for forming VCMs. Together, the presented findings support a scenario
in which the activity of each VCM is driven by a set of cell- and
chromatin-context-specific TFs. This would be consistent with
TF-DNA binding being highly dependent on proximal sequence
environment and chromatin organization, which may differ
from one VCM to the next. In addition, it would be compatible
with the multiple enhancer variant hypothesis (Corradin
et al., 2014), which dictates that linked variants in distinct regulatory elements often jointly contribute to gene expression variation. The VCM landscape may, as such, also be compatible with
the LD structure of the genome.
What is the molecular nature of VCM-embedded TF-TF interactions? Based on the conceptual similarity between VCMs and
sub-TADs and on how canonical enhancer-promoter interactions are established (Ciabrelli and Cavalli, 2015), it is conceivable that they are mediated by either direct physical contacts
or by indirect protein-protein interactions involving more generic
factors such as mediator, CTCF, and cohesins (Dekker and
Mirny, 2016). The latter proteins may function to stabilize the
interactions both with DNA and between TFs such that
distal DNA binding interdependencies arise. However, other
interaction-independent mechanisms could also underwrite
such interdependencies, including long-range, transcription-,
or repression-coupled chromatin remodeling processes (Hathaway et al., 2012; Smolle and Workman, 2013). Further experimentation will be required to elucidate the involvement and
contributions of key individual TFs or TF pairs in VCM formation.
From Causal Variant to Complex Phenotype
While the identification and characterization of a trait- or disease-associated variant that causally disrupts TF-DNA binding
is difficult, elucidating how it impacts on other potentially downstream molecular and biological processes may be equally if not
more challenging (Edwards et al., 2013). One intuitive strategy to
expand on the relatively few cases so far in which a causal relationship between molecular and phenotypic variation was established (Table 1) is the integration of other genetic or molecular
data to infer the functional consequences of the focal variant.
548 Cell 166, July 28, 2016
For example, distinct QTL datasets can be used to determine

whether the variant impacts not only TF-DNA binding, but also
the chromatin landscape, gene expression, or even other molecular phenotypes (Pai et al., 2015). The most common molecular
QTL analysis, involving the identification of variants that associate with gene expression changes (i.e., eQTLs), is highly informative in this regard. For six distinct human populations, the
most significant eQTLs were consistently found to overlap TF
binding sites (Auton et al., 2015), thus providing direct insights
into the identity of genes whose expression may be affected
by variable TF-DNA binding.
However, other layers of molecular phenotypesmore associated with regulatory functions and therefore often defined as
regulatory QTLscan also be associated with genotypes. These
include: (1) DNase I sensitivity (ds)QTLs that are strongly enriched in predicted TF binding sites in addition to being major determinants of gene expression variation (Degner et al., 2012);
(2) chromatin (c)QTLs or histone marks (hm)QTLs that are largely
concordant with TF-DNA binding and transcription (Grubert
et al., 2015; Waszak et al., 2015), and (3) methylation (m)QTLs
that also often exhibit a functional link with the other regulatory
QTLs (Banovich et al., 2014; Domcke et al., 2015; GutierrezArcelus et al., 2013; Heyn et al., 2013; McClay et al., 2015). Since
regulatory QTLs as well as eQTLs were found to be enriched in
complex trait or disease susceptibility variants (Albert and Kruglyak, 2015; Grubert et al., 2015; Nicolae et al., 2010; Waszak
et al., 2015), their joint analysis may reveal how specific perturbations triggered by causal genetic variants in a certain condition
or environment may first spread through transcriptional and
other molecular networks before affecting the cellular, tissue,
and finally organismal networks (Lehner, 2013; Mackay et al.,
2009).
An intriguing observation that emerged from such analyses is
that many regulatory QTLs do not overlap eQTLs, even if they
overlap other types of regulatory QTLs (Degner et al., 2012; Grubert et al., 2015; Waszak et al., 2015). This is consistent with the
well-established finding that many changes in TF-DNA binding
have no measurable effect on gene expression (Cusanovich
et al., 2014; Farnham, 2009). Thus, regulatory QTL analyses suffer from the same limitations as complex trait or disease susceptibility GWAS studies, i.e., difficulties in uncovering leading
causal variants among LD blocks or in reaching statistical significance without a high number of samples (Veyrieras et al., 2008).
Approaches that link chromatin organization to transcriptional
function such as ChIA-PET (Dowen et al., 2014; Tang et al.,
2015) or VCM mapping (Waszak et al., 2015) may in this regard
prove valuable, as they can provide a structural framework for interpreting regulatory variation. Indeed, as regulatory variants
tend to impact different layers of molecular phenotypes, it is
intrinsically valuable to know how these layers are coordinated
across distinct genomic domains. For example, many VCMs
were identified that consisted of active chromatin marks as
well as TF binding sites, even though an important portion of
such VCMs did not vary along with the expression of neighboring
genes. These VCMs, termed island VCMs (Waszak et al.,
2015), thus represent coordinated changes in TF binding and
chromatin state without measurable impact on gene expression.
Accordingly, QTLs for such island VCMs tend to overlap with
tfQTL and cQTLs, but not with eQTLs. There are several complementary hypotheses that could explain the existence of such
island VCMs, including (1) futile regulatory activity without
transcriptional consequences (Cusanovich et al., 2014; Farnham, 2009; Wasserman and Sandelin, 2004); (2) regulatory
redundancy, which prevents a gene-specific regulatory network
from collapsing even if one node or edge is impacted (Pai et al.,
2015), consistent with the shadow enhancer concept (Hong
et al., 2008); (3) regulatory regions that are not transcriptionally
operational, at least in the studied condition/cellular environment, which implies that the activity of these regions is tissue
specific. Indeed, if, in a hypothetical study, a complex trait-associated regulatory variant would be linked to an island VCM, it
might indicate that an incorrect system or context is being
studied, as its disconnection with gene expression is unlikely
to yield a cellular or organismal phenotype. This reasoning is
consistent with the observation across several studies that
GWAS variants tend to be most enriched for eQTLs in tissues
that are relevant to the phenotype (Emilsson et al., 2008; Nica
et al., 2010; Torres et al., 2014). However, in most cases, the
causal variants are obviously unknown a priori. To identify
them, it may prove valuable to, similar to eQTLs, map VCMs in
as many distinct cell types/tissues as possible. The resulting
set of VCMs may then provide guidance to both variant identification and characterization. Indeed, the most interesting candidates among the set of associated GWAS variants would be
those that impact not only on the chromatin topology (e.g.,
vcmQTLs) and state of the respective locus (e.g., cQTLs or
tfQTLs), but also on expression of the embedded gene. Once
identified, it should be relatively straightforward to detangle the
underlying molecular mechanisms since the coordinated, molecular phenotypes that make up the focal VCM should provide
clear insights into the flow of regulatory information, i.e., from
causal nucleotide over gene to ultimately cellular or organismal
phenotype.
DNA in machine learning approaches. In addition, it is increasingly appreciated that the chromatin context needs to be accounted for when searching for causal, regulatory variants and
that, in general, the use of cell types or systems that are most
relevant for the studied trait or disease will yield the best results.
It is also important to recognize that only a small fraction of all
variable TF-DNA binding events is actually driven by variation
within the motif of the studied TF. Thus, similar to gene expression, TF-DNA binding is a complex molecular trait by itself, which
has profound implications for our understanding of how regulatory variation arises.
Well-established concepts in the gene regulation field provide
an intuitive molecular foundation for local or proximal variantdriven DNA binding variation. Specifically, the former involves
cooperative DNA binding that is mediated by direct, physical
interactions between TFs, while the latter appears to be driven
by collaborative DNA binding that is likely reflective of sequenceor chromatin-context conditioned TF interdependencies to
displace nucleosomes and open chromatin. However, the mechanisms that underlie distal variant-driven DNA binding changes
are much less well understood (Figure 2). The identification of
3C-, ChIA-PET-, or VCM-based chromatin entities that link
structural information to transcriptional function is important in
this regard since they offer a molecular rationale to explain these
prevalent, long-range DNA binding dependencies. Sustained
efforts will therefore be required to unravel the modular structure
of the variable (epi)genome across a wide range of cells or
tissues. Thus, although many challenges remain, exciting progress is being made in elucidating the genetic basis of TF-DNA
binding variation that will undoubtedly improve our ability to
achieve a nucleotide-level understanding of the molecular mechanisms underlying many complex traits, including disease
susceptibility.
Conclusions
The fundamental discovery that most complex trait-associated
variants are located in non-coding, putatively regulatory regions
of the genome has focused the spotlight on TF-DNA interactions
as important mediators of phenotypic variation. Yet, to date,
relatively few examples are available in which a clear mechanistic relationship between TF-DNA binding variation and phenotypic variation was established (Table 1). To clarify why this is
such a challenging task, we focused in this Review on elucidating
how the impact of genetic variation on TF-DNA binding can be
assessed and why, contrary to expectations, this is itself already
inherently complex. There are several current limitations that will
have to be addressed to improve our ability to identify and interpret regulatory variation, including the need for new experimental or computational approaches that will enable us to
expand the TF motif catalog, to better predict genuine TF binding
sites, and to evaluate how motif variation affects TF-DNA binding. Promising research avenues in this regard include the development of new technologies to characterize monomeric and
higher complex TF-DNA binding properties and the incorporation of additional DNA binding features such as the sequence
environment and the conformational and chemical nature of
Supplemental Information includes one table and is available with this article
online at http://dx.doi.org/10.1016/j.cell.2016.07.012.
SUPPLEMENTAL INFORMATION
ACKNOWLEDGMENTS
We thank Richard Benton (University of Lausanne), Sebastian Waszak (EMBL),
Alina Isakova, Antonio Meireles-Filho, Petra Schwalie, and other members of
the Deplancke Laboratory, as well as the anonymous reviewers for useful comments on the manuscript. We also would like to acknowledge scientific discussions with all members of the Effect of sequence variation on chromatin
structure and transcription Sinergia Consortium (i.e., the Reymond and Hernandez Laboratories [UNIL] and the Dermitzakis Laboratory [University of
Geneva]). This work was supported by the Swiss National Science Foundation
grant CRSI33_130326, by SystemsX.ch (AgingX, 51RTP0_151019), and by
institutional support from the Swiss Federal Institute of Technology in Lausanne (EPFL).
REFERENCES
Adam, R.C., Yang, H., Rockowitz, S., Larsen, S.B., Nikolova, M., Oristian, D.S.,
Polak, L., Kadaja, M., Asare, A., Zheng, D., and Fuchs, E. (2015). Pioneer factors govern super-enhancer dynamics in stem cell plasticity and lineage
choice. Nature 521, 366370.
Albert, F.W., and Kruglyak, L. (2015). The role of regulatory variation in complex
traits and disease. Nat. Rev. Genet. 16, 197212.
Cell 166, July 28, 2016 549
Alipanahi, B., Delong, A., Weirauch, M.T., and Frey, B.J. (2015). Predicting the
sequence specificities of DNA- and RNA-binding proteins by deep learning.
Nat. Biotechnol. 33, 831838.
Al Zadjali, S., Wali, Y., Al Lawatiya, F., Gravell, D., Alkindi, S., Al Falahi, K.,
Krishnamoorthy, R., and Daar, S. (2011). The b-globin promoter -71 C>T mutation is a b+ thalassemic allele. Eur. J. Haematol. 87, 457460.
Ameur, A., Rada-Iglesias, A., Komorowski, J., and Wadelius, C. (2009). Identification of candidate regulatory SNPs by combination of transcription-factorbinding site prediction, SNP genotyping and haploChIP. Nucleic Acids Res.
37, e85.
Arnosti, D.N., and Kulkarni, M.M. (2005). Transcriptional enhancers: Intelligent
enhanceosomes or flexible billboards? J. Cell. Biochem. 94, 890898.
Auton, A., Brooks, L.D., Durbin, R.M., Garrison, E.P., Kang, H.M., Korbel, J.O.,
Marchini, J.L., McCarthy, S., McVean, G.A., and Abecasis, G.R.; 1000 Genomes Project Consortium (2015). A global reference for human genetic variation. Nature 526, 6874.
Banovich, N.E., Lan, X., McVicker, G., van de Geijn, B., Degner, J.F., Blischak,
J.D., Roux, J., Pritchard, J.K., and Gilad, Y. (2014). Methylation QTLs are associated with coordinated changes in transcription factor binding, histone modifications, and gene expression levels. PLoS Genet. 10, e1004663.
Barozzi, I., Simonatto, M., Bonifacio, S., Yang, L., Rohs, R., Ghisletti, S., and
Natoli, G. (2014). Coregulation of transcription factor binding and nucleosome
occupancy through DNA features of mammalian enhancers. Mol. Cell 54,
844857.
Barrera, L.A., Vedenko, A., Kurland, J.V., Rogers, J.M., Gisselbrecht, S.S.,
Rossin, E.J., Woodard, J., Mariani, L., Kock, K.H., Inukai, S., et al. (2016). Survey of variation in human transcription factors reveals prevalent DNA binding
changes. Science 351, 14501454.
Bauer, A.L., Hlavacek, W.S., Unkefer, P.J., and Mu, F. (2010). Using sequencespecific chemical and structural properties of DNA to predict transcription factor binding sites. PLoS Comput. Biol. 6, e1001007.
Benko, S., Fantes, J.A., Amiel, J., Kleinjan, D.-J., Thomas, S., Ramsay, J.,
Jamshidi, N., Essafi, A., Heaney, S., Gordon, C.T., et al. (2009). Highly
conserved non-coding elements on either side of SOX9 associated with Pierre
Robin sequence. Nat. Genet. 41, 359364.
Berger, M.F., Philippakis, A.A., Qureshi, A.M., He, F.S., Estep, P.W., 3rd, and
Bulyk, M.L. (2006). Compact, universal DNA microarrays to comprehensively
determine transcription-factor binding site specificities. Nat. Biotechnol. 24,
14291435.
Berlivet, S., Paquette, D., Dumouchel, A., Langlais, D., Dostie, J., and Kmita,
M. (2013). Clustering of tissue-specific sub-TADs accompanies the regulation
of HoxA genes in developing limbs. PLoS Genet. 9, e1004018.
Biggin, M.D. (2011). Animal transcription networks as highly connected, quantitative continua. Dev. Cell 21, 611626.
Bulyk, M.L., Johnson, P.L., and Church, G.M. (2002). Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of
transcription factors. Nucleic Acids Res. 30, 12551261.
Chorley, B.N., Wang, X., Campbell, M.R., Pittman, G.S., Noureddine, M.A.,
and Bell, D.A. (2008). Discovery and verification of functional single nucleotide
polymorphisms in regulatory genomic regions: current and developing technologies. Mutat. Res. 659, 147157.
Ciabrelli, F., and Cavalli, G. (2015). Chromatin-driven behavior of topologically
associating domains. J. Mol. Biol. 427, 608625.
Cirillo, L.A., Lin, F.R., Cuesta, I., Friedman, D., Jarnik, M., and Zaret, K.S.
(2002). Opening of compacted chromatin by early developmental transcription
factors HNF3 (FoxA) and GATA-4. Mol. Cell 9, 279289.
Claussnitzer, M., Dankel, S.N., Klocke, B., Grallert, H., Glunk, V., Berulava, T.,
Lee, H., Oskolkov, N., Fadista, J., Ehlers, K., et al.; DIAGRAM+Consortium
(2014). Leveraging cross-species transcription factor binding site patterns:
from diabetes risk loci to disease mechanisms. Cell 156, 343358.
Claussnitzer, M., Dankel, S.N., Kim, K.-H., Quon, G., Meuleman, W., Haugen,
C., Glunk, V., Sousa, I.S., Beaudry, J.L., Puviindran, V., et al. (2015). FTO
550 Cell 166, July 28, 2016
Obesity Variant Circuitry and Adipocyte Browning in Humans. N. Engl. J.

Med. 373, 895907.
Consortium, T.E.; ENCODE Project Consortium (2012). An integrated encyclopedia of DNA elements in the human genome. Nature 489, 5774.
Corradin, O., Saiakhova, A., Akhtar-Zaidi, B., Myeroff, L., Willis, J., Cowper-Sal
lari, R., Lupien, M., Markowitz, S., and Scacheri, P.C. (2014). Combinatorial
effects of multiple enhancer variants in linkage disequilibrium dictate levels
of gene expression to confer susceptibility to common traits. Genome Res.
24, 113.
Crossley, M., and Brownlee, G.G. (1990). Disruption of a C/EBP binding site in
the factor IX promoter is associated with haemophilia B. Nature 345, 444446.
Crossley, M., Ludwig, M., Stowell, K.M., De Vos, P., Olek, K., and Brownlee,
G.G. (1992). Recovery from hemophilia B Leyden: an androgen-responsive
element in the factor IX promoter. Science 257, 377379.
Cusanovich, D.A., Pavlovic, B., Pritchard, J.K., and Gilad, Y. (2014). The functional consequences of variation in transcription factor binding. PLoS Genet.
10, e1004226.
de Wit, E., Bouwman, B.A.M., Zhu, Y., Klous, P., Splinter, E., Verstegen,
M.J.A.M., Krijger, P.H.L., Festuccia, N., Nora, E.P., Welling, M., et al. (2013).
The pluripotent genome in three dimensions is shaped around pluripotency
factors. Nature 501, 227231.
Degner, J.F., Pai, A.A., Pique-Regi, R., Veyrieras, J.B., Gaffney, D.J., Pickrell,
J.K., De Leon, S., Michelini, K., Lewellen, N., Crawford, G.E., et al. (2012).
DNase I sensitivity QTLs are a major determinant of human expression variation. Nature 482, 390394.
De Gobbi, M., Viprakasit, V., Hughes, J.R., Fisher, C., Buckle, V.J., Ayyub, H.,
Gibbons, R.J., Vernimmen, D., Yoshinaga, Y., de Jong, P., et al. (2006). A regulatory SNP causes a human genetic disease by creating a new transcriptional
promoter. Science 312, 12151217.
Dekker, J., and Mirny, L. (2016). The 3D Genome as Moderator of Chromosomal Communication. Cell 164, 11101121.
Dina, C., Meyre, D., Gallina, S., Durand, E., Korner, A., Jacobson, P., Carlsson,
L.M., Kiess, W., Vatin, V., Lecoeur, C., et al. (2007). Variation in FTO contributes
to childhood obesity and severe adult obesity. Nat. Genet. 39, 724726.
Ding, Z., Ni, Y., Timmer, S.W., Lee, B.-K., Battenhouse, A., Louzada, S., Yang,
F., Dunham, I., Crawford, G.E., Lieb, J.D., et al. (2014). Quantitative genetics of
CTCF binding reveal local sequence effects and different modes of X-chromosome association. PLoS Genet. 10, e1004798.
Dixon, J.R., Selvaraj, S., Yue, F., Kim, A., Li, Y., Shen, Y., Hu, M., Liu, J.S., and
Ren, B. (2012). Topological domains in mammalian genomes identified by
analysis of chromatin interactions. Nature 485, 376380.
Dixon, J.R., Jung, I., Selvaraj, S., Shen, Y., Antosiewicz-Bourget, J.E., Lee,
A.Y., Ye, Z., Kim, A., Rajagopal, N., Xie, W., et al. (2015). Chromatin architecture reorganization during stem cell differentiation. Nature 518, 331336.
Djordjevic, M., Sengupta, A.M., and Shraiman, B.I. (2003). A biophysical
approach to transcription factor binding site discovery. Genome Res. 13,
23812390.
Dodd, A.W., Syddall, C.M., and Loughlin, J. (2013). A rare variant in the osteoarthritis-associated locus GDF5 is functional and reveals a site that can be
manipulated to modulate GDF5 expression. Eur. J. Hum. Genet. 21, 517521.
Domcke, S., Bardet, A.F., Adrian Ginno, P., Hartl, D., Burger, L., and Schubeler, D. (2015). Competition between DNA methylation and transcription factors determines binding of NRF1. Nature 528, 575579.
Dowen, J.M., Fan, Z.P., Hnisz, D., Ren, G., Abraham, B.J., Zhang, L.N., Weintraub, A.S., Schuijers, J., Lee, T.I., Zhao, K., and Young, R.A. (2014). Control of
cell identity genes occurs in insulated neighborhoods in mammalian chromosomes. Cell 159, 374387.
Dror, I., Golan, T., Levy, C., Rohs, R., and Mandel-Gutfreund, Y. (2015). A widespread role of the motif environment in transcription factor binding across
diverse protein families. Genome Res. 25, 12681280.
Edwards, S.L., Beesley, J., French, J.D., and Dunning, A.M. (2013). Beyond
GWASs: illuminating the dark road from association to function. Am. J. Hum.
Genet. 93, 779797.
Emilsson, V., Thorleifsson, G., Zhang, B., Leonardson, A.S., Zink, F., Zhu, J.,
Carlson, S., Helgason, A., Walters, G.B., Gunnarsdottir, S., et al. (2008). Genetics of gene expression and its effect on disease. Nature 452, 423428.
Hathaway, N.A., Bell, O., Hodges, C., Miller, E.L., Neel, D.S., and Crabtree,
G.R. (2012). Dynamics and memory of heterochromatin in living cells. Cell
149, 14471460.
Escalante, C.R., Brass, A.L., Pongubala, J.M.R., Shatova, E., Shen, L., Singh,
H., and Aggarwal, A.K. (2002). Crystal structure of PU.1/IRF-4/DNA ternary
complex. Mol. Cell 10, 10971105.
Heinz, S., Benner, C., Spann, N., Bertolino, E., Lin, Y.C., Laslo, P., Cheng, J.X.,
Murre, C., Singh, H., and Glass, C.K. (2010). Simple combinations of lineagedetermining transcription factors prime cis-regulatory elements required for
macrophage and B cell identities. Mol. Cell 38, 576589.
Faisst, S., and Meyer, S. (1992). Compilation of vertebrate-encoded transcription factors. Nucleic Acids Res. 20, 326.
Farnham, P.J. (2009). Insights from genomic profiling of transcription factors.
Nat. Rev. Genet. 10, 605616.
Fischer, J., Koch, L., Emmerling, C., Vierkotten, J., Peters, T., Bruning, J.C.,
and Ruther, U. (2009). Inactivation of the Fto gene protects from obesity. Nature 458, 894898.
Frayling, T.M., Timpson, N.J., Weedon, M.N., Zeggini, E., Freathy, R.M., Lindgren, C.M., Perry, J.R., Elliott, K.S., Lango, H., Rayner, N.W., et al. (2007).
A common variant in the FTO gene is associated with body mass index and
predisposes to childhood and adult obesity. Science 316, 889894.
French, J.D., Ghoussaini, M., Edwards, S.L., Meyer, K.B., Michailidou, K.,
Ahmed, S., Khan, S., Maranian, M.J., OReilly, M., Hillman, K.M., et al.;
GENICA Network; kConFab Investigators (2013). Functional variants at the
11q13 risk locus for breast cancer regulate cyclin D1 expression through
long-range enhancers. Am. J. Hum. Genet. 92, 489503.
Funnell, A.P., Wilson, M.D., Ballester, B., Mak, K.S., Burdach, J., Magan, N.,
Pearson, R.C., Lemaigre, F.P., Stowell, K.M., Odom, D.T., et al. (2013).
A CpG mutational hotspot in a ONECUT binding site accounts for the prevalent
variant of hemophilia B Leyden. Am. J. Hum. Genet. 92, 460467.
Heinz, S., Romanoski, C.E., Benner, C., Allison, K.A., Kaikkonen, M.U., Orozco, L.D., and Glass, C.K. (2013). Effect of natural genetic variation on
enhancer selection and function. Nature 503, 487492.
Helms, C., Cao, L., Krueger, J.G., Wijsman, E.M., Chamian, F., Gordon, D.,
Heffernan, M., Daw, J.A., Robarge, J., Ott, J., et al. (2003). A putative
RUNX1 binding site variant between SLC9A3R1 and NAT9 is associated
with susceptibility to psoriasis. Nat. Genet. 35, 349356.
Heyn, H., Moran, S., Hernando-Herraez, I., Sayols, S., Gomez, A., Sandoval,
J., Monk, D., Hata, K., Marques-Bonet, T., Wang, L., and Esteller, M. (2013).
DNA methylation contributes to natural human variation. Genome Res. 23,
13631372.
Hindorff, L.A., Sethupathy, P., Junkins, H.A., Ramos, E.M., Mehta, J.P.,
Collins, F.S., and Manolio, T.A. (2009). Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc.
Hobbs, K., Negri, J., Klinnert, M., Rosenwasser, L.J., and Borish, L. (1998).
Interleukin-10 and transforming growth factor-beta promoter polymorphisms
in allergies and asthma. Am. J. Respir. Crit. Care Med. 158, 19581962.
Hoekstra, H.E., and Coyne, J.A. (2007). The locus of evolution: evo devo and
the genetics of adaptation. Evolution 61, 9951016.
Gao, Z., and Ruan, J. (2015). A structure-based Multiple-Instance Learning

approach to predicting in vitro transcription factor-DNA interaction. BMC Genomics 16 (Suppl 4 ), S3.
Hong, J.-W., Hendrix, D.A., and Levine, M.S. (2008). Shadow enhancers as a
source of evolutionary novelty. Science 321, 1314.
Gelfond, J.A., Gupta, M., and Ibrahim, J.G. (2009). A Bayesian hidden Markov
model for motif discovery through joint modeling of genomic sequence and
ChIP-chip data. Biometrics 65, 10871095.
Horn, S., Figl, A., Rachakonda, P.S., Fischer, C., Sucker, A., Gast, A., Kadel,
S., Moll, I., Nagore, E., Hemminki, K., et al. (2013). TERT promoter mutations
in familial and sporadic melanoma. Science 339, 959961.
Gheldof, N., Smith, E.M., Tabuchi, T.M., Koch, C.M., Dunham, I., Stamatoyannopoulos, J.A., and Dekker, J. (2010). Cell-type-specific long-range looping interactions identify distant regulatory elements of the CFTR gene. Nucleic Acids
Res. 38, 43254336.
Hu, S., Xie, Z., Onishi, A., Yu, X., Jiang, L., Lin, J., Rho, H.S., Woodard, C.,
Wang, H., Jeong, J.-S., et al. (2009). Profiling the human protein-DNA interactome reveals ERK2 as a transcriptional repressor of interferon signaling. Cell
139, 610622.
Giorgetti, L., Galupa, R., Nora, E.P., Piolot, T., Lam, F., Dekker, J., Tiana, G.,
and Heard, E. (2014). Predictive polymer modeling reveals coupled fluctuations in chromosome conformation and transcription. Cell 157, 950963.
Hu, S., Wan, J., Su, Y., Song, Q., Zeng, Y., Nguyen, H.N., Shin, J., Cox, E., Rho,
H.S., Woodard, C., et al. (2013). DNA methylation presents distinct binding
sites for human transcription factors. eLife 2, e00726.
Gosselin, D., Link, V.M., Romanoski, C.E., Fonseca, G.J., Eichenfield, D.Z.,
Spann, N.J., Stender, J.D., Chun, H.B., Garner, H., Geissmann, F., and Glass,
C.K. (2014). Environment drives selection and function of enhancers controlling tissue-specific macrophage identities. Cell 159, 13271340.
Huang, F.W., Hodis, E., Xu, M., Kryukov, G.V., Chin, L., and Garraway, L.A.
(2013). Highly recurrent TERT promoter mutations in human melanoma. Science 339, 957959.
Gotea, V., Visel, A., Westlund, J.M., Nobrega, M.A., Pennacchio, L.A., and
Ovcharenko, I. (2010). Homotypic clusters of transcription factor binding sites
are a key component of human promoters and enhancers. Genome Res. 20,
565577.
Gragnoli, C., Lindner, T., Cockburn, B.N., Kaisaki, P.J., Gragnoli, F., Marozzi,
G., and Bell, G.I. (1997). Maturity-onset diabetes of the young due to a mutation in the hepatocyte nuclear factor-4 alpha binding site in the promoter of the
hepatocyte nuclear factor-1 alpha gene. Diabetes 46, 16481651.
Grant, S.F., Reid, D.M., Blake, G., Herd, R., Fogelman, I., and Ralston, S.H.
(1996). Reduced bone density and osteoporosis associated with a polymorphic Sp1 binding site in the collagen type I alpha 1 gene. Nat. Genet. 14,
203205.
Grubert, F., Zaugg, J.B., Kasowski, M., Ursu, O., Spacek, D.V., Martin, A.R.,
Greenside, P., Srivas, R., Phanstiel, D.H., Pekowska, A., et al. (2015). Genetic
Control of Chromatin States in Humans Involves Local and Distal Chromosomal Interactions. Cell 162, 10511065.
Gutierrez-Arcelus, M., Lappalainen, T., Montgomery, S.B., Buil, A., Ongen, H.,
Yurovsky, A., Bryois, J., Giger, T., Romano, L., Planchon, A., et al. (2013). Passive and active DNA methylation and the interplay with genetic variation in
gene regulation. eLife 2, e00523.
Huang, W., Massouras, A., Inoue, Y., Peiffer, J., Ra`mia, M., Tarone, A.M., Turlapati, L., Zichner, T., Zhu, D., Lyman, R.F., et al. (2014). Natural variation in
genome architecture among 205 Drosophila melanogaster Genetic Reference
Panel lines. Genome Res. 24, 11931208.
Isakova, A., Berset, Y., Hatzimanikatis, V., and Deplancke, B. (2016). Quantification of Cooperativity in Heterodimer-DNA Binding Improves the Accuracy of
Binding Specificity Models. J. Biol. Chem. 291, 1029310306.
Iwafuchi-Doi, M., and Zaret, K.S. (2014). Pioneer transcription factors in cell reprogramming. Genes Dev. 28, 26792692.
Jacob, F., and Monod, J. (1961). Genetic regulatory mechanisms in the synthesis of proteins. J. Mol. Biol. 3, 318356.
Jeong, Y., Leskow, F.C., El-Jaick, K., Roessler, E., Muenke, M., Yocum, A.,
Dubourg, C., Li, X., Geng, X., Oliver, G., and Epstein, D.J. (2008). Regulation
of a remote Shh forebrain enhancer by the Six3 homeoprotein. Nat. Genet.
40, 13481353.
Jia, L., Landan, G., Pomerantz, M., Jaschek, R., Herman, P., Reich, D., Yan, C.,
Khalid, O., Kantoff, P., Oh, W., et al. (2009). Functional enhancers at the genepoor 8q24 cancer-linked locus. PLoS Genet. 5, e1000597.
Jolma, A., Kivioja, T., Toivonen, J., Cheng, L., Wei, G., Enge, M., Taipale, M.,
Vaquerizas, J.M., Yan, J., Sillanpaa, M.J., et al. (2010). Multiplexed massively
Cell 166, July 28, 2016 551
parallel SELEX for characterization of human transcription factor binding specificities. Genome Res. 20, 861873.
Mackay, T.F., Stone, E.A., and Ayroles, J.F. (2009). The genetics of quantitative traits: challenges and prospects. Nat. Rev. Genet. 10, 565577.
Jolma, A., Yan, J., Whitington, T., Toivonen, J., Nitta, K.R., Rastas, P., Morgunova, E., Enge, M., Taipale, M., Wei, G., et al. (2013). DNA-binding specificities
of human transcription factors. Cell 152, 327339.
Maerkl, S.J., and Quake, S.R. (2009). Experimental determination of the evolvability of a transcription factor. Proc. Natl. Acad. Sci. USA 106, 1865018655.
Jolma, A., Yin, Y., Nitta, K.R., Dave, K., Popov, A., Taipale, M., Enge, M., Kivioja, T., Morgunova, E., and Taipale, J. (2015). DNA-dependent formation of
transcription factor pairs alters their binding specificity. Nature 527, 384388.
Junion, G., Spivakov, M., Girardot, C., Braun, M., Gustafson, E.H., Birney, E.,
and Furlong, E.E. (2012). A transcription factor collective defines cardiac cell
fate and reflects lineage history. Cell 148, 473486.
Karczewski, K.J., Tatonetti, N.P., Landt, S.G., Yang, X., Slifer, T., Altman, R.B.,
and Snyder, M. (2011). Cooperative transcription factor associations discovered using regulatory variation. Proc. Natl. Acad. Sci. USA 108, 1335313358.
Kasowski, M., Grubert, F., Heffelfinger, C., Hariharan, M., Asabere, A., Waszak, S.M., Habegger, L., Rozowsky, J., Shi, M., Urban, A.E., et al. (2010). Variation in transcription factor binding among humans. Science 328, 232235.
Kasowski, M., Kyriazopoulou-Panagiotopoulou, S., Grubert, F., Zaugg, J.B.,
Kundaje, A., Liu, Y., Boyle, A.P., Zhang, Q.C., Zakharia, F., Spacek, D.V.,
et al. (2013). Extensive variation in chromatin states across humans. Science
342, 750752.
Keane, T.M., Goodstadt, L., Danecek, P., White, M.A., Wong, K., Yalcin, B.,
Heger, A., Agam, A., Slater, G., Goodson, M., et al. (2011). Mouse genomic
variation and its effect on phenotypes and gene regulation. Nature 477,
289294.
Kilpinen, H., Waszak, S.M., Gschwind, A.R., Raghav, S.K., Witwicki, R.M.,
Orioli, A., Migliavacca, E., Wiederkehr, M., Gutierrez-Arcelus, M., Panousis,
N.I., et al. (2013). Coordinated effects of sequence variation on DNA binding,
chromatin structure, and transcription. Science 342, 744747.
Kulakovskiy, I.V., Vorontsov, I.E., Yevshin, I.S., Soboleva, A.V., Kasianov, A.S.,
Ashoor, H., Ba-Alawi, W., Bajic, V.B., Medvedeva, Y.A., Kolpakov, F.A., and
Makeev, V.J. (2016). HOCOMOCO: expansion and enhancement of the collection of transcription factor binding sites models. Nucleic Acids Res. 44 (D1),
D116D125.
Kulzer, J.R., Stitzel, M.L., Morken, M.A., Huyghe, J.R., Fuchsberger, C., Kuusisto, J., Laakso, M., Boehnke, M., Collins, F.S., and Mohlke, K.L. (2014).
A common functional regulatory variant at a type 2 diabetes locus upregulates
ARAP1 expression in the pancreatic beta cell. Am. J. Hum. Genet. 94,
186197.
Lecerf, L., Kavo, A., Ruiz-Ferrer, M., Baral, V., Watanabe, Y., Chaoui, A., Pingault, V., Borrego, S., and Bondurand, N. (2014). An impairment of long distance SOX10 regulatory elements underlies isolated Hirschsprung disease.
Hum. Mutat. 35, 303307.
Lehner, B. (2013). Genotype to phenotype: lessons from model organisms for
human genetics. Nat. Rev. Genet. 14, 168178.
Levo, M., and Segal, E. (2014). In pursuit of design principles of regulatory sequences. Nat. Rev. Genet. 15, 453468.
Li, Z., Gadue, P., Chen, K., Jiao, Y., Tuteja, G., Schug, J., Li, W., and Kaestner,
K.H. (2012). Foxa2 and H2A.Z mediate nucleosome depletion during embryonic stem cell differentiation. Cell 151, 16081616.
Lieberman-Aiden, E., van Berkum, N.L., Williams, L., Imakaev, M., Ragoczy,
T., Telling, A., Amit, I., Lajoie, B.R., Sabo, P.J., Dorschner, M.O., et al.
(2009). Comprehensive mapping of long-range interactions reveals folding
principles of the human genome. Science 326, 289293.
Lowe, W.L., Jr., and Reddy, T.E. (2015). Genomic approaches for understanding the genetics of complex disease. Genome Res. 25, 14321441.
Ludlow, L.B., Schick, B.P., Budarf, M.L., Driscoll, D.A., Zackai, E.H., Cohen,
A., and Konkle, B.A. (1996). Identification of a mutation in a GATA binding
site of the platelet glycoprotein Ibbeta promoter resulting in the Bernard-Soulier syndrome. J. Biol. Chem. 271, 2207622080.
Lynch, V.J., and Wagner, G.P. (2008). Resurrecting the role of transcription
factor change in developmental evolution. Evolution 62, 21312154.
552 Cell 166, July 28, 2016
Maienschein-Cline, M., Dinner, A.R., Hlavacek, W.S., and Mu, F. (2012).

Improved predictions of transcription factor binding sites using physicochemical features of DNA. Nucleic Acids Res. 40, e175.
Mancini, E.J., and West, M.J. (2015). How to Be a Pioneer: A One-Sided View.
Trends Biochem. Sci. 40, 547548.
Manco, L., Ribeiro, M.L., Maximo, V., Almeida, H., Costa, A., Freitas, O., Barbot, J., Abade, A., and Tamagnini, G. (2000). A new PKLR gene mutation in the
R-type promoter region affects the gene transcription causing pyruvate kinase
deficiency. Br. J. Haematol. 110, 993997.
Manolio, T.A. (2013). Bringing genome-wide association findings into clinical
use. Nat. Rev. Genet. 14, 549558.
Mansour, M.R., Abraham, B.J., Anders, L., Berezovskaya, A., Gutierrez, A.,
Durbin, A.D., Etchin, J., Lawton, L., Sallan, S.E., Silverman, L.B., et al.
(2014). Oncogene regulation. An oncogenic super-enhancer formed through
somatic mutation of a noncoding intergenic element. Science 346, 13731377.
Martin, D.I., Tsai, S.F., and Orkin, S.H. (1989). Increased gamma-globin
expression in a nondeletion HPFH mediated by an erythroid-specific DNAbinding factor. Nature 338, 435438.
Masotti, C., Armelin-Correa, L.M., Splendore, A., Lin, C.J., Barbosa, A., Sogayar, M.C., and Passos-Bueno, M.R. (2005). A functional SNP in the promoter
region of TCOF1 is associated with reduced gene expression and YY1 DNAprotein interaction. Gene 359, 4452.
Massouras, A., Waszak, S.M., Albarca-Aguilera, M., Hens, K., Holcombe, W.,
Ayroles, J.F., Dermitzakis, E.T., Stone, E.A., Jensen, J.D., Mackay, T.F., and
Deplancke, B. (2012). Genomic variation and its impact on gene expression
in Drosophila melanogaster. PLoS Genet. 8, e1003055.
Matsuda, M., Sakamoto, N., and Fukumaki, Y. (1992). Delta-thalassemia
caused by disruption of the site for an erythroid-specific transcription factor,
GATA-1, in the delta-globin gene promoter. Blood 80, 13471351.
Maurano, M.T., Humbert, R., Rynes, E., Thurman, R.E., Haugen, E., Wang, H.,
Reynolds, A.P., Sandstrom, R., Qu, H., Brody, J., et al. (2012). Systematic
localization of common disease-associated variation in regulatory DNA. Science 337, 11901195.
Maurano, M.T., Haugen, E., Sandstrom, R., Vierstra, J., Shafer, A., Kaul, R.,
and Stamatoyannopoulos, J.A. (2015). Large-scale identification of sequence
variants influencing human transcription factor occupancy in vivo. Nat. Genet.
47, 13931401.
McClay, J.L., Shabalin, A.A., Dozmorov, M.G., Adkins, D.E., Kumar, G., Nerella, S., Clark, S.L., Bergen, S.E., Hultman, C.M., Magnusson, P.K., et al.;
Swedish Schizophrenia Consortium (2015). High density methylation QTL
analysis in human blood via next-generation sequencing of the methylated
genomic DNA fraction. Genome Biol. 16, 291.
McVicker, G., van de Geijn, B., Degner, J.F., Cain, C.E., Banovich, N.E., Raj, A.,
Lewellen, N., Myrthil, M., Gilad, Y., and Pritchard, J.K. (2013). Identification of
genetic variants that affect histone modifications in human cells. Science 342,
747749.
Meng, X., Brodsky, M.H., and Wolfe, S.A. (2005). A bacterial one-hybrid system for determining the DNA-binding specificity of transcription factors. Nat.
Biotechnol. 23, 988994.
Miller, I.J., and Bieker, J.J. (1993). A novel, erythroid cell-specific murine transcription factor that binds to the CACCC element and is related to the Kruppel
family of nuclear proteins. Mol. Cell. Biol. 13, 27762786.
Mirny, L.A. (2010). Nucleosome-mediated cooperativity between transcription
factors. Proc. Natl. Acad. Sci. USA 107, 2253422539.
Musunuru, K., Strong, A., Frank-Kamenetsky, M., Lee, N.E., Ahfeldt, T., Sachs,
K.V., Li, X., Li, H., Kuperwasser, N., Ruda, V.M., et al. (2010). From noncoding
variant to phenotype via SORT1 at the 1p13 cholesterol locus. Nature 466,
714719.
Najafabadi, H.S., Mnaimneh, S., Schmitges, F.W., Garton, M., Lam, K.N.,
Yang, A., Albu, M., Weirauch, M.T., Radovani, E., Kim, P.M., et al. (2015).
C2H2 zinc finger proteins greatly expand the human regulatory lexicon. Nat.
Biotechnol. 33, 555562.
Reddy, T.E., Gertz, J., Pauli, F., Kucera, K.S., Varley, K.E., Newberry, K.M.,
Marinov, G.K., Mortazavi, A., Williams, B.A., Song, L., et al. (2012). Effects of
sequence variation on differential allelic transcription factor occupancy and
gene expression. Genome Res. 22, 860869.
Nica, A.C., Montgomery, S.B., Dimas, A.S., Stranger, B.E., Beazley, C., Barroso, I., and Dermitzakis, E.T. (2010). Candidate causal regulatory effects by
integration of expression QTLs with complex trait genetic associations.
PLoS Genet. 6, e1000895.
Reijnen, M.J., Sladek, F.M., Bertina, R.M., and Reitsma, P.H. (1992). Disruption of a binding site for hepatocyte nuclear factor 4 results in hemophilia B
Leyden. Proc. Natl. Acad. Sci. USA 89, 63006303.
Nicolae, D.L., Gamazon, E., Zhang, W., Duan, S., Dolan, M.E., and Cox, N.J.
(2010). Trait-associated SNPs are more likely to be eQTLs: annotation to
enhance discovery from GWAS. PLoS Genet. 6, e1000888.
Rohs, R., West, S.M., Sosinsky, A., Liu, P., Mann, R.S., and Honig, B. (2009).
The role of DNA shape in protein-DNA recognition. Nature 461, 12481253.
Saiz, L., and Vilar, J.M.G. (2006). DNA looping: the consequences and its control. Curr. Opin. Struct. Biol. 16, 344350.
Nora, E.P., Lajoie, B.R., Schulz, E.G., Giorgetti, L., Okamoto, I., Servant, N.,
Piolot, T., van Berkum, N.L., Meisig, J., Sedat, J., et al. (2012). Spatial partitioning of the regulatory landscape of the X-inactivation centre. Nature 485,
381385.
Sherwood, R.I., Hashimoto, T., ODonnell, C.W., Lewis, S., Barkal, A.A., van
Hoff, J.P., Karun, V., Jaakkola, T., and Gifford, D.K. (2014). Discovery of directional and nondirectional pioneer transcription factors by modeling DNase profile magnitude and shape. Nat. Biotechnol. 32, 171178.
Nutiu, R., Friedman, R.C., Luo, S., Khrebtukova, I., Silva, D., Li, R., Zhang, L.,
Schroth, G.P., and Burge, C.B. (2011). Direct measurement of DNA affinity
landscapes on a high-throughput sequencing instrument. Nat. Biotechnol.
29, 659664.
Siersbk, R., Rabiee, A., Nielsen, R., Sidoli, S., Traynor, S., Loft, A., La Cour
Poulsen, L., Rogowska-Wrzesinska, A., Jensen, O.N., and Mandrup, S.
(2014). Transcription factor cooperativity in early adipogenic hotspots and
super-enhancers. Cell Rep. 7, 14431455.
Orkin, S.H., Kazazian, H.H., Jr., Antonarakis, S.E., Goff, S.C., Boehm, C.D.,
Sexton, J.P., Waber, P.G., and Giardina, P.J. (1982). Linkage of beta-thalassaemia mutations and beta-globin gene polymorphisms with DNA polymorphisms in human beta-globin gene cluster. Nature 296, 627631.
Simicevic, J., Schmid, A.W., Gilardoni, P.A., Zoller, B., Raghav, S.K., Krier, I.,
Gubelmann, C., Lisacek, F., Naef, F., Moniatte, M., and Deplancke, B. (2013).
Absolute quantification of transcription factors during cellular differentiation
using multiplexed targeted proteomics. Nat. Methods 10, 570576.
Pai, A.A., Pritchard, J.K., and Gilad, Y. (2015). The genetic and mechanistic basis for variation in gene regulation. PLoS Genet. 11, e1004857.
Slattery, M., Zhou, T., Yang, L., Dantas Machado, A.C., Gordan, R., and Rohs,
R. (2014). Absence of a simple code: how transcription factors read the
genome. Trends Biochem. Sci. 39, 381399.
Peters, T., Ausmeier, K., and Ruther, U. (1999). Cloning of Fatso (Fto), a novel
gene deleted by the Fused toes (Ft) mouse mutation. Mamm. Genome 10,
983986.
Phillips-Cremins, J.E., Sauria, M.E., Sanyal, A., Gerasimova, T.I., Lajoie, B.R.,
Bell, J.S., Ong, C.T., Hookway, T.A., Guo, C., Sun, Y., et al. (2013). Architectural protein subclasses shape 3D organization of genomes during lineage
commitment. Cell 153, 12811295.
Polach, K.J., and Widom, J. (1996). A model for the cooperative binding of
eukaryotic regulatory proteins to nucleosomal target sites. J. Mol. Biol. 258,
800812.
Poncz, M., Ballantine, M., Solowiejczyk, D., Barak, I., Schwartz, E., and Surrey,
S. (1982). beta-Thalassemia in a Kurdish Jew. Single base changes in the T-AT-A box. J. Biol. Chem. 257, 59945996.
Ponomarenko, J.V., Merkulova, T.I., Vasiliev, G.V., Levashova, Z.B., Orlova,
G.V., Lavryushev, S.V., Fokin, O.N., Ponomarenko, M.P., Frolov, A.S., and
Sarai, A. (2001). rSNP_Guide, a database system for analysis of transcription
factor binding to target sequences: application to SNPs and site-directed mutations. Nucleic Acids Res. 29, 312316.
Ptashne, M., Jeffrey, A., Johnson, A.D., Maurer, R., Meyer, B.J., Pabo, C.O.,
Roberts, T.M., and Sauer, R.T. (1980). How the l repressor and cro work.
Cell 19, 111.
Raghav, S.K., Waszak, S.M., Krier, I., Gubelmann, C., Isakova, A., Mikkelsen,
T.S., and Deplancke, B. (2012). Integrative genomics identifies the corepressor
SMRT as a gatekeeper of adipogenesis through the transcription factors
C/EBPb and KAISO. Mol. Cell 46, 335350.
Rahimov, F., Marazita, M.L., Visel, A., Cooper, M.E., Hitchler, M.J., Rubini, M.,
Domann, F.E., Govil, M., Christensen, K., Bille, C., et al. (2008). Disruption of an
AP-2alpha binding site in an IRF6 enhancer is associated with cleft lip. Nat.
Genet. 40, 13411347.
Smemo, S., Tena, J.J., Kim, K.H., Gamazon, E.R., Sakabe, N.J., GomezMarn, C., Aneas, I., Credidio, F.L., Sobreira, D.R., Wasserman, N.F., et al.
(2014). Obesity-associated variants within FTO form long-range functional
connections with IRX3. Nature 507, 371375.
Smolle, M., and Workman, J.L. (2013). Transcription-associated histone modifications and cryptic transcription. Biochim. Biophys. Acta 1829, 8497.
Soccio, R.E., Chen, E.R., Rajapurkar, S.R., Safabakhsh, P., Marinis, J.M., Dispirito, J.R., Emmett, M.J., Briggs, E.R., Fang, B., Everett, L.J., et al. (2015). Genetic Variation Determines PPARg Function and Anti-diabetic Drug Response
In Vivo. Cell 162, 3344.
Solis, C., Aizencang, G.I., Astrin, K.H., Bishop, D.F., and Desnick, R.J. (2001).
Uroporphyrinogen III synthase erythroid promoter mutations in adjacent
GATA1 and CP2 elements cause congenital erythropoietic porphyria. J. Clin.
Invest. 107, 753762.
Soufi, A., Garcia, M.F., Jaroszewicz, A., Osman, N., Pellegrini, M., and Zaret,
K.S. (2015). Pioneer transcription factors target partial DNA motifs on nucleosomes to initiate reprogramming. Cell 161, 555568.
Spitz, F., and Furlong, E.E.M. (2012). Transcription factors: from enhancer
binding to developmental control. Nat. Rev. Genet. 13, 613626.
Spivakov, M., Akhtar, J., Kheradpour, P., Beal, K., Girardot, C., Koscielny, G.,
Herrero, J., Kellis, M., Furlong, E.E., and Birney, E. (2012). Analysis of variation
at transcription factor binding sites in Drosophila and humans. Genome Biol.
13, R49.
Stefflova, K., Thybert, D., Wilson, M.D., Streeter, I., Aleksic, J., Karagianni, P.,
Brazma, A., Adams, D.J., Talianidis, I., Marioni, J.C., et al. (2013). Cooperativity
and rapid evolution of cobound transcription factors in closely related mammals. Cell 154, 530540.
Stormo, G.D., and Zhao, Y. (2010). Determining the specificity of protein-DNA
interactions. Nat. Rev. Genet. 11, 751760.
Rao, S.S., Huntley, M.H., Durand, N.C., Stamenova, E.K., Bochkov, I.D., Robinson, J.T., Sanborn, A.L., Machol, I., Omer, A.D., Lander, E.S., and Aiden, E.L.
(2014). A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell 159, 16651680.
Tang, Z., Luo, O.J., Li, X., Zheng, M., Zhu, J.J., Szalaj, P., Trzaskoma, P., Magalska, A., Wlodarczyk, J., Ruszczycki, B., et al. (2015). CTCF-Mediated Human
3D Genome Architecture Reveals Chromatin Topology for Transcription. Cell
163, 16111627.
Ravasi, T., Suzuki, H., Cannistraci, C.V., Katayama, S., Bajic, V.B., Tan, K.,
Akalin, A., Schmeier, S., Kanamori-Katayama, M., Bertin, N., et al. (2010). An
atlas of combinatorial transcriptional regulation in mouse and man. Cell 140,
744752.
Tijssen, M.R., Cvejic, A., Joshi, A., Hannah, R.L., Ferreira, R., Forrai, A., Bellissimo, D.C., Oram, S.H., Smethurst, P.A., Wilson, N.K., et al. (2011). Genomewide analysis of simultaneous GATA1/2, RUNX1, FLI1, and SCL binding in
megakaryocytes identifies hematopoietic regulators. Dev. Cell 20, 597609.
Cell 166, July 28, 2016 553
Torres, J.M., Gamazon, E.R., Parra, E.J., Below, J.E., Valladares-Salgado, A.,
Wacher, N., Cruz, M., Hanis, C.L., and Cox, N.J. (2014). Cross-tissue and tissue-specific eQTLs: partitioning the heritability of a complex trait. Am. J. Hum.
Genet. 95, 521534.
Weirauch, M.T., Yang, A., Albu, M., Cote, A.G., Montenegro-Montero, A.,
Drewe, P., Najafabadi, H.S., Lambert, S.A., Mann, I., Cook, K., et al. (2014).
Determination and inference of eukaryotic transcription factor sequence specificity. Cell 158, 14311443.
Tournamille, C., Colin, Y., Cartron, J.P., and Le Van Kim, C. (1995). Disruption
of a GATA motif in the Duffy gene promoter abolishes erythroid gene expression in Duffy-negative individuals. Nat. Genet. 10, 224228.
White, M.A., Myers, C.A., Corbo, J.C., and Cohen, B.A. (2013). Massively parallel in vivo enhancer assay reveals that highly local features determine the cisregulatory function of ChIP-seq peaks. Proc. Natl. Acad. Sci. USA 110, 11952
11957.
Tuupanen, S., Turunen, M., Lehtonen, R., Hallikas, O., Vanharanta, S., Kivioja,
T., Bjorklund, M., Wei, G., Yan, J., Niittymaki, I., et al. (2009). The common
colorectal cancer predisposition SNP rs6983267 at chromosome 8q24 confers potential to enhanced Wnt signaling. Nat. Genet. 41, 885890.
Vaquerizas, J.M., Kummerfeld, S.K., Teichmann, S.A., and Luscombe, N.M.
(2009). A census of human transcription factors: function, expression and evolution. Nat. Rev. Genet. 10, 252263.
Verlaan, D.J., Berlivet, S., Hunninghake, G.M., Madore, A.-M., Larivie`re, M.,
Moussette, S., Grundberg, E., Kwan, T., Ouimet, M., Ge, B., et al. (2009).
Allele-specific chromatin remodeling in the ZPBP2/GSDMB/ORMDL3 locus
associated with the risk of asthma and autoimmune disease. Am. J. Hum.
Genet. 85, 377393.
Veyrieras, J.-B., Kudaravalli, S., Kim, S.Y., Dermitzakis, E.T., Gilad, Y., Stephens, M., and Pritchard, J.K. (2008). High-resolution mapping of expression-QTLs yields insight into human gene regulation. PLoS Genet. 4,
e1000214.
Vietri Rudan, M., Barrington, C., Henderson, S., Ernst, C., Odom, D.T., Tanay,
A., and Hadjur, S. (2015). Comparative Hi-C reveals that CTCF underlies evolution of chromosomal domain architecture. Cell Rep. 10, 12971309.
Wang, J., and Batmanov, K. (2015). BayesPI-BAR: a new biophysical model
for characterization of regulatory sequence variations. Nucleic Acids Res.
43, e147.
Wang, S., Wu, S., Meng, Q., Li, X., Zhang, J., Chen, R., and Wang, M. (2016).
FAS rs2234767 and rs1800682 polymorphisms jointly contributed to risk of
colorectal cancer by affecting SP1/STAT1 complex recruitment to chromatin.
Sci. Rep. 6, 19229.
Wasserman, W.W., and Sandelin, A. (2004). Applied bioinformatics for the
identification of regulatory elements. Nat. Rev. Genet. 5, 276287.
Waszak, S.M., Delaneau, O., Gschwind, A.R., Kilpinen, H., Raghav, S.K., Witwicki, R.M., Orioli, A., Wiederkehr, M., Panousis, N.I., Yurovsky, A., et al.
(2015). Population Variation and Genetic Control of Modular Chromatin Architecture in Humans. Cell 162, 10391050.
Weedon, M.N., Cebola, I., Patch, A.M., Flanagan, S.E., De Franco, E., Caswell,
R., Rodrguez-Segu, S.A., Shaw-Smith, C., Cho, C.H., Lango Allen, H., et al.;
International Pancreatic Agenesis Consortium (2014). Recessive mutations in
a distal PTF1A enhancer cause isolated pancreatic agenesis. Nat. Genet. 46,
6164.
Weinhold, N., Jacobsen, A., Schultz, N., Sander, C., and Lee, W. (2014).
Genome-wide analysis of noncoding regulatory mutations in cancer. Nat.
Genet. 46, 11601165.
Weirauch, M., and Hughes, T.R. (2011). A Catalogue of Eukaryotic Transcription Factor Types, Their Evolutionary Origin, and Species Distribution. In A
Handbook of Transcription Factors, T.R. Hughes, ed. (Springer Netherlands),
pp. 2573.
554 Cell 166, July 28, 2016
Wienert, B., Funnell, A.P.W., Norton, L.J., Pearson, R.C.M., Wilkinson-White,

L.E., Lester, K., Vadolas, J., Porteus, M.H., Matthews, J.M., Quinlan, K.G.R.,
and Crossley, M. (2015). Editing the genome to introduce a beneficial naturally
occurring mutation associated with increased fetal globin. Nat. Commun. 6,
7085.
Wray, G.A. (2007). The evolutionary significance of cis-regulatory mutations.
Nat. Rev. Genet. 8, 206216.
Wu, J., Metz, C., Xu, X., Abe, R., Gibson, A.W., Edberg, J.C., Cooke, J., Xie, F.,
Cooper, G.S., and Kimberly, R.P. (2003). A novel polymorphic CAAT/
enhancer-binding protein b element in the FasL gene promoter alters Fas
ligand expression: a candidate background gene in African American systemic
lupus erythematosus patients. J. Immunol. 170, 132138.
Xu, D., Dwyer, J., Li, H., Duan, W., and Liu, J.-P. (2008). Ets2 maintains hTERT
gene expression and breast cancer cell proliferation by interacting with c-Myc.
J. Biol. Chem. 283, 2356723580.
Yalcin, B., Wong, K., Agam, A., Goodson, M., Keane, T.M., Gan, X., Nellaker,
C., Goodstadt, L., Nicod, J., Bhomra, A., et al. (2011). Sequence-based characterization of structural variation in the mouse genome. Nature 477, 326329.
Yang, W.S., Nevin, D.N., Peng, R., Brunzell, J.D., and Deeb, S.S. (1995). A mutation in the promoter of the lipoprotein lipase (LPL) gene in a patient with familial combined hyperlipidemia and low LPL activity. Proc. Natl. Acad. Sci.
USA 92, 44624466.
Zeron-Medina, J., Wang, X., Repapi, E., Campbell, M.R., Su, D., Castro-Giner,
F., Davies, B., Peterse, E.F., Sacilotto, N., Walker, G.J., et al. (2013). A polymorphic p53 response element in KIT ligand influences cancer risk and has undergone natural selection. Cell 155, 410422.
Zhang, X., Miao, X., Tan, W., Ning, B., Liu, Z., Hong, Y., Song, W., Guo, Y.,
Zhang, X., Shen, Y., et al. (2005). Identification of functional genetic variants
in cyclooxygenase-2 and their association with risk of esophageal cancer.
Gastroenterology 129, 565576.
Zhao, X., Huang, H., and Speed, T.P. (2005). Finding short DNA motifs using
permuted Markov models. J. Comput. Biol. 12, 894906.
Zheng, X.-W.W., Kudaravalli, R., Russell, T.T., DiMichele, D.M., Gibb, C., Russell, J.E., Margaritis, P., and Pollak, E.S. (2011). Mutation in the factor VII hepatocyte nuclear factor 4a-binding site contributes to factor VII deficiency.
Blood Coagul. Fibrinolysis 22, 624627.
Zhou, J., and Troyanskaya, O.G. (2015). Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods 12, 931934.
Zhou, T., Shen, N., Yang, L., Abe, N., Horton, J., Mann, R.S., Bussemaker,
H.J., Gordan, R., and Rohs, R. (2015). Quantitative modeling of transcription
factor binding specificities using DNA shape. Proc. Natl. Acad. Sci. USA
112, 46544659.
Leading Edge
Review
Mitochondria and Cancer
Sejal Vyas,1 Elma Zaganjor,1 and Marcia C. Haigis1,*
1Department
of Cell Biology, Ludwig Center at Harvard, Harvard Medical School, Boston, MA 02115, USA
*Correspondence: marcia_haigis@hms.harvard.edu
Mitochondria are bioenergetic, biosynthetic, and signaling organelles that are integral in stress
sensing to allow for cellular adaptation to the environment. Therefore, it is not surprising that mitochondria are important mediators of tumorigenesis, as this process requires flexibility to adapt to
cellular and environmental alterations in addition to cancer treatments. Multiple aspects of mitochondrial biology beyond bioenergetics support transformation, including mitochondrial biogenesis and turnover, fission and fusion dynamics, cell death susceptibility, oxidative stress regulation,
metabolism, and signaling. Thus, understanding mechanisms of mitochondrial function during
tumorigenesis will be critical for the next generation of cancer therapeutics.
Introduction
Historical Perspective
Louis Pasteur identified the importance of oxygen consumption
in 1861, finding that yeast divided more in the presence of oxygen and that oxygen inhibited fermentation, an observation
known as the Pasteur effect. The discovery of mitochondria
in the 1890s, described cytologically by both Richard Altmann
and Carl Benda, began to shed light on this observation, and in
1913, the biochemist Otto Warburg linked cellular respiration
to grana derived from guinea pig liver extracts (Ernster and
Schatz, 1981). Warburg stated that the granules functioned to
enhance the activity of iron-containing enzymes and involved a
transfer to oxygen (Ernster and Schatz, 1981). In the following
decades, many scientists elucidated the machinery that drives
mitochondrial respiration, including tricarboxylic acid (TCA)
cycle and fatty acid b-oxidation enzymes in the mitochondrial
matrix that generate electron donors to fuel respiration and electron transport chain (ETC) complexes and ATP synthase in the inner mitochondrial membrane (IMM) that carry out oxidative
phosphorylation (Ernster and Schatz, 1981). This biochemical
understanding of mitochondrial oxidative phosphorylation gave
mechanistic insight into the Pasteur effect, which could be reconstituted by adding purified, respiring liver mitochondria to
glycolytic tumor supernatants and observing inhibited fermentation (Aisenberg et al., 1957). The ability of mitochondria to inhibit
a glycolytic system suggested an active and direct role for mitochondria in regulating oxidative versus glycolytic metabolism
(Aisenberg et al., 1957).
Warburgs seminal discovery that cancer cells undergo aerobic glycolysis, which refers to the fermentation of glucose to
lactate in the presence of oxygen as opposed to the complete
oxidation of glucose to fuel mitochondrial respiration, brought
attention to the role of mitochondria in tumorigenesis (Warburg,
1956). While the Warburg effect is an undisputed feature of
many (but not all) cancer cells, Warburgs reasoning that it
stemmed from damaged mitochondrial respiration caused immediate controversy (Weinhouse, 1956). We now understand
that while damaged mitochondria drive the Warburg effect in
some cases, many cancer cells that display Warburg metabolism possess intact mitochondrial respiration, with some cancer subtypes actually depending on mitochondrial respiration.
Decades of studies on mitochondrial respiration in cancer have
set the framework for a new frontier focused on additional functions of mitochondria in cancer, which have identified pleiotropic
roles of mitochondria in tumorigenesis.
A major function of mitochondria is ATP production, hence its
nickname powerhouse of the cell. However, mitochondria
perform many roles beyond energy production, including the
generation of reactive oxygen species (ROS), redox molecules
and metabolites, regulation of cell signaling and cell death, and
biosynthetic metabolism. These multifaceted functions of mitochondria in normal physiology make them important cellular
stress sensors, and allow for cellular adaptation to the environment. Mitochondria similarly impart considerable flexibility for tumor cell growth and survival in otherwise harsh environments,
such as during nutrient depletion, hypoxia, and cancer treatments, and are therefore key players in tumorigenesis.
There is no simple canon for the role of mitochondria in cancer
development. Instead, mitochondrial functions in cancer vary
depending upon genetic, environmental, and tissue-of-origin
differences between tumors. It is clear that the biology of mitochondria in cancer is central to our understanding of cancer
biology, as many classical cancer hallmarks result in altered
mitochondrial function. This review will summarize functions of
mitochondria biology that contribute to tumorigenesis, which
include mitochondrial biogenesis and turnover, fission and
fusion dynamics, cell death, oxidative stress, metabolism and
bioenergetics, signaling, and mtDNA (Figures 1 and 2).
Mitochondrial Biogenesis and Turnover
Mitochondrial mass is dictated by two opposing pathways,
biogenesis and turnover, and has emerged as both a positive
and negative regulator of tumorigenesis. The role of mitochondrial biogenesis in cancer is regulated by many factors, including
metabolic state, tumor heterogeneity, tissue type, microenvironment, and tumor stage. Additionally, mitophagy, the selective
Cell 166, July 28, 2016 2016 Elsevier Inc. 555
Figure 1. Mitochondria and Cancer

The role of mitochondrial metabolism, bioenergetics, mtDNA, oxidative stress regulation, fission and fusion dynamics, cell death regulation, biogenesis, turnover,
and signaling in tumorigenesis.
autophagic pathway for mitochondrial turnover, maintains a

healthy mitochondrial population. Importantly, regulation of
both mitochondrial biogenesis and mitophagy are central to
key oncogenic signaling pathways.
Transcriptional and Signaling Networks Regulating
Biogenesis
Mitochondrial biogenesis is regulated by transcriptional programs that coordinate induction of both mitochondrial- and nu556 Cell 166, July 28, 2016
clear-localized genes that encode mitochondrial proteins. The

transcriptional coactivator peroxisome proliferator-activated receptor gamma coactivator-1 alpha (PGC-1a) is a central regulator of mitochondrial biogenesis through interactions with
multiple transcription factors (Tan et al., 2016). PGC-1a levels
often reveal tumor reliance on mitochondrial mass, with high
PGC-1a expression resulting in a dependence on mitochondrial
respiration (Tan et al., 2016). In contrast, PGC-1a acts as a tumor
Figure 2. Mitochondria and Stages of

Tumorigenesis
Mitochondrial biology supports tumorigenesis at
multiple stages. Mutations in mitochondrial enzymes generate oncometabolites that result in
tumor initiation. Oxidative stress and mitochondrial signalling can also support tumor initiation.
Mitochondrial metabolic reprogramming, oxidative stress, and signaling can promote tumor
growth and survival. Mitochondria additionally
regulate redox homeostasis and susceptibility to
cell death via alterations in morphology to promote
cell survival. Alterations in mitochondrial mass via
regulation of biogenesis and mitophagy also
contribute to survival depending on cancer type.
Mitochondrial metabolic reprogramming, biogenesis, and redox homeostasis and dynamics also
contribute to metastatic potential of cancer cells.
suppressor in some cancer types, with overexpression resulting

in induction of apoptosis (Tan et al., 2016). Additionally, PGC-1a
is downregulated in hypoxia inducible factor-1 alpha (HIF-1a)activated renal cell carcinomas, reinforcing a switch to glycolytic
metabolism in low oxygen conditions (LaGory et al., 2015; Zhang
et al., 2007). Therefore, it is important to identify factors that
contribute to the dichotomous effect of PGC-1a on tumor
viability, as this has the potential to identify specific susceptibilities for cancer subtypes.
PGC-1a-dependent mitochondrial biogenesis may also support anchorage-independent cancer cell growth, a key step in
metastasis. Proteomic analysis identified upregulation of mitochondrial proteins involved in metabolism and biogenesis upon
low-attachment culture conditions (Lamb et al., 2014). Additionally, increased mitochondrial mass co-enriched with tumor-initiating activity in patient-derived breast cancer lines, which could
be blocked by PGC-1a inhibition (De Luca et al., 2015). These
findings remain relevant in vivo, as circulating tumor cells
(CTCs) developed from primary orthotopic breast tumors show
increased mitochondrial biogenesis and respiration, with PGC1a silencing decreasing CTCs and metastasis (LeBleu et al.,
2014). Thus, PGC-1a-dependent mitochondrial biogenesis
may contribute to tumor metastatic potential.
A key activator of mitochondrial biogenesis in cancer is c-Myc,
a transcription factor that globally regulates cell cycle, growth,
metabolism, and apoptosis. Over 400 mitochondrial genes are
identified as c-Myc targets, and initial studies demonstrated
that gain/loss of Myc increases/reduces mitochondrial mass,
respectively (Li et al., 2005). In normal physiology, c-Myc couples mitochondrial biogenesis with cell-cycle progression.
However, elevated mitochondrial biogenesis due to oncogenic
c-Myc increases cellular biosynthetic and respiratory capacity
by upregulating mitochondrial metabolism to support rapid proliferation,

complementing c-Mycs effects on stimulating cell-cycle progression and glycolytic metabolism to coordinate rapid cell
growth (Figure 3).
Another effector of mitochondrial
biogenesis is the mammalian target of
rapamycin (mTOR) signaling pathway,
which is critical for cellular growth and energy homeostasis
and is misregulated in many diseases including cancer. mTOR
regulates mitochondrial biogenesis both transcriptionally via
PGC-1a/Yin Yang 1 (YY1) activation, resulting in mitochondrial
gene expression, and translationally via repression of inhibitory
4E-binding proteins (4E-BPs) that downregulate translation of
nuclear-encoded mitochondrial proteins (Morita et al., 2015)
(Figure 3).
The transcriptional networks regulating biogenesis impact
therapeutic outcomes by providing cancer cells with metabolic
flexibility to adapt to targeted treatments and tumor microenvironments. In B-Raf or N-Ras mutant melanomas, resistance to
MEK inhibitors was partially due to a switch to oxidative metabolism mediated by PGC-1a upregulation and was overcome
by mTORC1/2 inhibition, which repressed PGC-1a expression
(Gopal et al., 2014; Haq et al., 2013). Likewise, in a mouse model
of K-Ras mutant pancreatic ductal adenocarcinoma, cells that
survive oncogene ablation have increased PGC-1a expression
and mitochondrial function, and the reliance on mitochondrial
respiration resulted in sensitivity to oxidative phosphorylation inhibitors (Viale et al., 2014). Cancer cells can adapt their mitochondrial function according to the specific stress. For example,
c-Myc upregulation and glycolytic gene expression enables
resistance to metformin, a complex I inhibitor, in pancreatic cancer cells, which actively utilize mitochondrial respiration due to
PGC-1a expression (Sancho et al., 2015). Similarly, c-Mycdependent mitochondrial biogenesis is normally opposed by
the HIF-1a signaling pathway, but this balance is altered during
oncogenic c-Myc-driven transformation (Dang et al., 2008).
Therefore, an important consideration in cancer therapeutics
will be addressing routes of bioenergetic plasticity provided by
mitochondria.
Cell 166, July 28, 2016 557
Figure 3. Effects of Classical Oncogenic and Tumor Suppressive Pathways on Mitochondrial Biology
Key mechanisms of mitochondrial regulation by c-MYC, K-RAS, PI3K, and p53 signaling pathways. Through transcriptional regulation, c-Myc induces mitochondrial biogenesis and metabolism in addition to its stimulation of cell-cycle progression and glycolysis. c-Myc promotes mitochondrial fusion and respiration,
which can result in increased ROS production and oxidative signaling. Hyperactive PI3K signaling through either PI3K mutation or loss/mutation of the PTEN
tumor suppressor results in mTOR activation, which is additionally regulated by nutrient availability, to regulate cell growth. mTOR promotes mitochondrial
biogenesis both transcriptionally and translationally. Low nutrient conditions that result in a high AMP/ATP ratio activate AMPK, which opposes the mTOR
pathway. During chronic nutrient deprivation, AMPK can also promote mitochondrial biogenesis to allow for metabolic flexibility. Loss of p53 promotes survival
not only via transcriptional regulation of cell death programs, but also through direct interactions with Bcl-2 proteins at the mitochondria. p53 can also induce
mitochondrial respiration to promote tumorigenesis by allowing for metabolic flexibility. Oncogenic K-Ras mutations result in a coordinated program of mitochondrial regulation, reprogramming mitochondrial metabolism through multiple mechanisms as well as promoting mitochondrial fission and mitophagy.
Mitophagy
Clearance of damaged mitochondria via mitophagy is critical for
cellular fitness since dysfunctional mitochondria can impair ETC
function and increase oxidative stress. A major trigger for
mitophagy is via the PTEN-induced putative kinase 1 (PINK1)/
Parkin pathway. This pathway is activated upon mitochondrial
membrane depolarization, a signal of mitochondrial dysfunction
that results from multiple causes including lack of reducing
equivalents, hypoxia, and impaired electron transport. An alternate pathway for mitophagy induction is through the HIF-1a
target genes Bcl-2 and adenovirus E1B 19 kDa-interacting protein 3 (BNIP3) and BNIP3-like (BNIP3L/NIX), which inhibit mitochondrial respiration during hypoxic conditions that could result
in excessive ROS.
558 Cell 166, July 28, 2016
Is mitophagy beneficial or harmful to cancers? Similar to autophagy, which is shown to be both pro- and anti-tumorgenic
based on context, the function of mitophagy in transformation
likely depends on tumor stage (Mancias and Kimmelman,
2016). Mitophagy-deficient Parkin null mice develop spontaneous hepatic tumors, and Parkin loss increases tumorigenesis
in multiple cancer models (Matsuda et al., 2015). Additionally,
BNIP3 and NIX are identified as tumor suppressors in multiple
cancer models (Chourasia et al., 2015). Thus, in certain stages
of tumorigenesis, decreased mitophagy may allow for a permissive threshold of dysfunctional mitochondria to persist, generating increased tumor-promoting ROS or other tumorigenic
mitochondrial signals. In contrast, established tumors may
require mitophagy for stress adaptation and survival. Supporting
this concept, BNIP3 is induced in patient glioblastoma samples

in response to hypoxia caused by anti-angiogenic therapy and
combinatorial angiogenesis and autophagy inhibition had a
potent anti-tumor effect in xenograft glioma models (Hu et al.,
2012). Additionally, oncogenic K-Ras-driven transformation upregulates mitophagy for the clearance of dysfunctional mitochondria, and the accumulation of dysfunctional mitochondria
switches adenoma tumor fate to benign oncocytomas instead
of carcinomas (Guo et al., 2013).
Fission and Fusion Dynamics
Mitochondria are extremely dynamic, and the balance of fission
and fusion dictates their morphology. A critical step in mitochondrial membrane fission is dynamin-related protein-1
(Drp1) recruitment to mitochondria and interaction with its outer
mitochondria membrane (OMM) receptors, where it causes
membrane constriction fueled by GTPase activity. Drp1 mitochondrial translocation and activity is regulated by phosphorylation mediated by multiple kinases that respond to distinct
cell-cycle and stress conditions (Mishra and Chan, 2016). The
mitofusins, Mfn1 and Mfn2, along with optic atrophy-1 (Opa1)
mediate mitochondrial fusion. Mitochondria exist as either
fused, tubular networks or as fragmented granules depending
on cellular state, with mitochondrial metabolism, respiration,
and oxidative stress regulating fission/fusion machinery (Mishra
and Chan, 2016). Mitochondrial morphology also affects susceptibility to mitophagy and apoptosis (Kasahara and Scorrano,
2014).
Multiple studies have demonstrated an imbalance of fission
and fusion activities in cancer, with elevated fission activity
and/or decreased fusion resulting in a fragmented mitochondrial
network (Senft and Ronai, 2016). Importantly, restoration of
fused mitochondrial networks in these studies, through either
Drp1 knockdown/ inhibition or Mfn2 overexpression, impaired
cancer cell growth, suggesting that mitochondrial network remodeling is important in tumorigenesis. Increased Drp1 expression is associated with a migratory phenotype in multiple cancer
types, further highlighting the role of mitochondrial dynamics in
metastasis (Senft and Ronai, 2016).
Altered mitochondrial dynamics are a key feature of K-Rasdependent cellular transformation, with oncogenic K-Ras
stimulating mitochondrial fragmentation via ERK1/2-mediated
phosphorylation of Drp1 (Kashatus et al., 2015; Serasinghe
et al., 2015). Knockdown or inhibition of Drp1 renders cells
resistant to oncogenic K-Ras-mediated transformation and impairs tumor growth (Kashatus et al., 2015). Additionally, remodeling of the mitochondrial network upon oncogenic K-Ras
expression affects mitochondrial function, decreasing membrane potential and increasing ROS generation (Serasinghe
et al., 2015). Thus, K-Ras-mediated mitochondrial network remodeling creates a state of upregulated tumorigenic stimuli
to support cellular transformation. c-Myc also affects mitochondrial dynamics by altering the expression of multiple
fission and fusion proteins (Graves et al., 2012). However, the
net effect causes mitochondrial fusion (von Eyss et al., 2015),
and further studies are needed to understand the differential
effects of oncogenic signaling pathways on mitochondrial
dynamics.
Cell Death
A hallmark of cancers is their ability to evade cell death, a phenomenon tightly linked to mitochondria. The pro-apoptotic
Bcl-2 family members Bax and Bak are recruited to the OMM
and oligomerize to mediate mitochondrial outer membrane permeabilization (MOMP), resulting in pore formation and cytochrome c release from mitochondria into the cytosol to activate
caspases, the executors of programmed cell death. During
normal physiology, anti-apoptotic family members such as
Bcl-2 and Bcl-XL bind and inhibit Bax/Bak. Tumor cells escape
apoptosis by downregulating pro-apoptotic Bcl-2 genes and/or
upregulating anti-apoptotic Bcl-2 genes, achieved through multiple mechanisms reviewed elsewhere (Lopez and Tait, 2015).
The balance of pro- and anti-apoptotic proteins affects a cancer
cells susceptibility to apoptotic stimuli and may predict how a
tumor will respond to chemotherapy (Sarosiek et al., 2013).
Mitochondrial shape also dictates apoptotic susceptibility, as
Drp1 loss delays cytochrome c release and apoptotic induction,
although follow-up work indicated that fission was not required
for Bax/Bak-mediated apoptosis (Martinou and Youle, 2011).
Instead, a GTPase-independent function of Drp1 in membrane
remodeling and hemifusion results in Bax oligomerization and
subsequent MOMP, indicating that Drp1 can promote apoptosis
independent of fission (Martinou and Youle, 2011). The
importance of mitochondrial shape in apoptosis is further demonstrated by Mfn-1-loss induced mitochondrial hyperfragmentation, causing resistance to apoptotic stimuli due to the loss of
Bax interaction with mitochondrial membranes. In this study,
Drp1 inhibition rescued sensitivity to apoptotic stimuli by
restoring a balanced mitochondrial network (Renault et al.,
2015). Additionally, Mfn1 is a target of the MEK/ERK signaling
pathwayphosphorylated Mfn1 inhibits mitochondria fusion
and interacts with Bak to stimulate its oligomerization and
subsequent MOMP (Pyakurel et al., 2015). Therefore, while fission
and fusion do not necessarily regulate apoptosis per se, a balance
of these activities appears to generate a mitochondrial shape that
supports interactions with pro-apoptotic Bcl2 proteins.
Oxidative Stress
ROS, in the form of superoxide and hydroxyl free radicals, and
hydrogen peroxide, are produced from physiological metabolic
reactions. Mitochondria are major contributors to cellular ROS
and have multiple antioxidant pathways to neutralize ROS
including superoxide dismutase (SOD2), glutathione, thioredoxin, and peroxiredoxins. The early observation that cancer
cells have high ROS levels led to an overly simple hypothesis
that inhibiting ROS could be a successful therapeutic strategy.
However, a more complex picture is emerging, in which ROS
stimulates signaling and proliferation, and the concomitant
upregulation of antioxidant pathways prevents ROS-mediated
cytotoxicity and may even enhance tumor survival (Shadel and
Horvath, 2015; Sullivan and Chandel, 2014).
Multiple physiological reactions, including electron transport
by the ETC and NAD(P)H oxidases result in ROS production,
and these are often exacerbated during tumorigenesis by
oncogenic signaling, ETC mutations, and hypoxic microenvironments. High levels of ROS contribute to the oxidation of macromolecules, such as lipids, proteins, and DNA, and can contribute
Cell 166, July 28, 2016 559
to genomic instability to promote transformation. However,

modest elevations of ROS observed in many tumors can regulate
cell signaling via cysteine oxidation. Indeed, H2O2 inactivates
the tumor suppressor PTEN by oxidizing active site cysteine
residues, causing the formation of a disulfide bond, which
prevents PTEN from inactivating the PI3K pathway (Sullivan
and Chandel, 2014). Since ROS can inactivate protein tyrosine
phosphatases through oxidation of cysteine residues, ROS
may have many yet to be discovered effects on diverse,
mitogen-activated pathways that are normally inhibited by phosphatases (Sullivan and Chandel, 2014). ROS-mediated regulation of oncogenic signaling also affects metastasisoxidation
of cysteines in Src increased its oncogenic ability, promoting tumor cell migration and metastasis in multiple tumor types, and
these phenotypes were blocked by a ROS scavenger (Porporato
et al., 2014).
In response to elevated ROS, many tumors upregulate protective antioxidant pathways. For example, oncogenic K-Ras, B-raf,
and c-Myc actively inhibit ROS through regulation of nuclear factor (erythroid-derived 2)-like 2 (NRF2), a transcriptional regulator
of the antioxidant response, to promote tumorigenesis (DeNicola
et al., 2011). Similarly, a study in melanoma found that circulating
tumor cells had higher levels of NADPH than primary tumor sites,
presumably to combat the increased ROS caused by the stress
of metastasis (Piskounova et al., 2015). In this system, antioxidants promoted distant metastasis, while folate pathway inhibition prevented metastasis due to decreased NADPH production
but had no effect on the primary tumor. Similarly, antioxidant
treatment increased the number of metastasis in a mouse model
of malignant melanoma, causing increased invasiveness dependent on glutathione synthesis cells (Le Gal et al., 2015). Thus,
successful tumors maintain ROS levels within a window that
stimulates proliferation without causing cytotoxicity. The balance of ROS production and antioxidant expression is critical
for maintaining this tumor-promoting ROS level.
The requirement for upregulated antioxidant pathways may be
an Achilles heel for tumor cells: combination therapy using glutathione and thioredoxin pathway inhibitors has promising results
in vitro and in vivo in breast cancer models (Harris et al., 2015).
Targeting other aspects of mitochondrial metabolism that
contribute to redox regulation has also been proven to be a
successful anti-cancer strategy. For example, inhibition of glutamate dehydrogenase 1 (GDH1) increased ROS by reducing
levels of fumarate, an activator of antioxidant glutathione peroxidase 1 (GPx), to slow cancer growth (Jin et al., 2015).
Metabolism
One hallmark of tumors is metabolic reprogramming, which supports macromolecule synthesis, bioenergetic demand, and
cellular survival (Pavlova and Thompson, 2016). Mitochondria
are hubs for metabolic reactions and drive this reprogramming
through multiple mechanisms.
Alterations in Glucose Utilization
Many tumors divert glycolytic intermediates into the pentose
phosphate pathway, serine biosynthesis, and lipid biosynthesis,
as opposed to complete oxidation by mitochondrial respiration.
In some tumors, this is achieved by limiting pyruvate utilization
by mitochondria. The availability of pyruvate for mitochondrial
560 Cell 166, July 28, 2016
oxidation is regulated by pyruvate kinase (PKM), which catalyzes

the final step of glycolysis to generate pyruvate. Cancers specifically upregulate the PKM2 isoform, which has low activity, allowing upstream glycolytic intermediates to accumulate and
be used for anabolic processes (Christofk et al., 2008). Additionally, the mitochondrial pyruvate carrier (MPC1 and MPC2) is
either lost or downregulated in a number of cancers (Schell
et al., 2014). MPC re-expression had a profound effect on
reducing tumor growth in xenograft models, suggesting that
the expression of this fuel gatekeeper is an important determinant of growth (Yang et al., 2014). Furthermore, MPC loss stimulates compensatory pathways that maintain fuel oxidation by
the TCA cycle, including upregulation of glutaminolysis and the
use of fatty acids and branched chain amino acids, demonstrating mitochondrial metabolic flexibility (Schell et al., 2014;
Vacanti et al., 2014; Yang et al., 2014). Thus, mitochondria
remain functional during aerobic glycolysis, and mitochondrialdependent metabolic reprogramming can support bioenergetic
homeostasis during Warburg metabolism.
Reprogramming of Amino Acid Metabolism
Glutamine can be a substrate for TCA cycle oxidation and a
starting material for macromolecule synthesis (DeBerardinis
et al., 2007). The amide nitrogen on glutamine is used in nucleotide and amino acid synthesis, and glutamine-derived carbons
are used in glutathione, amino acid, and lipid synthesis. Catabolism of glutamine, termed glutaminolysis, is elevated in many
glutamine-addicted tumors and is often driven by c-Myc upregulation of glutaminase (GLS), which converts glutamine to glutamate and ammonia (Stine et al., 2015). Glutamate is oxidized
to a-ketoglutarate (a-KG) by GDH, providing an entry point into
the TCA cycle. This process is inhibited by the mitochondriallocalized sirtuin, SIRT4, a tumor suppressor in multiple cancer
models. SIRT4 expression in B cell lymphoma cells downregulates glutamine uptake and inhibits growth, whereas SIRT4
loss in an Em-myc B cell lymphoma model increases glutamine
consumption and accelerates tumorigenesis (Jeong et al.,
2014). In addition, transaminases utilize glutamate nitrogen to
couple a-KG production to synthesis of non-essential amino
acids, and tumor cells can utilize this pathway to support biosynthesis and redox homeostasis. For example, oncogenic K-Ras
reprograms glutamine metabolism by transcriptional downregulation of GDH1 and upregulation of GOT1, the aspartate
transaminase, to produce cytosolic oxaloacetate, which can ultimately lead to an increase in NADPH/NADP+ ratio through conversion to pyruvate (Son et al., 2013). Similarly, transaminases,
but not GDH1, are upregulated in 3D cultures of proliferating
mammary epithelial cells compared to quiescent cells, suggesting that this pathway is important during cancer cell proliferation
to support biosynthesis (Coloff et al., 2016).
In tumor cells with dysfunctional mitochondria due to ETC or
TCA cycle enzyme mutations, a fraction of glutamine-derived
a-KG undergoes reductive carboxylation to support biosynthesis and redox homeostasis (Mullen et al., 2011). This pathway
is dependent on the NADP+/NADPH-utilizing isocitrate dehydrogenase isoforms IDH1 (cytosolic) and IDH2 (mitochondrial),
catalyzing the reverse reaction of isocitrate production from
a-KG. Reductive carboxylation can also support anchorage independent tumor growth in spheroid cultures by mitigating
mitochondrial ROS in a coordinated cycle in which cytosolic

reductive carboxylation by IDH1 supports mitochondrial oxidative metabolism and NADPH production by IDH2 (Jiang et al.,
2016).
As nutrients are oxidized to produce biosynthetic precursors,
electrons are removed from carbon. Therefore, electron acceptors can quickly become limiting in highly proliferating cells.
This observation was highlighted in a series of studies demonstrating that beyond ATP production, mitochondrial respiration
is required to replenish electron-accepting cofactors NAD+ and
FAD (Birsoy et al., 2015; Sullivan et al., 2015). Interestingly,
when mitochondrial respiration is impaired, rather than ATP,
the electron acceptors are most limiting for de novo synthesis
of aspartate, a key amino acid required for protein and nucleotide synthesis.
Aside from coordinating fuel oxidation, mitochondria contribute to tumor progression through nucleotide synthesis via
one-carbon (1C) metabolism. The mitochondrial folate synthesis pathway consists of serine hydroxylmethyltransferase 2
(SHMT2) and methylenetetrahydrofolate dehydrogenase 2
(MTHFD2). A meta-analysis of gene expression profiles identified MTHFD2 as overexpressed in many human tumors and
further studies revealed its importance in survival of cancer
cells (Nilsson et al., 2014). Unlike the cytosolic arm of folate
metabolism that primarily uses serine, the mitochondrial arm
also uses glycine as a carbon source, a potential vulnerability
for cancers that upregulate this pathway. Metabolic profiling
of NCI-60 lines revealed high correlation of proliferation with
glycine consumption along with the increase in SHMT2,
MTHFD2, and MTHFD1L (Jain et al., 2012). While the cytosolic
pathway can compensate for loss of mitochondrial 1C metabolism, cells become dependent on extracellular serine and
glycine for growth and are thus susceptible to inhibition of serine
catabolism, highlighting the importance of mitochondrial 1C
metabolism in supporting tumorigenesis during nutrient deprivation (Ducker et al., 2016). For example, SHMT2 is expressed in
ischemic tumor zones, providing proliferative advantage under
hypoxic conditions (Kim et al., 2015). Additionally SHMT2 regulation of serine metabolism also contributes to NADPH production and detoxification of ROS under hypoxia, a function important for survival of Myc-driven cancers (Ye et al., 2014).
Lipid Metabolism
Unlike other fuels, lipid utilization in cancer is less defined at the
molecular level. Cancer-specific alterations of lipid metabolism
seem to be unique to tumor type, allowing for some cancers to
upregulate fatty acid oxidation (FAO) while others are more
dependent on lipid synthesis. Upregulation of lipogenesis is
postulated to be a common feature across most tumors, in
part to produce membranes for proliferation (Currie et al.,
2013). Inhibition of ATP-citrate lyase (ACLY), which converts
mitochondrial-derived citrate to acetyl-CoA in the cytoplasm to
support lipogenesis, impairs tumorigenesis in multiple models
(Currie et al., 2013). In contrast, certain cancer types including
lymphomas and leukemias rely primarily on FAO for ATP production (Carracedo et al., 2013). Additionally, FAO may be a
preferred fuel choice for cancers undergoing stress as it is a
crucial survival mechanism for breast cancer cells undergoing
loss of attachment to the extracellular matrix (Carracedo et al.,
2013). However, mechanisms that upregulate FAO in cancers

remain poorly understood. In one example, tumor cell upregulation of the brain-specific isoform of carnitine palmitoyltransferase (Cpt-1c), required for mitochondrial FA import, resulted in
increased FAO and ATP production and resistance to metabolic
stress (Carracedo et al., 2013). Moreover, increased FAO may
confer benefits beyond ATP generation such as maintaining
redox homeostasis (Carracedo et al., 2013). Finally, production
of acetyl-CoA from oxidized fatty acids could be used for epigenetic remodeling of chromatin, subsequently causing lasting
changes in metabolism.
Studying Cancer Metabolism In Vivo
Recent work has highlighted the importance of studying cancer
metabolism in models comparable to the in vivo disease. For
example, while glutamine fuels TCA cycle anaplerosis in vitro,
this is not necessarily true of all tumors in vivo. Studies
comparing the fate of labeled glucose and glutamine in mouse
models of K-Ras-driven non-small-cell lung cancer showed minimal contribution of glutamine to TCA cycle intermediates (Davidson et al., 2016). Additionally, studies in glioblastoma cells
showed that glutamine-dependent anaplerosis was not required
for growth, with cells secreting glutamate even under glutamine
starvation conditions (Tardito et al., 2015). In this study, glutamine synthase (GS) expression sustained growth and purine
nucleotide biosynthesis during glutamine starvation. Furthermore, primary patient-derived glioma stem-like cells grew independently of glutamine supplementation. These studies highlight
the importance of understanding in vivo metabolic requirements
of tumor cells when designing therapeutic strategies.
Mitochondrial Signaling
Mitochondrial biology and tumorigenic signaling intersect at
multiple levels. First, classical oncogenic signaling pathways
alter mitochondrial functions to support tumorigenesis. Second,
direct signals from mitochondria affect cellular physiology and
tumorigenesis. Finally, mutations in mitochondrial enzymes can
result in oncometabolite production, a novel set of mitochondrial
signaling molecules that function in tumor initiation.
Classical Oncogenic and Tumor Suppressive Pathways
Regulate Mitochondrial Biology
The resurgence of mitochondrial research has led to the discovery that established tumor suppressors and oncogenes directly
regulate mitochondrial biology. Several hallmark cancer signaling pathways that alter mitochondrial biology to promote
transformation are described herein (Figure 3).
In addition to promoting mitochondrial biogenesis, numerous
studies have linked c-Myc with mitochondrial metabolism in cancer. The importance of mitochondrial metabolism in c-Myc
driven growth was demonstrated in a functional screen of
Myc-responsive cDNAs to rescue cell growth of c-Myc-null cells.
The screen identified SHMT2, the first reaction in mitochondrial
1C metabolism, as the only target that could partially rescue
growth (Nikiforov et al., 2002). While the tumorigenic contribution
of increased mitochondrial biogenesis and metabolism in oncogenic c-Myc-driven cancers is difficult to separate from its global
upregulation of transcription, suppression of glutaminolysis can
inhibit proliferation of c-Myc driven lymphoma cells (Jeong et al.,
2014; Le et al., 2012).
Cell 166, July 28, 2016 561
An important signaling pathway in hypoxic tumor microenvironments is mediated by HIF-1a, which upregulates glycolytic
metabolism in low oxygen conditions and inhibits mitochondrial
respiration (Mucaj et al., 2012). Mitochondrial-derived ROS also
regulate the HIF-1a pathway via inhibition of prolyl hydroxylases (PHDs), negative regulators of HIF signaling. SIRT3, a
mitochondrial deacetylase, is an important regulator of this
pathway by maintaining redox homeostasis via deacetylation
and activation of mitochondrial SOD2 and IDH2 and indirectly
through transcriptional upregulation of antioxidant pathways
(Bause and Haigis, 2013). SIRT3-dependent reduction in mitochondrial ROS results in HIF-1a degradation, limiting glycolysis
and the Warburg effect in tumors (Bell et al., 2011; Finley et al.,
2011).
In addition to the pleiotropic effects of oncogenic K-Ras
signaling on proliferation, apoptosis, and metabolism, oncogenic K-Ras results in a coordinated program of mitochondrial
regulation that supports transformation (Pylayeva-Gupta et al.,
2011). Multiple K-Ras-dependent mechanisms can downregulate mitochondrial respiration including upregulation of mitochondrial fission (Kashatus et al., 2015; Serasinghe et al.,
2015), transcriptional downregulation of complex I (Wang et al.,
2015), and ERK-phosphorylation-dependent mitochondrial
translocation of phosphoglycerate kinase I (PGK1) (Li et al.,
2016). Oncogenic K-Ras also promotes upregulation of mitophagy to preserve mitochondrial function under starvation conditions. Autophagy inhibition in cancers with active K-Ras results
in a decline in mitochondrial respiration, TCA metabolite, and
energy levels during starvation; thus, this pathway may be
important for tumor cell survival in nutrient-depleted microenvironments (Guo et al., 2011).
The PI3K/Akt signaling pathway stimulates cell growth and is
often activated in cancer either through oncogenic mutations
in signaling kinases or loss/mutation of the PTEN tumor suppressor, a key phosphatase that shuts off this pathway (Papa et al.,
2014). Although PI3K signaling induces cell growth and upregulates glycolysis, metabolic adaptation via a switch to mitochondrial oxidative phosphorylation can mediate resistance to PI3K
inhibitors, undermining the effectiveness of PI3K-specific targeted therapy (Ghosh et al., 2015). A major downstream effector
of active PI3K/Akt signaling is mTOR, which participates in
mTORC1 and mTORC2 signaling complexes to couple nutrient
and growth-factor sensing to cellular growth through regulation
of translation, anabolic metabolism, and autophagy (Dibble
and Cantley, 2015). In addition to regulating mitochondrial
biogenesis, mTORC1 stimulates multiple mitochondrial metabolic pathways. The transcriptional repression of SIRT4 downstream of mTORC1 activity results in GDH activation to
upregulate glutaminolysis (Csibi et al., 2013). mTORC1 also induces the mitochondrial folate pathway to promote de novo
purine synthesis via activation of the transcription factor ATF4
to result in upregulation of MTHFD2 expression (Ben-Sahra
et al., 2016).
The AMP-regulated kinase (AMPK) signaling network is activated during low energy conditions, directly inhibiting multiple
targets including mTORC1 to restore energy homeostasis.
AMPK is a critical downstream target of the liver kinase B1
(LKB1) tumor suppressor, which is mutated in the inherited
562 Cell 166, July 28, 2016
cancer disorder Peutz-Jeghers syndrome and mediates many

LKB1 tumor suppressive functions (Faubert et al., 2015). However, AMPK loss does not fully recapitulate LKB1 loss and
AMPK has both pro- and anti-tumorigenic effects, which
appear dependent on the presence of other oncogenic drivers
as well as tumor stage (Faubert et al., 2015). While AMPK
loss can uncouple proliferation from energy sensing to allow
for unhindered proliferation with oncogenic growth signaling,
AMPK functions in metabolic adaptation and mitochondrial
homeostasis can be beneficial in established tumors. For
example, AMPK promotes mitophagy through phosphorylation
of ULK kinases and is required for cell survival during starvation (Faubert et al., 2015). Additionally, AMPK activation in
response to ETC dysfunction results in mitochondrial fragmentation through direct phosphorylation of mitochondrial
fission factor, an OMM receptor for Drp1 (Toyama et al.,
2016). Finally, sustained energy deprivation can result in
AMPK-mediated upregulation of mitochondrial biogenesis via
PGC-1aallowing the cell further metabolic plasticity (Faubert
et al., 2015).
p53 is a commonly mutated tumor suppressor and has been
extensively studied due to its transcriptional regulation of cellcycle and apoptotic genes. It is now appreciated that p53 also
has functions in the regulation of cellular metabolism via transcriptional activation of metabolic genes (Berkers et al., 2013).
p53 limits glycolysis and drives transcription of genes required
for ETC assembly and maintenance (Berkers et al., 2013). However, more recent work has suggested an alternate side to p53s
role in tumorigenesis, with its ability to allow for adaptation to
metabolic stress resulting in pro-survival effects in tumor cells.
These pro-survival effects are partially accomplished through
upregulation of mitochondrial FAO and respiration, allowing cancer cells to adapt to starvation conditions (Jiang et al., 2015). In
addition to transcriptional regulation of mitochondrial activity,
p53 also directly functions at the mitochondria to induce
apoptosis in response to stress via interactions with Bcl-2 family
members (Vaseva and Moll, 2009). Tumor-derived p53 mutations no longer interact with Bcl-2 and do not trigger mitochondrial outer membrane permeabilization (Vaseva and Moll,
2009). Thus, in addition to effects on transcriptional activity,
p53 mutations can also promote cancer survival through direct
mitochondrial functions.
Mitochondrial Retrograde Signals
Mitochondria are important stress sensors, and retrograde
signaling from the mitochondria allows the cell to adapt to its
environment. Metabolites generated by mitochondrial metabolic
pathways, including the TCA cycle, b-oxidation, and the ETC,
affect both nuclear gene transcription via chromatin modification
as well as cytosolic signaling pathways. For example, the
TCA cycle intermediate a-KG is a cosubstrate for many enzymes
in the cytoplasm and nucleus including the PHD family
and the 10-11-translocation methylcytosine dioxygenase (TET)
and Jumunji-C histone demethylase (JHDM) families of chromatin-modifying enzymes. In the case of chromatin regulation, glutamine-derived a-KG contributes to TET-dependent
demethylation reactions (Carey et al., 2015). Additional mitochondrial regulation of chromatin occurs through histone
acetylation. ACLY-dependent production of acetyl-CoA from
mitochondrial-derived citrate is used by histone acetyl transferases (HATs), and oncogenic signaling pathways modify histone
acetylation patterns in a ACLY-dependent manner (Lee et al.,
2014; Wellen et al., 2009). In addition to chromatin modification,
acetyl-CoA generated from mitochondrial-derived citrate is used
for the acetylation of many cytosolic and mitochondrial proteins
to modulate protein activity. Thus, mitochondrial-derived metabolites can effect signaling pathways, nuclear transcription, and
chromatin modification.
In addition to signaling molecules, readouts of mitochondrial
integrity including Dcm and MOMP also function as important
signals, enabling the cell to respond to unhealthy/dysfunctional
mitochondria. Since the membrane potential generated by
healthy mitochondria is required for protein import into the mitochondrial matrix and intermembrane space via the TIM22 and
TIM23 translocator complexes, loss of membrane potential impairs import. If the defect in protein import is severe, the cell
can initiate mitophagy to clear these unhealthy mitochondria
as discussed above. Additionally, ATP generation by the ETC
is an important signaling output with diminished ETC activity
increasing the AMP/ATP ratio to activate AMPK signaling. ETC
dysfunction can also result in decreased NAD+ levels, a co-substrate for both the sirtuin and poly(ADP-ribose) protein families,
which have many functions in tumorigenesis (German and Haigis, 2015; Vyas and Chang, 2014). Finally, ROS regulates cytosolic signaling networks to promote tumorigenesis (as discussed
above).
Mitochondrial Oncometabolites
Dominant mutations in mitochondrial enzymes led to the exciting
identification of mitochondrial-derived signaling molecules,
termed oncometabolites. Mutant versions of cytoplasmic and
mitochondrial IDH isoforms, found in a striking 20% of acute
myeloid leukemias and 70% of glioblastomas, reduce a-KG to
generate the oncometabolite (R)-2-hydroxyglutarate ([R]-2-HG)
(Dang et al., 2009; Ward et al., 2010). In addition, loss of function
of TCA cycle enzymes succinate dehydrogenase (SDH) and
fumarate hydratase (FH), underlying the inherited cancer predispositions hereditary paraganglioma syndrome and hereditary
leiomyomatosis and renal-cell cancer syndrome, respectively,
result in the accumulation of metabolic intermediates succinate
and fumarate, which function as oncometabolites when in
excess.
A major mode of action of these oncometabolites is owed to
their structural similarity to a-KG, allowing them to act as
competitive inhibitors of a-KG-dependent enzymes including
the TET and JHDM families of chromatin-modifying enzymes
and the PHD family (Nowicki and Gottlieb, 2015). Inhibition of
TET activity leads to hypermethylation of CpG islands, found
near gene promoters, which results in gene silencing (Nowicki
and Gottlieb, 2015). Additionally, repressive histone methylation
marks on H3K9 and H3K27 are observed in IDH1 and IDH2
mutant gliomas due to JHDM inhibition (Lu et al., 2012). Therefore, through the production of oncometabolites, mitochondria
exert strong influence on chromatin structure to promote
tumor initiation. Both succinate and fumarate accumulation stabilize HIF-1a via PHD inhibition, reinforcing the Warburg effect
(MacKenzie et al., 2007; Nowicki and Gottlieb, 2015). In contrast,
(R)-2-HG activates PHD enzymes, diminishing HIF-1a levels,
which resulted in the enhancement of proliferation of astrocytes

(Koivunen et al., 2012). (R)-2-HG alone reversibly recapitulates
the effects of IDH mutation on leukemogenesis while its enantiomer (S)-2-HG had no effect even though it more potently
inhibited TET2 and PHDs, suggesting that differential requirements for HIF-1a depending on cell type can influence neoplasia
(Losman et al., 2013).
FH deficiency also supports tumorigenesis independently of
a-KG/HIF-1a. The high level of fumarate accumulation in FHdeficient tumors/cells results in increased protein succinylation
through the covalent modification of fumarate on cysteines.
Cysteine succinylation inhibits Kelch-like ECH-associated protein 1 (KEAP1), a negative regulator of Nrf2, to result in upregulation of antioxidant pathways (Adam et al., 2011). Additionally,
accumulated fumarate can bind to glutathione to generate succinylated glutathione, an alternate glutathione reductase substrate that decreases NADPH and increases ROS levels (Sullivan
et al., 2013). Thus, FH deficiency can alter redox homeostasis to
promote tumorigenesis.
mtDNA Mutations
The presence of a separate mitochondrial genome adds to the
unique and complex biology of this organelle, as mutations in
mtDNA impact tumorigenesis. Mitochondria contain multiple
copies of a circular 16kB genome that encodes for 13 ETC subunits, mitochondrial rRNAs, and tRNAs. In addition to distinct
mtDNA haplotypes that exist among different human populations, many germline and somatic mtDNA mutations associated
with cancer risk have been identified (van Gisbergen et al.,
2015). Although the functional consequence of many of these
polymorphisms/mutations is not well understood, some mutations occur in ETC genes and can result in increased oxidative
stress due to ETC dysfunction to promote tumorigenesis. Differences in mtDNA copy number are implicated in tumorigenesis, although both low and high copy numbers have been
associated with various cancers, similar to the varying associations between mitochondrial biogenesis and tumorigenesis
(Reznik et al. 2016). Since mitochondria contain multiple copies
of the mtDNA genome, cells are either homoplasmic or heteroplasmic regarding their mtDNA composition, with mutant
copies of the genome spreading through the mitochondrial
network through fission and fusion cycles. In this way, dominant mtDNA mutations become established in a clonal cell
population. mtDNA mutations and haplotypes associated with
various cancer types are reviewed elsewhere (van Gisbergen
et al., 2015).
Concluding Remarks
Mitochondria are complex organelles that influence cancer initiation, growth, survival, and metastasis, and many facets of mitochondrial biology beyond energy production actively contribute
to tumorigenesis. These include mitochondrial mass, dynamics,
cell death regulation, redox homeostasis, metabolic regulation,
and signaling. The interplay between these aspects of mitochondrial biology results in coordinated programs of mitochondrial
regulation of cellular physiology and highlights the pleiotropic
functions of mitochondria in cancer. Additionally, similar to
the transforming discoveries of oncogenic mutations in growth
Cell 166, July 28, 2016 563
factor signaling pathways, mutations in mitochondrial metabolic

enzymes are an exciting new frontier in cancer biology.
The flexibility that mitochondria bestow tumor cells, including
alterations in fuel utilization, bioenergetics, cell death susceptibility, and oxidative stress, allows for survival in the face of
adverse environmental conditions such as starvation and during
chemotherapeutic and targeted cancer treatments. Therefore,
in order to effectively treat cancer, the escape routes to therapeutic interventions provided by mitochondria must also be
consideredfuture studies into combination therapies that
remove this flexibility will be important to advance cancer
treatments.
ACKNOWLEDGMENTS
We apologize for all the primary literature and work we could not cite due to
space limitations. This review is not meant to be a comprehensive summary
of all the work done in the field of mitochondrial functions in cancer. We thank
Jonathan Coloff, Lydia Finley, Karina Gonzalez, Jessica Spinelli, and Alison
Ringel for discussion on the manuscript. S.V. is supported by a postdoctoral
fellowship from the American Cancer Society (127097-PF-14-255-01-TBE).
E.Z. is supported by a postdoctoral fellowship from the American Heart Association (15POST25560077). M.C.H. is supported by the Ludwig Center at Harvard, the Paul F. Glenn Foundation, and the National Institute of Diabetes and
Digestive and Kidney Diseases (1R01DK103295-01A1).
REFERENCES
Adam, J., Hatipoglu, E., OFlaherty, L., Ternette, N., Sahgal, N., Lockstone, H.,
Baban, D., Nye, E., Stamp, G.W., Wolhuter, K., et al. (2011). Renal cyst formation in Fh1-deficient mice is independent of the Hif/Phd pathway: roles for
fumarate in KEAP1 succination and Nrf2 signaling. Cancer Cell 20, 524537.
Csibi, A., Fendt, S.M., Li, C., Poulogiannis, G., Choo, A.Y., Chapski, D.J.,
Jeong, S.M., Dempsey, J.M., Parkhitko, A., Morrison, T., et al. (2013). The
mTORC1 pathway stimulates glutamine metabolism and cell proliferation by
repressing SIRT4. Cell 153, 840854.
Currie, E., Schulze, A., Zechner, R., Walther, T.C., and Farese, R.V., Jr. (2013).
Cellular fatty acid metabolism and cancer. Cell Metab. 18, 153161.
Dang, C.V., Kim, J.W., Gao, P., and Yustein, J. (2008). The interplay between
MYC and HIF in cancer. Nat. Rev. Cancer 8, 5156.
Dang, L., White, D.W., Gross, S., Bennett, B.D., Bittinger, M.A., Driggers, E.M.,
Fantin, V.R., Jang, H.G., Jin, S., Keenan, M.C., et al. (2009). Cancer-associated
IDH1 mutations produce 2-hydroxyglutarate. Nature 462, 739744.
Davidson, S.M., Papagiannakopoulos, T., Olenchock, B.A., Heyman, J.E., Keibler, M.A., Luengo, A., Bauer, M.R., Jha, A.K., OBrien, J.P., Pierce, K.A., et al.
(2016). Environment Impacts the Metabolic Dependencies of Ras-Driven NonSmall Cell Lung Cancer. Cell Metab. 23, 517528.
De Luca, A., Fiorillo, M., Peiris-Page`s, M., Ozsvari, B., Smith, D.L., SanchezAlvarez, R., Martinez-Outschoorn, U.E., Cappello, A.R., Pezzi, V., Lisanti,
M.P., and Sotgia, F. (2015). Mitochondrial biogenesis is required for the
anchorage-independent survival and propagation of stem-like cancer cells.
Oncotarget 6, 1477714795.
DeBerardinis, R.J., Mancuso, A., Daikhin, E., Nissim, I., Yudkoff, M., Wehrli, S.,
and Thompson, C.B. (2007). Beyond aerobic glycolysis: transformed cells can
engage in glutamine metabolism that exceeds the requirement for protein and
nucleotide synthesis. Proc. Natl. Acad. Sci. USA 104, 1934519350.
DeNicola, G.M., Karreth, F.A., Humpton, T.J., Gopinathan, A., Wei, C., Frese,
K., Mangal, D., Yu, K.H., Yeo, C.J., Calhoun, E.S., et al. (2011). Oncogeneinduced Nrf2 transcription promotes ROS detoxification and tumorigenesis.
Nature 475, 106109.
Dibble, C.C., and Cantley, L.C. (2015). Regulation of mTORC1 by PI3K
signaling. Trends Cell Biol. 25, 545555.
Aisenberg, A.C., Reinafarje, B., and Potter, V.R. (1957). Studies on the Pasteur
effect. I. General observations. J. Biol. Chem. 224, 10991113.
Ducker, G.S., Chen, L., Morscher, R.J., Ghergurovich, J.M., Esposito, M.,
Teng, X., Kang, Y., and Rabinowitz, J.D. (2016). Reversal of Cytosolic OneCarbon Flux Compensates for Loss of the Mitochondrial Folate Pathway.
Cell Metab. 23, 11401153.
Bause, A.S., and Haigis, M.C. (2013). SIRT3 regulation of mitochondrial oxidative stress. Exp. Gerontol. 48, 634639.
Ernster, L., and Schatz, G. (1981). Mitochondria: a historical review. J. Cell Biol.
91, 227s255s.
Bell, E.L., Emerling, B.M., Ricoult, S.J., and Guarente, L. (2011). SirT3 suppresses hypoxia inducible factor 1a and tumor growth by inhibiting mitochondrial ROS production. Oncogene 30, 29862996.
Faubert, B., Vincent, E.E., Poffenberger, M.C., and Jones, R.G. (2015). The
AMP-activated protein kinase (AMPK) and cancer: many faces of a metabolic
regulator. Cancer Lett. 356 (2 Pt A), 165170.
Ben-Sahra, I., Hoxhaj, G., Ricoult, S.J., Asara, J.M., and Manning, B.D. (2016).
mTORC1 induces purine synthesis through control of the mitochondrial tetrahydrofolate cycle. Science 351, 728733.
Finley, L.W., Carracedo, A., Lee, J., Souza, A., Egia, A., Zhang, J., Teruya-Feldstein, J., Moreira, P.I., Cardoso, S.M., Clish, C.B., et al. (2011). SIRT3 opposes
reprogramming of cancer cell metabolism through HIF1a destabilization. Cancer Cell 19, 416428.
Berkers, C.R., Maddocks, O.D., Cheung, E.C., Mor, I., and Vousden, K.H.
(2013). Metabolic regulation by p53 family members. Cell Metab. 18, 617633.
Birsoy, K., Wang, T., Chen, W.W., Freinkman, E., Abu-Remaileh, M., and Sabatini, D.M. (2015). An Essential Role of the Mitochondrial Electron Transport
Chain in Cell Proliferation Is to Enable Aspartate Synthesis. Cell 162, 540551.
Carey, B.W., Finley, L.W., Cross, J.R., Allis, C.D., and Thompson, C.B. (2015).
Intracellular a-ketoglutarate maintains the pluripotency of embryonic stem
cells. Nature 518, 413416.
German, N.J., and Haigis, M.C. (2015). Sirtuins and the Metabolic Hurdles in
Cancer. Curr. Biol. 25, R569R583.
Ghosh, J.C., Siegelin, M.D., Vaira, V., Faversani, A., Tavecchio, M., Chae, Y.C.,
Lisanti, S., Rampini, P., Giroda, M., Caino, M.C., et al. (2015). Adaptive mitochondrial reprogramming and resistance to PI3K therapy. J. Natl. Cancer
Inst. 107, dju502.
Chourasia, A.H., Boland, M.L., and Macleod, K.F. (2015). Mitophagy and cancer. Cancer Metab. 3, 4.
Gopal, Y.N., Rizos, H., Chen, G., Deng, W., Frederick, D.T., Cooper, Z.A.,
Scolyer, R.A., Pupo, G., Komurov, K., Sehgal, V., et al. (2014). Inhibition of
mTORC1/2 overcomes resistance to MAPK pathway inhibitors mediated by
PGC1a and oxidative phosphorylation in melanoma. Cancer Res. 74, 7037
7047.
Christofk, H.R., Vander Heiden, M.G., Harris, M.H., Ramanathan, A., Gerszten,
R.E., Wei, R., Fleming, M.D., Schreiber, S.L., and Cantley, L.C. (2008). The M2
splice isoform of pyruvate kinase is important for cancer metabolism and
tumour growth. Nature 452, 230233.
Graves, J.A., Wang, Y., Sims-Lucas, S., Cherok, E., Rothermund, K., Branca,
M.F., Elster, J., Beer-Stolz, D., Van Houten, B., Vockley, J., and Prochownik,
E.V. (2012). Mitochondrial structure, function and dynamics are temporally
controlled by c-Myc. PLoS ONE 7, e37699.
Coloff, J.L., Murphy, J.P., Braun, C.R., Harris, I.S., Shelton, L.M., Kami, K.,
Gygi, S.P., Selfors, L.M., and Brugge, J.S. (2016). Differential Glutamate Metabolism in Proliferating and Quiescent Mammary Epithelial Cells. Cell Metab.
23, 867880.
Guo, J.Y., Chen, H.Y., Mathew, R., Fan, J., Strohecker, A.M., Karsli-Uzunbas,
G., Kamphorst, J.J., Chen, G., Lemons, J.M., Karantza, V., et al. (2011). Activated Ras requires autophagy to maintain oxidative metabolism and tumorigenesis. Genes Dev. 25, 460470.
Carracedo, A., Cantley, L.C., and Pandolfi, P.P. (2013). Cancer metabolism:
fatty acid oxidation in the limelight. Nat. Rev. Cancer 13, 227232.
564 Cell 166, July 28, 2016
Guo, J.Y., Karsli-Uzunbas, G., Mathew, R., Aisner, S.C., Kamphorst, J.J., Strohecker, A.M., Chen, G., Price, S., Lu, W., Teng, X., et al. (2013). Autophagy
suppresses progression of K-ras-induced lung tumors to oncocytomas and
maintains lipid homeostasis. Genes Dev. 27, 14471461.
Haq, R., Shoag, J., Andreu-Perez, P., Yokoyama, S., Edelman, H., Rowe, G.C.,
Frederick, D.T., Hurley, A.D., Nellore, A., Kung, A.L., et al. (2013). Oncogenic
BRAF regulates oxidative metabolism via PGC1a and MITF. Cancer Cell 23,
302315.
Harris, I.S., Treloar, A.E., Inoue, S., Sasaki, M., Gorrini, C., Lee, K.C., Yung,
K.Y., Brenner, D., Knobbe-Thomsen, C.B., Cox, M.A., et al. (2015). Glutathione
and thioredoxin antioxidant pathways synergize to drive cancer initiation and
progression. Cancer Cell 27, 211222.
Hu, Y.L., DeLay, M., Jahangiri, A., Molinaro, A.M., Rose, S.D., Carbonell, W.S.,
and Aghi, M.K. (2012). Hypoxia-induced autophagy promotes tumor cell survival and adaptation to antiangiogenic treatment in glioblastoma. Cancer
Res. 72, 17731783.
Jain, M., Nilsson, R., Sharma, S., Madhusudhan, N., Kitami, T., Souza, A.L.,
Kafri, R., Kirschner, M.W., Clish, C.B., and Mootha, V.K. (2012). Metabolite
profiling identifies a key role for glycine in rapid cancer cell proliferation. Science 336, 10401044.
Jeong, S.M., Lee, A., Lee, J., and Haigis, M.C. (2014). SIRT4 protein suppresses tumor formation in genetic models of Myc-induced B cell lymphoma.
J. Biol. Chem. 289, 41354144.
Jiang, D., LaGory, E.L., Kenzelmann Broz, D., Bieging, K.T., Brady, C.A., Link,
N., Abrams, J.M., Giaccia, A.J., and Attardi, L.D. (2015). Analysis of p53 transactivation domain mutants reveals Acad11 as a metabolic target important for
p53 pro-survival function. Cell Rep. 10, 10961109.
Jiang, L., Shestov, A.A., Swain, P., Yang, C., Parker, S.J., Wang, Q.A., Terada,
L.S., Adams, N.D., McCabe, M.T., Pietrak, B., et al. (2016). Reductive carboxylation supports redox homeostasis during anchorage-independent growth.
Nature 532, 255258.
Jin, L., Li, D., Alesi, G.N., Fan, J., Kang, H.B., Lu, Z., Boggon, T.J., Jin, P., Yi, H.,
Wright, E.R., et al. (2015). Glutamate dehydrogenase 1 signals through antioxidant glutathione peroxidase 1 to regulate redox homeostasis and tumor
growth. Cancer Cell 27, 257270.
Kasahara, A., and Scorrano, L. (2014). Mitochondria: from cell death executioners to regulators of cell differentiation. Trends Cell Biol. 24, 761770.
Kashatus, J.A., Nascimento, A., Myers, L.J., Sher, A., Byrne, F.L., Hoehn, K.L.,
Counter, C.M., and Kashatus, D.F. (2015). Erk2 phosphorylation of Drp1 promotes mitochondrial fission and MAPK-driven tumor growth. Mol. Cell 57,
537551.
Kim, D., Fiske, B.P., Birsoy, K., Freinkman, E., Kami, K., Possemato, R.L.,
Chudnovsky, Y., Pacold, M.E., Chen, W.W., Cantor, J.R., et al. (2015).
SHMT2 drives glioma cell survival in ischaemia but imposes a dependence
on glycine clearance. Nature 520, 363367.
Le Gal, K., Ibrahim, M.X., Wiel, C., Sayin, V.I., Akula, M.K., Karlsson, C., Dalin,
M.G., Akyurek, L.M., Lindahl, P., Nilsson, J., and Bergo, M.O. (2015). Antioxidants can increase melanoma metastasis in mice. Sci. Transl. Med. 7, 308re8.
LeBleu, V.S., OConnell, J.T., Gonzalez Herrera, K.N., Wikman, H., Pantel, K.,
Haigis, M.C., de Carvalho, F.M., Damascena, A., Domingos Chinen, L.T., Rocha, R.M., Asara, J.M., and Kalluri, R. (2014). Pgc-1alpha Mediates Mitochondrial Biogenesis and Oxidative Phosphorylation in Cancer Cells to Promote
Metastasis. Nat. Cell. Biol. 16, 9921003, 10011015.
Lee, J.V., Carrer, A., Shah, S., Snyder, N.W., Wei, S., Venneti, S., Worth, A.J.,
Yuan, Z.F., Lim, H.W., Liu, S., et al. (2014). Akt-dependent metabolic reprogramming regulates tumor cell histone acetylation. Cell Metab. 20, 306319.
Li, F., Wang, Y., Zeller, K.I., Potter, J.J., Wonsey, D.R., ODonnell, K.A., Kim,
J.W., Yustein, J.T., Lee, L.A., and Dang, C.V. (2005). Myc stimulates nuclearly
encoded mitochondrial genes and mitochondrial biogenesis. Mol. Cell. Biol.
25, 62256234.
Li, X., Jiang, Y., Meisenhelder, J., Yang, W., Hawke, D.H., Zheng, Y., Xia, Y.,
Aldape, K., He, J., Hunter, T., et al. (2016). Mitochondria-Translocated PGK1
Functions as a Protein Kinase to Coordinate Glycolysis and the TCA Cycle in
Tumorigenesis. Mol. Cell 61, 705719.
Lopez, J., and Tait, S.W. (2015). Mitochondrial apoptosis: killing cancer using
the enemy within. Br. J. Cancer 112, 957962.
Losman, J.A., Looper, R.E., Koivunen, P., Lee, S., Schneider, R.K., McMahon,
C., Cowley, G.S., Root, D.E., Ebert, B.L., and Kaelin, W.G., Jr. (2013). (R)-2-hydroxyglutarate is sufficient to promote leukemogenesis and its effects are
reversible. Science 339, 16211625.
Lu, C., Ward, P.S., Kapoor, G.S., Rohle, D., Turcan, S., Abdel-Wahab, O.,
Edwards, C.R., Khanin, R., Figueroa, M.E., Melnick, A., et al. (2012). IDH mutation impairs histone demethylation and results in a block to cell differentiation. Nature 483, 474478.
MacKenzie, E.D., Selak, M.A., Tennant, D.A., Payne, L.J., Crosby, S., Frederiksen, C.M., Watson, D.G., and Gottlieb, E. (2007). Cell-permeating alpha-ketoglutarate derivatives alleviate pseudohypoxia in succinate dehydrogenasedeficient cells. Mol. Cell. Biol. 27, 32823289.
Mancias, J.D., and Kimmelman, A.C. (2016). Mechanisms of Selective Autophagy in Normal Physiology and Cancer. J. Mol. Biol. 428 (9 Pt A), 16591680.
Martinou, J.C., and Youle, R.J. (2011). Mitochondria in apoptosis: Bcl-2 family
members and mitochondrial dynamics. Dev. Cell 21, 92101.
Matsuda, S., Nakanishi, A., Minami, A., Wada, Y., and Kitagishi, Y. (2015).
Functions and characteristics of PINK1 and Parkin in cancer. Front. Biosci.
(Landmark Ed.) 20, 491501.
Mishra, P., and Chan, D.C. (2016). Metabolic regulation of mitochondrial dynamics. J. Cell Biol. 212, 379387.
Morita, M., Gravel, S.P., Hulea, L., Larsson, O., Pollak, M., St-Pierre, J., and
Topisirovic, I. (2015). mTOR coordinates protein synthesis, mitochondrial activity and proliferation. Cell Cycle 14, 473480.
Mucaj, V., Shay, J.E., and Simon, M.C. (2012). Effects of hypoxia and HIFs on
cancer metabolism. Int. J. Hematol. 95, 464470.
Koivunen, P., Lee, S., Duncan, C.G., Lopez, G., Lu, G., Ramkissoon, S., Losman, J.A., Joensuu, P., Bergmann, U., Gross, S., et al. (2012). Transformation
by the (R)-enantiomer of 2-hydroxyglutarate linked to EGLN activation. Nature
483, 484488.
Mullen, A.R., Wheaton, W.W., Jin, E.S., Chen, P.H., Sullivan, L.B., Cheng, T.,
Yang, Y., Linehan, W.M., Chandel, N.S., and DeBerardinis, R.J. (2011). Reductive carboxylation supports growth in tumour cells with defective mitochondria. Nature 481, 385388.
LaGory, E.L., Wu, C., Taniguchi, C.M., Ding, C.K., Chi, J.T., von Eyben, R.,
Scott, D.A., Richardson, A.D., and Giaccia, A.J. (2015). Suppression of
PGC-1a Is Critical for Reprogramming Oxidative Metabolism in Renal Cell Carcinoma. Cell Rep. 12, 116127.
Nikiforov, M.A., Chandriani, S., OConnell, B., Petrenko, O., Kotenko, I., Beavis, A., Sedivy, J.M., and Cole, M.D. (2002). A functional screen for Mycresponsive genes reveals serine hydroxymethyltransferase, a major source
of the one-carbon unit for cell metabolism. Mol. Cell. Biol. 22, 57935800.
Lamb, R., Harrison, H., Hulit, J., Smith, D.L., Lisanti, M.P., and Sotgia, F.
(2014). Mitochondria as new therapeutic targets for eradicating cancer stem
cells: Quantitative proteomics and functional validation via MCT1/2 inhibition.
Nilsson, R., Jain, M., Madhusudhan, N., Sheppard, N.G., Strittmatter, L.,
Kampf, C., Huang, J., Asplund, A., and Mootha, V.K. (2014). Metabolic enzyme
expression highlights a key role for MTHFD2 and the mitochondrial folate
pathway in cancer. Nat. Commun. 5, 3128.
Le, A., Lane, A.N., Hamaker, M., Bose, S., Gouw, A., Barbi, J., Tsukamoto, T.,
Rojas, C.J., Slusher, B.S., Zhang, H., et al. (2012). Glucose-independent glutamine metabolism via TCA cycling for proliferation and survival in B cells. Cell
Metab. 15, 110121.
Nowicki, S., and Gottlieb, E. (2015). Oncometabolites: tailoring our genes.

FEBS J. 282, 27962805.
Papa, A., Wan, L., Bonora, M., Salmena, L., Song, M.S., Hobbs, R.M., Lunardi,
A., Webster, K., Ng, C., Newton, R.H., et al. (2014). Cancer-associated PTEN
Cell 166, July 28, 2016 565
mutants act in a dominant-negative manner to suppress PTEN protein function. Cell 157, 595610.
Pavlova, N.N., and Thompson, C.B. (2016). The Emerging Hallmarks of Cancer
Metabolism. Cell Metab. 23, 2747.
Piskounova, E., Agathocleous, M., Murphy, M.M., Hu, Z., Huddlestun, S.E.,
Zhao, Z., Leitch, A.M., Johnson, T.M., DeBerardinis, R.J., and Morrison, S.J.
(2015). Oxidative stress inhibits distant metastasis by human melanoma cells.
Nature 527, 186191.
Porporato, P.E., Payen, V.L., Perez-Escuredo, J., De Saedeleer, C.J., Danhier,
P., Copetti, T., Dhup, S., Tardy, M., Vazeille, T., Bouzin, C., et al. (2014). A mitochondrial switch promotes tumor metastasis. Cell Rep. 8, 754766.
Pyakurel, A., Savoia, C., Hess, D., and Scorrano, L. (2015). Extracellular regulated kinase phosphorylates mitofusin 1 to control mitochondrial morphology
and apoptosis. Mol. Cell 58, 244254.
Pylayeva-Gupta, Y., Grabocka, E., and Bar-Sagi, D. (2011). RAS oncogenes:
weaving a tumorigenic web. Nat. Rev. Cancer 11, 761774.
Renault, T.T., Floros, K.V., Elkholi, R., Corrigan, K.A., Kushnareva, Y., Wieder,
S.Y., Lindtner, C., Serasinghe, M.N., Asciolla, J.J., Buettner, C., et al. (2015).
Mitochondrial shape governs BAX-induced membrane permeabilization and
apoptosis. Mol. Cell 57, 6982.
Reznik, E., Miller, M.L., Senbabaoglu, Y., Riaz, N., Sarungbam, J., Tickoo,
S.K., Al-Ahmadie, H.A., Lee, W., Seshan, V.E., Hakimi, A.A., and Sander, C.
(2016). Mitochondrial DNA Copy Number Variation across Human Cancers.
Elife 5, e10769.
Sancho, P., Burgos-Ramos, E., Tavera, A., Bou Kheir, T., Jagust, P.,
Schoenhals, M., Barneda, D., Sellers, K., Campos-Olivas, R., Grana, O.,
et al. (2015). MYC/PGC-1a Balance Determines the Metabolic Phenotype
and Plasticity of Pancreatic Cancer Stem Cells. Cell Metab. 22, 590605.
Sarosiek, K.A., Chi, X., Bachman, J.A., Sims, J.J., Montero, J., Patel, L., Flanagan, A., Andrews, D.W., Sorger, P., and Letai, A. (2013). BID preferentially activates BAK while BIM preferentially activates BAX, affecting chemotherapy
response. Mol. Cell 51, 751765.
Schell, J.C., Olson, K.A., Jiang, L., Hawkins, A.J., Van Vranken, J.G., Xie, J.,
Egnatchik, R.A., Earl, E.G., DeBerardinis, R.J., and Rutter, J. (2014). A role
for the mitochondrial pyruvate carrier as a repressor of the Warburg effect
and colon cancer cell growth. Mol. Cell 56, 400413.
Senft, D., and Ronai, Z.A. (2016). Regulators of mitochondrial dynamics in cancer. Curr. Opin. Cell Biol. 39, 4352.
Serasinghe, M.N., Wieder, S.Y., Renault, T.T., Elkholi, R., Asciolla, J.J., Yao,
J.L., Jabado, O., Hoehn, K., Kageyama, Y., Sesaki, H., and Chipuk, J.E.
(2015). Mitochondrial division is requisite to RAS-induced transformation
and targeted by oncogenic MAPK pathway inhibitors. Mol. Cell 57, 521536.
Shadel, G.S., and Horvath, T.L. (2015). Mitochondrial ROS signaling in organismal homeostasis. Cell 163, 560569.
Son, J., Lyssiotis, C.A., Ying, H., Wang, X., Hua, S., Ligorio, M., Perera, R.M.,
Ferrone, C.R., Mullarky, E., Shyh-Chang, N., Kang, Y., Fleming, J.B.,
Bardeesy, N., Asara, J.M., Haigis, M.C., DePinho, R.A., Cantley, L.C., and Kimmelman, A.C. (2013). Glutamine Supports Pancreatic Cancer Growth through
a Kras-Regulated Metabolic Pathway. Nature 496, 101105.
Stine, Z.E., Walton, Z.E., Altman, B.J., Hsieh, A.L., and Dang, C.V. (2015).
MYC, Metabolism, and Cancer. Cancer Discov. 5, 10241039.
Sullivan, L.B., and Chandel, N.S. (2014). Mitochondrial reactive oxygen species and cancer. Cancer Metab. 2, 17.
Sullivan, L.B., Martinez-Garcia, E., Nguyen, H., Mullen, A.R., Dufour, E., Sudarshan, S., Licht, J.D., Deberardinis, R.J., and Chandel, N.S. (2013). The protooncometabolite fumarate binds glutathione to amplify ROS-dependent
signaling. Mol. Cell 51, 236248.
Sullivan, L.B., Gui, D.Y., Hosios, A.M., Bush, L.N., Freinkman, E., and Vander
Heiden, M.G. (2015). Supporting Aspartate Biosynthesis Is an Essential Function of Respiration in Proliferating Cells. Cell 162, 552563.
566 Cell 166, July 28, 2016
Tan, Z., Luo, X., Xiao, L., Tang, M., Bode, A.M., Dong, Z., and Cao, Y. (2016).
The Role of PGC1a in Cancer Metabolism and its Therapeutic Implications.
Mol. Cancer Ther. 15, 774782.
Tardito, S., Oudin, A., Ahmed, S.U., Fack, F., Keunen, O., Zheng, L., Miletic, H.,
Sakariassen, P.O., Weinstock, A., Wagner, A., et al. (2015). Glutamine synthetase activity fuels nucleotide biosynthesis and supports growth of glutaminerestricted glioblastoma. Nat. Cell Biol. 17, 15561568.
Toyama, E.Q., Herzig, S., Courchet, J., Lewis, T.L., Jr., Loson, O.C., Hellberg,
K., Young, N.P., Chen, H., Polleux, F., Chan, D.C., and Shaw, R.J. (2016).
Metabolism. AMP-activated protein kinase mediates mitochondrial fission in
response to energy stress. Science 351, 275281.
Vacanti, N.M., Divakaruni, A.S., Green, C.R., Parker, S.J., Henry, R.R., Ciaraldi,
T.P., Murphy, A.N., and Metallo, C.M. (2014). Regulation of substrate utilization
by the mitochondrial pyruvate carrier. Mol. Cell 56, 425435.
van Gisbergen, M.W., Voets, A.M., Starmans, M.H., de Coo, I.F., Yadak, R.,
Hoffmann, R.F., Boutros, P.C., Smeets, H.J., Dubois, L., and Lambin, P.
(2015). How do changes in the mtDNA and mitochondrial dysfunction influence
cancer and cancer therapy? Challenges, opportunities and models. Mutat.
Res. Rev. Mutat. Res. 764, 1630.
Vaseva, A.V., and Moll, U.M. (2009). The mitochondrial p53 pathway. Biochim.
Biophys. Acta 1787, 414420.
Viale, A., Pettazzoni, P., Lyssiotis, C.A., Ying, H., Sanchez, N., Marchesini, M.,
Carugo, A., Green, T., Seth, S., Giuliani, V., et al. (2014). Oncogene ablationresistant pancreatic cancer cells depend on mitochondrial function. Nature
514, 628632.
von Eyss, B., Jaenicke, L.A., Kortlever, R.M., Royla, N., Wiese, K.E., Letschert,
S., McDuffus, L.A., Sauer, M., Rosenwald, A., Evan, G.I., et al. (2015). A MYCDriven Change in Mitochondrial Dynamics Limits YAP/TAZ Function in Mammary Epithelial Cells and Breast Cancer. Cancer Cell 28, 743757.
Vyas, S., and Chang, P. (2014). New PARP targets for cancer therapy. Nat.
Rev. Cancer 14, 502509.
Wang, P., Song, M., Zeng, Z.L., Zhu, C.F., Lu, W.H., Yang, J., Ma, M.Z., Huang,
A.M., Hu, Y., and Huang, P. (2015). Identification of NDUFAF1 in mediating
K-Ras induced mitochondrial dysfunction by a proteomic screening approach.
Warburg, O. (1956). On the origin of cancer cells. Science 123, 309314.
Ward, P.S., Patel, J., Wise, D.R., Abdel-Wahab, O., Bennett, B.D., Coller, H.A.,
Cross, J.R., Fantin, V.R., Hedvat, C.V., Perl, A.E., et al. (2010). The common
feature of leukemia-associated IDH1 and IDH2 mutations is a neomorphic
enzyme activity converting alpha-ketoglutarate to 2-hydroxyglutarate. Cancer
Cell 17, 225234.
Weinhouse, S. (1956). On respiratory impairment in cancer cells. Science 124,
267269.
Wellen, K.E., Hatzivassiliou, G., Sachdeva, U.M., Bui, T.V., Cross, J.R., and
Thompson, C.B. (2009). ATP-citrate lyase links cellular metabolism to histone
acetylation. Science 324, 10761080.
Yang, C., Ko, B., Hensley, C.T., Jiang, L., Wasti, A.T., Kim, J., Sudderth, J.,
Calvaruso, M.A., Lumata, L., Mitsche, M., et al. (2014). Glutamine oxidation
maintains the TCA cycle and cell survival during impaired mitochondrial pyruvate transport. Mol. Cell 56, 414424.
Ye, J., Fan, J., Venneti, S., Wan, Y.W., Pawel, B.R., Zhang, J., Finley, L.W., Lu,
C., Lindsten, T., Cross, J.R., et al. (2014). Serine catabolism regulates mitochondrial redox control during hypoxia. Cancer Discov. 4, 14061417.
Zhang, H., Gao, P., Fukuda, R., Kumar, G., Krishnamachary, B., Zeller, K.I.,
Dang, C.V., and Semenza, G.L. (2007). HIF-1 inhibits mitochondrial biogenesis
and cellular respiration in VHL-deficient renal cell carcinoma by repression of
C-MYC activity. Cancer Cell 11, 407420.
Article
Mitotic Checkpoint Regulators Control Insulin

Signaling and Metabolic Homeostasis
Graphical Abstract
Authors
Eunhee Choi, Xiangli Zhang, Chao Xing,
Hongtao Yu
Correspondence
hongtao.yu@utsouthwestern.edu
In Brief
Mitotic checkpoint proteins that prevent
chromosomes from incorrectly
segregating have another unexpected
job: they regulate insulin signaling. This
role draws a link between chromosome
stability and nutrient metabolism.
Highlights
d
p31/ mice show glucose intolerance and insulin resistance

Low-level aneuploidy does not underlie metabolic defects in
p31/ mice
The insulin receptor (IR) directly binds to MAD2 through a
canonical MIM
p31comet blocks premature clathrin-mediated endocytosis of
unstimulated IR
Choi et al., 2016, Cell 166, 567581

July 28, 2016 2016 Elsevier Inc.
Article
Mitotic Checkpoint Regulators Control
Insulin Signaling and Metabolic Homeostasis
Eunhee Choi,1 Xiangli Zhang,2 Chao Xing,2 and Hongtao Yu1,*
1Howard Hughes Medical Institute, Department of Pharmacology, University of Texas Southwestern Medical Center, 6001 Forest Park Road,
Dallas, TX 75390, USA
2Bioinformatics Core, Eugene McDermott Center for Human Growth and Development, University of Texas Southwestern Medical Center,
6001 Forest Park Road, Dallas, TX 75390, USA
*Correspondence: hongtao.yu@utsouthwestern.edu
SUMMARY
Insulin signaling regulates many facets of animal

physiology. Its dysregulation causes diabetes and
other metabolic disorders. The spindle checkpoint
proteins MAD2 and BUBR1 prevent precocious chromosome segregation and suppress aneuploidy. The
MAD2 inhibitory protein p31comet promotes checkpoint inactivation and timely chromosome segregation. Here, we show that whole-body p31comet
knockout mice die soon after birth and have reduced
hepatic glycogen. Liver-specific ablation of p31comet
causes insulin resistance, hyperinsulinemia, glucose
intolerance, and hyperglycemia and diminishes the
plasma membrane localization of the insulin receptor (IR) in hepatocytes. MAD2 directly binds to IR
and facilitates BUBR1-dependent recruitment of
the clathrin adaptor AP2 to IR. p31comet blocks the
MAD2-BUBR1 interaction and prevents spontaneous clathrin-mediated IR endocytosis. BUBR1
deficiency enhances insulin sensitivity in mice.
BUBR1 depletion in hepatocytes or the expression
of MAD2-binding-deficient IR suppresses the metabolic phenotypes of p31comet ablation. Our findings
establish a major IR regulatory mechanism and
link guardians of chromosome stability to nutrient
metabolism.
INTRODUCTION
Insulin regulates systemic metabolic homeostasis in animals, as
well as cell survival, growth, and proliferation in multiple tissues
and organs. The main framework of insulin signaling has been
established for decades (White, 2003). At the plasma membrane
(PM), insulin binds and activates the insulin receptor (IR), a receptor tyrosine kinase that consists of a dimer of two disulfidelinked chains, IRa and IRb. The activated IR phosphorylates itself
and insulin receptor substrate 1/2 (IRS1/2), creating phosphotyrosine (pY)-containing docking motifs for downstream effectors and adaptors. These proteins activate the PI3K-AKT and
MAP kinase pathways to promote cellular glucose uptake,
glycogen and protein synthesis, and cell growth and survival

(Boucher et al., 2014). The activated IR can be internalized
through clathrin-mediated endocytosis (Goh and Sorkin, 2013),
thus terminating signaling. Dysregulation of insulin signaling
has been linked to human diseases, including diabetes and cancer (Boucher et al., 2014; Pollak, 2012).
In response to kinetochores not properly attached to spindle
microtubules, the spindle checkpoint inhibits the anaphasepromoting complex/cyclosome (APC/C) and delays anaphase
onset, thereby suppressing chromosome segregation errors
(Jia et al., 2013; London and Biggins, 2014; Musacchio, 2015).
The checkpoint proteins MAD2 and BUBR1 can simultaneously
bind to CDC20, converting it from an APC/C activator to a
subunit of an APC/C-inhibitory complex called the mitotic checkpoint complex (MCC) (Izawa and Pines, 2015). MAD2 is an unusual protein with multiple folded conformers, including the
latent open MAD2 (O-MAD2) and the active closed MAD2
(C-MAD2) (Luo and Yu, 2008; Mapelli and Musacchio, 2007).
On checkpoint activation, unattached kinetochores convert
O-MAD2 to CDC20-bound C-MAD2. C-MAD2-CDC20 then associates with BUBR1 to form MCC (Kulukian et al., 2009), in
which BUBR1 contacts both C-MAD2 and CDC20 (Chao et al.,
2012). During checkpoint inactivation, p31comet (also known as
MAD2L1BP) specifically binds to C-MAD2 and disrupts the
C-MAD2-BUBR1 interaction, thus promoting MCC disassembly
and timely anaphase onset (Hagan et al., 2011; Jia et al., 2011;
Xia et al., 2004; Yang et al., 2007). The intricate interactions
among p31comet, MAD2, and BUBR1 are crucial for proper chromosome segregation and genomic stability.
In this study, we discover the role of the p31comet-MAD2BUBR1 module of mitotic regulators in insulin signaling, thus
linking a critical chromosome segregation network to a major
pathway regulating metabolic homeostasis.
RESULTS
p31comet Ablation in Mice Causes Neonatal Lethality and
Liver Glycogen Shortage
Mad2 overexpression in mice promotes aneuploidy and tumorigenesis through hyperactivation of the spindle checkpoint (Sotillo et al., 2007). To test whether inactivation of p31comet might
similarly cause hyperactivation of the checkpoint and promote
tumorigenesis, we generated mice with floxed alleles of p31comet
(p31F/F) and crossed them with CAG-Cre transgenic mice to
Cell 166, 567581, July 28, 2016 2016 Elsevier Inc. 567
Figure 1. Whole-Body and Liver-Specific

Ablation of p31comet Reveals Its Role in
Metabolism
(A) Wild-type (WT), p31+/, and p31/ littermates at
E18.5 and birth. Arrows indicate milk spots.
(B) Survival curves of WT, p31+/, and p31/
neonates.
(C) H&E staining and periodic acid-Schiff (PAS)
staining (magenta) of livers from WT and p31/
newborns. Scale bar, 100 mm.
(D) Liver glycogen levels of WT and p31/ newborns.
(E) Fed blood glucose levels of WT, liver-p31/,
and liver-Insr/ mice. Mean SEM. *p < 0.05,
**p < 0.01, ***p < 0.0001 versus WT; mean
SEM (EI).
(F) Serum insulin concentrations of WT, liverp31/, and liver-Insr/ mice.
(G) Liver glycogen levels of WT, liver-p31/, and
liver-Insr/ mice.
(H) Glucose tolerance test in 2-month-old male
mice. WT, n = 21; liver-p31/, n = 16; liver-Insr/,
n = 9.
(I) Insulin tolerance test in 2-month-old male mice.
WT, n = 16; liver-p31/, n = 9; liver-Insr/, n = 12.
See also Figures S1 and S2 and Table S1.
delete p31comet in the whole body (Figures S1AS1C). Homozygous p31comet knockout (p31/) mice were born in the expected
Mendelian ratio. Heterozygous p31+/ mice did not show
568 Cell 166, 567581, July 28, 2016
discernible differences from wild-type

(WT) littermates (Figures 1A and 1B).
p31/ newborns and embryos at E18.5,
however, showed mild growth retardation
and died within 5 hr after birth, in spite of
suckling activity. Thus, instead of being
susceptible to tumorigenesis, p31/
mice exhibited neonatal lethality.
The p31comet protein was expectedly
absent in p31/ mouse embryonic fibroblasts (MEFs) (Figure S1D). The p31comet
protein level in p31+/ MEFs was about
65% of that in WT, and these p31+/ cells
did not show discernable abnormalities.
The p31/ MEFs proliferated more
slowly than did WT MEFs (Figure S1E)
and had elevated apoptosis, decreased
S-phase population, and increased
G2/M population (Figure S1F). Consistent
with established mitotic roles of p31comet,
p31/ MEFs exhibited a mitotic delay
and abnormal mitotic features, including
lagging chromosomes (Figures S1G
S1I). Consequently, p31/ MEFs displayed increased polyploidy and aneuploidy (Figures S1J and S1K). These
results validate the functions of p31comet
in mitotic regulation and in suppressing
chromosome instability.
The mitotic and aneuploidy phenotypes of p31/ MEFs were
similar to those reported for MEFs with a hypomorphic allele of
Cdc20 (Malureanu et al., 2010). Mice with the same hypomorphic
Cdc20 allele were, however, viable (Malureanu et al., 2010).

Similarly, partial inactivation of other mitotic regulators, including
BUBR1, induces aneuploidy in mice without compromising
their viability (Baker et al., 2004). Therefore, mitotic defects and
aneuploidy alone in mice do not necessarily lead to neonatal
lethality.
We investigated the defects underlying the neonatal lethality
of p31/ mice. The p31+/ mice did not show differences
discernible from WT littermates. Despite mild growth retardation, p31/ embryos and newborns did not show gross
morphological and developmental defects. The lung and respiratory muscle in p31/ animals were normal, and there was
evidence of lung inflation. p31/ mice also showed normal
liver development (Figure S2A). However, unlike hepatocytes
from WT and p31+/ mice, hepatocytes from p31/ E18.5 embryos and newborns had reduced cytoplasmic vacuolation
(Figures 1C and S2A), suggestive of defects in hepatic
glycogen storage.
Indeed, liver sections of p31/ newborn mice showed
dramatically reduced periodic acid Schiff (PAS) staining, which
detects polysaccharides, including glycogen (Figure 1C). As a
control, the skeletal muscle in the hind limbs of p31/ animals
exhibited normal PAS staining (Figure S2B). Direct biochemical
measurement confirmed that the p31/ liver had significantly
lower glycogen content (Figure 1D). Glycogen storage in the
liver is crucial for energy homeostasis of newborn mice, and
hypoglycemia is a major cause of neonatal lethality (Girard and
Pegorier, 1998). We suspect that the insufficient energy to
breathe and inability to transition from placenta to nursing, as
consequences of decreased glycogen stores in the liver, are
causal factors of neonatal lethality in p31/ animals. These phenotypes of p31/ mice closely resemble those of insulin receptor (IR) knockout (Insr/) mice (Joshi et al., 1996), although we
did not observe muscle hypotrophy in p31/ mice, as seen
with Insr/ mice.
Liver-Specific p31comet Ablation Causes Metabolic
Disorders in Mice
Liver-specific insulin receptor knockout (liver-Insr/) mice survive to adulthood and develop severe hepatic insulin resistance
and metabolic syndromes (Biddinger et al., 2008; Michael et al.,
2000), indicating that insulin resistance in the liver is a major
cause of metabolic disorders. We generated liver-specific
p31comet knockout mice (liver-p31/) by crossing p31F/F mice
with Albumin-Cre mice. Liver-p31/ mice were born in the expected Mendelian ratio, were indistinguishable from WT littermates, and survived to adulthood. Histological analysis did not
reveal hepatic developmental defects in liver-p31/ embryos
and mice (Figures S2C and S2D). The hepatic glycogen level of
liver-p31/ E18.5 embryos was reduced, but to much less
extent than p31/ embryos, allowing liver-p31/ mice to survive. Clearly, p31comet has functions in non-liver tissues that
help to maintain hepatic glycogen levels. Liver-Insr/ mice
showed normal hepatic architecture, but had scattered focal
dysplasia in the periportal area (Figure S2D), as reported previously (Michael et al., 2000).
2-month-old liver-p31/ mice showed hyperglycemia and
hyperinsulinemia in the fed state, although their hyperinsulinemia
was less severe compared to liver-Insr/ mice (Figures 1E and

1F; Table S1). Despite having elevated serum insulin levels, liverp31/ mice had a decreased level of hepatic glycogen (Figures
1G and S2D). Again, the hepatic glycogen shortage in liverp31/ mice was less severe than that in liver-Insr/ mice. The
hepatic triglyceride levels of liver-p31/ mice were decreased
by about 50%, whereas their serum triglyceride levels were
moderately increased (Table S1). These results indicate that specific ablation of p31comet in the liver can promote systemic
changes in glucose and lipid metabolism.
Liver-Insr/ mice showed severe glucose and insulin intolerance (Michael et al., 2000) (Figures 1H and 1I). 2-month-old male
liver-p31/ mice also developed glucose and insulin intolerance, albeit to lesser degrees. These data indicate that loss
of p31comet in the liver suffices to produce defects in insulin
modulation of glucose metabolism. The whole-body p31+/
mice had normal glucose metabolism (Figure S2E), indicating
that p31comet is not haploinsufficient.
Liver-Insr/ mice progressively developed liver dysfunction,
but surprisingly did not show severe glucose/insulin intolerance
at an older age (Michael et al., 2000). Indeed, the blood glucose
levels of 6-month-old liver-Insr/ and liver-p31/ mice became
normal (Figure S2F). Similar phenotypes of whole-body and liverspecific p31comet and Insr knockout mice implicate p31comet in
insulin signaling.
p31comet Regulates Insulin Signaling
In the liver, an important downstream event of insulin signaling is
the stimulation of glycogen synthesis by activating glycogen
synthase (GS) (Boucher et al., 2014). As expected, insulin treatment stimulated the GS activity in WT hepatocytes, and this
stimulation was absent in liver-Insr/ hepatocytes (Figure 2A).
Strikingly, insulin-dependent GS activation was abolished in
liver-p31/ hepatocytes, consistent with the hepatic glycogen
shortage in p31/ mice. Thus, p31comet is required for a crucial
branch of insulin signaling in hepatocytes.
To determine at which step p31comet regulated IR signaling,
we monitored the status of IR autophosphorylation, activating
phosphorylation of AKT (pT308), and inhibitory phosphorylation
of GSK3b (pS9) in whole-liver lysates (Figure 2B). WT, liverp31/, and liver-Insr/ mice were fasted overnight and injected with insulin via inferior vena cava. Liver lysates were
prepared from these mice and subjected to quantitative immunoblotting. The p31comet protein was reduced by about 80% in
liver-p31/ whole-liver lysates (Figure 2B). As hepatocytes
make up about 85% of total cells in the liver, the residual
p31comet in liver-p31/ animals was likely from non-parenchymal cells. IR autophosphorylation and AKT pT308 were
greatly reduced in both liver-Insr/ and liver-p31/ animals.
GSK3b pS9 was also reduced in both groups, but to a lesser
degree, implicating the existence of IR-independent mechanisms for this phosphorylation. Consistent with the in vivo findings, freshly isolated liver-p31/ primary hepatocytes showed
weakened and delayed IR autophosphorylation and AKT pT308
at multiple time points and over a wide range of insulin concentrations (Figures 2C, 2D, S3A, and S3B). Therefore, p31comet is
required for insulin signaling, at the step or upstream of IR
autophosphorylation.
Cell 166, 567581, July 28, 2016 569
Figure 2. p31comet Ablation Causes Insulin Resistance, whereas Bub1b Insufficiency Enhances Insulin Sensitivity
(A) Glycogen synthase (GS) activity of WT, liver-p31/, and liver-Insr/ hepatocytes treated without () or with (+) insulin (Ins). Mean SD (n = 4 independent
experiments).
(B) Immunoblots of whole-liver lysates of WT, liver-p31/, and liver-Insr/ mice treated without () or with (+) insulin (Ins). Each lane contains lysate from an
individual mouse. The relative band intensities are quantified and shown below.
(C and D) Insulin signaling in primary WT and liver-p31/ hepatocytes treated with 10 nM insulin for the indicated times (C) or increasing concentrations of insulin
for 5 min (D). Cell lysates were blotted with the indicated antibodies.
(legend continued on next page)
570 Cell 166, 567581, July 28, 2016
Aneuploidy Alone Is Insufficient to Promote Insulin

Resistance
The status of aneuploidy in normal hepatocytes is still controversial, and it is formally possible that the insulin signaling
defects of liver-p31/ mice are a secondary consequence of
severe aneuploidy in hepatocytes. To assess the extent of hepatic aneuploidy in these mice, we isolated live hepatocytes
from WT and liver-p31/ mice, sorted tetraploid cells, amplified and sequenced genomic DNA from single cells, and
analyzed the whole-genome copy number variation. We
analyzed hepatocytes from mice harboring hypomorphic alleles
of BUBR1 (Bub1bH/H) as a control (Baker et al., 2004). Consistent with a previous study (Knouse et al., 2014), none of the
15 WT hepatocytes were aneuploid, whereas about 20%
Bub1bH/H (2 out of 10) were aneuploid (Figures 2E and S3C).
Only 1 out of 10 p31/ hepatocytes was aneuploid. Thus,
the aneuploidy incidence of p31/ hepatocytes was surprisingly low.
We tested whether p31comet restoration rescued the metabolic and insulin signaling defects of liver-p31/ mice. Adeno-associated virus (AAV)-mediated expression of p31comet,
but not of GFP, in the liver rescued the hyperglycemia and
glucose/insulin intolerance phenotypes and insulin signaling
defects in liver-p31/ mice (Figures 2F2H and S3D). We performed single-cell sequencing to assess the extent of aneuploidy in hepatocytes from liver-p31/ mice injected with
AAV-GFP or AAV-p31. Only 1 of 40 hepatocytes in the AAVGFP group was aneuploid, and 2 of 39 hepatocytes in the
AAV-p31 group were aneuploid (Figures 2I and S3E). Thus,
the extent of aneuploidy in liver-p31/ hepatocytes is very
low (about 5%). p31comet restoration rescues the metabolic defects of p31comet ablation without eliminating aneuploidy,
arguing against aneuploidy as the underlying reason for the
observed defects.
Next, we analyzed glucose metabolism and insulin signaling in
Bub1bH/H mice. The aneuploidy incidence in Bub1bH/H hepatocytes was higher than that in liver-p31/ hepatocytes (Figures
2E, S3C, and S3E) (Knouse et al., 2014). Strikingly, Bub1bH/H
mice exhibited a reduced blood glucose level in the fed state
(Figure 2J) and increased glucose tolerance (Figure 2K) and insulin sensitivity (Figure 2L). In contrast to liver-p31/ hepatocytes,
IR autophosphorylation and AKT phosphorylation in response to
insulin were more robust in Bub1bH/H hepatocytes (Figures S3F
S3I). Thus, another aneuploid mouse model (Bub1bH/H) exhibited opposite insulin signaling phenotypes. These findings
argue against aneuploidy as the underlying cause of defective insulin signaling in p31/ animals, and are more consistent with a
karyotype-independent regulatory function of p31comet in insulin
signaling.
p31comet Prevents Unscheduled Clathrin-Mediated

Endocytosis of IR
The requirement for p31comet in IR autophosphorylation indicates
that p31comet acts upstream in the pathway, possibly at the level
of IR itself. On insulin stimulation, the activated IR at the PM can
be internalized by clathrin-dependent or clathrin-independent
pathways (Goh and Sorkin, 2013). We generated human
HepG2 hepatocellular carcinoma cell lines stably expressing
the C-terminal GFP-tagged IR and examined the localization of
IR-GFP after transfection with control or p31comet small interfering RNAs (siRNAs) (Figure 3A). In control cells, IR localized
to PM in the absence of insulin and was internalized on insulin
treatment and co-localized with RAB7, a late endosome marker.
In contrast, IR-GFP in p31comet-depleted cells had much weaker
PM localization and was enriched in intracellular compartments
(ICs), even without insulin treatment. A large fraction of these IRpositive ICs were positive for RAB7. The GTPase dynamin is
essential for clathrin-mediated endocytosis (McMahon and Boucrot, 2011). Addition of dynasore (Macia et al., 2006), a chemical
inhibitor of dynamin, blocked the aberrant IR internalization in
p31comet-depleted HepG2 cells (Figures 3B and 3C). Co-depletion of the clathrin heavy chain (CLTC) also restored IR at PM
in p31comet-depleted cells. Thus, p31comet suppresses unscheduled clathrin-dependent IR internalization in the basal state in
HepG2 cells.
Next, we analyzed the localization of the endogenous IR
and regulation of IR by insulin in the liver of WT, liver-p31/,
and liver-Insr/ mice in vivo (Figure 3D). IR localized to
the PM of hepatocytes in WT liver and underwent insulin-triggered internalization. The PM staining of IR was absent in
liver-Insr/ hepatocytes, although we could detect residual
IR signals in non-parenchymal cells. Even without insulin
injection, the PM IR signal in the liver of liver-p31/ animals
was already weak, whereas the IR signal in RAB7-positive
ICs was strong. AAV-mediated p31comet expression in liverp31/ mice restored the PM IR signal in the liver (Figure 3E).
These results establish a role of p31comet in suppressing IR
internalization in vivo.
We monitored the kinetics of insulin endocytosis in WT and
liver-p31/ hepatocytes. At several time points, the intensities
of internalized, fluorescently labeled insulin in liver-p31/ hepatocytes were much weaker than those in WT hepatocytes (Figures 3F and 3G). In contrast, the internalization of transferrin,
another client of clathrin-mediated endocytosis, was normal in
liver-p31/ hepatocytes (Figures 3F and 3H). These results indicate that p31comet specifically regulates insulin endocytosis, but
not all forms of clathrin-mediated endocytosis. Both defective insulin endocytosis and insulin-triggered IR autophosphorylation
in p31comet-deficient cells can be attributed to the decreased
(E) Segmentation plots of euploid (WT) and aneuploid hepatocytes from liver-p31/ and Bub1bH/H mice.
(FH) Fed blood glucose levels (F), glucose tolerance test (G), and insulin tolerance test (H) of WT and liver-p31/ mice injected with AAV-GFP or AAV-p31. WT
(AAV-GFP), n = 10; liver-p31/ (AAV-GFP), n = 8; liver-p31/ (AAV-p31), n = 7; mean SEM. *p < 0.05, **p < 0.01, ***p < 0.001 versus liver-p31/ (AAV-GFP);
py < 0.05, yyp < 0.01 versus WT (AAV-GFP).
(I) Segmentation plots of aneuploid cells from liver-p31/ mice injected with AAV-GFP or AAV-p31.
(JL) Fed blood glucose levels (J), glucose tolerance test (K), and insulin tolerance test (L) of WT and Bub1bH/H mice. For GTT and ITT, at least 12 mice in each
group were analyzed. Mean SEM. *p < 0.05, **p < 0.01.
See also Figure S3.
Cell 166, 567581, July 28, 2016 571
Figure 3. p31comet Suppresses Spontaneous IR Endocytosis in the Absence of Insulin

(A) HepG2 cells stably expressing IR-WT-GFP were transfected with the indicated siRNAs, serum starved, and stained with anti-GFP (IR; green) and anti-RAB7
(red) antibodies and DAPI (blue). The boxed region was magnified and shown on the right. Scale bars, 10 mm.
572 Cell 166, 567581, July 28, 2016
level of functional IR at the PM, due to the premature internalization of IR prior to insulin stimulation.
MAD2 Directly Binds to a Canonical MIM Motif in IR
We wondered whether p31comet function in regulating IR internalization involved MAD2. MAD2 had been reported as an IR-binding protein in yeast two-hybrid and proteomic screens (Hutchins
et al., 2010; ONeill et al., 1997). MAD2 binds to MAD1 and
CDC20 through the MAD2-interacting motif (MIM) with the
consensus of (K/R)ccXcX3-4P (c, a hydrophobic residue;
X, any residue; Figure 4A). The C-terminal cytoplasmic tail of IR
contained a putative MIM (Figure 4A). A peptide containing this
motif (IRMIM-WT) efficiently bound to purified recombinant
MAD2 WT and a monomeric mutant R133A (Figure 4B), but did
not interact with MAD2 DC, a truncation mutant that could not
form C-MAD2. A mutant IR peptide (IRMIM-4A) did not interact
with MAD2. Immunoprecipitation with a C-MAD2-specific antibody (Fava et al., 2011) confirmed that IR-bound MAD2 formed
C-MAD2 (Figure 4C). IRMIM-WT bound to C-MAD2R133A with a
dissociation constant of 380 nM (Figure 4D). Thus, IR contains
a functional MIM.
MAD2 WT, but not DC, interacted with the cytoplasmic
domain of IRb (IRb-C) (Figure 4E), indicating that the MIM is
not masked in the intact cytoplasmic domain of IR and is available for MAD2 binding. MAD2 binding did not affect the kinase
activity of IR, and vice versa. Furthermore, IR-WT-MYC interacted with endogenous MAD2 in 293FT cells, whereas IR-4AMYC did not (Figure 4F). Depletion of p31comet enhanced the
IR-MAD2 interaction, suggesting that p31comet might promote
IR-C-MAD2 disassembly. Endogenous MAD2 and IR interacted
with each other in HepG2 cells (Figure 4G) and in whole-liver lysates of WT and liver-p31/ mice (Figure 4H). Unlike in 293FT
cells depleted of p31comet, in liver lysates of liver-p31/ mice
we did not observe a significant increase of the IR-MAD2 interaction, suggesting that p31comet might not actively promote IRMAD2 disassembly in hepatocytes.
A previous report showed that insulin-stimulated IR activation
reduced the IR-MAD2 interaction in IR-overexpressing CHO
cells (ONeill et al., 1997). In contrast, we found that the IRMAD2 interaction in vitro and in human cells was not regulated
by insulin or IR autophosphorylation (Figures 4E4G). Thus, IR
binds to MAD2 through a canonical MIM in vitro and in vivo.
This interaction is constitutive and is not regulated by insulin.
p31comet Regulates IR Endocytosis by
Counteracting MAD2
We generated HepG2 cells stably expressing IR-4A-GFP and
examined the subcellular localization of this MAD2-binding-defi-
cient mutant. IR-WT localized to the PM in control-depleted

cells, but localized to intracellular compartments in p31cometdepleted cells (Figures 5A and 5B). In contrast, IR-4A retained
its PM localization, even in p31comet-depleted cells. Similar results were obtained in p31/ MEFs (Figures 5C and 5D). Expression of p31comet WT, but not of the MAD2-binding-deficient
Q83A/F191A (QF) mutant (Yang et al., 2007), restored IR at the
PM of p31comet-depleted cells (Figures 5E and 5F). Adenovirus-mediated expression of IR-4A, but not of IR-WT or GFP,
in the liver of liver-p31/ mice restored IR at the PM in hepatocytes (Figure 5G). These results establish the importance of IRMAD2 and MAD2-p31comet interactions in IR endocytosis and
indicate that p31comet regulates insulin signaling through suppressing MAD2.
p31comet Blocks MAD2-BUBR1-Dependent AP2
Recruitment
Adaptor protein 2 (AP2) is crucial for clathrin-mediated endocytosis and interacts with both clathrin and the cargo (McMahon
and Boucrot, 2011). AP2 is a heterotetramer of the a, b2, m2,
and s2 subunits. BUBR1 has been reported to interact with
AP2-b2 (AP2B1) (Cayrol et al., 2002). Through its aC dimerization
helix, C-MAD2 can directly bind to BUBR1 (Chao et al., 2012).
MAD2 might bridge an interaction between IR and BUBR1AP2, thus promoting clathrin-mediated endocytosis of IR.
Consistent with this hypothesis, the aberrant IR endocytosis in
p31comet-depleted HepG2 cells required clathrin, AP2, MAD2,
and BUBR1 (Figures 6A, 6B, and S4AS4C). Depletion of
MAD2 or BUBR1 partially blocked IR endocytosis induced by insulin. Thus, MAD2 and BUBR1 are required for untimely IR internalization in cells with a compromised p31comet function and for
insulin-triggered IR endocytosis.
The recombinant BUBR1 N-terminal domain (BUBR1N) associated with IR-bound MAD2 WT (Figure 6C). It bound much less efficiently to MAD2 R133E/Q134A (RQ), which was monomeric and
contained mutations of two critical residues in aC. p31comet diminished the BUBR1N-MAD2 interaction (Figure 6C), without substantially affecting the IR-MAD2 interaction. p31comet bound
weakly to MAD2 RQ and did not further reduce the residual binding between BUBR1N and MAD2 RQ. Thus, p31comet inhibits the
interaction between BUBR1 and IR-bound C-MAD2 in vitro. In
human cells, MYC-BUBR1 interacted with IR and AP2B1 at a
basal level (Figure 6D). These interactions were enhanced when
clathrin-mediated endocytosis of IR was blocked with dynasore.
Depletion of p31comet further enhanced these interactions,
consistent with the role of p31comet in suppressing IR endocytosis.
In keeping with these protein-binding data, BUBR1, p31comet,
and MAD2 could be detected on PM in HepG2 cells with total
(B) HepG2 cells stably expressing IR-WT-GFP were transfected with the indicated siRNAs, serum starved, treated without or with 80 mM dynasore (Dyn), and
stained with indicated antibodies. Scale bars, 10 mm.
(C) Quantification of the ratios of PM and IC IR-GFP signals of cells in (B) (mean SD; *p < 0.0001).
(D) Liver sections of WT, liver-p31/, and liver-Insr/ mice injected with PBS or insulin (+Ins) were stained with anti-IR (red) and anti-RAB7 (green) antibodies and
DAPI (blue). Scale bars, 10 mm. Asterisks indicate sinusoids.
(E) Liver sections of WT and liver-p31/ mice injected with AAV-GFP or AAV-p31 were stained with anti-IR (red) antibodies and DAPI (blue). Scale bars, 10 mm.
(F) Representative images of endocytosis assays with Cy3-insulin and Alexa 568-transferrin in WT and liver-p31/ hepatocytes at 20 min. The insulin and
transferrin signals are shown in red. The DAPI signals are shown in blue. Scale bar, 10 mm.
(G and H) Endocytosis of insulin (G) and transferrin (H) in WT and liver-p31/ hepatocytes. The intensities of internalized fluorescent signals at the indicated times
were quantified (mean SD; n = 3 independent experiments with >70 cells analyzed at each time point).
Cell 166, 567581, July 28, 2016 573
Figure 4. MAD2 Binds to a Canonical MIM in the C-Terminal Tail of IR

(A) Sequence alignment of the C-terminal tail of IR proteins and human CDC20 and MAD1, with the conserved residues in the MAD2-interacting motif (MIM)
boxed. The MIM consensus is shown on top. Sequences of IRMIM-WT and IRMIM-4A peptides are also shown.
(B) Beads coupled to IRMIM-WT or IRMIM-4A were incubated with the indicated MAD2 proteins. Input and proteins bound to beads were analyzed by SDS-PAGE
and stained with Coomassie (CBB).
(C) Beads coupled to a C-MAD2-specific antibody were incubated with the indicated MAD2 proteins in the presence or absence of IRMIM-WT. Input and beadsbound proteins were detected with the anti-MAD2 antibody.
(D) ITC analysis of binding between MAD2 and IRMIM-WT, with the Kd and binding stoichiometry (N) indicated.
574 Cell 166, 567581, July 28, 2016
internal reflection fluorescence (TIRF) microscopy (Figure S4D),

and the PM co-localization of IR and BUBR1 was increased by
dynasore treatment with or without insulin stimulation (Figures
S4E and S4F). We then measured the co-localization efficiency
of IR and AP2B1 on the cell surface in control and p31cometdepleted cells (Figures 6EEG). In control cells, IR had a diffusive
PM staining, with no clear enrichment in the AP2B1 puncta (Figure 6E). The punctate staining pattern of AP2B1 became more
pronounced in p31comet-depleted cells, and IR efficiently colocalized with AP2B1 in these surface puncta (Figures 6E and
6F). Co-depletion of BUBR1 or MAD2 in p31comet-depleted cells
reduced the co-localization of IR and AP2B1 (Figures 6F and 6G).
Therefore, p31comet attenuates clathrin-mediated endocytosis of
IR through suppressing MAD2-BUBR1-dependent recruitment
of AP2.
The p31comet-MAD2-BUBR1 Module Regulates Insulin
Signaling
We tested whether the p31comet-MAD2-BUBR1 module indeed
regulated insulin signaling. Insulin-triggered IR and AKT phosphorylation was markedly reduced in liver-p31/ hepatocytes,
indicative of defective insulin signaling (Figures 7A and S5A).
Depletion of BUBR1 or MAD2 from liver-p31/ hepatocytes
significantly restored insulin signaling. Co-depletion of MAD2,
BUBR1, AP2B1, or CLTC similarly restored insulin signaling in
p31comet-depleted HepG2 cells (Figures S5B and S5C). These
results indicate that p31comet-MAD2-BUBR1 controls insulin
signaling through regulating IR endocytosis.
Next, we generated liver-specific p31comet and BUBR1 doubleknockout (liver-p31/-;Bub1b/) mice and tested if BUBR1 ablation rescued the metabolic defects of liver-p31/ mice. Similar to
Bub1bH/H mice, liver-p31/;Bub1b/ mice were normal at birth,
but only attained 40% of normal weight at 3 weeks. Strikingly,
liver-p31/;Bub1b/ mice showed hypoglycemia in the fed state
(Figure 7B), increased glucose tolerance (Figure 7C), and insulin
hypersensitivity (Figure 7D), phenotypes similar to those in
Bub1bH/H mice and opposite of those in liver-p31/ mice.
The cell and nuclear sizes of hepatocytes in liver-p31/;
Bub1b/ mice were larger than those of hepatocytes in the
single-knockout animals (Figures S5D and S5E). The percentage
of hepatocytes with 8N or greater DNA content was higher in
liver-p31/;Bub1b/ mice (Figure S5F). This increased polyploidization is likely due to the complete loss of Bub1b in the
liver. Despite this striking ploidy change, liver-p31/;Bub1b/
mice retained the insulin and glucose hypersensitivity seen in
Bub1bH/H mice. These results do not support a causal link between polyploidization and insulin resistance. We could not
perform single-cell sequencing of p31/;Bub1b/ hepatocytes
to ascertain their aneuploidy status, as live-cell sorting produced
no intact cells.
Because IR-4A does not undergo unscheduled endocytosis

caused by p31comet inactivation (Figure 5), we hypothesized
that the expression of IR-4A might suppress the metabolic phenotypes of liver-p31/ mice. Expression of IR-4A-GFP, but not of
IR-WT-GFP, rescued the insulin signaling defects in HepG2 cells
depleted of p31comet (Figures S6A and S6B) and in primary hepatocytes isolated from liver-p31/ mice (Figures S6C and S6D).
Adenovirus-mediated expression of IR-4A, but not of IR-WT or
GFP, restored insulin-stimulated IR autophosphorylation and
AKT phosphorylation in the liver of liver-p31/ mice (Figure 7E).
Expression of IR-4A also significantly decreased fed blood
glucose levels (Figure 7F) and restored glucose tolerance (Figure 7G) and insulin sensitivity (Figure 7H) in liver-p31/ mice.
These results indicate that p31comet regulates insulin signaling
through suppressing the functions of the IR-MAD2 interaction
in vivo.
DISCUSSION
Combining mouse genetics, cell biological and biochemical
methods, and single-cell genomics, we have discovered a critical role for the p31comet-MAD2-BUBR1 module of mitotic regulators in insulin signaling through regulating IR endocytosis
(Figure 7I).
In the mouse, p31comet deficiency diminishes IR at the PM
prior to insulin stimulation and causes defective insulin signaling
and metabolic syndrome. The insulin signaling defects caused
by p31comet ablation are not limited to the liver, as whole-body
p31/ mice exhibit neonatal lethality, whereas liver-specific
p31/ mice are viable. Indeed, p31/ MEFs exhibited defective
insulin-induced adipogenesis (Figures S7A and S7B). Insulin
signaling, GLUT4 translocation, and insulin-stimulated glucose
uptake were defective in the poorly differentiated p31/ adipocytes (Figures S7CS7E).
While we cannot rule out mitotic defects and the low extent of
aneuploidy as factors contributing to the metabolic defects in
p31comet-deficient mice, two complementary lines of evidence
argue against aneuploidy as the determining factor. The first
line of evidence involves the use of BUBR1-deficient (Bub1bH/H)
mice. These mice harbor aneuploidy in hepatocytes, but are
more sensitive to insulin, a phenotype opposite of that of liverp31/ mice. Furthermore, liver-specific ablation of BUBR1
rescues the metabolic defects of liver-p31/ mice, despite
increasing polyploidization in hepatocytes. These findings support our conclusion that BUBR1 facilitates IR endocytosis and
attenuates insulin signaling. It also indicates that aneuploidy or
polyploidy alone is insufficient to produce insulin signaling defects caused by p31comet ablation.
The second line of evidence comes from genetic suppression experiments with transgene expression in the liver. The
(E) GST pull-down assays with recombinant GST-IRb-C and MAD2. Input and beads-bound proteins were blotted with the indicated antibodies. The relative
intensities of pY and MAD2 (mean SEM; n = 3 independent experiments) are shown below.
(F) 293FT cells were co-transfected with IR-WT-MYC or IR-4A-MYC constructs and the indicated siRNAs, serum starved, and treated without () or with (+)
100 nM insulin (Ins) for 20 min. The total cell lysate (TCL) and anti-MYC immunoprecipitate (IP) were blotted with the indicated antibodies.
(G) HepG2 cells were transfected with siLuc or siMAD2, serum starved, and treated without () or with (+) 100 nM insulin (Ins) for 20 min. TCL, anti-MAD2 IP, and
IgG IP were blotted with the indicated antibodies.
(H) Total liver lysates from WT, liver-p31/ and liver-Insr/ mice, and anti-MAD2 and IgG IP from these lysates were blotted with the indicated antibodies.
Cell 166, 567581, July 28, 2016 575
Figure 5. p31comet Suppresses Spontaneous IR Endocytosis through Counteracting the IR-MAD2 Interaction
(A) HepG2 cells stably expressing IR-WT-GFP or IR-4A-GFP were transfected with siLuc or si-p31, serum starved, and stained with anti-GFP (IR; green) antibody
and DAPI (blue). Scale bars, 10 mm.
(B) Quantification of the ratios of PM and IC IR-GFP signal intensities in (A) (mean SD; ****p < 0.0001).
(C) WT and p31/ MEFs were infected with IR-WT-GFP or IR-4A-GFP retroviruses, serum starved, treated without or with dynasore (Dyn), and stained with antiGFP (IR; green) antibody and DAPI (blue). Scale bar, 10 mm.
(D) Quantification of the ratios of PM and IC IR-GFP signal intensities in (C) (mean SD; ****p < 0.0001).
(E) IR-GFP-expressing HepG2 cells were co-transfected with the indicated siRNAs and plasmids and stained with anti-GFP (IR; green) and anti-MYC (p31comet;
red) antibodies and DAPI (blue). Scale bar, 10 mm.
(F) Quantification of the ratios of PM and IC IR-GFP signal intensities in (E) (mean SD; ****p < 0.0001).
(G) Liver sections of WT and liver-p31/ mice injected with Ad-GFP, Ad-IR-WT, or Ad-IR-4A were stained with anti-IR (red) antibody and DAPI (blue). Scale
bar, 10 mm.
576 Cell 166, 567581, July 28, 2016
Figure 6. p31comet Prevents MAD2- and BUBR1-Dependent IR Endocytosis through Blocking AP2 Recruitment
(A) IR-GFP-expressing HepG2 cells were transfected with the indicated siRNAs, treated without or with 100 nM insulin (Ins) for 5 min, and stained with DAPI (blue)
and anti-GFP (IR; green) and anti-RAB7 (red) antibodies. Scale bars, 10 mm.
Cell 166, 567581, July 28, 2016 577
aneuploidy incidence in p31/ hepatocytes is surprisingly low.

Restoring p31comet expression using viral vectors does not eliminate this low-level aneuploidy, but rescues metabolic and insulin
signaling defects in liver-p31/ mice and hepatocytes. Expression of IR-4A similarly rescues the phenotypes of liver-p31/
mice and hepatocytes. These results indicate that the insulin
signaling defects caused by p31comet ablation are not simply
an indirect consequence of aneuploidy. Rather, they strongly
support our conclusion that p31comet and MAD2 directly regulate
insulin signaling through a physical interaction with IR.
We propose that O-MAD2 can bind to IR without the need
for MAD1-mediated conformational activation (Figure 7I). IRbound MAD2 then recruits AP2 to IR through BUBR1 and promotes IR endocytosis. p31comet prevents IR-bound MAD2
from binding BUBR1 and blocks AP2 recruitment to IR, thereby
inhibiting clathrin-mediated endocytosis of IR. Insulin-triggered IR autophosphorylation events have been shown to be
a requirement for insulin-stimulated IR endocytosis (Goh
and Sorkin, 2013). MAD2 and BUBR1 are also required for
insulin-stimulated IR endocytosis in cells without genetic
inactivation of p31comet. Thus, insulin stimulation might suppress p31comet-mediated blockade of BUBR1-AP2 association
with IR. Future experiments are needed to test this intriguing
possibility.
The p31comet-MAD2-BUBR1 module specifically regulates the
endocytosis of cell-surface receptors that contain the MIM. The
MIM is highly conserved in all vertebrate IR proteins (Figure 4A),
suggesting that the metabolic function of the p31comet-MAD2BUBR1 module might have been acquired in vertebrates. The insulin-like growth factor 1 receptor (IGF1R) does not contain the
MIM and does not bind to MAD2 (ONeill et al., 1997). On the
other hand, IR might not be the only client cell-surface protein
regulated by MAD2. For example, MAD2 can bind to another
membrane protein, ADAM17, a metalloprotease with myriad
functions in cancer biology (Nelson et al., 1999). We have identified a functional MIM in the cytoplasmic tail of ADAM17 (Figures
S7FS7H). Additional studies are required to establish the functional significance of this interaction and to systematically identify cell-surface proteins that interact with MAD2.
BUBR1 insufficiency in mice causes aging-related disorders
(Baker et al., 2004), and, conversely, BUBR1 overexpression extends lifespan and delays aging in mice (Baker et al., 2013). An
evolutionarily conserved function of IR/IGF1R signaling in
longevity and aging has been widely documented (Bluher
et al., 2003; Kimura et al., 1997). We have now linked BUBR1
to insulin signaling. Future experiments are needed to test
whether changes in insulin signaling contribute to the aging phenotypes of mice with altered BUBR1 expression.
Liver-specific ablation of p31comet in mice produces metabolic
disorders reminiscent of type 2 diabetes, including insulin resistance. The underlying mechanisms of insulin resistance in type 2
diabetes are complex and not fully understood (Samuel and
Shulman, 2012). Our findings presented herein implicate premature IR internalization prior to insulin binding as a potential mechanism underlying insulin resistance. It will be interesting to
examine IR levels or localization in liver biopsies of human
type 2 diabetes patients.
Our results establish the role of mitotic checkpoint regulators
in insulin signaling and metabolic homeostasis. Our study provides a striking example of how an entire branch of key regulators in one cellular process can be repurposed to control
another. By virtue of its ability to link signaling pathways originating from kinetochores and the plasma membrane, the
p31comet-MAD2-BUBR1 module may offer a potential conduit
for extracellular hormones to regulate chromosome segregation
and karyotypes.
EXPERIMENTAL PROCEDURES
Generation and Phenotypic Analyses of p31/ and Liver-p31/ Mice
All animal experiments were performed in accordance with institutional guidelines and with approval from the Institutional Animal Care and Use Committee
of UT Southwestern Medical Center.
The strategy of targeting p31comet (Mad2l1bp) is described in Figure S1. The
p31F/F mice were crossed with transgenic mice expressing Cre recombinase
under the Actin promoter to generate the p31/ mice and were mated with
transgenic mice expressing Cre recombinase under the Albumin promoter
to generate liver-p31/ mice. See the Supplemental Experimental Procedures
for information about the mouse crosses and husbandry and generation of
liver-p31/;Bub1b/ mice.
Tissue histology and immunohistochemistry were performed by an oncampus core facility. Hepatic glycogen content was measured with the
Glycogen Assay Kit (Sigma). Glucose and insulin tolerance tests, metabolic
profiling, glycogen synthase activity assay, and in vivo insulin signaling assay
were performed with established protocols. Prism was used for the generation
of all curves and graphs and for statistical analyses. See the Supplemental
Experimental Procedures for details.
Cell Culture, Transfection, and Viral Infection
Mouse embryonic fibroblasts (MEFs) and primary hepatocytes were isolated
and cultured following standard protocols. 293FT and HepG2 cells were
cultured in DMEM, supplemented with fetal bovine serum. Plasmid
transfections into 293FT cells and HepG2 were performed with polyethylenimine (PEI; Sigma) and Lipofectamine 2000 (Invitrogen), respectively. siRNA
transfections were performed with Lipofectamine RNAiMAX (Invitrogen).
(B) Quantification of the ratios of PM and IC IR-GFP signal intensities in (A) (mean SD; ****p < 0.0001).
(C) The indicated proteins were incubated for 1 hr and added to beads coupled to the IRMIM-WT peptide. Proteins bound to beads were blotted with the indicated
antibodies. The relative BUBR1 intensities (mean SEM; n = 3 independent experiments) are shown below.
(D) 293T cells were transfected with plasmids encoding MYC-BUBR1, AP2B1, and IR and treated with or without dynasore (Dyn). The total cell lysates (TCL), antiMYC IP, and IgG IP were blotted with the indicated antibodies. The relative intensities of IRb and AP2B1 (mean SEM; n = 3 independent experiments) are shown
in the figure.
(E) HepG2 cells stably expressing IR-WT-GFP were transfected with the indicated siRNAs and stained with anti-GFP (IR; green) and anti-AP2B1 (red) antibodies.
Two boxed regions (1 and 2) were magnified and shown. Scale bar, 10 mm.
(F) Quantification of the Manders coefficients of IR and AP2B1 co-localization in (E) (mean SEM; siLuc, n = 16; si-p31, n = 26; si-p31/siBUBR1, n = 12; si-p31/
siMAD2, n = 11; **p < 0.005).
(G) Western blot analysis of cell lysates in (E) and (F).
See also Figure S4.
578 Cell 166, 567581, July 28, 2016
Figure 7. The p31comet-MAD2-BUBR1 Module Controls Insulin Signaling

(A) Hepatocytes from WT and liver-p31/ mice were transfected with siRNAs, serum starved, and then treated with 10 nM insulin (Ins) for 5 min. Cell lysates were
blotted with the indicated antibodies.
(BD) Fed blood glucose levels (B), glucose tolerance test (C), and insulin tolerance test (D) of WT, liver-p31/, Bub1bH/H, and liver-p31/;Bub1b/ mice. At least
eight mice in each group were analyzed. Mean SEM. *p < 0.05, **p < 0.01, ***p < 0.001, ****p < 0.0001 versus WT; py < 0.05, yyp < 0.01, yyyp < 0.001,
yyyyp < 0.0001 versus liver-p31/;Bub1b/.
Cell 166, 567581, July 28, 2016 579
Recombinant adenoviruses and adeno-associated viruses were generated at

Agilent Technologies and Vector Biolabs, respectively, and were introduced to
mice through tail-vein injection. See the Supplemental Experimental Procedures for adipocyte differentiation and glucose uptake assay, generation of
stable cell lines, list of siRNAs, and other details.
Single-Cell Whole-Genome Sequencing

Hepatocytes were isolated from WT, liver-p31/, and Bub1bH/H mice and
sorted into single cells with flow cytometry. GenomePlex Single Cell Whole
Genome Amplification (WGA4; Sigma) was performed. Library generation
and sequencing were performed by an on-campus facility. Sequencing reads
were aligned against the reference genome. Copy numbers were estimated
with a bin size of 500 kb. The average gene copy number of each chromosome
was calculated, and log2 (copy number/average autosome copy number) was
used to classify aneuploidy, with cutoff set at 0.15. See the Supplemental
Experimental Procedures for details.
AUTHOR CONTRIBUTIONS
Immunoprecipitation and Immunoblots

Cleared cell lysates were incubated with antibody-conjugated beads. The proteins bound to beads were eluted with SDS sample buffer and analyzed by
SDS-PAGE and quantitative western blotting. Membranes were scanned
with the Odyssey Infrared Imaging System (LI-COR). See the Supplemental
Experimental Procedures for details and list of antibodies.
Immunofluorescence, Live-Cell Imaging, and Metaphase Spreads
Immunofluorescence microscopy was performed on cells grown on coverslips
or on liver sections following standard fixation and staining procedures.
Cells were imaged on various microscopes with the appropriate objectives.
Live-cell imaging was performed on MEFs expressing H2B-mRFP. Metaphase
spreads were prepared for MEFs, and images were captured with a Zeiss
Axioscope upright microscope. Identical exposure times and magnifications
were used for all comparative analyses. Images were analyzed with Image J.
See the Supplemental Experimental Procedures for details and for the endocytosis assay.
Flow Cytometry
MEFs pulsed with bromodeoxyuridine (BrdU) were fixed and stained with fluorescein isothiocyanate (FITC)-anti-BrdU antibody (BD Bioscience) and propidium iodide. Hepatocytes were fixed and stained with propidium iodide. Cells
were analyzed with FACSCalibur or FACS Aria II SORP (BD Bioscience). Data
were processed with FlowJo. See the Supplemental Experimental Procedures
for details.
Protein Binding Assays
Recombinant MAD2 proteins were purified as previously described (Luo et al.,
2004). BUBR1N (residues 1370) and mouse p31comet were purified with
affinity and conventional chromatography. IR peptides were chemically
synthesized. Glutathione S-transferase (GST)-IR (residues 10111382) was
purchased from Promega. The isothermal titration calorimetry (ITC) assay of
IR-WT binding to MAD2 was performed as previously described (Xia et al.,
2004). Peptide- or protein-bound beads were used to pull down prey proteins
in line with standard procedures. See the Supplemental Experimental Procedures for details.
Supplemental Information includes Supplemental Experimental Procedures,

seven figures, and one table and can be found with this article online at
http://dx.doi.org/10.1016/j.cell.2016.05.074.
E.C. designed and performed all the experiments, analyzed the data, and
wrote the paper. X.Z. and C.X. analyzed copy number variation in hepatocytes.
H.Y. supervised the project, provided suggestions, and edited the paper.
ACKNOWLEDGMENTS
We thank Xuemin Zhang for assistance with generating the p31F/F mice, Jan
van Deursen for providing the Bub1bH/H mice and other reagents, Mayuko
Hara for performing ITC, Bing Li for preparing BUBR1N, Min Kim for advice
on adenovirus production, Ralph DeBerardinis for providing 293FT and
HepG2 cells, Vanessa Schmid and Rachel Bruce for single-cell sequencing,
John Shelton and James Richardson for advice on histological analysis, and
Xuelian Luo for advice on protein purification. We are grateful to anonymous
reviewers for suggesting the genetic suppression experiments. This work is
supported, in part, by grants from the Clayton Foundation (to H.Y.) and the
NIH (UL1TR001105 to C.X.). H.Y. is an HHMI Investigator.
Received: March 23, 2015
Revised: December 9, 2015
Accepted: May 24, 2016
Published: June 30, 2016
REFERENCES
Baker, D.J., Jeganathan, K.B., Cameron, J.D., Thompson, M., Juneja, S., Kopecka, A., Kumar, R., Jenkins, R.B., de Groen, P.C., Roche, P., and van
Deursen, J.M. (2004). BubR1 insufficiency causes early onset of aging-associated phenotypes and infertility in mice. Nat. Genet. 36, 744749.
Baker, D.J., Dawlaty, M.M., Wijshake, T., Jeganathan, K.B., Malureanu, L., van
Ree, J.H., Crespo-Diaz, R., Reyes, S., Seaburg, L., Shapiro, V., et al. (2013).
Increased expression of BubR1 protects against aneuploidy and cancer and
extends healthy lifespan. Nat. Cell Biol. 15, 96102.
Biddinger, S.B., Hernandez-Ono, A., Rask-Madsen, C., Haas, J.T., Aleman,
J.O., Suzuki, R., Scapa, E.F., Agarwal, C., Carey, M.C., Stephanopoulos, G.,
et al. (2008). Hepatic insulin resistance is sufficient to produce dyslipidemia
and susceptibility to atherosclerosis. Cell Metab. 7, 125134.
Bluher, M., Kahn, B.B., and Kahn, C.R. (2003). Extended longevity in mice lacking the insulin receptor in adipose tissue. Science 299, 572574.
Boucher, J., Kleinridders, A., and Kahn, C.R. (2014). Insulin receptor signaling
in normal and insulin-resistant states. Cold Spring Harb. Perspect. Biol. 6,
a009191.
Cayrol, C., Cougoule, C., and Wright, M. (2002). The beta2-adaptin clathrin
adaptor interacts with the mitotic checkpoint kinase BubR1. Biochem. Biophys. Res. Commun. 298, 720730.
ACCESSION NUMBERS
Chao, W.C., Kulkarni, K., Zhang, Z., Kong, E.H., and Barford, D. (2012). Structure of the mitotic checkpoint complex. Nature 484, 208213.
The accession number for the single-cell sequencing data reported in this
paper is SRA: SRP074854.
Fava, L.L., Kaulich, M., Nigg, E.A., and Santamaria, A. (2011). Probing the
in vivo function of Mad1:C-Mad2 in the spindle assembly checkpoint.
EMBO J. 30, 33223336.
(E) Immunoblots of whole-liver lysates of WT and liver-p31/ mice injected with Ad-GFP, Ad IR-WT, or Ad-IR-4A and treated without or with insulin (Ins). Each lane
contains lysate from an individual mouse. The relative band intensities are quantified in the figure.
(FH) Fed blood glucose levels (F), glucose tolerance test (G), and insulin tolerance test (H) of WT and liver-p31/ mice injected with Ad-GFP, Ad-IR-WT, or
Ad-IR-4A. At least eight mice in each group were analyzed. Mean SEM. *p < 0.05, **p < 0.01 versus liver-p31/ (Ad-GFP).
(I) Left and right panels depict the roles of the p31comet-MAD2-BUBR1 module in mitosis and insulin signaling, respectively. The red lines in MAD1, CDC20, and IR
indicate the MAD2-interacting motif (MIM).
See also Figures S5, S6, and S7.
580 Cell 166, 567581, July 28, 2016
Girard, J., and Pegorier, J.P. (1998). An overview of early post-partum nutrition
and metabolism. Biochem. Soc. Trans. 26, 6974.
Goh, L.K., and Sorkin, A. (2013). Endocytosis of receptor tyrosine kinases.
Cold Spring Harb. Perspect. Biol. 5, a017459.
Hagan, R.S., Manak, M.S., Buch, H.K., Meier, M.G., Meraldi, P., Shah, J.V.,
and Sorger, P.K. (2011). p31comet acts to ensure timely spindle checkpoint
silencing subsequent to kinetochore attachment. Mol. Biol. Cell 22, 4236
4246.
Hutchins, J.R., Toyoda, Y., Hegemann, B., Poser, I., Heriche, J.K., Sykora,
M.M., Augsburg, M., Hudecz, O., Buschhorn, B.A., Bulkescher, J., et al.
(2010). Systematic analysis of human protein complexes identifies chromosome segregation proteins. Science 328, 593599.
Izawa, D., and Pines, J. (2015). The mitotic checkpoint complex binds a second CDC20 to inhibit active APC/C. Nature 517, 631634.
Jia, L., Li, B., Warrington, R.T., Hao, X., Wang, S., and Yu, H. (2011). Defining
pathways of spindle checkpoint silencing: functional redundancy between
Cdc20 ubiquitination and p31comet. Mol. Biol. Cell 22, 42274235.
Jia, L., Kim, S., and Yu, H. (2013). Tracking spindle checkpoint signals from kinetochores to APC/C. Trends Biochem. Sci. 38, 302311.
Joshi, R.L., Lamothe, B., Cordonnier, N., Mesbah, K., Monthioux, E., Jami, J.,
and Bucchini, D. (1996). Targeted disruption of the insulin receptor gene in the
mouse results in neonatal lethality. EMBO J. 15, 15421547.
Malureanu, L., Jeganathan, K.B., Jin, F., Baker, D.J., van Ree, J.H., Gullon, O.,
Chen, Z., Henley, J.R., and van Deursen, J.M. (2010). Cdc20 hypomorphic
mice fail to counteract de novo synthesis of cyclin B1 in mitosis. J. Cell Biol.
191, 313329.
Mapelli, M., and Musacchio, A. (2007). MAD contortions: conformational
dimerization boosts spindle checkpoint signaling. Curr. Opin. Struct. Biol.
17, 716725.
McMahon, H.T., and Boucrot, E. (2011). Molecular mechanism and physiological functions of clathrin-mediated endocytosis. Nat. Rev. Mol. Cell Biol. 12,
517533.
Michael, M.D., Kulkarni, R.N., Postic, C., Previs, S.F., Shulman, G.I., Magnuson, M.A., and Kahn, C.R. (2000). Loss of insulin signaling in hepatocytes leads
to severe insulin resistance and progressive hepatic dysfunction. Mol. Cell 6,
8797.
Musacchio, A. (2015). The molecular biology of spindle assembly checkpoint
signaling dynamics. Curr. Biol. 25, R1002R1018.
Nelson, K.K., Schlondorff, J., and Blobel, C.P. (1999). Evidence for an
interaction of the metalloprotease-disintegrin tumour necrosis factor alpha
convertase (TACE) with mitotic arrest deficient 2 (MAD2), and of the metalloprotease-disintegrin MDC9 with a novel MAD2-related protein, MAD2beta.
Biochem. J. 343, 673680.
Kimura, K.D., Tissenbaum, H.A., Liu, Y., and Ruvkun, G. (1997). daf-2, an insulin receptor-like gene that regulates longevity and diapause in Caenorhabditis
elegans. Science 277, 942946.
ONeill, T.J., Zhu, Y., and Gustafson, T.A. (1997). Interaction of MAD2 with the
carboxyl terminus of the insulin receptor but not with the IGFIR. Evidence for
release from the insulin receptor after activation. J. Biol. Chem. 272, 10035
10040.
Knouse, K.A., Wu, J., Whittaker, C.A., and Amon, A. (2014). Single cell
sequencing reveals low levels of aneuploidy across mammalian tissues.
Proc. Natl. Acad. Sci. USA 111, 1340913414.
Pollak, M. (2012). The insulin and insulin-like growth factor receptor family in
neoplasia: an update. Nat. Rev. Cancer 12, 159169.
Kulukian, A., Han, J.S., and Cleveland, D.W. (2009). Unattached kinetochores
catalyze production of an anaphase inhibitor that requires a Mad2 template to
prime Cdc20 for BubR1 binding. Dev. Cell 16, 105117.
Samuel, V.T., and Shulman, G.I. (2012). Mechanisms for insulin resistance:
common threads and missing links. Cell 148, 852871.
London, N., and Biggins, S. (2014). Signalling dynamics in the spindle checkpoint response. Nat. Rev. Mol. Cell Biol. 15, 736747.
Sotillo, R., Hernando, E., Daz-Rodrguez, E., Teruya-Feldstein, J., CordonCardo, C., Lowe, S.W., and Benezra, R. (2007). Mad2 overexpression promotes aneuploidy and tumorigenesis in mice. Cancer Cell 11, 923.
Luo, X., and Yu, H. (2008). Protein metamorphosis: the two-state behavior of
Mad2. Structure 16, 16161625.
White, M.F. (2003). Insulin signaling in health and disease. Science 302, 1710
1711.
Luo, X., Tang, Z., Xia, G., Wassmann, K., Matsumoto, T., Rizo, J., and Yu, H.
(2004). The Mad2 spindle checkpoint protein has two distinct natively folded
states. Nat. Struct. Mol. Biol. 11, 338345.
Xia, G., Luo, X., Habu, T., Rizo, J., Matsumoto, T., and Yu, H. (2004). Conformation-specific binding of p31comet antagonizes the function of Mad2 in the
spindle checkpoint. EMBO J. 23, 31333143.
Macia, E., Ehrlich, M., Massol, R., Boucrot, E., Brunner, C., and Kirchhausen,
T. (2006). Dynasore, a cell-permeable inhibitor of dynamin. Dev. Cell 10,
839850.
Yang, M., Li, B., Tomchick, D.R., Machius, M., Rizo, J., Yu, H., and Luo, X.
(2007). p31comet blocks Mad2 activation through structural mimicry. Cell
131, 744755.
Cell 166, 567581, July 28, 2016 581
Article
AIRE-Deficient Patients Harbor Unique High-Affinity

Disease-Ameliorating Autoantibodies
Graphical Abstract
Authors
Steffen Meyer, Martin Woodward,
Christina Hertel, ..., Part Peterson,
Kai Kisand, Adrian Hayday
Correspondence
kai.kisand@ut.ee (K.K.),
adrian.hayday@kcl.ac.uk (A.H.)
In Brief
Self-reactive antibodies specific for type I
interferons are associated with protection
against type I diabetes in patients with an
autoimmune syndrome caused by
mutations in AIRE.
Highlights
d
Each AIRE-deficient patient has a private repertoire of

autoantibody reactivities
Loss of B cell tolerance occurs during T cell-dependent
somatic hypermutation
Patient autoantibodies have unprecedented affinities for
conformational epitopes
Patient autoantibodies can display disease-ameliorating
properties in vivo
Meyer et al., 2016, Cell 166, 582595

July 28, 2016 2016 The Authors. Published by Elsevier Inc.
Article
AIRE-Deficient Patients Harbor Unique
High-Affinity Disease-Ameliorating Autoantibodies
Steffen Meyer,1,11 Martin Woodward,2,11 Christina Hertel,1,11 Philip Vlaicu,1,11 Yasmin Haque,2,11 Jaanika Karner,3,11
Annalisa Macagno,4 Shimobi C. Onuoha,4 Dmytro Fishman,5,6 Hedi Peterson,5,6 Kaja Metskula,7 Raivo Uibo,7 Kirsi Jantti,8
Kati Hokynar,8 Anette S.B. Wolff,9 APECED patient collaborative, Kai Krohn,8 Annamari Ranki,10 Part Peterson,3
Kai Kisand,3,* and Adrian Hayday2,*
1ImmunoQure
AG, Konigsallee 90, 2012 Dusseldorf, Germany

Gorer Department of Immunobiology, Kings College, London SE19RT, UK
3Molecular Pathology, Institute of Biomedicine and Translational Medicine, University of Tartu, Ravila 19, Tartu 50411, Estonia
4ImmunoQure Research AG, Wagistrasse 14, 8952 Schlieren, Switzerland
5Institute of Computer Science, University of Tartu, Liivi 2, Tartu 50409, Estonia
6Quretec Ltd., U
likooli 6A, Tartu 51003, Estonia
7Department of Immunology, Institute of Biomedicine and Translational Medicine, University of Tartu, Ravila 19, Tartu 50411, Estonia
8Clinical Research Institute HUCH Ltd., Haartmaninkatu 8, 00290 Helsinki, Finland
9Department of Clinical Science, University of Bergen, Laboratory Building, 8th floor, 5021 Bergen, Norway
10Department of Dermatology, Allergology and Venereology, Institute of Clinical Medicine, University of Helsinki, Skin and Allergy Hospital,
Helsinki University Central Hospital, Meilahdentie 2, 00250 Helsinki, Finland
11Co-first author
*Correspondence: kai.kisand@ut.ee (K.K.), adrian.hayday@kcl.ac.uk (A.H.)
2Peter
SUMMARY
APS1/APECED patients are defined by defects in the

autoimmune regulator (AIRE) that mediates central
T cell tolerance to many self-antigens. AIRE deficiency also affects B cell tolerance, but this is incompletely understood. Here we show that most APS1/
APECED patients displayed B cell autoreactivity toward unique sets of approximately 100 self-proteins.
Thereby, autoantibodies from 81 patients collectively
detected many thousands of human proteins. The
loss of B cell tolerance seemingly occurred during
antibody affinity maturation, an obligatorily T celldependent step. Consistent with this, many APS1/
APECED patients harbored extremely high-affinity,
neutralizing autoantibodies, particularly against specific cytokines. Such antibodies were biologically
active in vitro and in vivo, and those neutralizing
type I interferons (IFNs) showed a striking inverse
correlation with type I diabetes, not shown by other
anti-cytokine antibodies. Thus, naturally occurring
human autoantibodies may actively limit disease
and be of therapeutic utility.
INTRODUCTION
T lymphocyte tolerance is essential for limiting autoimmune disease. Tolerance occurs centrally when developing thymocytes
with strongly self-reactive T cell receptors (TCRs) are deleted
following engagement of self-antigen-derived peptides presented by major histocompatibility complex (MHC) antigens.
The expression of thousands of tissue-specific self-antigens
(TSAs) by medullary thymic epithelial cells (mTEC) is directly promoted by AIRE, a poorly understood transcriptional regulator
(Mathis and Benoist, 2009; Klein et al., 2014). Reflecting its
importance, AIRE deficiency is defined by the APS1/APECED
syndrome for which autoimmune polyendocrinopathy and
chronic mucocutaneous candidiasis are pathognomonic (Nagamine et al., 1997).
There are also several mechanisms of peripheral T cell tolerance, including requirements for co-stimulatory signals for the
activation of naive T cells; the expression of molecular brakes
(e.g., CTLA-4, PD-1) by activated T cells; and the suppression of
effector T cells in trans by FOXP3-expressing T-regulatory
(T-reg) cells. Reflecting its importance, FOXP3 deficiency is
defined by early-onset, life-threatening autoimmunity (Bennett
et al., 2001; Wildin et al., 2001).
Central and peripheral tolerance mechanisms have likewise
been hypothesized to shape the B cell compartment. Thus,
self-reactive B cells developing in the bone marrow may be
censored by clonal deletion, clonal anergy, or B cell receptor
(BCR) editing in which secondary gene rearrangements replace
the initial BCR with a new specificity (Goodnow et al., 2010; Pillai
et al., 2011; Ubelhart and Jumaa, 2015). Peripheral B cell tolerance is less well characterized, although some checkpoints
have been inferred. For example, immature transitional B cells
recently emigrated from the bone marrow contain many autoreactive and polyreactive cells, whereas there are relatively few
among mature naive B cells, strongly suggesting that tolerance
is imposed as transitional B cells differentiate into naive B cells
(Wardemann et al., 2003).
Interestingly, this B cell checkpoint is T cell dependent, as reflected by its impairment in patients with T-reg deficiencies
(Kinnunen et al., 2013). Likewise, CD40L and MHC class II
deficiencies that each impair T-B interactions also display
more autoreactive B cells (Meffre and Wardemann, 2008). These
582 Cell 166, 582595, July 28, 2016 2016 The Authors. Published by Elsevier Inc.
This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
(legend on next page)
Cell 166, 582595, July 28, 2016 583
considerations raise the possibility that B cell tolerance is largely

governed by the state of T cell tolerance.
Certainly, any autoreactive B cell that might progress through
to the naive B cell compartment of a healthy individual should
lack cognate autoreactive T cells to help it mature. Likewise,
T cell help is required in the germinal center (GC) reaction in
which B cells undergo somatic hyper-mutation (SHM) of the
immunoglobulin (Ig) variable (V) region genes, thereby driving
T cell-dependent selective expansion of clones with increased
antigen affinity (Brink, 2014). The question that then arises is
whether major defects in central T cell tolerance provoke wideranging losses of B cell tolerance at either or both of these
stages.
An approach to assessing this is to examine B cell reactivities
in AIRE-deficient APS1/APECED patients whose under-expression of TSAs in the thymus is predicted to lead to increased
numbers of peripheral autoreactive T cells. Thus, there are reports of APS1/APECED patients carrying autoantibodies against
twenty-five TSAs, with prevalence ranging from 6% to 69%
(Kisand and Peterson, 2015). Their specificities include steroidogenic enzymes, consistent with the patients polyendocrinopathies (Krohn et al., 1992; Uibo et al., 1994; Winqvist et al.,
1993). In addition, most patients display autoreactivities toward
type I IFNs and T helper (Th)-17-related cytokines, antibodies to
which limit resistance to Candida infection (Kisand et al., 2010;
Meager et al., 2006; Puel et al., 2010).
These findings notwithstanding, there has been no large-scale
analysis of the scope and nature of autoantibodies in APS1/
APECED patients, thereby resolving how T cell tolerance impacts upon B cell tolerance in humans. By analyzing 81 APS1/
APECED patients, we found that each was much more likely
than a healthy relative or an unrelated control to harbor strong
serum reactivities toward 100 human proteins. About 10 of
those, including type I IFNs and interleukin-22 (IL22), were
recognized by almost all patients, whereas others were mostly
private specificities. Hence, 81 patients collectively harbored
antibodies toward >3,700 human proteins.
Focusing on antibodies to type I IFNs, IL22, and IL17, we
found unexpectedly that most were reactive to conformational
determinants and included highly mutated antibodies of subpicomolar affinity. Because their gemline counterparts were
not self-reactive, B cell autoreactivity was most probably
driven by self-reactive T cells in the GC reaction. The autoantibodies commonly neutralized their targets in vivo, and APS1/
APECED patients with signature type 1 diabetes (T1D)-associated antibodies (e.g., anti-GAD65) commonly failed to develop
T1D so long as they harbored powerfully neutralizing IFNa-specific antibodies. Thus, autoantibodies naturally arising in subjects with defective central T cell tolerance may be disease
ameliorating.
RESULTS
High-Titer Autoreactivities in APS1/APECED
Sera from 81 APS1/APECED patients from discrete Finnish, Norwegian, Slovenian, and Sardinian cohorts were directed against
a ProtoArray displaying 9000 immobilized recombinant human
proteins or protein fragments. Because some patients were
sampled longitudinally, 97 sera were assayed in total. Control
sera were from healthy first-degree relatives (n = 9) and healthy
unrelated volunteers (n = 12) across the same age range. Data
readouts for the binding of individual sera were normalized by
applying robust linear modeling (Sboner et al., 2009), whereafter
each signal was assigned a Z score denoting the number of standard deviations (SD) above or below the mean of the combined
healthy relatives and controls.
Most patients and the combined controls displayed Z scores
of 12 for 200 proteins (Figure 1A). However, when the convention was employed of defining Z R 3 as bona fide positives, the
patients segregated from the two control cohorts, considered
either jointly or separately. Thus, each control serum displayed
reactivities of Z R 3 toward an average of %20 proteins, with
most recognizing < 10 (Figures 1A, 1B, and S1A). Given that
there was inter-individual variation, the 21 control sera collectively displayed Z R 3 reactivities toward 406 distinct proteins,
i.e., 5% of those displayed on the array (Figure 1B). For only
2 proteins was Z R 4, and for none was Z R 5 (Figures 1A and
1B). Hence, as expected, the control cohorts largely lacked
high-titer serum autoreactivities.
Conversely, most patients at any one time displayed Z R 3
autoreactivities toward R 80 proteins (Figures 1A, 1B, and
S1A). These data were re-analyzed with stringent procedures
to minimize false-positives, including exclusion of any signals
that might have arisen from cross-sample print contamination. With this achieved, the patients private autoantibody
repertoires collectively detected 3,731 distinct targets (Figure 1B). Furthermore, almost all patients displayed Z R 4
scores for at least 10 proteins (mean of 30), collectively
recognizing > 1,500 proteins, and > 50% of patients displayed
Z R 5 scores for R 10 proteins (mean of > 12), collectively
recognizing 636 proteins (Figures 1A and 1B). Hence, high-level
reactivity toward multiple self-proteins was a disease-defining
property. This was further illustrated by the qualitative difference in Z score distribution curves for patients versus controls,
which cannot simply be explained by there being 5-fold more
patient sera (Figure 1C). Thus, whereas sampling greater
numbers would likely have increased the protein species detected by control cohorts at Z R 4, it would not bridge the
1,000-fold gap between two proteins detected by 21 control
sera versus > 1,500 proteins detected by 97 patient sera
(Figure 1B).
Figure 1. Immune Response Profiling of APS1/APECED

(A) Distributions of hits between patients and controls at different Z scores.
(B) Z scores for all samples against all protein features and mean hits for each group calculated for Z R 3, Z R 4, and Z R 5. The number of distinct proteins
targeted in each group (P, n = 97; C, n = 21) at Z scores denoted. The complexity factor was calculated by dividing the number of distinct proteins by average
number of hits per patient.
(C) The max Z score distribution of all proteins in patient and control groups.
(D) Fraction of patients recognizing each of 3,731 proteins at Z R 3. Red dots depict 126 proteins shared between patients and controls.
584 Cell 166, 582595, July 28, 2016
In sum, 81 different patients collectively displayed strong reactivities to >40% of human proteins arrayed. For most proteins
(blue dots 133731, Figure 1D), reactivities were spread across
the cohort, reflecting high inter-patient variation, whereas 12
proteins (blue dots 112, Figure 1D), including several type I
IFNs, were recognized by > 60% of patients, as reported
(Meager et al., 2006). However, the public specificities were
not enriched among the 126 reactivities shared between patients
and controls at z > 3 (red dots, Figure 1D), emphasizing that their
common autoantigenicity is unique to the patients. Patient autoreactivity frequencies were largely comparable across geographical locations, albeit somewhat less in Norway and Slovenia, and
age ranges (Figures S1B and S1C). Indeed, most anti-IFN autoantibodies of APECED patients were reported to increase early in
life and remain stable thereafter (Meager et al., 2006; Wolff et al.,
2013).
The collective targets of patient antibodies included intracellular, trans-membrane, and secreted proteins. Because many
proteins displayed on the ProtoArray may be denatured, there
may be false-negatives that underestimate patient reactivities
to conformational determinants. Although a detailed analysis of
the types of proteins targeted will be presented, it is evident
that the proteins most commonly detected by patient sera
included numerous cytokines, particularly type I IFNs, for which
reason this study focuses on the nature of those autoreactivities.
Strong, Selective Anti-Cytokine Reactivities
Human type I IFN genes include 13 IFNa genes, 1 IFNb gene, and
1 IFNu gene. There is also a type II IFNg gene and three type III
IFNl genes. IFNg is largely limited to lymphocytes, whereas
type I and type III IFNs are broadly expressed, with their functional uniqueness and/or redundancy unresolved (Ivashkiv and
Donlin, 2014). As assessed by ProtoArray, patient sera showed
significantly stronger reactivities than controls toward all IFNa
subtypes, albeit the reactivities to some (e.g., a1/13, a5, and
a14) were higher than those to others (e.g., a2, a16, and a21)
(Figure 2A). The differential between patients versus controls
was emphasized by luciferase-based immunoprecipitation
(LIPS) in which many target proteins were recognized in their
native conformations (Figure 2B). Many patients showed strong
reactivities to IFNu but rarely toward IFNb (Figure 2B) and never
toward IFNk and IFN, two phylogenetically distant type I IFNs
(data not shown). By contrast, patient sera harbored reactivities
significantly above controls toward IL1a, IL5, IL6, IL17A, IL17F,
IL20, IL22, IL28A (IFNl2), IL28B (IFNl3), and IL29 (IFNl1) (Figure 2C). Whereas reactivities toward some targets (e.g., IL17F,
IL22) were common to most patients, reactivities toward others
(e.g., IL20, IL28, IL6) were not (Table S1), and with the exception
of IL5, patient sera mostly did not detect either Th2 cytokines
(e.g., IL4 and IL13) or IL21, a Tfh (T follicular helper) cell cytokine
that drives high-affinity antibody maturation. There were also no
reactivities toward G-CSF and GM-CSF (Table S1), which drive
the development of myeloid cells associated with the patients
inflammatory endocrinopathies.
Cytokine reactivities were largely validated by ELISA, which
confirmed that IFNg was only rarely and weakly recognized by
patient sera (Figure 2D; Table S1) and that there was no reactivity
toward TNFa (data not shown). By contrast, ELISA revealed au-
toantibodies toward IL32a and IL32g, two poorly characterized

proinflammatory cytokines (Figure 2D; Table S1). In sum, 81
APS1/APECED patient sera collectively displayed strong reactivities to a very selective subset of human cytokines.
Very High-Affinity Human Antibodies
To understand the nature of patient serum reactivities, nine
IFNa-specific monoclonal antibodies (mAbs) were derived by
limit-dilution cloning from memory B cells of four patients. Two
were characterized in detail (26B9 and 19D11), whereas a
more limited analysis of the others strongly argued that the properties of 26B9 and 19D11 were generally representative of
patients cytokine-specific antibodies. First, their VH and Vk
sequences were highly mutated relative to their germline
counterparts, with non-conservative replacements enriched
in complementarity-determining regions (CDRs), as expected
(white; Figure 3A). The antibodies bore no obvious resemblance
to each other in V-gene segment or CDR3 usage. Conversely, a
third anti-IFNa antibody, 50E11, shared with 19D11 the same VH
(IGHV1-69) and junctional (IGHJ4) segments and a very similar
light chain (IGKV3-11 versus V3-20) (Figure S2A). However, there
were very different template-independent nucleotide insertions
in the VH CDR3s of 19D11 and 50E11, and the somatic mutation
patterns were different: whereas 19D11 and 26B9 showed high
mutation frequencies in VH CDR2 and Vk CDR1, 50E11 did not
(Figures 3A and S2A).
The recombinant antibodies 26B9 and 19D11 harvested from
transfected CHO cells were immobilized on surface plasmon
resonance (SPR) chips over which were run recombinant human
IFNa2b, IFNa4, IFNa14, and IFNu, the latter being recognized by
26B9 but not by 19D11 (Figure 3B). These experiments revealed
very slow off-rates reflecting extremely high affinities of the antibodies for their targets, ranging from KD = 3.28e 14M for 26B9
toward IFNa14 to KD = 2.09e 11M for 26B9 toward IFNa2b (Figures 3B and 3C). Sub-picomolar/near-femtomolar dissociation
constants were likewise shown by 19D11 (Figures 3B and 3C).
Thus, APS1/APECED patients harbor some of the strongest
affinity antibodies described.
18-mer peptides spanning IFNa2b and IFNu were used to
map linear epitopes recognized by 26B9 and 19D11. However,
no specific reactivities were detected (data not shown), consistent with the antibodies binding conformational determinants
shared by several type I IFNs. Also, the antibodies reacted poorly
or not at all to mouse IFNs (Table S2).
To investigate the origins of the high-affinity, conformationspecific antibodies, germline counterparts for 19D11, 26B9,
and 50E11, albeit with the same CDR3-VDJ sequences, were expressed and tested by LIPS against recombinant human IFNa2b,
IFNa8, and IFNa14. There was no measurable interaction with
any target (Figure 3D), although the antibodies quality was
evident from their comparable detection by anti-human IgG (Figure S2B). These data argue that the strong autoreactivity toward
IFNs developed de novo during affinity maturation, rather than
being an intrinsic property of the germline repertoire that is
enhanced by affinity maturation.
The high affinities of 26B9 and 19D11 were not unique. Thus, a
patient-derived IgGk mAb (20A10) specific for IL20 (which is
not a target detected by most patients; Figure 2C; Table S1)
Cell 166, 582595, July 28, 2016 585
Figure 2. Serology of APS1/APECED to IFNs and Other Cytokines

Seroreactivity of APS1/APECED patients (blue) and contols (red) toward selected interferons and cytokines as measured in ProtoArray (A), LIPS (B and C), and
ELISA (D).
586 Cell 166, 582595, July 28, 2016
Cell 166, 582595, July 28, 2016 587
displayed a KD of 9.1e 14M, (Table S3). Likewise one IgGk mAb

(17E3) and one IgGl mAb (24D3), each specific for IL17F, displayed dissociation constants of <10 pM, and one IgGk antibody
(30G1) and one IgGl antibody (35G11) specific for IL22 displayed dissociation constants of 37 pM and 39 pM, respectively.
As a comparison, a CHO cell-expressed form of fezakinumab, a
humanized anti-IL22 mAb tested in the clinic, displayed a KD of
54 pM (Table S3). The only exception to this pattern was 2C2,
an IgGl mAb specific for IL32g (for which no human antibody
has been reported), which displayed nanomolar dissociation
(Table S3).
Similar to IFNa antibodies, most cytokine-specific antibodies
did not detect linear peptides from relevant target proteins,
strongly suggestive of complex conformational determinants
(data not shown). The one exception was 20A10, which bound
to an IL20 peptide and within which key amino acids were identified by mutagenesis (Figures S2C and S2D).
The antibody sequences of IL17F-reactive 17E3 and 9A2 and
of IL22-reactive 30G1 and 35G11 displayed myriad non-conservative mutations enriched in the CDRs. Again their germline
counterparts did not detect the respective targets (Figures 3D,
S2E, and S2F). Moreover, neither patient-derived antibodies
nor their germline counterparts showed any general autoreactivity (judged by immunofluorescent staining of tissue sections or
HEp-2 cells) or reactivity to Candida albicans, thus arguing
against candida infection being the trigger for autoantibody generation (data not shown).
The highly mutated CDRs of all studied antibodies suggested
that they derived from GC reactions that partially rely on Tfh cells.
Aberrant generation and/or activation of Tfh cells has been
described in several autoimmune diseases (Ueno et al., 2015),
but when four pediatric and four adult APS1/APECED were
compared to controls, we found no differences in the percentages of circulating CXCR5+ Tfh cells, or their activation state,
as judged by ICOS (inducible costimulator) and CCR7 levels
(Figure S3).
Biologically Active Human Antibodies
To test the biological activities of 19D11 and 26B9, HEK293 cells
transfected with type I IFN-stimulated response elements (ISRE)
fused to firefly luciferase were treated with recombinant forms
of each of 12 IFNa subtypes and IFNu in the presence or
absence of increasing concentrations of 19D11 or 26B9.
Following treatment, firefly luciferase values were normalized
to those of co-transfected Renilla luciferase, so as to control
for variations in transfection efficiency. Both antibodies strongly
inhibited the IFN-dependent response, with median IC50 values
of 2.83 ng/ml for 26B9 and 0.9 ng/ml for 19D11 (Figure 4A; Table
S4). By comparison, median IC50 values of 76.24 ng/ml and
10.86 ng/ml, respectively, were displayed by in-house-generated recombinant sifalimumab and rontalizumab, two anti-IFN
mAbs used in clinical trials for systemic lupus erythematosus patients (Table S4).
Predictably, the antibodies varied in their inhibition of IFNstimulated responses. Thus, 26B9 neutralized IFNu, but not
IFNa16, and only poorly inhibited IFNa8 (Figure 4A; Table S4).
Likewise, in the same assay, rontalizumab failed to efficiently
neutralize IFNa6, IFNa7 and IFNa10, whereas sifalimumab
neutralized several IFNa subtypes only weakly. By contrast,
19D11 neutralized all 12 IFNa subtypes tested (Table S4).
Patient-derived IFN-specific mAbs were also assessed for
their capacity to inhibit STAT1 phosphorylation in cells treated
with each of 12 IFNa subtypes, IFNu, IFNb, or IFNg (Figures
4B, 4C, and 4D). As predicted from the luciferase assay,
19D11 inhibited STAT1 phosphorylation levels (normalized to
total STAT1 or tubulin) driven by all IFNa subtypes but did not
affect responses to IFNu, IFNb, or IFNg. By contrast, 25C3, an
additional patient-derived mAb (Table S2), was highly selective
for discrete IFNa subtypes, whereas other antibodies tested,
including 26B9, showed neutralization profiles between those
of 19D11 and 25C3 (Figures 4B4D). Only 26B9 and 31B4
neutralized IFNu, and none neutralized IFNb or IFNg. By comparison, sifalimumab, rontalizumab, and AGS-009 (another
IFNa-targeting mAb in clinical development) showed variable
and less uniform inhibition of STAT1 phosphorylation induced
by different IFNa subtypes (Figure S4A).
The striking biological activities of patient mAbs were not
limited to those specific for IFNs in that potent functional target
neutralization was shown by mAbs targeting IL17F, IL22,
IL32g, and IL20, respectively (Figure S4B).
Biologically Active Human Antibodies In Vivo
We next asked whether patient autoantibodies could functionally
neutralize targets in vivo. To test this, mice were treated intraperitoneally (i.p.) with a single aliquot of antibodies 26B9, 19D11, or
sifalimumab, and their ears inoculated intradermally (i.d.) on
days 1, 3, 6, and 8 with recombinant human IFNa5 or IFNa14
(Figure 5A) and IFNu (data not shown). Relative to repeated inoculation with vehicle/PBS, the cytokines induced ear swelling, reflecting an inflammatory response that includes rapid TNFa and
IFNg induction (Figures S5A and S5B). This ear swelling was
significantly inhibited by single injections of antibodies (Figure 5B). Again, neutralization varied toward the effector IFNa
subtype: 26B9 and 19D11, but not sifalimumab, largely ablated
the IFNa5 response, whereas all three partially yet significantly
limited swelling induced by IFNa14 (Figure 5B).
Specific, antibody-mediated neutralization in vivo was likewise seen when the same assay was applied to human IL17F
Figure 3. Affinity of Patient-Derived mAbs

(A) Amino acid sequences of 26B9 and 19D11 anti-IFN antibodies aligned with closest corresponding germline IgVH, DH, JH, VL, and JL sequences. Identities
highlighted in blue; conservative mutations in yellow; non-conservative in white; CDRs underlined in red.
(B) Plasmon resonance data: antibodies 19D11 and 26B9 were immobilized on Biacore chips; different concentrations of recombinant human IFNa2b, IFNa4,
IFNa14, and IFNu were passed over; response units were recorded; and dissociation constants (KD) calculated.
(C) Scatter chart of KD values derived from (B).
(D) Binding determined by LIPS of APS1/APECED-derived mAbs and of germline counterparts to IFNa2, IFNa8, IFNa14 (19D11, 50E11, and 26B9), IL22 (35G11
and 30G1), and IL20 (20A10 and 2A11). Binding to immobilized IL17F (17E3 and 9A2) was determined by ELISA.
588 Cell 166, 582595, July 28, 2016
Figure 4. In Vitro Neutralization
(A) IC50 analysis of APS1/APECED-derived anti-IFN

mAbs 19D11 and 26B9 in HEK293T MSR cells
transfected with ISRE dual-luciferase reporter
constructs and treated with IFNa subtypes shown.
Error bars correspond to SEM of multiple measurements.
(BD) IFN-induced STAT1 tyrosine phosphorylation
detected by western blot and normalized to total
STAT1 or to tubulin levels as loading controls.
Vertical lines in (B) and (C) denote cropped lanes.
or IL32g (Figures 6A and 6B). For IL17 neutralization, the data are
clearly consistent with the known capacity of APS1/APECED
patients antibodies to neutralize Th17-family cytokines (Kisand
et al., 2010; Puel et al., 2010), thereby predisposing to Candida
infection.
Additionally, the detection of mouse IL22 by antibody 30G1
offered an opportunity to measure its bio-activity toward endogenous IL22, a primary effector of imiquimod (IMQ)-induced
dermatitis used to model psoriasis (van der Fits et al., 2009).
IMQ-induced pathology measured by modified PASI scoring
was significantly ameliorated by 30G1 relative to IgG control,
particularly following an initial inflammatory response (Figures
6C and S6). Again, 30G1 was at least as effective as an inhouse-expressed anti-IL22 antibody, fezakinumab (see above)
(Figure 6C). Collectively these data establish the capacity of
patient anti-cytokine antibodies to limit pathologies induced by
their targets in vivo.
Clinical Correlates of Neutralizing

Antibodies
Given the results from animal models, it
was appropriate to consider the potential
impact of APS1/APECED antibodies in
the patients themselves. Because circulating IFNa levels are extremely low in
human peripheral blood, even following
vaccination (Sobolev et al., 2016), circulating IFN levels do not offer robust biomarkers of anti-IFNa antibodies. Neither
does measurement of interferon-stimulated genes (ISGs) because many, e.g.,
CXCL10, can be upregulated by type II
IFNs (Welcher et al., 2015). By contrast,
antibody activities may be reliably reflected
in discrete pathologies, as in the correlation of anti-IL22 with candidiasis.
In this regard, many datasets, particularly in mouse models, suggest that type I
IFN contributes to type 1 diabetes (T1D)
(Carrero et al., 2013; Downes et al., 2010;
Foulis et al., 1987; Huang et al., 1995; Li
et al., 2008). Although APECED/APS1
patients by definition suffer from polyendocrinopathy, T1D affects only 10%
20% of patients and manifests primarily
in adulthood (Husebye et al., 2009;
Kisand and Peterson, 2015). This is despite the fact that
radioimmunoassays have revealed that many APS1/APECED
patients carry GAD65-reactive autoantibodies, a clinically
applied biomarker for likely onset of T1D (Ziegler et al., 2013).
Consistent with this, ProtoArray and LIPS data showed that
many patients carried GAD65- and/or GAD67-reactive antibodies, but among them relatively few presented with T1D (red
dots, Figures 7A and 7B). Collectively, these many observations
suggest that patients at risk of T1D, as judged by anti-GAD65/
GAD67, might fail to develop T1D if they harbored powerfully
neutralizing anti-IFNa antibodies. Indeed, we reported a seemingly exceptional APS1/APECED patient, completely lacking
IFNa-neutralizing antibodies and presenting with T1D (Kisand
et al., 2008).
To investigate this, the 8 patients presenting with T1D
(red dots, Figure 7B; mean age SD, 48 11 years) were
compared with an available cohort of 13 patients without
Cell 166, 582595, July 28, 2016 589
Figure 5. Biological Activity of IFN mAbs

(A) Experimental timeline: mAb administered i.p. at
day 0; human IFNa administered i.d. on days 1, 3,
6, and 8. Ear thickness measured on all days (prior
to cytokine injection) except for day 5.
(B) I.p.-administered IFN mAbs reduced IFNa-induced ear inflammation.
Significance calculated by two-way ANOVA, with
*p % 0.05, **p % 0.01, ***p % 0.001, and ****p %
0.0001. Error bars denote SEM.
DISCUSSION
T1D but with strong GAD65 reactivity (relative luciferase

units > 5) (blue dots, Figure 7B; mean age SD, 31 12
years). Consistent with T1D developing in adult APS1/APECED
patients, GAD65 reactivities mostly arose post-adolescence,
and hence the patient cohorts comprised 20 adults and one
8 year old.
As expected, all 21 patients harbored antibodies to IFNa
and IFNu (see Figure 2B), but when tested for IFNa and IFNu
neutralization, the antibodies showed a striking segregation
with clinical status (Figures 7C, 7D, and S7A): patients without
T1D collectively neutralized all IFNa subtypes, whereas those
with T1D showed only low or negligible neutralization. Particularly strong differences were seen vis-a-vis IFNa1, IFNa2,
IFNa5, IFNa8, IFNa14, and IFNa17 neutralization (Figure 7C).
The two subgroups of the 21 patients also showed statistically
significant differences in neutralizing IFNu, but the difference
was weaker than for IFNa (Figure S7A). Interestingly, the two
GAD65-reactive non-diabetics who displayed relatively low
IFNa neutralization were young adults who may be en route to
developing T1D.
In a small subcohort of GAD65/67-reactive patients for whom
longitudinal samples were available, the three T1D patients
(red bars) again showed lower IFNa neutralization relative to
the two patients without T1D. Moreover, one patient was able
to neutralize IFNa4 in 1978 but by 2012 could no longer do so
and presented with T1D (Figure S7B).
Such striking correlations with T1D (Figure 7D) were not
evident for any other naturally arising anti-cytokine antibodies,
supporting the view that IFNa may contribute critically to the natural progression of T1D. Moreover, although the data do not
prove that active anti-IFN antibodies underpin selective protection from T1D, they provide a firm foundation for exploring the
potentials of APS1/APECED-derived autoantibodies to ameliorate other major diseases that are rarely if ever present in
APS1/APECED patients.
590 Cell 166, 582595, July 28, 2016
This analysis of the impact of AIRE deficiency on human B cells has revealed
a signature pattern of humoral autoreactivity with general implications for our
understanding of autoimmunity. First,
the autoantibodies studied were mostly
extremely high affinity and specific for
native conformational epitopes. These
properties were shared by antibodies
specific for cytokines targeted by most patients (e.g., IFNa,
IL17, IL22) and by antibodies specific for IL20 to which few
patients displayed reactivity. Because such properties are
very rare among antibodies raised by immunization, when B
cells are primed de novo to antigen for short periods of time,
it seems inappropriate to continue to model one type of mAb
on the other.
Second, essentially all 81 APS1/APECED patients studied
showed strong reactivities toward a common set of 1015 proteins, coupled with patient-specific reactivity profiles toward
8090 additional proteins. This limited frequency (< 1% of proteins displayed on the array) is consistent with a recent report
that B cell tolerance was not globally disrupted in 51 APS1/
APECED patients sampled (Landegren et al., 2016). Nonetheless, the patient-to-patient variation in reactivity profiles meant
that the 97 sera analyzed in our study collectively harbored antibodies toward over 3,500 proteins.
The patient-to-patient variation argues that B cell autoimmunity resulting from AIRE deficiency is not simply an amplification
of sporadic, low-level autoreactivities seen in healthy controls
but has distinct origins. By this perspective, defects in central
T cell tolerance may underpin other autoimmune and autoinflammatory pathologies attributed to high-affinity autoantibodies.
Whereas this contrasts with the widely held view that autoimmune diseases mostly reflect peripheral tolerance defects, it
aligns with data that central tolerance defects contribute to the
NOD mouse model of T1D (Geng et al., 1998; Zucchelli et al.,
2005). Moreover, wherever autoantibodies reflect central T cell
tolerance defects, donor-to-donor variation is to be expected,
as individuals will generate distinct TCR repertoires via quasirandom gene rearrangements, will be exposed to different physiologic and environmental triggers that promote the selective
outgrowth of autoreactive T cell clones, and will differ in immune
response modifier genes (e.g., HLA) that regulate the magnitude
of antigen-specific responses.
Autoantibodies to some non-tissue-restricted antigens, including multiple type I IFNa subtypes, are displayed by almost
all patients, sometimes early post-partum (Wolff et al., 2013).
Most likely, the immunogenicity of these proteins arises by
mechanisms distinct from those shaping patient-specific autoantibody repertoires. Possibly the public autoantibodies arise
from a direct impact of AIRE deficiency on B cell tolerance,
for example, via the dysregulation of AIRE-expressing thymic
B cells that resemble GC B cells by several criteria (Yamano
et al., 2015). Arguing against this, however, autoantibodies to
type I IFNs, Th17 cytokines, and additional self-proteins are
found in thymoma patients with AIRE-sufficient B cells (Kisand
et al., 2011; Meager et al., 1997; Wolff et al., 2014). This likewise
argues against autoantibodies to type I IFNs and Th17 cytokines
originating from defects in lymph node AIRE+ cells termed
eTACs (Gardner et al., 2008). Although studies in mice have
suggested tolerizing roles of eTACs, the functions of their rare
human counterparts are unknown (Poliani et al., 2010).
AIRE deficiency may, however, act indirectly on thymic B cells,
for example by hyperactivity of functionally competent thymic gd
cells (Ribot et al., 2009) that may likewise be dysregulated in thymoma. Such cells may create an intra-thymic milieu favoring
priming rather than tolerance of thymic B cells toward proteins
highly expressed in the thymus (Dudakov et al., 2012; Meager
et al., 2006).
Notwithstanding this possibility, our findings suggest that
high-affinity autoantibodies in APS1/APECED patients prob-
Figure 6. In Vivo Activity of Cytokine-Reactive mAbs

(A) mAb administered i.p. at day 0, and human
IL17F administered i.d. on days 1, 3, 6, and 8. Ear
thickness measured on all days (prior to cytokine
injection) except day 5.
(B) As in (A), but with human IL32g administered i.d.
(C) anti-IL22-specific mAb injected i.p. into 9-week
mice prior to and during IMQ treatment. Efficacy
measured by Psoriasis Area and Severity Index
(PASI).
Significance calculated by two-way ANOVA, with
*p % 0.05, **p % 0.01, ***p % 0.001, and ****p %
0.0001. Error bars denote SEM.
ably reflect dysregulated GC reactions,

wherein autoreactive T cells, e.g., Tfh
cells, that were not tolerized in
the thymus promote the competitive
outgrowth and affinity maturation of
B cells that were initially primed to exogenous antigen(s) but whose mutated
IgGs bind to self-proteins. Consistent
with this, autoantibodies targeting thyroid-stimulating hormone receptor in
Graves disease cross-react to Yersinia
enterocolitica antigens (Brink, 2014;
Hargreaves et al., 2013), and activated
peripheral blood Tfh cells correlate positively with serum autoantibodies and disease activity/severity in
multiple autoimmune diseases (Ueno et al., 2015). Although our
analysis of four adult and four pediatric APS1/APECED patients
revealed no alterations in Tfh cell numbers relative to agematched healthy controls, this did not exclude Tfh cells being
enriched in autoreactive specificities. Moreover, no patients displayed neutralizing autoantibodies to IL21, a major mediator of
Tfh cells in the GC.
This etiology of APS1/APECED B cell autoimmunity is strikingly similar to proposed origins of highly mutated anti-desmoglein-3 antibodies in autoimmune pemphigus (Di Zenzo et al.,
2012) and of anti-GM-CSF antibodies pathognomonic in pulmonary alveolar proteinosis (Piccoli et al., 2015). In those
studies, as in this, the closest germline counterparts (unmutated common ancestors [UCAs]) showed no reactivity toward
the targets of the affinity-matured autoantibodies. By contrast,
germline versions of antiviral antibodies showed only slightly
reduced binding to target viral antigens (Corti et al., 2011,
2013). Moreover, it is not the case that UCAs intrinsically lack
autoreactivity, as germline counterparts of some autoantibodies with few replacement mutations showed autoantigen
reactivity in pemphigus patients (Cho et al., 2014). The underlying defect(s) in T cell tolerance that dysregulate affinity maturation in pemphigus, pulmonary alveolar proteinosis, and other
organ-specific autoimmune diseases may be limited to few
antigens, by contrast to broad-spectrum defects in APS1/
APECED.
Cell 166, 582595, July 28, 2016 591
That almost all APS1/APECED-derived mAbs were biologically active in vivo against a range of cytokine targets has
profound implications for patients. Clearly, immune-effector responses may be reduced, as in the association of anti-IL22
with susceptibility to Candidiasis (Kisand et al., 2010). Likewise,
gut barrier integrity may be compromised, leading to increased
levels of anti-commensal antibodies (Hetemaki et al., 2016).
Conversely, despite the common neutralization of IFNa and
IFNu, APS1/APECED patients do not show severe viral infections, as were recently reported for a child genetically impaired
in type I IFN (Ciancanelli et al., 2015). Possibly preserved
IFNb function mediates anti-viral protection in APS1/APECED
patients.
On the other hand, some autoantibodies may target key
mediators of immunopathologies, thereby ameliorating disease.
Thus, a unique correlation was observed between antibodymediated neutralization of IFNa and failure to develop T1D,
providing a novel strand of support for animal studies arguing
that targeting type I IFNs could be effective in T1D. The concept
that naturally arising autoantibodies may be beneficial is not
widely considered, despite its underpinning the widespread
592 Cell 166, 582595, July 28, 2016
Figure 7. Clinical Correlation of T1D and IFN

Neutralization
(A and B) Seroreactivity to GAD67 and GAD65
measured by ProtoArray and LIPS in APS1/
APECED patients with (red) or without (blue) T1D.
(C) IFNa-neutralizing titers in patients with T1D
(n = 8) and anti-GAD65 seropositive patients
without T1D (n = 13). y axis shows inhibitory concentration IC50 reflecting serum dilutions at which
IFN activity was reduced 50%.
(D) Heatmap of seroreactivity toward GAD67,
GAD65, and IFNa analyzed by ProtoArray and
LIPS combined with neutralization capacity in
patients with and without T1D.
Significance calculated by Mann Whitney using
GraphPad Prism v.6, with *p < 0.05, **p < 0.01,
***p < 0.001, ****p < 0.0001. Error bars
denote SEM. Significance values in (D) compare
T1D+ and T1D groups for each parameter.
use of therapeutic mAbs. In this regard,

it is striking that despite their severe flaws
in central T cell tolerance, APS1/APECED
patients do not present with systemic
sclerosis, Sjogrens syndrome, MS, or
SLE. These pathologies are considered
to involve interplays of IL17/Th17 and
type I IFNstwo main targets of APS1/
APECED autoantibodies (Ambrosi et al.,
2012). Likewise, Th17-driven psoriasis
was diagnosed in only two of our patients,
each of whom lacked autoantibodies
to IL17A, IL17F, and IL22 (our unpublished data). Furthermore, atopy/allergy
is seemingly rare among APS1/APECED
patients, although whether anti-IL5 antibodies underpin this requires more study.
For now, the data presented by this study strongly suggest that
antibodies recovered from APS1/APECED patients include ones
with profound therapeutic and diagnostic potential.
More details are available in the Supplemental Experimental Procedures.
Human Samples
Eighty-one APS1/APECED patients were diagnosed by mutational analysis of
AIRE and by autoantibodies to type I IFNs. All provided informed consent, and
many were analyzed previously (Kisand et al., 2011; Kluger et al., 2015; Meloni
et al., 2012; Wolff et al., 2007). Approvals by local ethics committees are
described in the Supplemental Experimental Procedures. Ages at serum sampling were 473 years; mean = 31.9. For protoarray there were 12 agematched controls and 9 healthy first-degree relatives, and there were
additional healthy controls for LIPS and ELISA.
Immune Response Profiling by ProtoArray
Sera of patients, healthy relatives, and controls were tested against > 9,000 human proteins displayed on the Human Protein Microarray v5.1 (ThermoFisher
Scientific). Preprocessing methods were applied to account for technical variability. First, corresponding local background intensity was subtracted,
whereafter values were log-transformed and subjected to robust linear normalization (Sboner et al., 2009). Z scores were calculated as the number of standard deviations of the signal from the mean of the corresponding controls
and healthy relatives; Z R 3 was considered positive. After scoring, stringent
quality assessment was undertaken, including high correlation coefficients of
duplicate spots of printed proteins (average r = 0.92), reactivity toward known
autoantibody targets, and perfect correlation of signals for proteins spotted in
different locations. Printing contaminants were identified as proteins showing
high correlation coefficients with known APECED antibody targets and were
further verified by cross-reference to another protoarray (5.0) used for 23 patients and 7 controls. Thus, 31 suspect false-positives were identified and
excluded from further consideration.
Antibody Isolation and Cloning
Cloning, production, and purification of human mAbs were performed as
described (patent application WO2013/098419). In brief, memory B cells
(CD22+, IgD , IgM , CD3 , CD8 , and CD54 ) were flow-sorted (MoFlo)
from patient PBMC, incubated transiently with EBV-containing B95-8 supernatant (SN) for 3.5 hr at 37 C, and then incubated in Transferrin- and CpGsupplemented IMDM at 37 C, 5% CO2, at 10 cells/well in 96-well plates
coated with irradiated PBMC feeders. Short-term, oligoclonal B cell culture
SN were analyzed for IgG and antigen-specific antibodies detected by
ELISA and/or LIPS. Positive wells were harvested, cells single-cell-sorted
into reverse transcriptase (RT) buffer (Life Technologies), and RT-PCR
performed using Superscript III (Life Technologies) and random hexamers.
IgG VH, Vl, and Vk regions were amplified from cDNA by two-step nested
PCR reaction using Advantage 2 cDNA polymerase (Clontech) and primer
mixes specific for germline families (VBASE database). Nested primers
attached restriction sites for V-region cloning into expression vectors
providing IgG1, Ig-k, or Ig-l constant regions. Recombinant antibodies were
produced in HEK293T cells and antigen specificity analyzed by ELISA. Corresponding closest germline region sequences were identified using the
VBASE2 database (Retter et al., 2005). CDRs were identified by IMGT definitions (Lefranc, 2003).
Complete Ig-VH and VL regions described in US7741449 (Sifalimumab),
US7087726 B2 (Rontalizumab), US8361463 (ACO-1), and US20070258982
A1 (Fezakinumab) were ordered as CHO-codon-optimized synthetic constructs (GenScript) and expressed as above.
mAb Characterization In Vitro
EC50 binding of mAbs was determined by ELISA. Neutralizing capacities of
type I IFN-specific mAbs were studied using phospho-STAT1 quantification
in immunoblot and ISRE-luciferase reporter assay. IL17F, IL22, IL20, and
IL32 neutralization assays were performed on respective responsive cell lines.
mAB affinities were measured with a Biacore T200 (GE Healthcare). Epitope
mapping used overlapping 18-mer peptides.
mAb Characterization In Vivo
C57BL/6J (WT; from Charles River) mice were administered i.p. with mAbs
(day 0) and inoculated i.d. on days 1, 3, 6, and 8 with cognate human cytokines,
IFNa2a, IFNa2b, IFNa4, IFNa14, IL17F, and IL32g, and their ear thicknesses
measured with a micrometer. For IL22 mAbs cross-reactive to mouse, bioactivity was assessed in imiquimod-treated mice.
S.M., A.M., and S.C.O. cloned monoclonal antibodies from patient samples,
and K.J. and K.H. assisted. S.M., P.V., and A.M. characterized antibodies
in vitro; M.W. and Y.H. did so in vivo. C.H. analyzed ProtoArray data and wrote
and edited the paper. J.K. assayed neutralization by sera and Tfh subsets and
performed LIPS. D.F. and H.P. analyzed ProtoArray data. K.M. and R.U.
screened sera for T1D autoantibodies and tested germline antibody specificities. K. Krohn and A.R. developed the clinical database, sampled Finnish
patients, and employed ELISA. A.S.B.W. sampled Norwegian patients,
contributed to the clinical database, and assayed antibodies. APECED patient
collaborative contributed to the clinical database and sampled respective patients. P.P., K. Kisand, and A.H. supervised research, reviewed data, and
wrote and edited the paper.
ACKNOWLEDGMENTS
We are indebted to patients; to the Finnish APECED and Addison patients
association; and to attending physicians and carers. We thank M. Rothe,
P. Adler, A. Remm, M. Pihlap, M. Karlsberg, A. Tallqvist, M. Tuukkanen,
L. Prassmayer, M. Wordehoff, A. Peters, R. Repke, B. Mathis, and particularly
E. Stuart and K. Henco for critical insight and support. We thank staff of the
Biological Services Unit at Kings College London. Funding was by the
following: ImmunoQure AG, the Wellcome Trust, and CRUK (to A.H.) and
Estonian Research Council grant IUT2-2 and European Union Project 20142020.4.01.15-0012 (to J.K., P.P., and K. Kisand). P.P., K. Krohn, K. Kisand,
A.R., and A.H. are cofounders and shareholders of ImmunoQure AG, and
A.M., C.H., P.V., S.C.O., and S.M. were/are employees of ImmunoQure AG.
Received: January 27, 2016
Revised: April 24, 2016
Accepted: June 10, 2016
Published: July 14 2016
REFERENCES
Ambrosi, A., Espinosa, A., and Wahren-Herlenius, M. (2012). IL-17: a new actor
in IFN-driven systemic autoimmune diseases. Eur. J. Immunol. 42, 22742284.
Bennett, C.L., Christie, J., Ramsdell, F., Brunkow, M.E., Ferguson, P.J., Whitesell, L., Kelly, T.E., Saulsbury, F.T., Chance, P.F., and Ochs, H.D. (2001). The
immune dysregulation, polyendocrinopathy, enteropathy, X-linked syndrome
(IPEX) is caused by mutations of FOXP3. Nat. Genet. 27, 2021.
Brink, R. (2014). The imperfect control of self-reactive germinal center B cells.
Curr. Opin. Immunol. 28, 97101.
Carrero, J.A., Calderon, B., Towfic, F., Artyomov, M.N., and Unanue, E.R.
(2013). Defining the transcriptional and cellular landscape of type 1 diabetes
in the NOD mouse. PLoS ONE 8, e59701.
Cho, M.J., Lo, A.S., Mao, X., Nagler, A.R., Ellebrecht, C.T., Mukherjee, E.M.,
Hammers, C.M., Choi, E.J., Sharma, P.M., Uduman, M., et al. (2014). Shared
VH1-46 gene usage by pemphigus vulgaris autoantibodies indicates common
humoral immune responses among patients. Nat. Commun. 5, 4167.
Ciancanelli, M.J., Huang, S.X., Luthra, P., Garner, H., Itan, Y., Volpi, S., Lafaille,
F.G., Trouillet, C., Schmolke, M., Albrecht, R.A., et al. (2015). Infectious disease. Life-threatening influenza and impaired interferon amplification in human
IRF7 deficiency. Science 348, 448453.

seven figures, and four tables and can be found with this article online at
Corti, D., Voss, J., Gamblin, S.J., Codoni, G., Macagno, A., Jarrossay, D., Vachieri, S.G., Pinna, D., Minola, A., Vanzetta, F., et al. (2011). A neutralizing antibody selected from plasma cells that binds to group 1 and group 2 influenza A
hemagglutinins. Science 333, 850856.
CONSORTIA
Corti, D., Bianchi, S., Vanzetta, F., Minola, A., Perez, L., Agatic, G., Guarino, B.,
Silacci, C., Marcandalli, J., Marsland, B.J., et al. (2013). Cross-neutralization of
four paramyxoviruses by a human monoclonal antibody. Nature 501, 439443.
The members of the APECED patient collaborative are Antonella Meloni, Nicolas Kluger, Eystein S. Husebye, Katarina Trebusak Podkrajsek, Tadej Battelino, Nina Bratanic, and Aleksandr Peet.
Di Zenzo, G., Di Lullo, G., Corti, D., Calabresi, V., Sinistro, A., Vanzetta, F.,
Didona, B., Cianchini, G., Hertl, M., Eming, R., et al. (2012). Pemphigus
Cell 166, 582595, July 28, 2016 593
autoantibodies generated through somatic mutations target the desmoglein-3

cis-interface. J. Clin. Invest. 122, 37813790.
Downes, K., Pekalski, M., Angus, K.L., Hardy, M., Nutland, S., Smyth, D.J.,
Walker, N.M., Wallace, C., and Todd, J.A. (2010). Reduced expression of
IFIH1 is protective for type 1 diabetes. PLoS ONE 5, e12646.
Kluger, N., Jokinen, M., Lintulahti, A., Krohn, K., and Ranki, A. (2015). Gastrointestinal immunity against tryptophan hydroxylase-1, aromatic L-amino-acid
decarboxylase, AIE-75, villin and Paneth cells in APECED. Clin. Immunol. 158,
212220.
Dudakov, J.A., Hanash, A.M., Jenq, R.R., Young, L.F., Ghosh, A., Singer, N.V.,
West, M.L., Smith, O.M., Holland, A.M., Tsai, J.J., et al. (2012). Interleukin-22
drives endogenous thymic regeneration in mice. Science 336, 9195.
Krohn, K., Uibo, R., Aavik, E., Peterson, P., and Savilahti, K. (1992). Identification by molecular cloning of an autoantigen associated with Addisons disease
as steroid 17 alpha-hydroxylase. Lancet 339, 770773.
Landegren, N., Sharon, D., Freyhult, E., Hallgren, A., Eriksson, D., Edqvist,
Foulis, A.K., Farquharson, M.A., and Meager, A. (1987). Immunoreactive

alpha-interferon in insulin-secreting beta cells in type 1 diabetes mellitus. Lancet 2, 14231427.
P.-H., Bensing, S., Wahlberg, J., Nelson, L.M., Gustafsson, J., et al. (2016).
Proteome-wide survey of the autoimmune target repertoire in autoimmune
polyendocrine syndrome type 1. Sci. Rep. 6, 20104.
Gardner, J.M., Devoss, J.J., Friedman, R.S., Wong, D.J., Tan, Y.X., Zhou, X.,
Johannes, K.P., Su, M.A., Chang, H.Y., Krummel, M.F., and Anderson, M.S.
(2008). Deletional tolerance mediated by extrathymic Aire-expressing cells.
Science 321, 843847.
Lefranc, M.P. (2003). IMGT, the international ImMunoGeneTics database. Nucleic Acids Res. 31, 307310.
Geng, L., Solimena, M., Flavell, R.A., Sherwin, R.S., and Hayday, A.C. (1998).
Widespread expression of an autoantigen-GAD65 transgene does not tolerize
non-obese diabetic mice and can exacerbate disease. Proc. Natl. Acad. Sci.
USA 95, 1005510060.
Goodnow, C.C., Vinuesa, C.G., Randall, K.L., Mackay, F., and Brink, R. (2010).
Control systems and decision making for antibody production. Nat. Immunol.
11, 681688.
Hargreaves, C.E., Grasso, M., Hampe, C.S., Stenkova, A., Atkinson, S.,
Joshua, G.W., Wren, B.W., Buckle, A.M., Dunn-Walters, D., and Banga, J.P.
(2013). Yersinia enterocolitica provides the link between thyroid-stimulating
antibodies and their germline counterparts in Graves disease. J. Immunol.
190, 53735381.
Hetemaki, I., Jarva, H., Kluger, N., Baldauf, H.M., Laakso, S., Bratland, E., Husebye, E.S., Kisand, K., Ranki, A., Peterson, P., and Arstila, T.P. (2016).
Anticommensal responses are associated with regulatory T cell defect in autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy patients.
J. Immunol. 196, 29552964.
Huang, X., Yuang, J., Goddard, A., Foulis, A., James, R.F., Lernmark, A., PujolBorrell, R., Rabinovitch, A., Somoza, N., and Stewart, T.A. (1995). Interferon
expression in the pancreases of patients with type I diabetes. Diabetes 44,
658664.
Husebye, E.S., Perheentupa, J., Rautemaa, R., and Kampe, O. (2009). Clinical
manifestations and management of patients with autoimmune polyendocrine
syndrome type I. J. Intern. Med. 265, 514529.
Ivashkiv, L.B., and Donlin, L.T. (2014). Regulation of type I interferon responses. Nat. Rev. Immunol. 14, 3649.
Kinnunen, T., Chamberlain, N., Morbach, H., Choi, J., Kim, S., Craft, J., Mayer,
L., Cancrini, C., Passerini, L., Bacchetta, R., et al. (2013). Accumulation of peripheral autoreactive B cells in the absence of functional human regulatory
T cells. Blood 121, 15951603.
Kisand, K., and Peterson, P. (2015). Autoimmune polyendocrinopathy candidiasis ectodermal dystrophy. J. Clin. Immunol. 35, 463478.
Kisand, K., Link, M., Wolff, A.S., Meager, A., Tserel, L., Org, T., Murumagi, A.,
Uibo, R., Willcox, N., Trebusak Podkrajsek, K., et al. (2008). Interferon autoantibodies associated with AIRE deficiency decrease the expression of IFN-stimulated genes. Blood 112, 26572666.
Kisand, K., Be Wolff, A.S., Podkrajsek, K.T., Tserel, L., Link, M., Kisand, K.V.,
Ersvaer, E., Perheentupa, J., Erichsen, M.M., Bratanic, N., et al. (2010).
Chronic mucocutaneous candidiasis in APECED or thymoma patients correlates with autoimmunity to Th17-associated cytokines. J. Exp. Med. 207,
299308.
Kisand, K., Lilic, D., Casanova, J.L., Peterson, P., Meager, A., and Willcox, N.
(2011). Mucocutaneous candidiasis and autoimmunity against cytokines in
APECED and thymoma patients: clinical and pathogenetic implications. Eur.
J. Immunol. 41, 15171527.
Klein, L., Kyewski, B., Allen, P.M., and Hogquist, K.A. (2014). Positive and
negative selection of the T cell repertoire: what thymocytes see (and dont
see). Nat. Rev. Immunol. 14, 377391.
594 Cell 166, 582595, July 28, 2016
Li, Q., Xu, B., Michie, S.A., Rubins, K.H., Schreriber, R.D., and McDevitt, H.O.
(2008). Interferon-alpha initiates type 1 diabetes in nonobese diabetic mice.
Mathis, D., and Benoist, C. (2009). Aire. Annu. Rev. Immunol. 27, 287312.
Meager, A., Vincent, A., Newsom-Davis, J., and Willcox, N. (1997). Spontaneous neutralising antibodies to interferonalpha and interleukin-12 in thymoma-associated autoimmune disease. Lancet 350, 15961597.
Meager, A., Visvalingam, K., Peterson, P., Moll, K., Murumagi, A., Krohn, K.,
Eskelin, P., Perheentupa, J., Husebye, E., Kadota, Y., and Willcox, N. (2006).
Anti-interferon autoantibodies in autoimmune polyendocrinopathy syndrome
type 1. PLoS Med. 3, e289.
Meffre, E., and Wardemann, H. (2008). B-cell tolerance checkpoints in health
and autoimmunity. Curr. Opin. Immunol. 20, 632638.
Meloni, A., Willcox, N., Meager, A., Atzeni, M., Wolff, A.S., Husebye, E.S., Furcas, M., Rosatelli, M.C., Cao, A., and Congia, M. (2012). Autoimmune polyendocrine syndrome type 1: an extensive longitudinal study in Sardinian patients.
J. Clin. Endocrinol. Metab. 97, 11141124.
Nagamine, K., Peterson, P., Scott, H.S., Kudoh, J., Minoshima, S., Heino, M.,
Krohn, K.J., Lalioti, M.D., Mullis, P.E., Antonarakis, S.E., et al. (1997). Positional
cloning of the APECED gene. Nat. Genet. 17, 393398.
Piccoli, L., Campo, I., Fregni, C.S., Rodriguez, B.M., Minola, A., Sallusto, F.,
Luisetti, M., Corti, D., and Lanzavecchia, A. (2015). Neutralization and clearance of GM-CSF by autoantibodies in pulmonary alveolar proteinosis. Nat.
Commun. 6, 7375.
Pillai, S., Mattoo, H., and Cariappa, A. (2011). B cells and autoimmunity. Curr.
Opin. Immunol. 23, 721731.
Poliani, P.L., Kisand, K., Marrella, V., Ravanini, M., Notarangelo, L.D., Villa, A.,
Peterson, P., and Facchetti, F. (2010). Human peripheral lymphoid tissues
contain autoimmune regulator-expressing dendritic cells. Am. J. Pathol. 176,
11041112.
Puel, A., Doffinger, R., Natividad, A., Chrabieh, M., Barcenas-Morales, G., Picard, C., Cobat, A., Ouachee-Chardin, M., Toulon, A., Bustamante, J., et al.
(2010). Autoantibodies against IL-17A, IL-17F, and IL-22 in patients with
chronic mucocutaneous candidiasis and autoimmune polyendocrine syndrome type I. J. Exp. Med. 207, 291297.
Retter, I., Althaus, H.H., Munch, R., and Muller, W. (2005). VBASE2, an integrative V gene database. Nucleic Acids Res. 33, D671D674.
Ribot, J.C., deBarros, A., Pang, D.J., Neves, J.F., Peperzak, V., Roberts, S.J.,
Girardi, M., Borst, J., Hayday, A.C., Pennington, D.J., and Silva-Santos, B.
(2009). CD27 is a thymic determinant of the balance between interferongamma- and interleukin 17-producing gammadelta T cell subsets. Nat. Immunol. 10, 427436.
Sboner, A., Karpikov, A., Chen, G., Smith, M., Mattoon, D., Freeman-Cook, L.,
Schweitzer, B., and Gerstein, M.B. (2009). Robust-linear-model normalization
to reduce technical variability in functional protein microarrays. J. Proteome
Res. 8, 54515464.
Sobolev, O., Binda, E., OFarrell, S., Lorenc, A., Pradines, J., Huang, Y., Duffner, J., Schulz, R., Cason, J., Zambon, M., et al. (2016). Adjuvanted influenzaH1N1 vaccination reveals lymphoid signatures of age-dependent early
responses and of clinical adverse events. Nat. Immunol. 17, 204213.
Ubelhart, R., and Jumaa, H. (2015). Autoreactivity and the positive selection of
B cells. Eur. J. Immunol. 45, 29712977.
autoimmune polyendocrine syndrome type I and Addisons disease. J. Clin.

Invest. 92, 23772385.
Ueno, H., Banchereau, J., and Vinuesa, C.G. (2015). Pathophysiology of T

follicular helper cells in humans and mice. Nat. Immunol. 16, 142152.
Wolff, A.S., Erichsen, M.M., Meager, A., Magitta, N.F., Myhre, A.G., Bollerslev,
J., Fougner, K.J., Lima, K., Knappskog, P.M., and Husebye, E.S. (2007). Autoimmune polyendocrine syndrome type 1 in Norway: phenotypic variation, autoantibodies, and novel mutations in the autoimmune regulator gene. J. Clin.
Endocrinol. Metab. 92, 595603.
Uibo, R., Aavik, E., Peterson, P., Perheentupa, J., Aranko, S., Pelkonen, R.,
and Krohn, K.J. (1994). Autoantibodies to cytochrome P450 enzymes
P450scc, P450c17, and P450c21 in autoimmune polyglandular disease
types I and II and in isolated Addisons disease. J. Clin. Endocrinol. Metab.
78, 323328.
van der Fits, L., Mourits, S., Voerman, J.S., Kant, M., Boon, L., Laman, J.D.,
Cornelissen, F., Mus, A.M., Florencia, E., Prens, E.P., and Lubberts, E.
(2009). Imiquimod-induced psoriasis-like skin inflammation in mice is mediated via the IL-23/IL-17 axis. J. Immunol. 182, 58365845.
Wardemann, H., Yurasov, S., Schaefer, A., Young, J.W., Meffre, E., and Nussenzweig, M.C. (2003). Predominant autoantibody production by early human
B cell precursors. Science 301, 13741377.
Welcher, A.A., Boedigheimer, M., Kivitz, A.J., Amoura, Z., Buyon, J., Rudinskaya, A., Latinis, K., Chiu, K., Oliner, K.S., Damore, M.A., et al. (2015).
Blockade of interferon-gamma normalizes interferon-regulated gene expression and serum CXCL10 levels in patients with systemic lupus erythematosus.
Arthritis Rheumatol. 67, 27132722.
Wildin, R.S., Ramsdell, F., Peake, J., Faravelli, F., Casanova, J.L., Buist, N.,
Levy-Lahad, E., Mazzella, M., Goulet, O., Perroni, L., et al. (2001). X-linked
neonatal diabetes mellitus, enteropathy and endocrinopathy syndrome is the
human equivalent of mouse scurfy. Nat. Genet. 27, 1820.
Winqvist, O., Gustafsson, J., Rorsman, F., Karlsson, F.A., and Kampe, O.
(1993). Two different cytochrome P450 enzymes are the adrenal antigens in
Wolff, A.S., Sarkadi, A.K., Marodi, L., Karner, J., Orlova, E., Oftedal, B.E.,
Kisand, K., Olah, E., Meloni, A., Myhre, A.G., et al. (2013). Anti-cytokine autoantibodies preceding onset of autoimmune polyendocrine syndrome type I
features in early childhood. J. Clin. Immunol. 33, 13411348.
Wolff, A.S., Karner, J., Owe, J.F., Oftedal, B.E., Gilhus, N.E., Erichsen, M.M.,
Kampe, O., Meager, A., Peterson, P., Kisand, K., et al. (2014). Clinical and
serologic parallels to APS-I in patients with thymomas and autoantigen transcripts in their tumors. J. Immunol. 193, 38803890.
Yamano, T., Nedjic, J., Hinterberger, M., Steinert, M., Koser, S., Pinto, S.,
Gerdes, N., Lutgens, E., Ishimaru, N., Busslinger, M., et al. (2015). Thymic
B cells are licensed to present self antigens for central T cell tolerance induction. Immunity 42, 10481061.
Ziegler, A.G., Rewers, M., Simell, O., Simell, T., Lempainen, J., Steck, A., Winkler, C., Ilonen, J., Veijola, R., Knip, M., et al. (2013). Seroconversion to multiple
islet autoantibodies and risk of progression to diabetes in children. JAMA 309,
24732479.
Zucchelli, S., Holler, P., Yamagata, T., Roy, M., Benoist, C., and Mathis, D.
(2005). Defective central tolerance induction in NOD mice: genomics and genetics. Immunity 22, 385396.
Cell 166, 582595, July 28, 2016 595
Article
Structure and Function Analysis of an Antibody

Recognizing All Influenza A Subtypes
Graphical Abstract
Authors
Nicole L. Kallewaard, Davide Corti,
Patrick J. Collins, ..., Qing Zhu,
Steven J. Gamblin, John J. Skehel
Correspondence
zhuq@medimmune.com (Q.Z.),
john.skehel@crick.ac.uk (J.J.S.)
In Brief
Identification of a human monoclonal
antibody that reacts effectively with all
influenza A hemagglutinin subtypes
paves the way for developing
immunotherapy for people infected with
the flu virus.
Highlights
d
Binding to all influenza A subtypes neutralizing seasonal and

pandemic strains
Utilizes a rare VH (VH6-1) and carries a low level of somatic
mutations
Highly conserved epitope encompassing fusion peptide and
hydrophobic groove
Superior therapeutic window compared to oseltamivir in
animals
Kallewaard et al., 2016, Cell 166, 596608

July 28, 2016 2016 The Authors. Published by Elsevier Inc.
Accession Numbers
5JW5
5JW4
5JW3
KX398429
KX398468
Article
Structure and Function Analysis of an Antibody
Recognizing All Influenza A Subtypes
Nicole L. Kallewaard,1,8 Davide Corti,2,8 Patrick J. Collins,3,8 Ursula Neu,3,8 Josephine M. McAuliffe,1 Ebony Benjamin,1
Leslie Wachter-Rosati,1 Frances J. Palmer-Hill,1 Andy Q. Yuan,4 Philip A. Walker,5 Matthias K. Vorlaender,3 Siro Bianchi,2
Barbara Guarino,2 Anna De Marco,2 Fabrizia Vanzetta,2 Gloria Agatic,2 Mathilde Foglierini,6 Debora Pinna,6
Blanca Fernandez-Rodriguez,6 Alexander Fruehwirth,6 Chiara Silacci,6 Roksana W. Ogrodowicz,5 Stephen R. Martin,5
Federica Sallusto,6 JoAnn A. Suzich,1 Antonio Lanzavecchia,6,7,9 Qing Zhu,1,9,* Steven J. Gamblin,3,9
and John J. Skehel3,9,*
1Department
of Infectious Disease and Vaccines, MedImmune LLC, One MedImmune Way, Gaithersburg, MD 20878, USA
BioMed SA, Via Mirasole 1, 6500 Bellinzona, Switzerland
3Mill Hill Laboratory, The Francis Crick Institute, London NW7 1AA, UK
4Department of Antibody Discovery and Protein Engineering, MedImmune LLC, One MedImmune Way, Gaithersburg, MD 20878, USA
5Structural Biology Science Technology Platform, Mill Hill Laboratory, Francis Crick Institute, London NW7 1AA, UK
6Institute for Research in Biomedicine, Universita
` della Svizzera italiana, 6500 Bellinzona, Switzerland
7Institute for Microbiology, ETH Zurich, Wolfgang-Pauli-Strasse 10, 8093 Zurich, Switzerland
8Co-first author
9Co-senior author
*Correspondence: zhuq@medimmune.com (Q.Z.), john.skehel@crick.ac.uk (J.J.S.)
2Humabs
SUMMARY
Influenza virus remains a threat because of its ability

to evade vaccine-induced immune responses due
to antigenic drift. Here, we describe the isolation,
evolution, and structure of a broad-spectrum human monoclonal antibody (mAb), MEDI8852, effectively reacting with all influenza A hemagglutinin
(HA) subtypes. MEDI8852 uses the heavy-chain
VH6-1 gene and has higher potency and breadth
when compared to other anti-stem antibodies.
MEDI8852 is effective in mice and ferrets with a
therapeutic window superior to that of oseltamivir.
Crystallographic analysis of Fab alone or in complex
with H5 or H7 HA proteins reveals that MEDI8852
binds through a coordinated movement of CDRs
to a highly conserved epitope encompassing a hydrophobic groove in the fusion domain and a large
portion of the fusion peptide, distinguishing it from
other structurally characterized cross-reactive antibodies. The unprecedented breadth and potency
of neutralization by MEDI8852 support its development as immunotherapy for influenza virus-infected
humans.
INTRODUCTION
Influenza virus infection remains a serious threat to global
health and the world economy. Annual epidemics result in a
high number of hospitalizations, with an estimated 35 million
cases of severe disease and 250,000500,000 deaths globally,
and higher mortality rates are possible during pandemics
(Wright et al., 2007). Given the emergence of anti-viral drug-
resistance, short treatment windows for antivirals and the

lack of cross-protective vaccines, there is an unmet medical
need for new therapeutic options that can effectively treat influenza infection.
There are three types of influenza viruses, A, B, and C
causing disease in humans, and influenza A and B are responsible for frequent seasonal epidemics. However, influenza A
infections account for the majority of hospitalizations and
are the only type to cause pandemics (Wright et al., 2007). Influenza A is subtyped by its two major surface proteins, hemagglutinin (HA) and neuraminidase (NA). HA is the main target of
neutralizing antibodies that are induced by infection or vaccination. The globular HA head domain mediates binding to the
sialic acid receptor, while the HA stem mediates the subsequent fusion between the viral and cellular membranes that is
triggered in endosomes by the low pH (Skehel and Wiley,
2000). Genetically, there are 16 influenza A subtypes of HA,
which form two structurally and antigenically distinct groups
(Nobusawa et al., 1991; Russell et al., 2004). In addition, two
new HA analogs recovered from bats, H17 and H18, have
been included in this classification (Tong et al., 2012, 2013).
Currently, H1 and H3 HA subtypes are associated with human
disease and viruses containing H5, H7, H9, and H10 HAs are
associated with sporadic human infections due to direct transmission from avian species.
The majority of influenza virus neutralizing antibodies elicited
by vaccination or infection bind to the globular head of HA and
recognize homologous strains within a given subtype (Russell
et al., 2008). These antibodies neutralize virus infectivity by
blocking sialic acid receptor binding either directly (Knossow
and Skehel, 2006; Schmidt et al., 2013) by interacting with the
receptor binding site at the tip of the molecule or indirectly, by
projecting over the binding site thereby rendering it inaccessible
(Fleury et al., 1999; Xiong et al., 2015). These antibodies are
involved in the selection of viruses with variant HAs in the
596 Cell 166, 596608, July 28, 2016 2016 The Authors. Published by Elsevier Inc.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
process of antigenic drift, necessitating the annual re-development of influenza vaccines.

In the past 8 years, several laboratories have described a new
class of influenza-neutralizing antibodies that target conserved
sites in the HA stem that showed different levels of cross-reactivity toward group 1 (Corti et al., 2010; Sui et al., 2009; Throsby
et al., 2008; Wrammert et al., 2011), group 2 (Dunand et al., 2015;
Ekiert et al., 2011; Friesen et al., 2014; Tan et al., 2014) and
groups 1 and 2 viruses (Corti et al., 2011; Dreyfus et al., 2012; Nakamura et al., 2013; Wu et al., 2015). Anti-stem antibodies are
less potent at direct viral neutralization as compared to antihead antibodies, but were shown to induce potent antibodydependent cellular cytotoxicity (ADCC) of infected cells in vitro
and in vivo (Corti et al., 2011; Dilillo et al., 2016; DiLillo et al.,
2014), while anti-head antibodies were not or less effective at
mediating ADCC. In general, the human antibody response
to the HA stem region is more frequent against group 1 as
compared to group 2 HAs and is dominated by VH1-69 antibodies (Pappas et al., 2014; Sui et al., 2009; Wrammert et al.,
2011). Although subdominant, the group 1 stem response was
shown to be recalled after heterologous boosts by the new
pandemic H1N1 virus in 2009 (Corti et al., 2011; Wrammert
et al., 2011). The antibody response to the HA stem region of
group 2 HAs is less frequent, possibly due to the presence of a
conserved glycan bound to N38 in HA1 that may shield the access to the most conserved sites in the HA stem and to the
lack of exposure to heterologous group 2 viruses (i.e., H7) or to
new pandemic H3N2 viruses. Finally, antibodies capable of reacting with the HA stem region of both group 1 and 2 subtypes
are extremely rare and usually do not show complete coverage
of all subtypes. It has been hypothesized that such broadly
cross-reactive antibodies might have potential as therapeutic
agents and studies on their mechanism of action, epitope specificity, and ontogeny could also inform the design of cross-protective influenza virus vaccines (Corti and Lanzavecchia, 2013;
Yewdell, 2013).
A problem related to the development of anti-stem antibodies as immunotherapeutics is their variable neutralizing
potency against viruses belonging to different subtypes and
the existence of natural escape mutants. In view of the limitations of group 1 and group 2 antibodies isolated so far, we
searched for an antibody capable of potently neutralizing
group 1 and 2 influenza A viruses within a narrow range of
antibody concentrations. In this study, we isolated and optimized an antibody, named MEDI8852, that exhibited unprecedented breadth and potency, being able to neutralize a diverse
panel of representative viruses spanning >80 years of antigenic evolution. Unlike other broadly neutralizing stem-reactive antibodies, MEDI8852 is unique in that it uses a rare
VH (VH6-1) and carries a low level of somatic mutations. Crystallographic analysis of the Fab alone or in complex with H5
and H7 HA proteins reveals that MEDI8852 binds a highly
conserved epitope on H5 and H7 that is markedly different
from other structurally characterized stem-reactive neutralizing antibodies. The characterization of this unique epitope
and the breadth and potency of neutralization exhibited by
MEDI8852 support its development for immunotherapy in
influenza virus-infected humans.
RESULTS
Isolation, Genetic Description, and Optimization of
MEDI8852
Four broadly reactive antibodies were isolated from the memory
B cells of a selected donor based on influenza A HA protein
cross-reactivity as previously reported (Traggiai et al., 2004;
Pappas et al., 2014). These antibodies (FY1, FY5, FY6, and
FY18) belong to the same lineage carrying VH6-1*01/D3-3*01/
JH3*02 and VK1-39*01/JK1*01 gene segments (Figure 1A). We
reconstructed the genealogical trees of this lineage and produced the unmutated common ancestor (UCA), the four clonally
derived antibodies, and three antibodies representing the evolutionary branching points (BP) of the lineage (Figure 1B). Purified
antibodies were tested for neutralizing activity against multiple
viruses of different group 1 and 2 subtypes (Figure 1C). Interestingly, the UCA antibody exhibited neutralizing activity against
group 1 viruses, but not group 2 viruses, albeit with lower
potency as compared to some of the mutated antibodies. Of
note, the first BP (BP1) gained neutralization activity toward early
group 2 H3N2 viruses (HK/68 and VC/75). Two antibodies (i.e.,
FY1 and FY5) of this lineage acquired neutralization activity
against group 2 viruses through two independent pathways of
somatic mutations. The same analysis was extended to the lineage of FI6, a previously described antibody cross-neutralizing
group 1 and group 2 viruses (Corti et al., 2011). Isolation of five
additional antibodies from this lineage allowed the reconstruction of a complex genealogy tree (Figure S1). Similarly to what
was observed for the FY1 lineage, the FI6-UCA antibody exhibited neutralizing activity against group 1 viruses only and
evolved through two independent pathways of somatic mutations that led to the group 1-specific FI370 and FI6038 antibodies
and to the group 1 and 2 cross-reactive antibodies FI6, FI2013
and FI4013. Taken together, these findings suggest that in
both lineages, the UCA was initially selected by a group 1 virus
and developed to a branching point characterized by crossreactivity toward a limited number of group 2 viruses. From
this point, the final antibody may have been selected further for
binding to group 1 only (e.g., FY6 and FI370) or group 2 HAs
(e.g., FY1 and FI6). These results are consistent with a model
in which the development of cross-reactive group 1 and 2 antibodies is started by group 1 HAs and then further selected
through boosts by group 2 HAs.
The FY1 antibody was chosen as the lead, based on its potency, breadth, and low somatic mutations for further in vitro
optimization through parsimonious mutagenesis of the complementarity determining regions (CDRs) combined with reversion
of unnecessary somatic mutations in the frameworks. The optimization focusing on affinity binding resulted in a 14-fold and
5-fold improved Fab affinity to H3 HA and H1 HA proteins determined by surface plasmon resonance, respectively (Table S2).
The resulting antibody was named MEDI8852 (VH and VL sequences shown in Figure 1A) and was compared side by side
with the parental FY1 antibody for binding and neutralization of
a large panel of influenza A viruses. MEDI8852 showed higher
binding activity as compared to FY1 against the group 1 HA proteins of H1, H2, H5, H6, and H9 subtypes and group 2 HA proteins of H3 and H7 subtypes, with a mean half-maximal effective
Cell 166, 596608, July 28, 2016 597
FY5
I
.
.
.
.
.
.
.
.
113
107
102
T I F GV N I D A F D I WGQG TMV T V S S
.V. . . . . . . . . . . . . . . . . . . . .
.V. . . . . . . . . . . . . . . . . . . . .
.V. . . . . . . . . . . . . . .K. . . . .
.V. . . . . . . . .V. . . . . . . . . . .
. V . . L . . . . Y . . . . . . AK . . . . .
.V. . . . . . . . . . . .L. .K. . . . .
. V . . . . V . . . .M. . . . . . . . . . .
. V . . . . V . . . .M. . . . . . . . . . .
107
100
96
90
85
80
75
70
100
100A
100B
100C
100D
100E
100F
100G
100H
95
90
85
80
82A
82B
82C
75
70
65
60
65
60
10-1
C
FY5
FY6
FY
6
FY
18
25 (13)
100
17 (6)
FY
5
18 (8)
FY18
FY18
101
FY
1
17 (18)
9 (6)
BP3
15 (12)
BP3
BP2
2 (2)
6 (3)
3 (2)
BP
15 (11)
FY1
BP2
BP
5 (1)
S S L QP E D F A T Y Y CQQS R T F GQG T K V E
. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . .V. . . . . . . . . . . . . . . . . .
.N. . . . . . . . . . . .L. . . . . . . . . . .
. TF . A . . V . . . . . . . . . . . . . . . . . .
. . . . . . .V. . . . . .L. . . . .H. . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . .
102
BP
BP1
2 (1)
I
.
.
.
.
.
.
.
.
UCA
2 (2)
UCA
BP1
55
50
55
50
45
40
35
30
T CR A SQS I S S Y L NWYQQK PGK A P K L L I Y A A S S L QSGV P S R F SGS GSG T D F T L T

. . . . . . .L. . . .H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
S. . . . . .L. . . .H. . . . . . . . . . . . . . . . . .T. . . . . . . . . . . . . . . . . . . . .
S. . . . . . L . . . .H. . . . . . .Q. . . . . . . . . .T. . . . . . . . . . . . . . . . . . . . .
S . . T . . . LR . . . H . . . . . . . . . . . . . . . . S . T . . . . . . . . . . . . . . . . . . . . .
S . . . - . . L . . . . H . . . . . . . QP . . . . . . . . T T . . . . . . . . . . . . . . . . . . . . .
S . . . . . RLN . . . H . . . . T . . Q . . . . . . . . T . T . . . . . SP . . . . . . . . . . . . . .
. . .T. . .L. . . .H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
G. . . . . . . . . . . . . . . . . . .
. . . T . . . L . . . TH . . . . . . . . . . . . . L . . . . . R
RG
4 (2)
FY1
52A
52B
30
25
15
15
10
5
5
20
I
.
.
.
.
.
.
.
.
SGD S V S S N S A AWNW I RQS P S RG L EWL GR T Y Y R S KWY ND Y A V S V K S R I T I N P D T S K NQ F S L Q L N S V T P E D T A V Y Y C A RGGH

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E. . . . . . . . . . . . . . . .V. . . . . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .E. . . . . .V. . . . . . . . .V. . . . . . . . . . .S. . . . . . . . . .
. . . . . . . . . . T . . . . . . . . . . . . . . . . . . . . . . . . . . . . . DF L . R . . . . . . . . . N . EV . . R . T . . . . D . . . L . . . . . . . .
. . .R. . . . . .V. . . . . . . . . . . . . . . . . . . . . . . . .Y. . .E. . . . . .V.D. . . . . . .V. . . . . . . . . . .S. I . . . . . . . .
. . . T . . . . R . T . . . M . . . . L . . . . . . . . . . . . . . . . . . . . . . . . . . V V . . . . . . . . . V . . . . . T . . . D . SG . . F . . . . . .
. . . . . . . . N . V . . . . . . . . . . . . . . . . . . . . . . . . . . . . . E . . . . . . . V . . . . . . . . . . . H . K. . . . . . . . . F . . V . S . .
. . . . . . . YN . V . . . . . . . . . . . . . . . . . . . . . . G . . . . . . E . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . S . .
IC50 ( g/ml)
I QMT QS P S S L S A S VGDR V T
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . I .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . . . . . . .
45
25
D
UCA
.
BP1
.
BP2
.
BP3
.
FY18
.
FY5
.
FY6
.
FY1
MEDI8852 .
40
VL
35
35A
35B
I
.
.
.
.
.
.
.
.
20
UCA
QVQ L QQSGPG L V K P SQ T L S L T C A
BP1
. . . . . . . . . . . . . . . . . . . . . . .
BP2
. . . . . . . . . . . . . . . . . . . . . . .
BP3
. . . . . . . . . . . . . . . . . . . . . . .
FY18
. . . . . . . . . . . . . . . . . . . . . . .
FY5
. . . . . . . . . . . . . . . . . . . . . . .
FY6
. . . . . . . . . . . . . . . . . . . . . .V
FY1
. . . . .E. . . . . . . . . . . . . . . . .
MEDI8852 . . . . . . . . . . . . . . . . . . . . . . .
10
VH
I
.
.
.
.
.
.
.
.
K
.
.
.
.
.
.
.
.
Group 1
Group 2
H1N1 WSN/33
H1N1 PR/34
H1N1 FM/47
H1N1 BJ/95
H1N1 SZ/95
H1N1 NC/99
H1N1 SI/2006
H1N1 SD/2007
H1N1 CA/2009
H1N1 BR/2010
H2N2 JP/57
H5N1 VT/2004
H6N2 AB/85
H9N2 HK/97
H3N2 HK/68
H3N2 VC/75
H3N2 SG/93
H3N2 WH/95
H3N2 SY/97
H3N2 PA/99
H3N2 CA/2004
H3N2 WI/2005
H3N2 PT/2009
H3N2 VC/2011
H7N3 BC/2004
FY6
Figure 1. Developmental Pathway of the MEDI8852 Lineage

(A) Alignment of VH and VL amino acid sequences of four mutated antibodies with their UCA and branchpoint (BP) configurations and MEDI8852. Amino acid
substitutions are highlighted in red. Residue positions are according to Kabat numbering. Dots indicate identical residues. Boxes indicate CDR borders according
to IMGT (solid line) and Kabat (dashed line).
(B) Genealogy trees of VH (left) and VL (right) nucleotide sequences generated using dnaml. The number of mutations is indicated on the branches with amino acid
substitutions in parentheses.
(C) Neutralization of influenza A viruses. IC50 values were determined against a panel of 25 influenza A isolates. Values above 50 mg/ml were scored as negative
(dashed line). Average IC50 values were obtained from at least two independent experiments. Full viral strains designations are listed in Table S1.
See also Figure S1.
concentration (EC50) of 0.064 mg/ml versus 0.124 mg/ml for

MEDI8852 and FY1, respectively (Figure 2B; Table S2). In addition, we investigated the binding of FY1 and MEDI8852 to the
remaining HAs including the 1918 H1N1 pandemic strain and
two recently identified HA analogs recovered from bats (H17
and H18) (Tong et al., 2012, 2013), by flow cytometric analysis
of cell-surface expressed HAs (Figure 2C). Of note, MEDI8852
bound to all HAs and gained reactivity against H12 HA over the
parental FY1 antibody.
To examine if the higher potency and breadth of binding
activity of MEDI8852 as compared to FY1 translated into potent
and broad antiviral activity, we measured neutralizing activity
of both antibody variants in MDCK cells against a diverse
panel of seasonal H1N1 and H3N2 viruses and emerging,
potentially epidemic viruses, isolated over a period spanning
>80 years (19332014). All seasonal influenza viruses tested
were neutralized by FY1 and MEDI8852 with median IC50 values
of 1.33 mg/ml and 0.51 mg/ml, respectively, resulting in nearly
a 3-fold increase in overall potency (Figure 2D). However,
both antibodies exhibited comparable activity in neutralizing
group 1 and group 2 viruses with similar IC50 values (1.03 and
2.02 mg/ml for FY1 and 0.34 and 0.61 mg/ml for MEDI8852
against 18 H1N1 and 18 H3N2 viruses, respectively) (Figure 2D).
598 Cell 166, 596608, July 28, 2016
The increase in overall activity of MEDI8852 compared to FY1

was also apparent when tested against 13 non-seasonal influenza viruses including H5 and H7 viruses isolated from recent
human infections. MEDI8852 neutralized these viruses having
an overall median IC50 of 1.21 mg/ml (range 4.050.41 mg/ml)
versus FY1 with a median IC50 of 3.59 mg/ml (range 11.05
0.76 mg/ml) (Figure 2E). These results indicate that the optimization of MEDI8852 resulted in a 3-fold increase in the potency of
neutralization and the ability to bind to all HA subtypes of influenza A viruses.
To extend the evaluation, we directly compared the in vitro
neutralization activity and breadth of MEDI8852 with the previously published cross-group neutralizing mAbs FI6v3, CR9114,
and 39.29 (Corti et al., 2011; Dreyfus et al., 2012; Nakamura
et al., 2013), using a diverse panel of seasonal and non-seasonal influenza strains from group 1 and group 2 (Figure 2F).
As reported previously, these antibodies neutralized group 1
and group 2 viruses although they exhibited distinct differences
in both potency and breadth. Among all the antibodies tested,
MEDI8852 is the only one that demonstrated neutralizing activity against all the viruses tested with a median IC50 0.99 mg/ml
(range 8.750.09 mg/ml). CR9114 failed to neutralize the human
H2N2 A/Japan/57 virus. Both FI6v3 and 39.29 were unable to
H8
H4 H14 H3
H12
H10
H7
H9
IC50( g/ml)
H2
H11
H6 H1
0.1
102
101
100
FY1
10-2
H5
10-1
10-1
H15
H16
H13
Group 1
H1
H2
H5
H6
H9
Group 2
H3
H7
EC50( g/ml)
Group 1
100
MEDI8852
H1N1
WSN/33
PR/34
FM/47
NJ/76
KW/86
TX/91
SZ/95
BJ/95
NC/99
SD/2007
SI/2006
CA/2009
BR/2010
HK/2010
NH/2010
NY/2012
WA/2012
BO/2013
H3N2
HK/68
VC/75
LA/87
SG/93
WH/95
SY/97
PA/99
CA/2004
WI/2005
PT/2009
VC/2011
BR/2011
NY/2012
TX/2012
AM/2013
SW/2013
NC/2014
PU/2014
IC50 ( g/ml)
101
100
10-1
CR9114
39.29
FI6v3
10
MEDI8852
104
103
102
101
FY1 MEDI8852
102
10-2
MFI
Group 2
FY1 MEDI8852
Mock
H1 SC/18
H8 ON/68
H11 ME/74
H12 AL/76
H13 MA/77
H16 SW/99
H17 GU/09
H18 PU/10
H4 CZ/56
H10 GE/49
H14 AS/82
H15 AU/79
102
IC50 ( g/ml)
H2N2 JP/57
H2N3 MO/2006
H5N1 HK/2003
H5N1 VT/2004
H6N2 AB/85
H6N1 HK/97
H9N2 HK/97
H9N2 HK/99
H3N2v MN/2010
H3N2v IN/2011
H7N7 NT/2003
H7N3 BC/2004
H7N9 AN/2013
101
100
10-1
H1N1 PR/34
H1N1 SD/2007
H1N1 SI/2006
H1N1 CA/2009
H1N1 HK/2010
H1N1 WA/2012
H1N1 BO/2013
H2N2 JP/57
H2N3 MO/2006
H5N1 HK/2003
H5N1 VT/2004
H6N2 AB/85
H6N1 HK/97
H9N2 HK/97
H9N2 HK/99
FY1 MEDI8852
Figure 2. MEDI8852 Binds to All Influenza A

HA Subtypes and Exhibits Neutralization
of Influenza A Seasonal and Non-seasonal
Viral Strains
(A) Phylogenetic tree of influenza A HAs. Group 1
and group 2 colored in red and blue are further
subdivided into 3 clades (H8, H9, and H12;
H1, H2, H5, and H6; H11, H13, and H16) and 2
clades (H3, H4, and H14; H7, H10, and H15),
respectively.
(B) ELISA binding average EC50 values of FY1 and
MEDI8852 to purified recombinant HA proteins.
(C) Binding of FY1 and MEDI8852 to surfaceexpressed HA proteins as determined by flow
cytometry. Shown are MFI values.
(D and E) FY1 and MEDI8852 neutralization IC50
values were determined against a panel of 36
seasonal influenza A isolates (D) and 13 nonseasonal influenza viruses (E).
(F) Neutralization average IC50 values of MEDI8852,
39.29, FI6v3, and CR9114 were determined from at
least two independent experiments using a panel of
24 seasonal and non-seasonal influenza viruses
and plotted as a single symbol. Full viral strains
designations are listed in Tables S1 and S2.
H3N2 HK/68
H3N2 WI/2005
H3N2 PT/2009
H3N2 BR/2011
H3N2 NY/2012
H3N2 PU/2014
H3N2 SW/2013
H7N7 NT/2003
H7N9 AN/2013
neutralize the contemporary human isolate H7N9 A/Anhui/

2013, and 39.29 was incapable of neutralizing the A/
Netherlands/2003 H7N7 virus, both H9N2 viruses, and a
contemporary H3N2 virus, A/Palau/2014, at the highest concentration tested (50 mg/ml). In addition to better overall
breadth of neutralization, MEDI8852 exhibits equal or greater
neutralization potency than the other cross-reactive monoclonal antibodies with a median IC50 of 0.99 mg/ml, compared
to 2.13, 7.57, and 1.76 mg/ml for CR9114, 39.29, and FI6v3,
respectively, when the non-neutralized viruses are excluded
from the analysis.
MEDI8852 Mechanisms of Antiviral Activity
The cross-subtype neutralizing antibodies reported to date
inhibit HA-mediated membrane fusion activity in vitro (Corti
et al., 2011; Dreyfus et al., 2012; Nakamura et al., 2013). Activation of fusion requires cleavage of the precursor, HA0, and exposure of the cleaved HA to the low pH of endosomes. In assays
of these two processes, we have shown that MEDI8852 inhibits
the host cell protease cleavage of both H1 (group 1) and H3
(group 2) HA0 that would prevent membrane fusion (Figure 3A),
and MEDI8852 binding to cleaved HA also prevents its low
pH-induced conformational change, which is required for membrane fusion by stabilizing the pre-fusion conformation (Figures
3B and 3C).
In addition, we have shown that

MEDI8852 mediates the lysis of infected cells by human primary NK cells
(ADCC), the antibody-dependent cellular
phagocytosis (ADCP) of MDCK cells
expressing H1 or H3 HAs by human
monocyte-derived macrophages and
the complement-dependent cytotoxicity
(CDC) of influenza-infected MDCK cells in the presence of complement (Figures 3D, 3E, 3F, and S2).
Prophylactic and Therapeutic Efficacy of MEDI8852 in
Mice and Ferrets
We evaluated the antiviral activity of MEDI8852 in mice challenged with a lethal dose of three different influenza A viruses,
A/California/7/2009 H1N1 (CA/2009 H1), A/Wilson Smith N/33
H1N1 (WSN/33 H1), and a reassortant A/Hong Kong/8/68
H3N1 (rHK/68 H3). A dose-ranging study was conducted in
which MEDI8852 was administered at the time of virus challenge. Mice (100%) receiving MEDI8852 at 3 or 1 mg/kg survived
challenge with CA/2009 H1, while 60% and 20% of mice survived that received 0.3 and 0.1 mg/kg of MEDI8852, respectively
(Figure 4A). Consistent with the survival data, lung viral titers
were significantly reduced compared to control antibody-treated
animals when MEDI8852 was administered at 3 mg/kg (Figure 4B). In addition, we observed that the two highest doses
of MEDI8852 significantly protected lung function in mice
compared to the control antibody as measured by pulse oximetry (Figure S3A).
To evaluate the therapeutic utility of MEDI8852, we administered MEDI8852 at different time points following infection with
WSN/33 H1 or rHK/68 H3 virus. At a dose of 10 mg/kg, survival
rates of 90%100% were achieved even when treatment was
Cell 166, 596608, July 28, 2016 599
Isotype Control
0 5 10 20 40
MEDI8852
5 10 20 40
Fusion Inhibition (%)
100
MEDI8852
H1N1
H3N2
Control Ab
H1N1
H3N2
40
MEDI8852
Control Ab
Figure 3. MEDI8852s Antiviral Mechanisms

of Action
(A) HA cleavage inhibition assay of uncleaved HA0

recombinant proteins of A/New Caledonia/20/99
(H1N1) or A/Hong Kong/8/68 (H3N2), pre-treated
60
H3
20
with MEDI8852 or a non-relevant isotype control
40
10
antibody, MPE8v3, after digestion with TPCKB
HA + MEDI8852
20
trypsin for 0, 5, 10, 20, or 40 min.
7.38 7.00 6.74 6.49 6.21 5.92 5.54
0
(B) Inhibition of low pH-activated conformational
0
HA
-1
0
1
2
3
4
5
change in HA showing SDS PAGE gels of H5
10-3 10-2 10-1 100 101 102 103
10 10 10 10 10 10 10
HA with and without MEDI8852, incubated at
Ab (ng/ml)
F
Ab (ng/mL)
Fab E
60
20
decreasing pH values and neutralized after
MEDI8852
MEDI8852
Control Ab
digestion with TPCK- trypsin.
Control Ab
40
15
(C) Fusion inhibition assay using MEDI8852 (solid)
HA alone
or MPE8v3 (open) incubated with A/Puerto Rico/8/
20
7.38 7.00 6.78 6.52 6.21 5.87 5.51
10
34 (H1N1) virus (red) or A/Aichi/2/68 (H3N2) virus
HA
0
(blue) and human red blood cells and exposed
5
to low pH to induce viral fusion. Percent fusion
-20
inhibition was calculated based on the amount of
0
hemoglobin present in the supernatant.
1
2
3
4
5
10-2 10-1 100 101 102 103 104
10 10 10 10 10
(D) ADCC activity on A549 cells infected with
Ab (ng/mL)
Ab (ng/ml)
A/Puerto Rico/8/34 (H1N1) and incubated with
MEDI8852 (red) or MPE8v3 (black) antibody in
the presence of human NK cells, antibody-dependent killing was measured in quadruplicate by LDH release.
(E) ADCP activity on MDCK cells expressing H1 HA from A/South Dakota/06/2007 that were labeled CFSE and incubated with MEDI8852 (red), or an irrelevant
control, R347 (black) antibody in the presence of violet-labeled human macrophages in duplicate. Percent phagocytosis was determined by the amount of total
macrophages that were labeled with violet and CFSE.
(F) CDC activity on MDCK cells infected with A/Puerto Rico/8/34 (H1N1) and incubated with a serial dilution of MEDI8852 (red) or MPE8v3 (black) antibody in the
presence of rabbit complement. Antibody-dependent killing was measured in triplicate by LDH release. Error bars represent two times the SD at each antibody
concentration.
See also Figure S2.
30
ADCC (%)
80
CDC (%)
ADCP (%)
H1
delayed until day 4 post infection with WSN/33 H1, or day 3 post
infection with rHK/68 H3 (Figures 4C and 4E). Significant survival
benefits were also seen with 1 and 3 mg/kg doses when administered on days 1, 2, or 3 post infection, albeit lower survival rates
than the 10 mg/kg (Table S3). MEDI8852 treatment of 10 mg/kg
at all times post infection resulted in significantly decreased viral
titers, compared to control antibody treated and untreated
animals, with a clear trend for greater reductions with earlier
treatment (Figures 4D and 4F).
To further investigate MEDI8852s therapeutic potential, we
determined the therapeutic window for treating ferrets infected
with the highly pathogenic avian influenza virus, A/Vietnam/
1204/2004 H5N1 (VT/2004 H5N1). In these studies, ferrets
were infected intranasally with 1 LD90 of VT/2004 and then
treated with a single intravenous (i.v.) dose of 25 mg/kg of
MEDI8852 at 1, 2, or 3 days post infection. We also used as a
comparator, the anti-influenza drug oseltamivir at 25 mg/kg
twice a day (b.i.d.) (Figure 4G). As expected, all control animals
showed signs of infection including fever peaking from days 1
3 post infection and 100% mortality by day 7 post infection (Figures 4G and 4H). In comparison, ferrets treated with MEDI8852
or oseltamivir on day 1 post infection were completely protected.
When treatment was delayed until 2 or 3 days post infection,
MEDI8852 provided complete protection with 100% survival
while oseltamivir only partially protected animals with survival
rates of 71% and 29%, respectively. In addition, MEDI8852
treatment resulted in a period of fever reduction following
administration, which was not observed in oseltamivir or control
antibody-treated animals (Figure 4H). Similar efficacy and therapeutic window results were seen when MEDI8852 and oseltami600 Cell 166, 596608, July 28, 2016
vir treatments were compared in a lethal murine model (Figures

S3B and S3C).
The Structures of Complexes Formed between
MEDI8852 Fab and H5, Group 1, and H7, Group 2, HAs
To provide insight into the structural basis for MEDI8852 breadth
and potency, we have determined the structures of the
MEDI8852 Fab fragment at 1.9 A and of its complexes with H5
and H7 HA proteins at 3.7 A and 3.75 A resolution, respectively
(Figures 5A and S4A). The structures of the HA proteins in both
complexes are similar to those of the apo structures determined
before (Russell et al., 2004; Xiong et al., 2015). MEDI8852 makes
similar contacts with both H5 and H7 HAs, by binding in a very
similar orientation to both HAs (Figures 5B5D), each Fab interacting with just one protomer of the HA trimer. Overall, the interactions bury 1,750 and 1,646 A2 from solvent for the H5 and H7
complexes, respectively, consistent with their high binding affinity (Table S2). MEDI8852 contacts the fusion domain of HA and
interacts with three regions of HA2, a central hydrophobic
groove, the fusion peptide and helix A, and with specific residues
in the HA1 component of the fusion domain (Figure 6). The location of MEDI8852 in the complex with the HA proteins is consistent with the in vitro assays showing that MEDI8852 stabilized
the pre-fusion conformation inhibiting fusion as well as blocking
the proteolytic cleavage of the HA precursor on the neighboring
subunit (Figure 3).
Although the structures of the complexes were determined at
intermediate resolution, the interfaces between the HAs and
MEDI8852 Fab are well-ordered and the electron density maps
in these areas are among the clearest of the overall complexes
40
20
0
0
WSN/33 H1
Survival (%)
C 100
80
40
20
2
4 6 8 10 12 14
Days Post Infection
rHK/68 H3
Survival (%)
100
80
40
80
60
40
MEDI8852 F
10 mpk
MEDI8852 Oseltamivir
25 mpk 25 mpk (2x)
Day 1
Day 2
Day 3
*
Ctr. mAb
(Day 1)
2 4 6 8 10 12
Days Post Infection
7
6
5
4
Ctl. mAb
Naive
Day 1
Day 2
Day 3
Day 4
Figure 4. MEDI8852 Provides Dose-Dependent Protection from Lethal Influenza

Infection in Mice and Ferrets Even When
Treatment Was Delayed
D 10
MEDI8852
10 mpk
Day 1
Day 2
Day 3
Day 4
4 6 8 10 12 14
Days Post Infection
20
0
0
Ctl. mAb
Naive
Ctl. mAb
Naive
20
100
A/VT/2004 H5N1
Survival (%)
*
*
*
60
0
0
*
*
*
*
60
0
0
4 6 8 10 12 14
Days Post Infection
Log TCID50 /g
60
Log TCID50 /g
80
8
Log TCID50 /g
MEDI8852 B
Day 0 Tx
3 mpk
*
1 mpk
*
0.3 mpk
*
0.1 mpk
0.03 mpk
Temperature (C) Temperature (C)
CA/2009 H1
Survival (%)
A 100
9
8
7
6
5
4
3
*
*
9
8
7
6
5
4
3
41
40
39
38
37
36
(MEDI8852)
41
40
39
38
37
36
2
3
(Oseltamivir)
1
2
3
4
Days Post Infection
(unbiased omit electron density maps are shown in Figures S4B

and S4C). We can, therefore, have confidence in our description
of the inter-molecular contacts: hydrogen bonds described in
the text should be regarded as potential interactions, however,
given the limitations of defining the exact geometry of these interactions. The principal contact areas involve three CDRs,
CDRH3, CDRH2, and CDRL1, with minor interactions with
CDRH1 and CDRL3 (Figure 6A). CDRH3 makes extensive contacts with the bottom of a hydrophobic groove between helix A
of HA2 and the fusion domain component of HA1 (Figures 6B
and 6C). Phe100A(CDRH3) inserts into this groove made by HA2
residues Ile45, Val48, Thr49, and Val52 of helix A, Trp21 of the
fusion peptide, and Thr309 of HA1. Val100C(CDRH3) binds in a
lower position in the same groove and interacts with the main
chain of HA2 residues 1921 of the fusion peptide as well as
with the side chains of Trp21 and of HA1 His8 (Figure 6B).
Asn100D(CDRH3) also makes hydrophobic contact with Val18 in
(A) Kaplan-Meier survival curves.

(B) Lung viral titers on day 5 post infection determined by TCID50 assay after mice were treated with
MEDI8852 at 3, 1, 0.3, 0.1, and 0.03 mg/kg (single
i.p. dose) and then infected with CA/2009 H1.
(C) Kaplan-Meier survival curves.
(D) Lung viral titers on day 5 post infection in mice
infected with WSN/33 H1 virus, on study day 0,
then treated with MEDI8852 or irrelevant control
antibody, R347 (single i.p. dose) at 10 mg/kg at
various days post infection.
(E) Kaplan-Meier survival curves.
(F) Lung viral titers on day 5 post infection in mice
infected with rHK/68 H3 virus, on study day 0, then
treated with MEDI8852 at 10 mg/kg or R347 (single
i.p. dose) at various days post infection.
(G) Kaplan-Meier survival curves of ferrets infected with 1LD90 of A/Vietnam/1203/2004 H5N1
virus on study day 0. Treatment with MEDI8852 at
25 mg/kg (closed symbols solid line), oseltamivir
at 25 mg/kg (open symbol dashed line), or R347
(open symbol solid line) was initiated at the indicated day post infection.
(H) Temperature of ferrets treated with MEDI8852,
or oseltamivir at various days post infection. Dotted
line designates the average normal temperature of a
ferret at 38.5 C. Error bars represent the SE of the
mean for each determination. *For murine studies,
significance was determined compared to control
antibody treatment with p < 0.005 for survival (logrank test) and p < 0.05 for lung viral titers (Students
t test); for ferret survival studies, significance was
determined by comparing to oseltamivir on the
indicated initiation day with p < 0.05 for survival (logrank test).
See also Figure S3 and Table S3.
the fusion peptide (Figure 6C). The

CDRH2 loop interacts, through VH germline-encoded residues, with HA2 residues
1519 of the fusion peptide (Figure 6C). In particular, HA2 Val18
is almost completely protected from solvent by Tyr56(CDRH2) and
Arg52B(CDRH2). There is also van der Waals interaction between
the Ca of HA2 Gly16 and Tyr52(CDRH2). Tyr52(CDRH2) is positioned
within hydrogen bonding distance of Gly16. Arg50(CDRH2) forms a
salt bridge with HA2 Asp19. Arg50(CDRH2) also contributes to a
polar patch on the antibody surface that includes Arg96(CDRL3)
and Asp58(CDRH2) and Asp100F(CDRH3). Finally, CDRL1 interacts
with the N-terminal region of helix A of HA2 (Figure 6D).
Ser30(CDRL1) and Ser31(CDRL1) are in hydrogen bonding distance
of HA2 Gln42, while Tyr32(CDRL1) and Leu29(CDRL1) make hydrophobic contacts with the aliphatic moiety of HA2 Lys38 of
helix A. The side chain of Tyr32(CDRL1) stacks against HA2
Gln42 of helix A and its hydroxyl group is in hydrogen bonding
distance of the main chain of HA2 Asp19 of the fusion peptide.
The epitope recognized by MEDI8852 is highly conserved between both HA proteins, consistent with the antibodys broad
Cell 166, 596608, July 28, 2016 601
Figure 5. MEDI8852 Binds to a Unique Site

within the H5 and H7 HA Proteins
(A) Overview of MEDI8852 in complex with H5
hemagglutinin. One HA protomer and the cognate
MEDI8852 Fab fragment are highlighted in color,
the other two copies in the trimer are colored gray.
The HA1 polypeptide is colored blue, the HA2
polypeptide is colored red, with the fusion peptide
at the N terminus of HA2 highlighted in yellow. The
heavy chain of the MEDI8852 Fab is colored orange, the light chain is colored green.
(B) Overlay of MEDI8852 bound to group 1 (H5) and
group 2 (H7) HAs. The antibodies are shown in
cartoon representation together with Helices A
and B of the HA. The components are colored
according to (A) and the view orientation is
approximately that shown by the black arrow in (A).
(C and D) H5 (C) and H7 (D) HAs are shown in
surface representation. Only the HA residues in the
MEDI8852 binding epitope that differ between H5
and H7 are labeled. The CDR loops of MEDI8852
that are in contact with HA are shown in cartoon
and stick representation and colored by element.
See also Figures S4, S7, and Table S4.
activity against group 1 and group 2 influenza viruses (Figures

5C, 5D, and S5). However, the membrane proximal fusion domains of both group 1 and group 2 HAs have a few distinct structural differences such as glycosylation status of HA1 position 38,
which could potentially affect the binding of some antibodies
(Corti et al., 2011; Ekiert et al., 2009; Sui et al., 2009). In the H7
complex with MEDI8852, the bulky carbohydrate chain attached
to HA1 Asn38 changes its orientation to allow antibody binding,
as observed before in the FI6 Fab-H3 HA complex (Corti et al.,
2011). Another notable feature of the MEDI8852 complex with
H7 HA is the involvement of HA2 Tyr38 in an aromatic stacking
interaction with Tyr32(CDRL1). Interestingly, the tyrosine at position 38 is not conserved across all HA subtypes only in H7,
H10, and H15. The arginine in H6, H9, H11, and H12 could
engage in a similar stacking interaction although the leucine,
lysine, and glutamine found in the remaining HA proteins could
not interact in the same way. Thus, the high and similar affinities
with which MEDI8852 reacts with these HAs, suggests that the
differences in glycosylation at HA1 38 and the side chain at
HA2 38 make a minor contribution to the overall energetics of
binding (Figure 2B; Table S2).
Conformational Rearrangements in MEDI8852 on
Complex Formation
The availability of a high-resolution structure of MEDI8852 Fab
and of well-ordered interfaces in the structures of the complexes
formed with H5 and H7 HAs enables us to analyze conformational changes in the Fab upon HA binding, particularly in the
602 Cell 166, 596608, July 28, 2016
CDRH3 and CDRL1 loops. The loop

formed by residues 97100F of CDRH3
undergoes a largely rigid-body rotation, pivoted around Gly96(CDRH3) and
Ala100G(CDRH3) (Figure 7A) to facilitate
interactions with HA. As a consequence,
the side chain of Phe100A(CDRH3) moves by 5 A, to insert
into the hydrophobic groove of the epitope, near HA2 48. In addition, residues 2732(CDRL1) are restructured, with an average
displacement of 10 A between apo and bound forms. The reorientation of the side chains of Tyr32(CDRL1) and Leu29(CDRL1) enables them to interact with HA2 Tyr38 in the H7 complex.
Ser30(CDRL1) and Ser31(CDRL1) in the complex form a helical
structure that places them in hydrogen bonding distance with
Gln42 of HA2.
There are five mutations found in the vitro optimization of
FY1 to MEDI8852 that are not in direct contact with HA, but
are contained within CDR loops (Figures 6E and 6F). The
mutated residues are seen to stabilize the conformations that
the loop regions adopt in complex with HA while the parental residues (FY1) do not appear to be able to make similar stabilizing
interactions (Figure S6). Thus, the optimization process of FY1
to MEDI8852 results in the selection of amino acid substitutions
that stabilize the induced fit conformation that the CDR loops
adopt in complex with HA.
Comparison of the Epitope of MEDI8852 with Those of
Other Broadly Neutralizing Antibodies
We have compared the mode of interaction of MEDI8852 with
other broadly neutralizing antibodies that recognize the membrane proximal fusion domain of HA. The cross-group neutralizing antibodies 39.29, FI6v3, and CR9114 (Corti et al., 2011;
Dreyfus et al., 2012; Nakamura et al., 2013), as well as the group
1-neutralizing antibodies F10 and CR6261 (Ekiert et al., 2009; Sui
Figure 6. Binding Epitope of MEDI8852 on H5 HA

(A) HA is shown in surface representation and residues that are contacted by MEDI8852 are highlighted in color (blue for HA1, red for HA2 and yellow for fusion
peptide residues). Secondary structure elements of HA are shown in cartoon representation. The hydrophobic groove on HA is outlined in gray. The CDR loops of
MEDI8852 that are in contact with HA are shown in cartoon representation and colored orange and green for the heavy and light chains, respectively. The colored
boxes indicate the three parts of the binding epitope that are shown in more detail in (B), (C), and (D).
(B) Interactions of MEDI8852 with the hydrophobic groove of H5 HA. HA is drawn in surface representation, with the main chain shown in cartoon representation
and amino acids that are in contact with MEDI8852 shown in stick representation. W21, which adopts different rotamers in group1 and group 2 influenza viruses,
is colored magenta. MEDI8852 is also shown in cartoon representation, with contact residues shown in stick representation. Hydrogen bonds and salt bridges are
indicated by dashed lines.
(C) Interactions of MEDI8852 with the fusion peptide of H5 HA. Shown in the same style as in (B).
(D) Interactions of MEDI8852 with the base of helix A of H5 HA. Shown in the same style as in (B).
Cell 166, 596608, July 28, 2016 603
et al., 2009), all recognize helix A of HA2 and the adjacent hydrophobic groove (Figure 7B, blue box). In contrast, the group
2-neutralizing antibodies CR8020 and CR8043 (Ekiert et al.,
2011; Friesen et al., 2014) recognize a different region of the
fusion peptide and a small b sheet below it in the fusion domain
(Figure 7B, red box). The epitope of MEDI8852, uniquely among
the cross-group neutralizing antibodies reported to date, represents a combination of both regions. A structural comparison of
HA-bound MEDI8852 overlapped with the antibodies CR8020
and CR9114 is shown in Figures S7A and S7B, which reveals
that the overall orientation of MEDI8852 is such that it sits slightly
higher on the HA than the CR8020 antibody and lower than
CR9114. The nearest paratope residue of MEDI8852 is 20
from the membrane proximal end of the HA.
MEDI8852 and 39.29 antibodies bind similarly to residues in
the hydrophobic groove and adjacent helix A in the fusion
domain. Both antibodies contain the four amino acid sequence
ValPheGlyVal/Ile in their otherwise dissimilar CDRH3 loops.
These tetrapeptides superpose in the complexes with an allatom root-mean-square deviation (RMSD) of 0.7 A, indicating
that they interact with their cognate HAs in a similar way (Figure 7C). However, other contacts made by these antibodies
with HA are quite different between MEDI8852 and 39.29, reflecting the fact that the two antibodies are not particularly similar
in sequence and are derived from different germline sequences
(VH6-1*01 and VK 1-39*01 for MEDI8852 versus VH3-30*01
and VK3-15*01 for 39.29). MEDI8852 contacts the base of helix
A and the fusion peptide with its CDRL1 and CDRH2 loops,
respectively (Figure 6). By contrast, contacts made by the heavy
chain of 39.29 Fab mainly involve CDRH3. 39.29 appears to bury
helix A using all three light chain CDR loops. As a consequence,
the 39.29-HA interaction buries a total of 2,287 A2, while
MEDI8852 achieves similar affinity with a smaller buried surface
of 1,646 A2.
MEDI8852 is the second cross-group neutralizing antibody for
which structures of complexes with both group1 and group 2
HAs have been reported. The first was FI6v3, which, although
it recognizes both group 1 and group 2 HAs, has higher in vivo
neutralizing activity against group 1 viruses (Corti et al., 2011).
The cross-group binding of FI6v3 has been attributed to its
long and flexible HCDR3, which can accommodate the differences in conformation and environment of HA2 Trp21 observed
between group 1 and group 2 viruses. There is a much more significant rearrangement between FI6v3 bound to H1 HA (group 1)
versus H3 HA (group 2) than there is between MEDI8852 in its
complexes with H5 HA (group 1) and H7 HA (group 2) which overlap very closely (Figures 5B and S7C). MEDI8852, therefore,
binds in a very similar way to HAs of both groups. This greater
structural conservation of the binding interface is likely responsible for its broader neutralizing ability.
DISCUSSION
There is an unmet medical need for effective treatments against
severe influenza. The potential of broadly infectivity-neutralizing
antibodies used therapeutically to address this need has provided a stimulus for their isolation and characterization. Among
the antibodies considered to date, the anti-HA human monoclonal antibody MEDI8852 has demonstrated significant breadth
of its infectivity-neutralizing capacity. MEDI8852 reacts with HAs
of all influenza antigenic subtypes, potently neutralizes diverse
virus strains with numerous HA subtypes, and can block infection and lethality caused by influenza viruses when administered
up to 4 days after challenge with the virus in mice and up to
3 days post challenge in ferrets with the highly pathogenic
H5N1 virus. This potential ability to overcome the unpredictable
characteristics of influenza, namely the antigenic shift, that results in disease during pandemic periods, and the antigenic drift,
that occurs with the emergence of antigenically novel viruses, is
a major advantage for a candidate anti-influenza therapeutic
antibody.
The mechanisms of MEDI8852-mediated neutralization of
infection involve processes at the beginning and the end of the
infection cycle. Binding of the antibody to HAs on the infecting
virus inhibits HA-mediated membrane fusion that is required
for the initiation of infection. At the end of infection, antibody
binding to precursor HA0 can block its cleavage and prevent
the formation and spread of newly made infectious virus. Additionally, binding of MEDI8852 to HAs displayed on the surfaces
of infected cells results in their recognition and lysis by other
components of the immune system: NK cells, macrophages,
and complement. These multiple mechanisms exhibited by
MEDI8852 presumably combine to ensure the observed effectiveness of antibody treatments in infected mice and ferrets.
The epitope recognized by MEDI8852 is consistent with the
ways it blocks HA function in membrane fusion and with the
locations of epitopes that have been described previously for
influenza group-specific cross-reactive antibodies and for
more broadly reactive antibodies. However, the regions of HA
that interact with MEDI8852 are a combination of those previously assigned to group 1 specificity (primarily a hydrophobic
groove and the adjacent helix A of HA) (Ekiert et al., 2009; Sui
et al., 2009) or group 2 specificity (a separate part of the fusion
peptide, near its N terminus) (Ekiert et al., 2011; Friesen et al.,
2014). The structural characterization of MEDI8852 bound and
unbound structures also highlight the coordinated movement
of the CDRH3 and the CDRL1 to insert into the hydrophobic
grove of the HA, as well as the rearrangement of the orientation
of the glycan attached to Asn38 of the H7 virus to allow antibody
binding. Importantly, structures of the complexes formed by
MEDI8852 with H5 and H7 HAs indicate that the locations and
(E) Location of mutations found during affinity maturation of FY1 to MEDI8852. The variable domains of MEDI8852 are shown in cartoon representation, viewed
from the direction of HA. Regions of the heavy and light chains in contact with HA are colored orange and green, respectively. Interacting sidechains are shown in
stick representation. Residues that differ between the parental and affinity-maturated antibody are shown in sphere representation.
(F) Sequences of MEDI8852 variable region framework and CDR residues. CDRs (according to Kabat) are highlighted in orange and green for the heavy and light
chains respectively. Residues in contact with HA are colored red and residues changed during affinity maturation from FY1 are colored cyan, with corresponding
residues of FY1 indicated.
See also Figure S6.
604 Cell 166, 596608, July 28, 2016
Figure 7. MEDI8852 Binds to a Unique Site within the H5 and H7 HA Proteins through CDR-H3 and CDR-L1 Conformational Rearrangements
upon Complex Formation
(A) Conformational rearrangements in MEDI8852 on complex formation. Conformational change of the CDRH3 and CDR-L1 loops upon HA engagement. The apo
structure of MEDI8852 is shown in blue, the bound structure is shown in orange and green for the heavy and light chains, respectively. The beginning and end of
the moving regions are indicated with black ovals. HA (H7) is shown as a gray surface. The apo structure does not make interactions with HA and does not fit into
its surface featuresthe conformational change is necessary for productive HA engagement.
(B) Epitopes of different broadly neutralizing antibodies on the HA surface. Residues of HA that are in contact with the heavy chain are colored orange, residues
that are in contact with the light chain are colored green, and residues that are in contact with both chains are colored yellow. The blue box encases the part of the
MEDI8852 epitope (helix A and hydrophobic groove) that can also be found in other broadly neutralizing antibodies as well as group 1 specific ones. The red box
encases the part of the MEDI8852 epitope that can also be found in group 2 specific antibodies (middle of fusion peptide).
(C) Comparison of the structures of the conserved CDRH3 tetra-peptide in the complexes between MEDI8852 and H7 HA (left panel) and 39.29 and H3 HA (right
panel). In both cases the tetra-peptide is shown in stick representation with other loops of the antibody shown as coil, colored as in panel A. The HAs are shown in
surface representation.
See also Figure S7.
Cell 166, 596608, July 28, 2016 605
orientations of the bound antibodies are very similar, and this

structurally conserved ability to interact with both regions of
HA presumably results in effective cross-reactivity.
The comparison of the structures of the complexes formed by
MEDI8852 with H5 and H7 HAs with the previously reported
complex formed between the cross-reactive monoclonal antibody 39.29 and H3 HA (Nakamura et al., 2013) indicated that
the two antibodies had the amino acid sequences, V-F-G-VMEDI8852 and V-F-G-I- 39.29, in their HCDR3 loops that occupied equivalent positions in the complexes. Conceivably, the
structure of this shared component of the antibodies might be
used in the preparation of immunogens or to select candidate
molecules on the basis of their affinity for the tetra-peptides.
Of note, a recent paper described the development of a computationally designed protein binding to Helix A in the stem region
of group 1 HAs that showed in vitro and in vivo antiviral efficacy
(Koday et al., 2016).
The reconstruction of the developmental pathway of MEDI8852,
as well as of FI6, suggests that the generation of such broadly
reactive antibodies may require a stepwise stimulation by group
1 HAs, followed by the selection of mutated variants by group 2
HAs. Indeed, the MEDI8852 donor was born in the 1950s
and it is possible that this lineage was primed by H2N2 and
further matured through multiple H3N2 exposures. This hypothesis is further strengthened by the observation that the UCA
mAb neutralized with high potency the strain H2N2 JP/57 (i.e.,
IC50 = 0.6 mg/ml). The UCA and mutated antibodies of the
MEDI8852 and FI6 lineages represent useful tools to design
stem-based immunogens that can be used in a heterologous
prime-boost mode to prime the group 1 reactive naive B
cells and selectively expand those that also cross-react with
group 2 HAs.
Based on the results reported, MEDI8852 is currently being
evaluated for safety and efficacy in adults with uncomplicated
influenza infection in an outpatient setting (https://.clinicaltrials.
gov: NCT02603952) prior to conducting studies in patients hospitalized with influenza caused by type A strains.
In Vitro Fusion and HA Cleavage Assays

Antibody-mediated fusion inhibition was tested using a low pH-induced red
blood cell fusion model adapted from protocol described in Wang et al.
(2010). The ability of MEDI8852 to inhibit the low pH-activated conformational
change in trypsin-digested H5 HA was analyzed by SDS/PAGE. The ability of
antibody to block the HA0 cleavage by TPCK-treated trypsin was measured by
western blot analysis (Supplemental Experimental Procedures).
Measurement of Fc-Effector Function
ADCC activity was measured with the LDH release assay using primary human
NK cells as effector cells and H1N1 or H3N2 influenza virus infected A549 cells
as a target. ADCP activity was measured by flow cytometry using fluorescently
labeled monocyte-derived macrophages and H1 or H3 HA-expressing MDCK
as target cells. CDC activity was measured with the LDH release assay using
rabbit complement on influenza H1N1-infected MDCK cells (Supplemental
Experimental Procedures).
Therapeutic Efficacy Studies in Mice and Ferrets
All animal studies were approved and conducted in accordance with
MedImmunes Institutional Animal Care and Use Committee (murine studies)
and Southern Research Institutes Institutional Animal Care and Use Committee (ferret studies) and performed in Association for the Assessment and
Accreditation of Laboratory Animal Care (AAALAC)-certified facilities.
MEDI8852 or R347 control mAb was administered as a single intraperitoneal
(i.p.) dose at various days post infection, depending on the virus strain. For
oseltamivir comparison studies, mice were administered 25 mg/kg oseltamivir
by mouth (PO) b.i.d. for 5 days, or a single 10 mg/kg dose i.v. of MEDI8852 with
vehicle PO b.i.d. for 5 days. Viral loads in the lungs were measured by TCID50
assay on day 5 post infection. Five- to six-month-old ferrets were challenged
intranasally with A/Vietnam/1203/04 (H5N1) virus and treated with a single
25 mg/kg i.v. dose of MEDI8852 (or R347 control) or oseltamivir at 25 mg/kg
BID for 5 days initiated at different days post infection. Bio-metric data systems chip was used for temperature monitoring. (Supplemental Experimental
Procedures).
HA-MEDI8852 Complex Preparation, Crystallization, and Structure
Determination
H5 and H7 HAs were purified from the virus membrane and mixed with purified
MEDI8852 antibody Fab fragments and incubated overnight at 4 C for complex formation. Complexes were further purified by size-exclusion chromatography and concentrated for crystallization. Crystals were frozen by direct
immersion in liquid nitrogen and diffraction datasets were collected at 100 K
at the IO2 and IO4 beamlines at the diamond light source (Harwell). Structures
were solved by molecular replacement and refined using standard protocols. Macromolecular structures have been deposited under the accession
numbers PDB: 5JW5 (apo MEDI8852), PDB: 5JW4 (H5 complex), and PDB:
5JW3 (H7 complex). Crystallographic statistics are summarized in Table S4
(Supplemental Experimental Procedures).
Monoclonal Antibody Isolation and Ex Vivo Affinity Maturation

Monoclonal antibodies were isolated from memory B cells, as previously
described (Pappas et al., 2014; Traggiai et al., 2004) from blood donors who
had given written informed consent, following approval by the Cantonal Ethical
Committee of Cantone Ticino, Switzerland. FY1 antibody was further modified
to revert the non-germline framework amino acid changes and its affinity was
improved through parsimonious mutagenesis of CDRs (Supplemental Experimental Procedures).
ACCESSION NUMBERS
Recombinant HA Protein and Binding Assays

Recombinant HA proteins were expressed and purified as previously
described (Benjamin et al., 2014). The binding of antibodies to HAs was
measured by ELISA or by staining HA transfected cells using flow cytometry
(Supplemental Experimental Procedures).
Viruses and Microneutralization Assay

Wild-type influenza strains and cold-adapted (ca) live-attenuated influenza
vaccine viruses (complete viral strain designations shown in Table S1) were
propagated in embryonated chicken eggs, titered, and used to infect MDCK
cells to determine neutralizing activity as described in the Supplemental
Experimental Procedures.
606 Cell 166, 596608, July 28, 2016
The accession number for the coordinates and structure factors reported in
this paper is PDB: 5JW5 (apo MEDI8852, 5JW4 (H5 complex), and 5JW3
(H7 complex). The accession number for the sequences for all of the antibodies reported in this paper is GenBank: KX398429-KX398468.

B.F.-R. carried out PCR of immunoglobulin sequences from B cells. G.A.
produced and purified antibodies. M.F. analyzed immunoglobulin genetic
elements, produced figures, and carried out bioinformatic analysis. D.P. and
C.S. carried out donors selection and screenings for the identification of
cross-reactive antibodies. F.V. and A.F. carried out cloning HAs and testing
antibodies for binding in cytofluorimetry. S.B. performed biochemical and
cellular assays to test the fusion and HA maturation inhibiting activities of isolated antibodies. B.G. and A.D.M. carried out ADCC and CDC studies. F.S.
wrote the paper. A.L. and D.C. directed the B cell isolation studies, analyzed
data, and wrote the paper. J.M.M. and E.B. carried out in vivo studies. L.W.-R.
carried out ADCP studies. F.J.P.-H. carried out the ELISA binding studies.
N.L.K., Q.Z., L.W.-R., and F.J.P.-H. carried out the neutralization studies.
A.Q.Y. lead the antibody optimization. J.A.S. edited the paper and provided
supervision. N.L.K. and Q.Z. directed antibody optimization, in vitro and in vivo
characterization, analyzed the data, and wrote the paper. P.J.C., U.N.,
P.A.W., M.K.V., R.W.O., S.R.M., S.J.G., and J.J.S. designed and performed
structural research, contributed new reagents and analytical tools, analyzed
data, and wrote the paper.
CONFLICTS OF INTEREST
A.L. is the scientific founder of Humabs BioMed SA. A.L. holds shares in Humabs BioMed. B.G., A.D.M., G.A., F.V., and D.C. are employees of Humabs
Biomed. This work was funded by MedImmune, LLC, a wholly owned subsidiary of AstraZeneca Pharmaceuticals. N.L.K., J.M.M., E.B., L.W.-R., F.J.P.-H.,
A.Q.Y., J.A.S., and Q.Z. were employed by MedImmune, LLC when work was
executed and may currently hold AstraZeneca stock or stock options.
Dilillo, D.J., Palese, P., Wilson, P.C., and Ravetch, J.V. (2016). Broadly neutralizing anti-influenza antibodies require Fc receptor engagement for in vivo protection. J. Clin. Invest. 126, 605610.
Dreyfus, C., Laursen, N.S., Kwaks, T., Zuijdgeest, D., Khayat, R., Ekiert, D.C.,
Lee, J.H., Metlagel, Z., Bujny, M.V., Jongeneelen, M., et al. (2012). Highly
conserved protective epitopes on influenza B viruses. Science 337, 1343
1348.
Dunand, C.J.H., Leon, P.E., Kaur, K., Tan, G.S., Zheng, N.-Y., Andrews, S.,
Huang, M., Qu, X., Huang, Y., Salgado-Ferrer, M., et al. (2015). Preexisting human antibodies neutralize recently emerged H7N9 influenza strains. J. Clin.
Invest. 125, 12551268.
Ekiert, D.C., Bhabha, G., Elsliger, M.-A., Friesen, R.H.E., Jongeneelen, M.,
Throsby, M., Goudsmit, J., and Wilson, I.A. (2009). Antibody recognition of a
highly conserved influenza virus epitope. Science 324, 246251.
Ekiert, D.C., Friesen, R.H.E., Bhabha, G., Kwaks, T., Jongeneelen, M., Yu, W.,
Ophorst, C., Cox, F., Korse, H.J.W.M., Brandenburg, B., et al. (2011). A highly
conserved neutralizing epitope on group 2 influenza A viruses. Science 333,
843850.
Fleury, D., Barre`re, B., Bizebard, T., Daniels, R.S., Skehel, J.J., and Knossow,
M. (1999). A complex of influenza hemagglutinin with a neutralizing antibody
that binds outside the virus receptor binding site. Nat. Struct. Biol. 6, 530534.
Friesen, R.H.E., Lee, P.S., Stoop, E.J.M., Hoffman, R.M.B., Ekiert, D.C.,
Bhabha, G., Yu, W., Juraszek, J., Koudstaal, W., Jongeneelen, M., et al.
(2014). A common solution to group 2 influenza virus neutralization. Proc.
ACKNOWLEDGMENTS
Knossow, M., and Skehel, J.J. (2006). Variation and infectivity neutralization in
influenza. Immunology 119, 17.
We thank Jose Martinez and the MedImmune Laboratory Animal Resource

staff for in vivo study assistance; Robert Woods for additional affinity characterization; and Sandrina Phipps, Arnita Barnes, and Kannaki Senthil for generating HA reagents. This work was supported by the European Research
Council (grant 670955 BROADimmune) and the Swiss National Science Foundation (grant 160279). A.L. is supported by the Helmut Horten Foundation. We
thank the staff at the Diamond Light Source Synchrotron for assistance and
beam-line access under Diamond Light Source Proposal 9826. This work
was funded by the Francis Crick Institute, London. U.N. was also funded by
a Marie Curie Actions Intra-European Fellowship (grant 629829).
Koday, M.T., Nelson, J., Chevalier, A., Koday, M., Kalinoski, H., Stewart, L.,
Carter, L., Nieusma, T., Lee, P.S., Ward, A.B., et al. (2016). A Computationally
Designed Hemagglutinin Stem-Binding Protein Provides In Vivo Protection
from Influenza Independent of a Host Immune Response. PLoS Pathog. 12,
e1005409.
Received: February 22, 2016

Published: July 21, 2016
REFERENCES
Benjamin, E., Wang, W., McAuliffe, J.M., Palmer-Hill, F.J., Kallewaard, N.L.,
Chen, Z., Suzich, J.A., Blair, W.S., Jin, H., and Zhu, Q. (2014). A broadly
neutralizing human monoclonal antibody directed against a novel conserved
epitope on the influenza virus H3 hemagglutinin globular head. J. Virol. 88,
67436750.
Corti, D., and Lanzavecchia, A. (2013). Broadly neutralizing antiviral antibodies. Annu. Rev. Immunol. 31, 705742.
Corti, D., Suguitan, A.L., Jr., Pinna, D., Silacci, C., Fernandez-Rodriguez, B.M.,
Vanzetta, F., Santos, C., Luke, C.J., Torres-Velez, F.J., Temperton, N.J., et al.
(2010). Heterosubtypic neutralizing antibodies are produced by individuals
immunized with a seasonal influenza vaccine. J. Clin. Invest. 120, 16631673.
DiLillo, D.J., Tan, G.S., Palese, P., and Ravetch, J.V. (2014). Broadly neutralizing hemagglutinin stalk-specific antibodies require FcgR interactions for protection against influenza virus in vivo. Nat. Med. 20, 143151.
Nakamura, G., Chai, N., Park, S., Chiang, N., Lin, Z., Chiu, H., Fong, R., Yan, D.,
Kim, J., Zhang, J., et al. (2013). An in vivo human-plasmablast enrichment
technique allows rapid identification of therapeutic influenza A antibodies.
Cell Host Microbe 14, 93103.
Nobusawa, E., Aoyama, T., Kato, H., Suzuki, Y., Tateno, Y., and Nakajima, K.
(1991). Comparison of complete amino acid sequences and receptor-binding
properties among 13 serotypes of hemagglutinins of influenza A viruses.
Virology 182, 475485.
Pappas, L., Foglierini, M., Piccoli, L., Kallewaard, N.L., Turrini, F., Silacci, C.,
Fernandez-Rodriguez, B., Agatic, G., Giacchetto-Sasselli, I., Pellicciotta, G.,
et al. (2014). Rapid development of broadly influenza neutralizing antibodies
through redundant mutations. Nature 516, 418422.
Russell, R.J., Gamblin, S.J., Haire, L.F., Stevens, D.J., Xiao, B., Ha, Y., and
Skehel, J.J. (2004). H1 and H7 influenza haemagglutinin structures extend a
structural classification of haemagglutinin subtypes. Virology 325, 287296.
Russell, C.A., Jones, T.C., Barr, I.G., Cox, N.J., Garten, R.J., Gregory, V., Gust,
I.D., Hampson, A.W., Hay, A.J., Hurt, A.C., et al. (2008). The global circulation
of seasonal influenza A (H3N2) viruses. Science 320, 340346.
Schmidt, A.G., Xu, H., Khan, A.R., ODonnell, T., Khurana, S., King, L.R., Manischewitz, J., Golding, H., Suphaphiphat, P., Carfi, A., et al. (2013). Preconfiguration of the antigen-binding site during affinity maturation of a broadly
neutralizing influenza virus antibody. Proc. Natl. Acad. Sci. USA 110, 264269.
Skehel, J.J., and Wiley, D.C. (2000). Receptor binding and membrane fusion in
virus entry: the influenza hemagglutinin. Annu. Rev. Biochem. 69, 531569.
Sui, J., Hwang, W.C., Perez, S., Wei, G., Aird, D., Chen, L.-M., Santelli, E., Stec,
B., Cadwell, G., Ali, M., et al. (2009). Structural and functional bases for broadspectrum neutralization of avian and human influenza A viruses. Nat. Struct.
Mol. Biol. 16, 265273.
Tan, G.S., Lee, P.S., Hoffman, R.M.B., Mazel-Sanchez, B., Krammer, F., Leon,
P.E., Ward, A.B., Wilson, I.A., and Palese, P. (2014). Characterization of a
Cell 166, 596608, July 28, 2016 607
broadly neutralizing monoclonal antibody that targets the fusion domain of

group 2 influenza A virus hemagglutinin. J. Virol. 88, 1358013592.
viruses following sequential immunization with different hemagglutinins.

PLoS Pathog. 6, e1000796.
Throsby, M., van den Brink, E., Jongeneelen, M., Poon, L.L.M., Alard, P., Cornelissen, L., Bakker, A., Cox, F., van Deventer, E., Guan, Y., et al. (2008).
Heterosubtypic neutralizing monoclonal antibodies cross-protective against
H5N1 and H1N1 recovered from human IgM+ memory B cells. PLoS ONE 3,
e3942.
Wrammert, J., Koutsonanos, D., Li, G.-M., Edupuganti, S., Sui, J., Morrissey,
M., McCausland, M., Skountzou, I., Hornig, M., Lipkin, W.I., et al. (2011).
Broadly cross-reactive antibodies dominate the human B cell response
against 2009 pandemic H1N1 influenza virus infection. J. Exp. Med. 208,
181193.
Tong, S., Li, Y., Rivailler, P., Conrardy, C., Castillo, D.A.A., Chen, L.-M., Recuenco, S., Ellison, J.A., Davis, C.T., York, I.A., et al. (2012). A distinct lineage
of influenza A virus from bats. Proc. Natl. Acad. Sci. USA 109, 42694274.
Wright, P., Neumann, G., and Kawaoka, Y. (2007). Orthomyxoviruses (Fields

Virology).
Tong, S., Zhu, X., Li, Y., Shi, M., Zhang, J., Bourgeois, M., Yang, H., Chen, X.,
Recuenco, S., Gomez, J., et al. (2013). New world bats harbor diverse influenza
A viruses. PLoS Pathog. 9, e1003657.
Wu, Y., Cho, M., Shore, D., Song, M., Choi, J., Jiang, T., Deng, Y.-Q., Bourgeois, M., Almli, L., Yang, H., et al. (2015). A potent broad-spectrum protective
human monoclonal antibody crosslinking two haemagglutinin monomers of
influenza A virus. Nat. Commun. 6, 7708.
Traggiai, E., Becker, S., Subbarao, K., Kolesnikova, L., Uematsu, Y., Gismondo, M.R., Murphy, B.R., Rappuoli, R., and Lanzavecchia, A. (2004). An efficient method to make human monoclonal antibodies from memory B cells:
potent neutralization of SARS coronavirus. Nat. Med. 10, 871875.
Xiong, X., Corti, D., Liu, J., Pinna, D., Foglierini, M., Calder, L.J., Martin, S.R.,
Lin, Y.P., Walker, P.A., Collins, P.J., et al. (2015). Structures of complexes
formed by H5 influenza hemagglutinin with a potent broadly neutralizing
human monoclonal antibody. Proc. Natl. Acad. Sci. USA 112, 94309435.
Wang, T.T., Tan, G.S., Hai, R., Pica, N., Petersen, E., Moran, T.M., and Palese,
P. (2010). Broadly protective monoclonal antibodies against H3 influenza
Yewdell, J.W. (2013). To dream the impossible dream: universal influenza

vaccination. Curr. Opin. Virol. 3, 316321.
608 Cell 166, 596608, July 28, 2016
Article
Vaccine-Induced Antibodies that Neutralize Group 1

and Group 2 Influenza A Viruses
Graphical Abstract
Authors
M. Gordon Joyce, Adam K. Wheatley,
Paul V. Thomas, ..., Peter D. Kwong,
John R. Mascola, Adrian B. McDermott
Correspondence
pdkwong@nih.gov (P.D.K.),
jmascola@nih.gov (J.R.M.),
adrian.mcdermott@nih.gov (A.B.M.)
In Brief
Quantifying B cells capable of producing
broadly neutralizing antibodies against
influenza serves as a metric to guide the
development of a universal influenza
vaccine.
Highlights
d
Isolation of group 1 and group 2 influenza A-neutralizing

antibodies from H5N1 vaccinees
Discovery of three classes of broadly neutralizing antibodies
directed to the HA stem
Delineation of sequence signatures specific for broadly
neutralizing antibodies
Antibody quantification by NGS to guide the development of
a universal vaccine
Joyce et al., 2016, Cell 166, 609623

July 28, 2016 Published by Elsevier Inc.
Accession Numbers
5K9J
5K9K
5K9O
5K9Q
5KAN
5KAQ
KX386124KX387227
Article
Vaccine-Induced Antibodies that Neutralize
Group 1 and Group 2 Influenza A Viruses
M. Gordon Joyce,1,8 Adam K. Wheatley,1,8 Paul V. Thomas,1,8 Gwo-Yu Chuang,1,8 Cinque Soto,1,8 Robert T. Bailer,1
Aliaksandr Druz,1 Ivelin S. Georgiev,1,2,3 Rebecca A. Gillespie,1 Masaru Kanekiyo,1 Wing-Pui Kong,1 Kwanyee Leung,1
Sandeep N. Narpala,1 Madhu S. Prabhakaran,1 Eun Sung Yang,1 Baoshan Zhang,1 Yi Zhang,1 Mangaiarkarasi Asokan,1
Jeffrey C. Boyington,1 Tatsiana Bylund,1 Sam Darko,1 Christopher R. Lees,1 Amy Ransier,1 Chen-Hsiang Shen,1
Lingshu Wang,1 James R. Whittle,1 Xueling Wu,1 Hadi M. Yassine,1 Celia Santos,1,4 Yumiko Matsuoka,4
Yaroslav Tsybovsky,5 Ulrich Baxa,5 NISC Comparative Sequencing Program,6 James C. Mullikin,6 Kanta Subbarao,4
Daniel C. Douek,1 Barney S. Graham,1 Richard A. Koup,1 Julie E. Ledgerwood,1 Mario Roederer,1 Lawrence Shapiro,1,7
Peter D. Kwong,1,* John R. Mascola,1,* and Adrian B. McDermott1,*
1Vaccine
Research Center, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, MD 20892, USA
of Pathology, Microbiology, and Immunology and Vanderbilt Vaccine Center, Vanderbilt University Medical Center, Nashville,
TN 37232, USA
3Department of Electrical Engineering and Computer Science, Vanderbilt University, Nashville, TN 37232, USA
4Laboratory of Infectious Diseases, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda,
MD 20892, USA
5Electron Microscopy Laboratory, Cancer Research Technology Program, Leidos Biomedical Research, Frederick National Laboratory for
Cancer Research, Frederick, MD 21702, USA
6NIH Intramural Sequencing Center (NISC), National Human Genome Research Institute, National Institutes of Health, Bethesda,
MD 20892, USA
7Department of Biochemistry & Molecular Biophysics and Department of Systems Biology, Columbia University, New York, NY 10027, USA
8Co-first author
*Correspondence: pdkwong@nih.gov (P.D.K.), jmascola@nih.gov (J.R.M.), adrian.mcdermott@nih.gov (A.B.M.)
2Department
SUMMARY
Antibodies capable of neutralizing divergent influenza A viruses could form the basis of a universal vaccine. Here, from subjects enrolled in an H5N1 DNA/
MIV-prime-boost influenza vaccine trial, we sorted
hemagglutinin cross-reactive memory B cells and
identified three antibody classes, each capable of
neutralizing diverse subtypes of group 1 and group
2 influenza A viruses. Co-crystal structures with hemagglutinin revealed that each class utilized characteristic germline genes and convergent sequence motifs
to recognize overlapping epitopes in the hemagglutinin stem. All six analyzed subjects had sequences
from at least one multidonor class, andin half the
subjectsmultidonor-class sequences were recovered from >40% of cross-reactive B cells. By contrast,
these multidonor-class sequences were rare in published antibody datasets. Vaccination with a divergent hemagglutinin can thus increase the frequency
of B cells encoding broad influenza A-neutralizing antibodies. We propose the sequence signature-quantified prevalence of these B cells as a metric to guide
universal influenza A immunization strategies.
INTRODUCTION
Influenza A viruses can be categorized into two phylogenetic
groups (group 1 and group 2), each containing diverse subtypes
(Figure 1A). Currently, group 1 influenza viruses from the H1 subtype (1918 and 2009 H1N1 pandemics), and the group 2 H3 subtype (1968 H3N2 pandemic), co-circulate and cause seasonal infections in over 10% of the human population each year. Other
subtypes have emerged or threaten to re-emerge including the
group 1 H2 subtype, endemic in humans from 19571968, the
group 1 H5 subtype, which includes lethal avian strains (Subbarao et al., 1998), and the group 1 H6 and H9 and the group 2
H7 and H10 subtypes, which have been associated with human
infections and fatalities in recent years (Butt et al., 2005; Morens
et al., 2013). Frequent zoonotic cross-overs that may cause pandemics of unpredictable frequency and severity highlight the
need for a universal influenza vaccine that is capable of eliciting
protection against divergent influenza A viruses.
Potential approaches to a universal influenza vaccine involve
the elicitation of neutralizing antibodies that recognize the
influenza hemagglutinin (HA) from multiple subtypes. One means
to accomplish this involves ontogeny-based strategies, which
seek to identify antibodies of reproducible classes and to induce
similar antibodies by vaccination (Jardine et al., 2015; Lingwood
et al., 2012; Pappas et al., 2014). We consider antibodies to be of
the same class when they recognize the same region, employ the
same structural mode of recognition, and develop through
similar recombination and maturation pathways (Kwong and
Mascola, 2012). Reproducible classes, which are observed in
multiple individuals, represent immunological solutions to the
challenge of broad influenza A neutralization that might be available to the general human population.
The influenza A-neutralizing stem-directed antibodies that utilize the HV1-69 germline gene are one such multidonor class
(Ekiert et al., 2009; Kashyap et al., 2010; Sui et al., 2009; Throsby
Cell 166, 609623, July 28, 2016 Published by Elsevier Inc. 609
B
H12
H8
H7
H15
H10
H1
H4
H14
H18
H2
H17
H3
H5
H
5
10 5
10 4
01
16
31
36
54
56
10 4
10 3
10 2
10 3
10
27
29
10 2
10 2
59
10 1
10 1
10 1
H5N1
Group 1
Subject 16
Subject 31
0.08%
0.22%
0.07%
Subject 36
Subject 54
Subject 56
10 3
H3-A/Perth/16/2009
H11
Subject 01
Selection of subjects displaying

varying levels of neutralization against
group 1 and group 2 strains
Neutralization (recipricol ID5 0 titre)
H9
H6
H13 H16
H3N2
H7N7
Group 2 D
0.2
Subject 01
Subject 16
H5-A/Indonesia/5/2005
Subject 31
E
Number of
genetic
Similarities
Antibody name
(subject.lineage.clone)
235
136
HV4-34
HV1-18
HV3-23
Subject 54
Subject 36
Subject 56
HV6-1
HV1-18
HV6-1
HV
1-18
94
0.13%
0.29%
0.11%
102
13
HV 4-30
HV1-18
65
HV6-1
HV3-64D
Neutralization breadth
Frequentist
probability
Number of
clones per
subject per
genetic cluster
HV
gene
HD
gene
HJ
gene
CDR H3
HV
length* maturation
LV
gene
LJ
Gene
CDR L3
length*
Binding
competition
Group 1
Group 2
31.g.01
54.f.01
56.a.09
4.9E-12
1
1
30
HV6-1
HV6-1
HV6-1
HD3-3
HD3-3
HD3-3
HJ4/5
HJ4/5
HJ4/5
16
16
16
8%
6%
4%
KV3-20
KV3-20
KV3-20
KJ2
KJ3
KJ2
9
9
9
n.d.
+++
+++
n.d.
H1, H2, H5
H1, H5
n.d.
H3, H7
H3, H7
01.k.01
31.b.09
3.6E-6
1
26
HV1-18
HV1-18
HD3-9
HD3-9
HJ4
HJ4
15
15
9%
5%
KV2-30
KV2-30
KJ5
KJ2
9
9
+++
+++
H1, H5
H1, H5
H3
H3, H7
16.g.07
54.a.84
6.9E-5
8
92
HV1-18
HV1-18
HD2-15
HD2-15
HJ2
HJ2
21
21
11%
11%
KV1-12
KV3-11
KJ2
KJ1
9
9
+++
+++
H1, H5, H9
H1, H5, H9
H3, H7
H3, H7
H3, H7
16.a.26
93
HV1-18
HD2-2
HJ5
21
8%
KV1-39
KJ2
+++
H1, H5, H9
54.a.39
92
HV1-18
HD2-2
HJ5
21
6%
KV3-11
KJ1
+++
H1, H9
H3, H7
31.a.83
56.h.01
0.004
104
2
HV3-23
HV3-23
HD3-9
HD3-9
HJ6
HJ6
24
28
8%
8%
KV3-15
KV2-29
KJ2
KJ4
9
9
+++
+++
H1, H2, H5, H9

H1, H2, H9
H3, H7
None
01.s.01
31.f.01
0.007
1
1
HV1-69
HV1-69
HD5-18
HD3-22
HJ4
HJ4
15
15
10%
7%
KV4-1
KV3-20
KJ3
KJ2
9
9
n.b.
+++
None
H1, H2, H5, H9
None
None
31.f.01
56.ND.11
5.1E-5
1
1
HV1-69
HV1-69
HD3-22
HD3-22
HJ4
HJ6
15
15
7%
6%
KV3-20
KV3-20
KJ2
n.r.
9
n.r.
+++
n.d.
H1, H2, H5, H9

n.d.
None
n.d.
54.ND.03
56.g.01
0.001
1
1
HV1-69
HV1-69
HD3-22
HD3-22
HJ6
HJ6
13
13
8%
6%
n.d.
KV3-20
n.d.
KJ2
n.d.
9
n.d.
n.b.
n.d.
None
n.d.
None
HV7-4-1
HD3-9
HJ4
21
1%
KV3-15
KJ1
+++
H1, H5
H3
HV3-49
HD3-9
HJ4
21
9%
KV2-30
KJ1
n.d.
n.d.
n.d.
1
1
HV3-23
HV4-4
HD6-13
HD6-13
HJ4
HJ4
15
15
0%
3%
KV4-1
KV2-40
KJ1
KJ2
9
7
n.b.
n.d.
None
n.d.
None
n.d.
54.e.01
56.k.01
6.9E-5
0.001
01.i.01
56.i.01
0.032
n.a.
31.d.01
n/a
HV3-30
HD3-9
HJ3
21
4%
KV4-1
KJ2
+++
H1, H2, H5, H9
None
n.a.
01.a.44
n/a
53
HV4-34
HD2-8
HJ6
24
12%
KV1-9
KJ2
+++
H1, H2, H5
H3, H7
*IMGT CDR 3 lengths used; n.d.: not determined; : crystal structure in complex with HA determined; n.b.: no binding to HA; n.r.: not recovered; None: no pseudovirus neutralization observed; n.a.: not applicable
Figure 1. H5N1 Vaccine Recipients Have Cross-Reactive B Cells that Utilize the Same Genetic Elements and Neutralize Group 1 and Group 2
Influenza A Viruses
(A) Phylogenetic tree depicting influenza A subtypes, generated using HA sequences (one per subtype except H1, H3, H5, and H7) with program MEGA6. Scale
bar indicates distance per fractional nucleotide change.
(B) Neutralization by serum from 63 vaccinees, sampled 2 weeks after final H5N1 immunization and assessed against vaccine strain (A/Indonesia/5/2005) and
heterologous group 2 strains (H3N2: A/Hong Kong/1-4-MA21-1/1968; H7N7: A/Netherlands/219/2003). Ten subjects were selected for flow cytometric (FACS)
characterization, as highlighted in key. Dotted line indicates the limit of detection.
(C) FACS analysis of PBMC samples isolated from H5N1-vaccine recipients 2 weeks after final vaccination and co-stained with HA probes H5 (A/Indonesia/5/
2005) and H3 (A/Perth/16/2009). Sizable populations of H5-H3 cross-reactive memory B cells observed in six of ten subjects (Figure S2).
610 Cell 166, 609623, July 28, 2016
et al., 2008). In terms of reproducibility, the HV1-69-derived antibodies have the additional advantage of utilizing heavy chainonly recognition, and prior studies have shown their vaccineinduced elicitation (Khurana et al., 2013; Ledgerwood et al.,
2011, 2013; Sui et al., 2009; Wheatley et al., 2015; Whittle
et al., 2014). However, HV1-69-derived antibodies generally do
not neutralize both group 1 and group 2 strains of influenza A,
and, only a single HV1-69-derived antibody has been identified
(CR9114) capable of neutralizing both group 1 and group 2
strains of influenza A (Dreyfus et al., 2012). Other broadly neutralizing antibodies have been identified, such as FI6v3 and 39.29,
both of which derive from the HV3-30 germline gene; however,
co-crystal structures with HA reveal different modes of recognition (Corti et al., 2011; Nakamura et al., 2013), and FI6v3 and
39.29 are thus not members of the same class. Indeed, a reproducible antibody class capable of neutralizing both group 1 and
group 2 influenza A viruses has not been observed in multiple
donors.
We previously showed that subjects enrolled in the phase I
clinical trial, VRC 310who received an A/Indonesia/05/2005
monovalent inactivated virus (MIV) vaccine primed by an H5
DNA plasmid vaccine (Ledgerwood et al., 2011, 2013) (Table
S1A)showed transient expansion of H1- and H5-cross-reactive memory B cells specific to the HA stem (Wheatley et al.,
2015; Whittle et al., 2014). To determine whether these memory
B cells might encode multidonor class antibodies capable of
neutralizing group 1 and group 2 influenza A virus, we sorted
memory B cells that reacted with both H5 (group 1) and H3
(group 2) HAs. Immunoglobulin transcripts from post-vaccination cross-reactive memory B cells were sequenced, and the
encoded antibodies were synthesized and characterized. Specifically, we assessed breadth and potency of influenza A virus
neutralization, determined representative crystal structures in
complex with HA, analyzed sequence convergence based on
V(D)J-gene recombination and somatic hypermutation (SHM),
and tested sequence signatures for their ability to identify group 1
and group 2 neutralizing antibodies. Our findings reveal reproducible immunological pathways to achieve broadly reactive
antibodies and support a B cell ontogeny-based approach to obtaining a universal influenza A vaccine.
RESULTS
Identification of Memory B Cells Cross-Reactive with
Group 1 and Group 2 Influenza A HAs
We studied ten subjects from the VRC 310 H5N1 vaccine trial
who displayed a range of vaccine-elicited serum H5N1 neutralization activity, as well as varied but detectable responses against
group 2 strains A/Hong Kong/1-4-MA21-1/1968 (H3N2) or
A/Netherlands/219/2003 (H7N7) (Figures 1B and S1; Tables S1B
and S2). We used recombinant group 1-specific (H5) and group
2-specific (H3) HA probesmodified to prevent sialic acid binding
(HADSA) (Wheatley et al., 2015; Whittle et al., 2014)to co-stain

and sort peripheral blood mononuclear cells (PBMCs) isolated
2 weeks post H5N1 MIV boost (Figures 1C and S2). We recovered
sequences of memory B cell immunoglobulin gene transcripts
from six of the ten studied subjects (Figure 1D).
The sequence repertoire of each subject was generally dominated by clonally related transcripts comprising a small number
of clonal expansions. Transcripts derived from diverse HV genes
including HV1-18, HV3-23, HV3-64D, HV4-30, HV4-34, and
HV6-1 (Figures 1D and 1E). Notably, transcripts from HV1-69,
which often predominate group 1-specific stem-reactive antibodies (Lingwood et al., 2012; Pappas et al., 2014; Wheatley
et al., 2015), comprised only 2.5% of this set of group 1 (H5+)group 2 (H3+) double-positive memory B cells.
Group 1 and Group 2 Neutralizing Antibodies from
Different Vaccinees Are Genetically Similar
Immunoglobulin sequences recovered from H5+ and H3+ crossreactive memory B cells (Table S3) showed surprising similarity.
Notably, many immunoglobulin sequences from different donors
derived from the same genetic elements (Figure 1E). To analyze
commonalities of immunoglobulin transcripts between subjects,
we considered the following seven genetic elements: inferred
heavy variable (HV), heavy diversity (HD), heavy joining (HJ)
genes, third-heavy chain complementarity-determining region
(CDR H3) length for heavy chain-gene transcript, inferred light
variable (LV) and light joining (LJ) genes, and CDR L3 length for
corresponding light chain transcripts (Figure 1E). Frequentist
analysis indicated the presence of four or more of the same genetic elements in separate lineages to be statistically significant
(p % 0.001) (Figure 1E), and representative antibodies were
cloned and expressed from such lineages. Of note, our antibody
nomenclature specifies donor, lineage, and clone; e.g., antibody
56.a.09 (Figure 1E, third row) is named for subject (56), lineage
within this subject (a), and clone within this lineage (09). Most
of the expressed antibodies bound HA, and all antibodies that
bound HA competed with the antigen-binding fragment (Fab)
of the stem-directed antibodies CR9114 (Dreyfus et al., 2012)
or F10 (Sui et al., 2009) (Figure 1E), and negative stain-electron
microscopy (EM) indicated binding to the HA stem (Figure S3).
Most of these antibodies neutralized viruses from both group 1
and group 2, including subtypes H1, H3, H5, and H7, with select
antibodies also demonstrating neutralization of viruses from
subtypes H2, H9, and H10 (Figures 1E and S4; Table S4).
Notably, in both pseudovirus and microneutralization assays
of influenza A viruses, the breadth and potency for several of
the newly identified antibodies were comparable to those of
antibody CR9114 (Dreyfus et al., 2012).
To understand the immunological basis of these multidonor
humoral responses against influenza A, we analyzed the recombination, SHM, and structural constraints, which drove the generation and development of these antibodies.
(D) Clonal diversity of H5-H3 cross-reactive B cells. The HV repertoire from each subject is shown as a pie chart; with each slice representing a unique HV clone or
clonally related family. Total number of HV sequences recovered per subject is indicated by the number at the center of each pie chart.
(E) Genetic and functional characteristics of selected antibodies recovered from H5-H3 cross-reactive B cells. Structurally characterized antibodies
indicated by .
See also Figures S1, S2, S3, and S4 and Tables S1, S2, S3, and S4.
Cell 166, 609623, July 28, 2016 611
A Multidonor Class of Broadly Neutralizing Antibodies

with HV6-1+HD3-3 Germline Genes
Three memory B cell lineages, from subjects 31, 54, and 56,
shared heavy chain sequences derived from recombination of
HV6-1, HD3-3, and HJ4 or HJ5 to yield highly similar amino
acid sequences in the CDR H3 (Figure 1E, top row; Table S3).
In each case, the heavy chain was paired with a light chain
sequence from KV3-20, KJ2, or KJ3 resulting in a CDR L3 of
nine amino acids. Similar affinity maturation patterns were
observed: a Val100bIleHC alteration of an HD-gene-encoded
section of the CDR H3 was completely conserved in all three lineages (Figure 2A) (for clarity, each residue number is followed by
a subscript denoting parent molecule: HC, heavy chain; LC, light
chain; HA1, HA2, or HA, either HA subunit or HA in general Kabat
numbering is used for antibodies, thus, residue 100b refers to the
second insertion b after residue 100 [Kabat et al., 1991]).
Notably, these antibodies displayed neutralization breadth and
potency that rivaled that of CR9114 and exceeded that of the
group-specific CR6261 or CR8020 or of the head-directed antibody CH65 (Ekiert et al., 2009, 2011; Whittle et al., 2011) (Figures
2B and S4; Table S4).
To provide insight into the structural basis for the similarity
between these HV6-1+HD3-3 antibodies, we determined the
crystal structure of the antigen-binding fragment (Fab) for antibody, 56.a.09, alone, and in complex with A/Hong Kong/1-4MA21-1/1968 (H3N2) HA at 3.3 A resolution (Figures 2C and
S5; Tables S5 and S6). Unexpectedly, the crystallized HA was
not trimeric, with the asymmetric unit for the crystallized FabH3 complex comprising an HA head of one protomer interacting
with the HA stem of an adjacent protomer in a head-to-stem
dimeric arrangement (Figures S5B and S5C). Despite this nontrimeric arrangement, the Ca-root-mean-square deviation
(RMSD) between the 56.a.09-bound HA and the ligand-free HA
was <1 A, and, for clarity, we thus depict the 56.a.09 bound complex as a typical HA trimer (Figure 2C).
Antibody 56.a.09 recognized a conserved region on the HA
stem in a manner that avoided glycans at residues Asn21HA1
(conserved on group 1 viruses) and Asn38HA1 (conserved on
group 2 viruses), providing a structural explanation for its
extraordinary group 1 and group 2 neutralization breadth. Antibody 56.a.09 bound primarily with its heavy chain (934 A2 buried
surface area [BSA] versus 386 A2 BSA for the light chain). Heavy
chain binding involved the HD3-3-encoded CDR H3 (Figure 2C)
with Phe100HC and Gly100aHC contributing 240 A2 of BSA
and the SHM-altered Val100bIleHC inserting directly into the
Trp21HA2 pocket (contributing over 100 A2 of interactive surface).
In addition, Met98HC interacted with a conserved aromatic residue present on all light chains, helping to orient the CDR H3
(Figure 2D). Heavy chain binding also involved the HV6-1 germline-encoded CDR H2, which uniquely encodes a nine-amino
acid CDR H2, contributed 182 A2 of BSA, was unmutated in contact residues in all three subjects (Figure S6), and interacted with
the conserved fusion peptide (Figure 2E). With respect to light
chain, the largely unmutated KV3-20-derived V genes (4%6%
SHM) observed in lineages from these three subjects interacted
with the HA stem through both CDR L1 and CDR L3 (Figure 2F;
Table S6), with Tyr33LC contributing the largest BSA among all
light chain residues (73.1 A2).
612 Cell 166, 609623, July 28, 2016
HV6-1+HD3-3-derived antibodies were thus found in three independent donors, shared genetic elements in both the heavy
and light chain, displayed convergent affinity maturation, and
appeared to share the same mode of recognition (structurefunction analysis of recognition interface and SHM are provided
in Figure S7). Furthermore, we tested for functional complementation: swapping of heavy and light chains of the three HV61+HD3-3-derived antibodies resulted in six functional antibodies
from nine possible pairings (all three pairings with the heavy
chain of antibody 31.g.01 failed to express) (Table S7). Overall,
these results indicated the HV6-1+HD3-3-derived antibodies
form a multidonor class. Structural analysis indicated numerous
light chains to be compatible with binding, and 99% of the human population (Abecasis et al., 2012) possess alleles of the
HV6-1 and HD3-3 genes compatible with the class elicitation
and recognition described here (Figures 2D2G).
A Second Multidonor Class of Broadly Neutralizing
Antibodies with HV1-18+HD3-9 Germline Genes
Two memory B cell lineages from subjects 1 and 31 shared
immunoglobulin heavy chain sequence derived from recombination of HV1-18 with HD3-9 and HJ4 to yield highly similar amino
acid sequences in a CDR H3 of 15 amino acids (Figures 1E and
S6). Notably an Arg96HC residue was encoded by N-nucleotide
addition in both cases (Figure 3A). In each donor, the heavy chain
was paired with a light chain derived from KV2-30. Encoded immunoglobulins were expressed and shown to neutralize primarily group 1 strains of influenza A, although a few group 2 strains
were neutralized (Figure S4; Table S4). Overall neutralization
from these HV1-18+HD3-9 antibodies appeared more similar
to the group 1-specific antibody CR6261 than to the very broad
CR9114 or HV6-1-derived antibodies; nevertheless, neutralization breadth encompassed 50% of influenza A subtypes that
commonly infect humans (Figure 3B).
We determined the crystal structure of Fab 31.b.09 in complex
with A/California/04/2009 (pH1N1) HA (Figure 3C; Tables S5 and
S6). Similar to the 56.a.09-H3 complex structure, the crystallized
hemagglutinin in the Fab 31.b.09 complex was not a trimer, but a
molecular dimer (Figure S5). Despite this unexpected nontrimeric arrangement, the Ca-RMSD between the 31.b.09bound HA and the ligand-free HA in the stem region was 0.6 A,
and for clarity, we depict the 31.b.09 bound complex in a more
typical trimeric arrangement (Figure 3C). The HV1-18+HD3-9derived antibody 31.b.09 bound an epitope that overlapped
the HV6-1+HD3-3 class epitope, but with antibody rotated
105 (mostly involving a rotation perpendicular to the trimer
axis) (Figure 3C). Antibody 31.b.09 bound with both heavy and
light chains (343 A2 BSA for heavy chain and 540 A2 BSA for
the light chain). Heavy chain interactions were generated
through CDR H2 and H3 loops. In the CDR H2 (127 A2 BSA),
the HV1-18 germline-encoded Tyr53HC recognized the fusion
peptide of HA2 and Asn56HC recognized helix A of HA2, while
the CDR H3 (216 A2 BSA) was positioned over the fusion peptide-helix A interface with Ile99HC and Leu100HC inserting into
the hydrophobic groove between these two conserved elements
(Figures 3C3E). Light chain interactions involved CDR L1 and
L3, which recognized helix A (Figure 3F). We tested functional
complementation: swapping of heavy and light chains between
A
HV6-1+HD3-3
92
94
96
98
100
100b
100d
100f
102
56.a.09
footprint
Phe
100
C
CDR H3
A/Hong Kong/1-4MA21-1/1968 H3 HA0
Helix A
Met 98
Trp 21HA2
Ile 100b
56.a.09
light chain
CDR L1
Fusion
peptide
CDR H2
CDR L3
90
Thr 57
N-terminusLC
56.a.09
heavy chain
Asp 92LC
H3 HA
E
Phe
100
Tyr
50LC
Ser
53
CDR H3
Met
98
Trp 21HA2
Ile
100b
Trp
21HA2
Arg
52b
CDR L1
Tyr
33
Fusion
peptide
Helix A
Arg
50
CDR L3
CDR H2
Lys
55
Tyr
56
Asp 92 LC
Thr
57
Gln 95LC
Ile
100d
Longer HD motif: 100 x X

Compatible HD genes:
HD3-3 with compatible alleles
HD3-3*01 and HD3-3*02
Shorter HD motif: x
HD1-1, HD1-14, HD1-20, HD2-2,
HD2-8, HD2-15, HD3-3, HD3-9,
HD3-10, and HD6-13
HV motif: 9 residue CDR H2

Compatible HV genes:
HV6-1 with
compatible alleles
HV6-1*01 and HV6-1*02
LV motif: Y at residue 33
Compatible LV genes:
30 genes compatible
Key:
= F,Y or W
x = G, A or S
= I, V, L, or M
X = any residue
Figure 2. A Multidonor HV6-1+HD3-3 Class of Broadly Neutralizing Antibodies

(A) Immunoglobulin heavy chains utilizing germline genes HV6-1, HD3-3, and HJ4 or HJ5. Germline HV, HD, and HJ gene-encoded nucleotide and amino acid
residues are shown in black, with junction-encoded residues in light blue and residues that have undergone SHM in red. Nucleotides removed by exonuclease
trimming indicated with a line through the letters. Conserved HD3-3-encoded residues (IFG) highlighted by a black box; recurrent SHM-derived Ile100bHC
highlighted by a red box. HA contacts indicated with open circles (B) denoting antibody main-chain-only contacts, open circles with rays () denoting antibody
side chain-only contacts, and filled circles (C) denoting both main-chain and side-chain contacts.
(B) Neutralization breadth-potency curve for HV6-1+HD3-3 antibodies, with breadth shown as percentage of pseudoviruses neutralized at each IC50 cutoff, and
virus panel comprising 15 strains that includes influenza A subtypes known to infect humans (H1, H2, H3, H5, H7, H9).
(C) Co-crystal structure of Fab 56.a.09 in complex with an H3 HA monomer (A/Hong Kong/1-4-MA21-1/1968). Fab heavy and light chains colored purple and
cyan, respectively, and depicted in surface representation, while the H3 HA is depicted in ribbon and shown as a trimer (Figure S5 shows HA crystal packing).
Inset: interacting CDR loops of the 56.a.09 Fab are shown in ribbon and sticks and colored as in (A) with the antibody footprint outlined. Note that labels for heavychain residues do not explicitly show HC.
(D) A five-amino acid motif within the CDR H3 inserts into the conserved Trp21 pocket of H3.
(E) HV6-1encoded CDR H2 depicted with HA fusion peptide; HV6-1 is the only HV gene that encodes a nine-residue CDR H2.
(F) Light chain interactions that contribute to the antibody-binding surface; these are not specific to KV3-20.
(G) Analysis of antibody gene compatibility, highlighting additional CDR H3 residues that may be compatible with HA binding, e.g., 100HC could be a F, Y, or W
residue (U), 101HC could be a G, A, or S residue (x).
See also Figures S3, S4, S5, S6, and S7 and Tables S3, S4, S5, S6, and S7.
Cell 166, 609623, July 28, 2016 613
HV1-18+HD3-9
92
94
96
98
100
31.b.09
footprint
A/California/04/2009
H1 HA0
CDR L1
31.b.09
light chain
CDR L3
Ile 27e
Trp 21HA2
90
Leu 100
31.b.09
CDR H3
Helix A
His 98
31.b.09
heavy chain
Fusion
peptide
CDR H2
Helix A
Trp 21HA2
Trp 21HA2
CDR H3
Helix A
Fusion
peptide
Leu
100
CDR L1
Ile 27e
CDR L3
CDR H2
Fusion
peptide
His 93
Tyr
53
His 98
Asn
56
Trp 21HA2
Trp 94
Helix A
HD Motif: Leu 100

All 34 D genes compatible
CDR H2 motif: Trp at residue 50

Tyr at residue 53, Asn at 56
HV1-18
HV1-68
HV7-34-1
HV7-81
LV motif: CDR L1=16 aa

Compatible LV genes:
KV2-4
KV2D-18
KV2-18
KV2D-24
KV2-24
KV2D-26
KV2-28
KV2D-28
KV2-29
KV2D-29
KV2-30
KV2D-30
Figure 3. A Multidonor HV1-18+HD3-9 Class of Broadly Neutralizing Antibodies

(A) Immunoglobulin heavy chains utilizing germline genes HV1-18, HD3-9, and HJ4, with sequences annotated as described in Figure 2A.
(B) Neutralization breadth-potency curve for HV1-18+HD3-9 antibodies on a panel of influenza A viruses that includes subtypes known to infect humans.
(C) Co-crystal structure of Fab 31.b.09 in complex with an H1 trimer (A/California/04/2009). Fab heavy and light chains colored dark green and light green, respectively,
and depicted in surface representation, while the H1 HA is depicted in ribbon and colored blue, green, and white. Inset: interacting CDR loops of the 31.b.09 Fab are
shown in ribbon and sticks and colored as in (A) with the antibody footprint outlined. Note that labels for heavy-chain residues do not explicitly show HC.
(D) A conserved motif within the CDR H3 inserts into the highly conserved Trp21 pocket of HA while also interacting with the fusion peptide.
(E) The HV1-18-encoded CDR H2 also interacts with the opposing side of the fusion peptide.
(F) Light chain interactions from CDR L3 and CDR L1 also contribute to the antibody binding surface area.
(G) Analysis of antibody gene compatibility.
two antibodies of the putative class resulted in functional antibodies for all four of the possible pairings (Table S7). Overall,
the results indicated the HV1-18+HD3-9-derived antibodies
614 Cell 166, 609623, July 28, 2016
form a multidonor class. The light chain had a 16-amino acid

CDR L1, which could be encoded by 12 other LV genes, and residue Ile27eLC, which forms a critical contact, was a product of
SHM. The multiple alternative D gene alleles that could be used

to generate CDR H3s compatible with recognition by this multidonor class, in combination with a large number of compatible
light chains, led to a calculated distribution of potential HV1-18
combinations in the human population of close to 100% (Abecasis et al., 2012) (Figure 3G).
A Third Multidonor Class of Broadly Neutralizing
Antibodies with HV1-18 Germline Gene and Q-x-x-V
Motif
Multiple B cell lineages in two subjects (16 and 54) produced
distinctive HV1-18-derived immunoglobulins sharing five genetic
elements and having a CDR H3 of 21 amino acids derived from
recombination with either HD2-2 or HD2-15 genes (Figure 4A;
Table S3). This set of immunoglobulins all shared an SHMderived Thr54HC, a Gln98HC encoded by P-nucleotide addition,
and a germline HD-encoded aliphatic residue at position100aHC
(Q-x-x-V motif). Neutralization breadth and potency for this set of
immunoglobulins were similar to that of antibody CR9114 (Figure S4; Table S4), neutralizing all common human-infecting subtypes of influenza A except H2 and exceeding substantially the
breadth of group 1-specific or group 2-specific stem antibodies
or the head-directed antibody CH65 (Figure 4B). To understand
the basis of their recognition, we crystallized representative
antibodies with HA. Co-crystal structures of representative
HV1-18+HD2-2 (16.a.26) and HV1-18+HD2-15 (16.g.07) Fabs
in complex with the A/Hong Kong/1-4-MA21-1/1968 (H3N2)
HA revealed highly similar epitopes (Figures 4C4H; Tables S5
and S6). Antibody binding occurred primarily through heavy
chain recognition of the conserved HA stem. With 16.a.26, the
heavy chain contributed 514 A2 of BSA while the light chain
contributed 283 A2 of BSA, and with 16.g.07, the heavy chain
contributed 646 A2 of BSA while the light chain contributed
329 A2 of BSA.
Heavy chain interactions were generated primarily through
CDR H2 (150 A2 BSA for 16.a.26 and 200 A2 BSA for
16.g.07) and CDR H3 (300 A2 BSA for 16.a.26 and 450 A2
BSA for 16.g.07) with the HV1-18 germline-encoded Tyr53HC
and the SHM-derived Thr54HC recognizing the N-terminal region
of HA1 and the hydrophobic groove between helix A and the
fusion peptide of HA2 (Figures 4D and 4G; Table S6). The CDR
H3s of these antibodies also bound the conserved hydrophobic
groove next to helix A and adjacent to Trp21HA2. The conserved
Val100aHC inserted into a pocket present on both group 1 and
group 2 HAs, just above Trp21HA2 and proximal to Ile48HA2.
Despite differences between residues encoded by HD2-2
(16.a.26) and HD2-15 (16.g.07), the CDR H3s from both antibodies were oriented perpendicular to the Fab axis and interacted similarly with HA. The conserved Gln98HC interacted
with Gln42HA2 by occupying a germline-encoded pocket unique
to HV1-18-encoded antibodies, which was formed by framework
residues Gly33HC and Ser52HC (Figures 4E and 4H). The location
of Gln98HC within the framework pocket likely stabilized the
perpendicular orientation of the CDR H3 relative to the antibody-framework regions and may allow for CDR H3-motifs
derived from diverse D genes to bind to this conserved hydrophobic groove. In this regard, we note that sequences from subject 01 with an HV1-18 germline gene and a Gln98-x-x-Val100a
motif in a 17-amino acid CDR H3 were able to neutralize H3

and H5 strains of influenza. Although light chains of this class
contribute approximately one-third of the total buried surface
area, analysis of antibodies of this class revealed light chain sequences to derive from diverse LV genes, with only KV3-11 appearing twice. The recently reported HV1-18-derived group 1
and group 2 neutralizing antibody CT149 uses a Gln98HC-x-xVal100aHC motif with a 19-amino acid CDR H3 to bind the HA
stem (Wu et al., 2015) and recognized HA in a manner highly
similar to both 16.a.26 and 16.g.07. We tested the functional
complementation for antibodies 16.a.26, 16.g.07, 54.a.39,
54.a.84, and CT149 from donors 16, 54, and SH-K1 (the source
of antibody CT149). Swapping of heavy and light chains between
these five antibodies resulted in ten functional antibodies from
the 25 possible pairings (Table S7). Functionality correlated
strongly with HV gene identity and CDR H3 length (Table S7),
suggesting a requirement for specific heavy-light chain pairings.
Despite this requirement, the HV1-18 antibodies with Thr54HC
and CDR H3 98Q-x-x-V motif appeared to form a multidonor
class. Thus, the recombination of HV1-18 with many alternative
D gene segments yields a Q-x-x-V motif in the CDR H3, which
may comprise a common solution for neutralization of group 1
and group 2 influenza A viruses. The many alternative HD gene
alleles that can be used to generate CDR H3s for this multidonor
class in combination with no obvious light chain bias led to a
calculated distribution of potential HV1-18 combinations in the
human population of close to 100% (Abecasis et al., 2012) (Figures 4I and 4J).
Multidonor and Lineage-Unique Antibodies that
Neutralize Group 1 and Group 2 Viruses Recognize
Similar Epitopes
For the six subjects analyzed, the most highly expanded lineage
was lineage a from subject 31, from which we sequenced 104
clones (Figure 1E; Table S2). Clone 83 of this lineage (antibody
31.a.83) derived from HV3-23, HD3-9, and HJ6 and displayed
the highest neutralization breadth of the antibodies we
sequenced and expressed, neutralizing all of the common influenza A subtypes that infect humans (Figure 5A). Structural characterization of antibody 31.a.83 in complex with A/Hong Kong/14-MA21-1/1968 (H3N2) HA (Table S5) revealed binding to the
conserved HA stem, primarily through CDR H3 residues, many
of which were somatically hypermutated (Figures 5B5D; Table
S6). Antibody 31.a.83 utilized hydrophobic residues to contact
an epitope adjacent to helix A and involving the Trp21HA2 pocket
with its CDR H3 parallel to helix A in a manner that avoided the
conserved N-glycan at Asn38HA1 present in group 2 HAs.
A second lineage, h from subject 56 (lineage 56.h), also
derived from the same germline genes as antibody 31.a.83 (Figures 5B and S6). However, critical CDR H3 contact residues of
antibody 31.a.83 were not conserved and a four-amino acid shift
relative to the V-gene was observed (Figure 5B). Moreover, the
representative antibody we expressed and analyzed from this
second lineage, antibody 56.h.01, neutralized H1, H2, and H9
subtypes, but not H3, H5, or H7 subtypes (Figures 1E and S4;
Table S4). Together, these observations indicated lineages
31.a and 56.h to have different modes of recognition, providing
an example of influenza A-targeting antibodies that derived
Cell 166, 609623, July 28, 2016 615
HV1-18 Q-x-x-V
92
94
96
98
99
100a
100c 100d
100f
100h
100j
101
103
92
94
96
98
99
100a
100c 100d
100f
100h
100j
101
103
16.a.26
heavy chain
16.a.26
footprint
Leu 100c
16.a.26
light chain
CDR H3
NAG 38HA1
Gly 33
Val 100a
90
170
Trp 21HA2
Gln 98
Ser 52
CDR H3
Gln 98
Tyr 53
Thr 54
Gln 42HA2
16.a.26
heavy chain
Helix A
Trp 21HA2
16.g.07
footprint
16.g.07
light chain
16.g.07
heavy chain
Pro
100c
CDR H3
Val 100a
Gly 33
NAG 38HA1
90
Gln 98
170
Trp 21HA2
CDR H3
Gln 98
Tyr 53
Thr 54
Ser 52
Gln 42 HA2
16.g.07
heavy chain
Helix A
Trp 21HA2
HD Motif: Val 100a

All 34 D genes compatible
HV Contact Residue Motif : Tyr 53

HV1-18, HV1-45, HV1-68,
HV1-NL1, HV7-34-1, and HV7-81
HV Structural Motif:
Gly33 and Ser52/Thr52
HV1-18 with compatible alleles
HV1-18*01, HV1-18*02,
HV1-18*03, and HV1-18*04
Figure 4. A Multidonor HV1-18 Class of Broadly Neutralizing Antibodies with Q-x-x-V Motif
(A) Immunoglobulin heavy chains utilizing germline genes HV1-18, HD2-2 or HD2-15, and HJ5 or HJ2, with sequences annotated as described in Figure 2A.
(B) Neutralization breadth-potency curve for HV1-18 (Q-x-x-V) class antibodies on a panel of influenza A viruses that includes subtypes known to infect
humans.
(C) Co-crystal structure of Fab 16.a.26 (HV1-18, HD2-2, HJ5) in complex with H3-HK68. Fab heavy and light chains colored orange and lavender respectively and
depicted in surface representation while HA is depicted in ribbon and shown as a trimer. Note that labels for heavy chain residues do not explicitly show HC.
(D) The 16.a.26 CDR H3 is depicted with junction-encoded and mutated residues colored as in (A) and germline-encoded residues in orange with the antibody
footprint on the HA outlined in black.
(E) Antibody heavy chain depicted in surface representation with CDR H3 loop in ribbon. 16.a.26 Gln98HC occupies a turn in the CDR H3, which interacts with the
conserved residue Gln42HA2.
(F) Co-crystal structure of Fab 16.g.07 (HV1-18, HD2-15, and HJ2) in complex with A/Hong Kong/1-4-MA21-1/1968 (H3N2) HA depicted as in (C).
(G) The 16.g.07 CDR H3 depicted as in (D) with the antibody footprint on the HA outlined in black.
616 Cell 166, 609623, July 28, 2016
92
94
96
98
A/Hong Kong/1/1968
H3 HA0
100c
100
100f
100h
31.a.83
footprint
31.a.83
footprint
31.a.83
light chain
Arg 31LC
31.a.83
light chain Ile 100
Tyr 91LC
Ile 100
70
Leu 100c
Leu 100c
Trp 21HA2
CDR H2
Met 100d
31.a.83
heavy chain
Lineage-unique group 1-2

neutralizing antibody epitopes
FI6v3
HV3-30
15% HV SHM
22aa CDR H3
39.29
HV3-30
10% HV SHM
18aa CDR H3
(Corti et al., 2011)
(Nakamura et al., 2013)
31.a.83
HV3-23
8% HV SHM
24aa CDR H3
Fusion
peptide
CDR H2
Multidonor group 1-2

neutralizing antibody epitopes
56.a.09
HV6-1
4% HV SHM
16aa CDR H3
31.b.09
HV1-18
4% HV SHM
15aa CDR H3
16.a.26
HV1-18
8% HV SHM
21aa CDR H3
16.g.07
HV1-18
11% HV SHM
21aa CDR H3
CT149
HV1-18
12% HV SHM
19aa CDR H3
CR9114
HV1-69
14% HV SHM
14aa CDR H3
(Wu et al., 2015)
(Dreyfus et al., 2012)
Figure 5. A Conserved Site of Group 1 and Group 2 Influenza A Virus Vulnerability Targeted by Multidonor and Lineage-Unique Antibodies
(A) Neutralization breadth-potency curve for HV3-23-derived antibodies on a panel of influenza A viruses that includes subtypes known to infect humans.
(B) Immunoglobulin heavy chains utilizing germline genes HV3-23, HD3-9, and HJ6, with sequences annotated as described in Figure 2A.
(C) Co-crystal structure of Fab 31.a.83 in complex with A/Hong Kong/1-4-MA21-1/1968 (H3N2) HA.
(D) Epitope for antibody 31.a.83 (outlined in black) with heavy chain depicted in yellow and junction-encoded residues (highlighted in blue), mutated residues (in
red), and germline-encoded residues (in gold) with light chain depicted in tan, HA protomer1 in gray, and HA protomer2 in dark gray. Note that labels for heavychain residues do not explicitly show HC.
(E and F) Antibodies capable of neutralizing group 1 and group 2 influenza A viruses recognize overlapping epitopes within the HA stem (colored red). The HA
surface is colored blue for structures determined previously (FI6v3 PDB: 3ZTN; 39.29 PDB: 4KVN; CT149 PDB: 4R8W; CR9114 PDB: 4FQI), and gray for
structures determined in the current study.
See also Figures S3, S4, S5, and S6 and Tables S3, S4, S5, S6, and S7.
from the same heavy chain-VDJ genes, but did not use the same
mode of recognition nor share convergent development.
Thus, antibody lineages may be multidonor (common or public), meaning that they are observed in different individuals and
share the same genetic elements and mode of recognition (Henry Dunand and Wilson, 2015; Zhou et al., 2013), or unique (uncommon or private), meaning that they have only been observed
a single time. We observed no substantial difference between
epitopes of multidonor and unique antibodies capable of neutralizing group 1 and group 2 influenza A viruses (Figures 5E and 5F),
nor did we observe segregation in antibody approach to HA used
by multidonor or unique antibodies (Figures S3E and S3F). Antibodies from multidonor lineages did, however, have lower SHM
(averaging 8% for multidonor versus 11% for the unique). The
lower SHM suggested that multidonor antibodies may undergo
more parsimonious development. We note that unique lineages
(H) The 16.g.07 CDR H3 is depicted as in (E) with Gln98HC contacting Gln42HA2.
(I) Analysis of D-gene compatibility.
(J) Analysis of V-gene compatibility.
Cell 166, 609623, July 28, 2016 617
Sequence
signature*
Class
PostVRC310
HeavyLight
(515,594#)
DeKosky et al.
Heavy-Light
partial
sequences
(3,019,679)
Pre-TIV
Jiang et al.
Heavy-only
(759,337)
Post-TIV
Jiang et al.
Heavy-only
(3,045,513)
Healthy
normal
donors
Heavy-only
(1,239,173)
HV6-1
+HD3-3
VH6-1 + D3-3
98MIFGI
CDR H3 = 16
21
(0.004%)
3
0
(0%)
0
0
(0%)
0
13
(0.0004%)
1
0
(0%)
0
HV1-18
+HD3-9
VH1-18
96RxxILTG
CDR H3 = 15
16
(0.003%)
3
17
(0.0006%)
1
0
(0%)
0
123
(0.004%)
3 (2)
64
(0.0052%)
2 (1)
VH1-18
53Y54T 98QxxV
CDR H3 = 17-21
309
(0.06%)
14
0
(0%)
0
0
(0%)
0 (1)
242
(0.008%)
3
2
(0.00016%)
1
*Class signature used for transcript identification

#515,594 B cells were sorted of which 645
yielded sequences used in frequency calculations
HD gene family
Subject number
16
> 30 sequences
2
5 67
HV gene family
HV6-1+HD3-3
HV1-18+HD3-9
HV1-18 (Q-x-x-V)
31
10-30 sequences
HV6-1+HD3-3.1.SRP015957H+56.a.09 L
+F10 competition
HV6-1+HD3-3.2.SRP015957H+54.f.01 L
+F10 competition
HV6-1+HD3-3.2.SRP015957H+56.a.09 L
+F10 competition
54
< 10 sequences
HV1-18+HD3-9
HV1-18 Q-x-x-V
HV1-18+HD3-9.1.SRP047462H+31.b.09 L
+F10 competition
HV1-18+HD3-9.1.SRP047462
+F10 competition
HV1-18+QxxV.1. SRP047462
HV1-18+QxxV.2. SRP047462
HV1-18+QxxV.3. SRP047462
No. of multidonor class sequences

(Sequence frequency)
No. of unique lineages
HV-HD repertoire of cross-reactive antibodies by subject
6
5
4
3
2
1
HA-stem binding of sequence signature-identified antibodies

HV6-1+HD3-3
56
1
36
Clustering of VRC 310 and sequence signature-identified antibodies

based on neutralization data
HV6-1+HD3-3
HV1-18+HD3-9
HV1-18 (Q-x-x-V)
Neutralization (recipricol IC50 titer (g/ml))
HV1-18
(Q-x-x-V)
Prevalence of multidonor antibodies against influenza A
Neutralization of sequence signature-identified antibodies

100
HV6-1+
HD3-3
HV1-18+
HD3-9
Control antibodies
Limit of detection (>50 g/ml)

Median IC50
10
0.1
0.01
0.001
0.0001
0.296 0.274 50.0 50.0
0.050 0.015 0.55
50.0 50.0
H
V6
-1
+H
H
D
V6
3-1
3.
+H
2.
H
SR
D
V1
3P0
-1
15
8+ 3.1
.S
95
H
R
D
7
P0
39.
15 H _5
1.
SR 957 4.f.
01
H
P0
H_
V1
L
56
47
-1
.a
46
8+
.0
2
H
9L
D
3- H -31
9.
.b
1.
SR .09
L
P0
47
46
2
C
R9
11
4
FI
6V
3
C
R6
26
1
C
R8
02
0
C
H6
5
Group 1 Strains
H1N1-A/South Carolina/1/1918
H1N1-A/Puerto Rico/8/1934
H1N1-A/New Caledonia/20/1999
H1N1-A/Califonia/04/2009
H2N2-A/Singapore/1/1957
H2N2-A/Canada/720/2005
H5N1-A/Vietnam/1203/2004
H5N1-A/Indonesia/05/2005
H9N2-A/Hong Kong/1073/1999
Group 2 Strains
H3N2-A/Hong Kong/1/1968
H3N2-A/Beijing/353/1989
H3N2-A/Perth/16/2009
H7N7-A/Netherlands/219/2003
H7N9-A/AnHui/1/2013
Influenza B
INF B-Brisbane/60/2008
0.2
Figure 6. Sequence Signatures of Multidonor Antibody Lineages

(A) Multidonor class antibodies capable of neutralizing group 1 and group 2 influenza A viruses: sequence signature, number, and frequency of signatureidentified class members. The number of singleton transcripts from 454-derived NGS data is reported in italicized font; these were omitted from transcript and
lineage quantifications as described in the Supplemental Experimental Procedures. Additional sequences MEDI8852 (Kallewaard et al., 2016) and SFV005-2G02
(Li et al., 2012) have been identified, which correspond to the HV6-1+HD3-3 and HV1-18+HD3-9 classes, respectively.
(B) HV-HD germline-gene origin plots. One box shown per subject with antibody frequencies plotted as circles at their HV (horizontal), HD (vertical) coordinates.
The size of the plotted circle corresponds to the number of antibody sequences as shown in key at left. Multidonor antibodies in different subjects connected by
lines colored according to each multidonor class.
(C) Binding and competition MSD-ECLIA for antibodies identified in (DeKosky et al., 2015; Jiang et al., 2013) databases; competition with stem antibody F10
highlighted in red.
(D) Neutralization of influenza pseudoviruses. Median IC50 for each antibody is indicated by a horizontal line with value (mg/ml) shown at the base of the graph.
(E) Clustering of antibodies based on their neutralization fingerprints on a 15-virus panel.
See also Figure S6 and Tables S3, S4, and S7.
accounted for the majority of cross-reactive B cells in four of the

six analyzed VRC 310 donors (1, 31, 36, and 56); in light of the
positive functional characteristics of several of the unique lineages (e.g., 31.a.83), it seems likely that these antibodies would
contribute to a protective response. As the number of donors
with sequenced cross-reactive memory B cells increases, we
618 Cell 166, 609623, July 28, 2016
would expect some of the antibodies described here as

unique to be observed in other donors.
Sequence Signatures for Multidonor Antibody Classes
We analyzed the multidonor antibodies identified here for
class-specific sequence signatures (Figure 6A) as a means to
quantify class transcripts and to identify potential class members in other subjects. Notably, despite a sequence signature
requiring residue 54HC to be altered by affinity maturation, the
HV1-18 (Q-x-x-V) class accounted for over half of the H5+ and
H3+ cross-reactive B cells we sequenced (Figure 6A) and could
be found in half the analyzed vaccinees (Figure 6B).
For other subjects, we searched published human antibody
datasets, both those with paired heavy-light sequences
(DeKosky et al., 2015), as well as those with heavy chain-only sequences (Figure 6A; Table S1C). Searches with the HV61+HD3-3 signature did not yield sequence matches in paired
heavy-light sequences, but in the published heavy chain-only
datasets, we found 13 matches, which appeared to derive
from a single lineage (Figure S6). We chose both consensus as
well as the sequence closest to consensus to synthesize and
reconstitute with light chains of HV6-1+HD3-3 class antibodies
from subjects 54 and 56. One of these reconstituted antibodies
did not express, but the other three did and bound H1 HA in a
manner that could be competed with antibody F10 (Figure 6C).
We tested two of these antibodies and both neutralized group
1 and group 2 influenza A strains (Figure 6D; Table S3), and
the neutralization signatures (Georgiev et al., 2013) of the synthesized antibodies clustered in a dendrogram with known HV61+HD3-3 class antibodies (Figure 6E).
With the HV1-18+HD3-9 signature, we found 17 matches in
published paired heavy-light chain sequences, which appeared
to derive from a single lineage (Figure 6A). We synthesized
consensus sequences and reconstituted published heavy and
light chains as well as the published heavy chain and light chain
of this class from subject 31. Both of these reconstituted antibodies bound H1 hemagglutinin in a manner that could be
competed with antibody F10 (Figure 6C), neutralized group 1
influenza A strains (Figure 6D; Table S4), and clustered in a
neutralization dendrogram with known HV1-18+HD3-9 class
antibodies (Figure 6E). The neutralization breadth of these antibodies was lower than those isolated from VRC 310 subjects,
likely due to the use of germline sequences to complete the
CDR L1 and CDR L2 regions of this antibody (the somatic mutation of 27ELC to Ile is required for optimal recognition)
(Figure 3F).
With the HV1-18 (Q-x-x-V) signature, searches did not yield
sequence matches in paired heavy-light sequences, but in
published heavy chain-only datasets (Table S1C), we found
244 matches, which appeared to derive from 4 lineages (Figures 6A and S6). We synthesized four sequences (consensus
or closest NGS read) and reconstituted with the five light
chains used previously in swapping experiments (Table S7).
None of these reconstituted antibodies bound a set of HAs
(Figure 6C) or neutralized any of the 15 viruses in our neutralization panel. Analysis of the tested heavy chains indicated that
their CDR H3 length matched that of CT149 in three of four
cases, but was below the 78% identity threshold that correlated with function in heavy-light complementation of this class
(Table S7).
Altogether, the results indicate sequence signatures with sufficient specificity to identify other functional class members by
sequence alone could be obtained for two multidonor classes:
HV6-1+HD3-3 and HV1-18+HD3-9. The sequence signature
for the third class, HV1-18 (Q-x-x-V), was complicated by incompatibility of some heavy-light pairs from this class; nonetheless,
sequence searches for this third class did place an upper limit on
the prevalence of this multidonor antibody class in the searched
databases.
Vaccine Induction of Multidonor Broadly Neutralizing
Antibodies
In the VRC 310 trial, we observed a significant expansion
(p = 0.0284) of H5+H3+ memory B cells following H5 DNA
prime-MIV boost, ranging from an increase of 1.2- to 10.6-fold
(Figure 7A). Notably, subjects with the largest increases and
the highest frequencies of multidonor antibodies had the largest
percentages of antibodies belonging to the three multidonor
classes identified here (Figure 7B). Our initial observation of a
high number of transcripts with multiple genetic commonalities
may be explained by the preferential expansion of multidonor
class transcripts; indeed, the fold-increase in cross-reactive B
cells by subject correlated with the percentage of antibody sequences with inter-subject genetic commonalities (Figure 7C).
Importantly, the frequency of cross-reactive memory B cells
post-VRC 310 vaccination correlated with the sequence signature-identified prevalence of multidonor class antibodies
(p = 0.0045) (Figure S6D). Moreover, while we did not see significant correlation between fold increase in overall sera titer versus
an increase in cross-reactive memory B cells, we did observe a
significant correlation in titers with the H1N1 virus, A/Singapore/
8/1986, which we previously found to be especially sensitive to
neutralization by stem-directed antibodies (Lingwood et al.,
2012) (Figure 7D).
To quantify the frequency of multidonor class antibodies in
unvaccinated subjects, we examined NGS-determined memory
B cell transcripts from healthy normal donors (Figure 6A). To
compare frequencies from VRC 310-vaccinated versus unvaccinated subjects, we equated the total number of sorted memory
B cells with the total number of transcripts and observed a substantially higher transcript frequency (and to a lesser degree,
a higher lineage frequency) for multidonor class sequences on
H5N1 vaccination in the VRC 310 trial (Figure 7E).
We also examined published antibody sequences from subjects immunized with the 2009 or 2010 seasonal influenza
vaccine (Jiang et al., 2013) (Figures 6A, 7F, and S6). Overall, transcripts matching the HV1-18+HD3-9 signature were 103 more
prevalent prior to vaccination than the other multidonor transcripts; however, this class appeared not to expand on either
seasonal or VRC 310 vaccination. By contrast, transcripts
matching HV6-1+HD3-3 and HV1-18 (Q-x-x-V) signatures appeared to be present at low frequency prior to vaccination, to increase on seasonal vaccination, and to increase up to 1,000-fold
on VRC 310 vaccination (Figure 7F).
DISCUSSION
The vaccine induction of broadly neutralizing antibodies against
influenza A virus has been a long standing immunological goal.
While the human immune system can generate antibodies
capable of neutralizing group 1 and group 2 strains of influenza
(Corti et al., 2011; Dreyfus et al., 2012; Nakamura et al., 2013;
Cell 166, 609623, July 28, 2016 619
Figure 7. Vaccine Induction of Antibodies Capable of Neutralizing Group 1 and Group 2 Influenza A Viruses
(A) Frequencies of H5-H3 cross-reactive memory B cells pre- and post-H5N1 vaccination. Subjects for whom pre-immunization samples were no longer available
are indicated with open symbols; subject name and fold increase shown for others.
(B) Frequency of multidonor class sequences by donor and multidonor class.
(C) Fold increase in cross-reactive B cells relative to prevalence of heavy chain sequences with three (out of a possible four) of the same heavy chain genetic
elements in at least one sequence in any of the six analyzed subjects (Pearson correlation with the total number of sequences provided in Figure 1D).
(D) Fold increase in cross-reactive B cells relative to the fold increase in sera neutralization titer for all tested influenza A strains (shown in black) (see Figures 1 and
S1), or the single H1N1-A/Singapore/8/1986 strain (shown in red).
(E) Bar graphs of transcript frequencies (left) and lineage frequencies (right). Left: frequency of multidonor-class transcripts by dataset. Right: frequency of
multidonor class lineages for each dataset.
(F) Transcript frequency versus dataset and goal. Stars depict upper-bound estimates, and circles depict frequencies confirmed by neutralization.
(G and H) Multidonor antibodies displayed in ribbon with class-conserved contact residues shown in stick. Antibody epitopes shown in purple (HV6-1+HD3-3),
green (HV1-18+HD3-9), and orange (HV1-18 with Q-x-x-V) with black outlines. Glycans shown in surface representation and colored by conservation: conserved
(light green) or variable (dark green).
See also Figures S3 and S6 and Tables S3 and S4.
620 Cell 166, 609623, July 28, 2016
Wu et al., 2015), clear evidence of their induction by vaccination

had not been reported. In this study, we used a sequence-based
structural approach to identify three multidonor classes of antibodies capable of neutralizing group 1 and group 2 influenza A
viruses in VRC 310-vaccinated subjects. One multidonor class
utilized HV6-1+HD3-3 germline genes (Figure 2); a second class
utilized HV1-18+HD3-9 germline genes and a 15-amino acid
CDR H3 (Figure 3); and a third class utilized HV1-18 germline
gene and a Gln98HC-x-x-Val100aHC motif (Figure 4). For each
class, we delineated sequence signatures (Figure 6). Despite
the lack of serological indicators of vaccine-induced improvement in cross neutralization with VRC 310-vaccinated subjects
(Figure S1), we observed a significant vaccine-induced expansion of cross-reactive memory B cells (Figure 7A) and a clear increase in the frequency of transcripts for two of the multidonor
antibody classes (Figure 7F). These findings reveal the vaccine
induction of broadly neutralizing influenza A antibodies.
Stereotypic antibody signatures have been reported for some
bacterial polysaccharide antigens (Adderson et al., 1993), CD4induced, V1V2-directed, and VRC01 classes of HIV-1-neutralizing antibodies (Dosenovic et al., 2015; Gorman et al., 2016;
Huang et al., 2004; Jardine et al., 2015; Zhou et al., 2013), and
both stem- and head-directed influenza neutralizing antibodies
(Pappas et al., 2014; Schmidt et al., 2015). Thus, despite the potential repertoire of human immunoglobulins being large (Boyd
et al., 2009) and SHM further increasing diversity to the point
where highly similar modes of antigen recognition might be expected very infrequently, our findings reveal that multidonor classes can be induced by vaccination in humans (Figures 7 and S7).
The coexistence of multiple vaccine-induced pathways to
generate influenza group 1 and group 2 neutralizing antibodies
is encouraging for efforts aimed at achieving analogous responses in genetically diverse human populations. The stemdirected antibodies induced here potently neutralize in pseudotype assays (Figure S4), but less potently in live influenza A virus
assays (Table S4). While determination of the in vivo concentrations of stem antibodies required for protective efficacy may
require passive infusion trials in humans, it seems likely that a
protective response will require higher titers of group 1 and
group 2 neutralizing antibodies than achieved in the VRC 310
trial. Our NGS-based measurements indicated VRC 310 vaccination to boost the frequency of transcripts from two multidonor
classes substantially toward the target goal, and we even
observed increases after seasonal vaccination (Figure 7F). In
this regard, sequence signature-based quantification may provide a suitably sensitive technology to detect transcript frequencies of group 1 and group 2 neutralizing antibodies, to
measure their expression in appropriate memory B cell subsets
and long-lived plasma cells, and to assess their durability.
Appropriate SHM is an additional aspect, which we investigated
by analyzing the recognition of germline-reverted versions of
each of the three multidonor classes as well as the effect of
mutational analysis of critical contacts (Figure S7). Further
studies aimed at increasing the prevalence of group 1 and group
2 neutralizing influenza antibody lineagesguided by the
sequence signatures identified heremay provide a means to
achieve the protective efficacy required of a universal influenza
A vaccine. In this regard, it is helpful to know that 100- to
1,000-fold increases in the transcript frequencies for two multidonor classes of influenza A neutralizing antibodies could be
achieved through immunization with a divergent influenza (Figures 6A and 7F), likely by enhancing immune focus to the HA
stem (Figures 7G and 7H), an approach utilized by stem-only
or headless immunogens (Impagliazzo et al., 2015; Yassine
et al., 2015) and by chimeric HA immunogens (Krammer et al.,
2015). It will be fascinating to evaluate how these immunogens,
in various vectored, subunit, and prime-boost combinations, will
fare at further inducing, maintaining, or expanding the multidonor
broadly neutralizing antibodies identified here.
Ethics Statement and VRC 310 Study Design
The VRC 310 study protocol and associated procedures were approved by the
National Institute of Allergy and Infectious Diseases (NIAID) Institutional Review Board. All participants provided written informed consent in accordance
with the Declaration of Helsinki. The VRC 310 study (ClinicalTrials.gov identifier
NCT01086657) (Ledgerwood et al., 2011, 2013) was conducted by the Vaccine
Research Center, NIH. See Table S1 and the Supplemental Experimental Procedures for details.
Expression of HA Probes, Flow Cytometry, Cell Sorting, and
Sequencing
HA constructs consisting of the extracellular domain of HA modified to ablate
sialic acid binding and C-terminally fused to (1) a T4-fibritin trimerization motif,
(2) a biotinylatable AviTag sequence, and (3) a hexahistidine affinity tag were
synthesized (Genscript), cloned into a CMV-expression plasmid, and expressed as previously described (Whittle et al., 2014). Cryopreserved PBMC
samples were stained and sorted on a fluorescence-activated cell sorting
(FACS) Aria II using fluorescently labeled recombinant H1 (A/New Caledonia/
20/1999), H5 (A/Indonesia/05/2005), or H3 (A/Perth/16/2009) probes; single
memory B cells binding to both H5 and H3 probes were sorted, and
sequencing of immunoglobulin genes by multiplex PCR was performed as previously described (Whittle et al., 2014). See the Supplemental Experimental
Procedures for details.
Production of Pseudotyped Lentiviral Vectors and Influenza A
Viruses and Measurement of Antibody Neutralizing Activity
Influenza HA pseudotyped lentiviral vectors expressing a luciferase reporter
gene were produced and used to infect 293A cells. All influenza viruses
used in the microneutralization assays were expanded in Madin-Darby canine
kidney epithelial (MDCK) cells in the presence of Tosyl phenylalanyl chloromethyl ketone (TPCK)-treated trypsin (Sigma) and titrated in MDCK cells.
See the Supplemental Experimental Procedures for details.
Structural Analysis and Sequence Bioinformatics
Both negative stain-EM and X-ray crystallography were used to characterize
antibodies from the VRC 310 trial and their complexes with HA. An Antibodyomics1.0 pipeline, modified to analyze both 454 and Illumina output,
was used to analyze B cell transcripts for the presence of sequence signatures specific to multidonor antibodies. Frequentist probabilities were used
to determine likelihoods of sequence convergence and germline prevalence
in the human population. See the Supplemental Experimental Procedures for
details.
ACCESSION NUMBERS
The accession numbers for the coordinates and structure factors reported in
this paper are Protein Data Bank (PDB): 5K9J, 5K9K, 5K9O, 5K9Q, 5KAN,
and 5KAQ. The accession numbers for the sequences for all of the antibodies
reported in this paper are GenBank: KX386124KX387227. The accession
numbers for the NGS data reported in this paper are Short Reads Archive:
SRP026397 and SRP073039.
Cell 166, 609623, July 28, 2016 621
seven figures, and seven tables and can be found with this article online at
M.G.J., A.K.W., P.V.T., G.-Y.C., C.S., P.D.K., J.R.M., and A.B.M. conceived,
designed, and coordinated the study. M.G.J., A.K.W., P.V.T., G.-Y.C., C.S.,
L.S., P.D.K., J.R.M., and A.B.M wrote and revised the manuscript and figures.
B.S.G. and J.E.L. carried out VRC 310 trial and provided subject samples.
R.A.K. and M.R. provided support for B cell analysis and sorting. J.C.B.,
M.K., and J.R.W. designed HA constructs. M.G.J., A.K.W., P.V.T., R.T.B.,
A.D., R.A.G., M.K., W.-P.K., K.L., S.N.N., M.S.P., E.S.Y., B.Z., Y.Z., M.A.,
S.D., C.R.L., A.R., L.W., X.W., H.M.Y., C.S., Y.M., Y.T., U.B., and NISC CSP
performed experiments. M.G.J., G.Y.-C., C.S., C.-H.S., and P.D.K. carried
out bioinformatics analysis. M.G.J., A.K.W., P.V.T., G.-Y.C., C.S., R.T.B.,
I.S.G., M.K., W.-P.K., K.L., T.B., S.D., Y.T., U.B., J.C.M., K.S., D.C.D., L.S.,
P.D.K., J.R.M., and A.B.M. analyzed data. All authors read and approved the
manuscript. Detailed author contributions are provided in Supplemental
Information.
ACKNOWLEDGMENTS
We thank D. Ambrosak and R. Nguyen for assistance with flow cytometry;
J. Chrzas, J. Gonczy, U. Chinte, and staff at Southeast Regional Collaborative Access Team (SER-CAT) for help with X-ray diffraction data collection;
G. Georgiou and S.R. Quake for human antibody sequences; J. Stuckey
for assistance with figures; and members of the Structural Biology Section,
Structural Bioinformatics Core Section and Virology Laboratory of the Vaccine Research Center for helpful comments. Support for this work was provided by the Intramural Research Program of the Vaccine Research Center
and the Division of Intramural Research, National Institute of Allergy and Infectious Diseases, NIH. This work was supported in part with federal funds
from the Frederick National Laboratory for Cancer Research, NIH, under
contract HHSN261200800001E. Use of insertion device 22 (SER-CAT) at
the Advanced Photon Source was supported by the U.S. Department of
Energy, Basic Energy Sciences, Office of Science, under contract W-31109-Eng-38.
REFERENCES
Abecasis, G.R., Auton, A., Brooks, L.D., DePristo, M.A., Durbin, R.M., Handsaker, R.E., Kang, H.M., Marth, G.T., and McVean, G.A.; 1000 Genomes
Project Consortium (2012). An integrated map of genetic variation from
1,092 human genomes. Nature 491, 5665.
Adderson, E.E., Shackelford, P.G., Quinn, A., Wilson, P.M., Cunningham,
M.W., Insel, R.A., and Carroll, W.L. (1993). Restricted immunoglobulin VH usage and VDJ combinations in the human response to Haemophilus influenzae
type b capsular polysaccharide. Nucleotide sequences of monospecific antiHaemophilus antibodies and polyspecific antibodies cross-reacting with self
antigens. J. Clin. Invest. 91, 27342743.
DeKosky, B.J., Kojima, T., Rodin, A., Charab, W., Ippolito, G.C., Ellington,
A.D., and Georgiou, G. (2015). In-depth determination and analysis of the human paired heavy- and light-chain antibody repertoire. Nat. Med. 21, 8691.
Dosenovic, P., von Boehmer, L., Escolano, A., Jardine, J., Freund, N.T., Gitlin,
A.D., McGuire, A.T., Kulp, D.W., Oliveira, T., Scharf, L., et al. (2015). Immunization for HIV-1 Broadly Neutralizing Antibodies in Human Ig Knockin Mice.
Cell 161, 15051515.
Dreyfus, C., Laursen, N.S., Kwaks, T., Zuijdgeest, D., Khayat, R., Ekiert, D.C.,
Lee, J.H., Metlagel, Z., Bujny, M.V., Jongeneelen, M., et al. (2012). Highly
conserved protective epitopes on influenza B viruses. Science 337, 1343
1348.
Ekiert, D.C., Bhabha, G., Elsliger, M.A., Friesen, R.H., Jongeneelen, M.,
Throsby, M., Goudsmit, J., and Wilson, I.A. (2009). Antibody recognition of a
highly conserved influenza virus epitope. Science 324, 246251.
Ekiert, D.C., Friesen, R.H., Bhabha, G., Kwaks, T., Jongeneelen, M., Yu, W.,
Ophorst, C., Cox, F., Korse, H.J., Brandenburg, B., et al. (2011). A highly
conserved neutralizing epitope on group 2 influenza A viruses. Science 333,
843850.
Georgiev, I.S., Doria-Rose, N.A., Zhou, T., Kwon, Y.D., Staupe, R.P., Moquin,
S., Chuang, G.Y., Louder, M.K., Schmidt, S.D., Altae-Tran, H.R., et al. (2013).
Delineating antibody recognition in polyclonal sera from patterns of HIV-1
isolate neutralization. Science 340, 751756.
Gorman, J., Soto, C., Yang, M.M., Davenport, T.M., Guttman, M., Bailer, R.T.,
Chambers, M., Chuang, G.Y., DeKosky, B.J., Doria-Rose, N.A., et al.; NISC
Comparative Sequencing Program (2016). Structures of HIV-1 Env V1V2
with broadly neutralizing antibodies reveal commonalities that enable vaccine
design. Nat. Struct. Mol. Biol. 23, 8190.
Henry Dunand, C.J., and Wilson, P.C. (2015). Restricted, canonical, stereotyped and convergent immunoglobulin responses. Philos. Trans. R. Soc.
Lond. B Biol. Sci. 370 http://dx.doi.org/10.1098/rstb.2014.0238.
Huang, C.C., Venturi, M., Majeed, S., Moore, M.J., Phogat, S., Zhang, M.Y.,
Dimitrov, D.S., Hendrickson, W.A., Robinson, J., Sodroski, J., et al. (2004).
Structural basis of tyrosine sulfation and VH-gene usage in antibodies that
recognize the HIV type 1 coreceptor-binding site on gp120. Proc. Natl.
Acad. Sci. USA 101, 27062711.
Impagliazzo, A., Milder, F., Kuipers, H., Wagner, M.V., Zhu, X., Hoffman, R.M.,
van Meersbergen, R., Huizingh, J., Wanningen, P., Verspuij, J., et al. (2015).
A stable trimeric influenza hemagglutinin stem as a broadly protective immunogen. Science 349, 13011306.
Jardine, J.G., Ota, T., Sok, D., Pauthner, M., Kulp, D.W., Kalyuzhniy, O., Skog,
P.D., Thinnes, T.C., Bhullar, D., Briney, B., et al. (2015). HIV-1 VACCINES.
Priming a broadly neutralizing antibody response to HIV-1 using a germlinetargeting immunogen. Science 349, 156161.
Jiang, N., He, J., Weinstein, J.A., Penland, L., Sasaki, S., He, X.S., Dekker,
C.L., Zheng, N.Y., Huang, M., Sullivan, M., et al. (2013). Lineage structure of
the human antibody repertoire in response to influenza vaccination. Sci.
Transl. Med. 5, 171ra19.
Kabat, E.A., Wu, T.T., Perry, H., Gottesman, K., and Foeller, C. (1991). Sequences of Proteins of Immunological Interest, Fifth Edition (NIH Publication
No. 91-3242).
Boyd, S.D., Marshall, E.L., Merker, J.D., Maniar, J.M., Zhang, L.N., Sahaf, B.,
Jones, C.D., Simen, B.B., Hanczaruk, B., Nguyen, K.D., et al. (2009). Measurement and clinical monitoring of human lymphocyte clonality by massively parallel VDJ pyrosequencing. Sci. Transl. Med. 1, 12ra23.
Kallewaard, N.L., Corti, D., Collins, P.J., Neu, J., McAuliffe, J.M., Benjamin, E.,
Wachter-Rosati, L., Palmer-Hill, F.J., Yuan, A.Q., Walker, P.A., et al. (2016).
Structure and function analysis of an antibody recognizing all influenza A subtypes. Cell 166, this issue, 596608.
Butt, K.M., Smith, G.J., Chen, H., Zhang, L.J., Leung, Y.H., Xu, K.M., Lim, W.,
Webster, R.G., Yuen, K.Y., Peiris, J.S., and Guan, Y. (2005). Human infection
with an avian H9N2 influenza A virus in Hong Kong in 2003. J. Clin. Microbiol.
43, 57605767.
Kashyap, A.K., Steel, J., Rubrum, A., Estelles, A., Briante, R., Ilyushina, N.A.,
Xu, L., Swale, R.E., Faynboym, A.M., Foreman, P.K., et al. (2010). Protection
from the 2009 H1N1 pandemic influenza by an antibody from combinatorial
survivor-based libraries. PLoS pathogens 6, e1000990.
622 Cell 166, 609623, July 28, 2016
Khurana, S., Wu, J., Dimitrova, M., King, L.R., Manischewitz, J., Graham, B.S.,
Ledgerwood, J.E., and Golding, H. (2013). DNA priming prior to inactivated
influenza A(H5N1) vaccination expands the antibody epitope repertoire and increases affinity maturation in a boost-interval-dependent manner in adults.
J. Infect. Dis. 208, 413417.
Krammer, F., Palese, P., and Steel, J. (2015). Advances in universal influenza
virus vaccine design and antibody mediated therapies based on conserved regions of the hemagglutinin. Curr. Top. Microbiol. Immunol. 386, 301321.
Kwong, P.D., and Mascola, J.R. (2012). Human antibodies that neutralize
HIV-1: identification, structures, and B cell ontogenies. Immunity 37, 412425.
Ledgerwood, J.E., Wei, C.J., Hu, Z., Gordon, I.J., Enama, M.E., Hendel, C.S.,
McTamney, P.M., Pearce, M.B., Yassine, H.M., Boyington, J.C., et al.; VRC
306 Study Team (2011). DNA priming and influenza vaccine immunogenicity:
two phase 1 open label randomised clinical trials. Lancet Infect. Dis. 11,
916924.
Ledgerwood, J.E., Zephir, K., Hu, Z., Wei, C.J., Chang, L., Enama, M.E., Hendel, C.S., Sitar, S., Bailer, R.T., Koup, R.A., et al.; VRC 310 Study Team (2013).
Prime-boost interval matters: a randomized phase 1 study to identify the minimum interval necessary to observe the H5 DNA influenza vaccine priming effect. J. Infect. Dis. 208, 418422.
Li, G.M., Chiu, C., Wrammert, J., McCausland, M., Andrews, S.F., Zheng, N.Y.,
Lee, J.H., Huang, M., Qu, X., Edupuganti, S., et al. (2012). Pandemic H1N1
influenza vaccine induces a recall response in humans that favors broadly
cross-reactive memory B cells. Proc. Natl. Acad. Sci. USA 109, 90479052.
Lingwood, D., McTamney, P.M., Yassine, H.M., Whittle, J.R., Guo, X., Boyington, J.C., Wei, C.J., and Nabel, G.J. (2012). Structural and genetic basis for
development of broadly neutralizing influenza antibodies. Nature 489,
566570.
Morens, D.M., Taubenberger, J.K., and Fauci, A.S. (2013). H7N9 avian influenza A virus and the perpetual challenge of potential human pandemicity.
MBio 4. http://dx.doi.org/10.1128/mBio.00445-13.
Nakamura, G., Chai, N., Park, S., Chiang, N., Lin, Z., Chiu, H., Fong, R., Yan, D.,
Kim, J., Zhang, J., et al. (2013). An in vivo human-plasmablast enrichment
technique allows rapid identification of therapeutic influenza A antibodies.
Cell Host Microbe 14, 93103.
Pappas, L., Foglierini, M., Piccoli, L., Kallewaard, N.L., Turrini, F., Silacci, C.,
Fernandez-Rodriguez, B., Agatic, G., Giacchetto-Sasselli, I., Pellicciotta, G.,
et al. (2014). Rapid development of broadly influenza neutralizing antibodies
through redundant mutations. Nature 516, 418422.
Schmidt, A.G., Therkelsen, M.D., Stewart, S., Kepler, T.B., Liao, H.X., Moody,
M.A., Haynes, B.F., and Harrison, S.C. (2015). Viral receptor-binding site antibodies with diverse germline origins. Cell 161, 10261034.
Subbarao, K., Klimov, A., Katz, J., Regnery, H., Lim, W., Hall, H., Perdue, M.,
Swayne, D., Bender, C., Huang, J., et al. (1998). Characterization of an avian
influenza A (H5N1) virus isolated from a child with a fatal respiratory illness. Science 279, 393396.
Sui, J., Hwang, W.C., Perez, S., Wei, G., Aird, D., Chen, L.M., Santelli, E., Stec,
B., Cadwell, G., Ali, M., et al. (2009). Structural and functional bases for broadspectrum neutralization of avian and human influenza A viruses. Nat. Struct.
Mol. Biol. 16, 265273.
Throsby, M., van den Brink, E., Jongeneelen, M., Poon, L.L., Alard, P., Cornelissen, L., Bakker, A., Cox, F., van Deventer, E., Guan, Y., et al. (2008). Heterosubtypic neutralizing monoclonal antibodies cross-protective against H5N1
and H1N1 recovered from human IgM+ memory B cells. PloS one 3, e3942.
Wheatley, A.K., Whittle, J.R., Lingwood, D., Kanekiyo, M., Yassine, H.M., Ma,
S.S., Narpala, S.R., Prabhakaran, M.S., Matus-Nicodemos, R.A., Bailer, R.T.,
et al. (2015). H5N1 vaccine-elicited memory B Cells are genetically constrained by the IGHV locus in the recognition of a neutralizing epitope in the
hemagglutinin stem. J. Immunol. 195, 602610.
Whittle, J.R., Zhang, R., Khurana, S., King, L.R., Manischewitz, J., Golding, H.,
Dormitzer, P.R., Haynes, B.F., Walter, E.B., Moody, M.A., et al. (2011). Broadly
neutralizing human antibody that recognizes the receptor-binding pocket of
influenza virus hemagglutinin. Proc. Natl. Acad. Sci. USA 108, 1421614221.
Whittle, J.R., Wheatley, A.K., Wu, L., Lingwood, D., Kanekiyo, M., Ma, S.S.,
Narpala, S.R., Yassine, H.M., Frank, G.M., Yewdell, J.W., et al. (2014). Flow cytometry reveals that H5N1 vaccination elicits cross-reactive stem-directed antibodies from multiple Ig heavy-chain lineages. J. Virol. 88, 40474057.
Wu, Y., Cho, M., Shore, D., Song, M., Choi, J., Jiang, T., Deng, Y.Q., Bourgeois, M., Almli, L., Yang, H., et al. (2015). A potent broad-spectrum protective
human monoclonal antibody crosslinking two haemagglutinin monomers of
influenza A virus. Nat. Commun. 6, 7708.
Yassine, H.M., Boyington, J.C., McTamney, P.M., Wei, C.J., Kanekiyo, M.,
Kong, W.P., Gallagher, J.R., Wang, L., Zhang, Y., Joyce, M.G., et al. (2015).
Hemagglutinin-stem nanoparticles generate heterosubtypic influenza protection. Nat. Med. 21, 10651070.
Zhou, T., Zhu, J., Wu, X., Moquin, S., Zhang, B., Acharya, P., Georgiev, I.S.,
Altae-Tran, H.R., Chuang, G.Y., Joyce, M.G., et al.; NISC Comparative
Sequencing Program (2013). Multidonor analysis reveals structural elements,
genetic determinants, and maturation pathway for HIV-1 neutralization by
VRC01-class antibodies. Immunity 39, 245258.
Cell 166, 609623, July 28, 2016 623
Article
Hexokinase Is an Innate Immune Receptor for the

Detection of Bacterial Peptidoglycan
Graphical Abstract
Authors
Andrea J. Wolf, Christopher N. Reyes,
Wenbin Liang, ..., K. Mark Coggeshall,
Moshe Arditi, David M. Underhill
Correspondence
david.underhill@csmc.edu
In Brief
The metabolic enzyme hexokinase
unexpectedly acts as a pattern
recognition receptor that recognizes
bacterial peptidoglycan and triggers
activation of inflammasomes.
Highlights
d
Peptidoglycan-derived N-acetylglucosamine activates the

NLRP3 inflammasome
N-acetylglucosamine in the cytosol is detected by
hexokinase
Hexokinase release from mitochondrial outer membranes
triggers NLRP3 activation
Metabolic conditions affecting hexokinase activity trigger
inflammasome formation
Wolf et al., 2016, Cell 166, 624636

July 28, 2016 2016 Published by Elsevier Inc.
Article
Hexokinase Is an Innate Immune Receptor for the
Detection of Bacterial Peptidoglycan
Andrea J. Wolf,1,2 Christopher N. Reyes,1 Wenbin Liang,3,6 Courtney Becker,1 Kenichi Shimada,2,4 Matthew L. Wheeler,1,2
Hee Cheol Cho,3,7 Narcis I. Popescu,5 K. Mark Coggeshall,5 Moshe Arditi,2,4 and David M. Underhill1,2,*
1F. Widjaja Foundation Inflammatory Bowel and Immunobiology Research Institute, Cedars-Sinai Medical Center, Los Angeles,
CA 90048, USA
2Division of Immunology, Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
3Cedars-Sinai Heart Institute, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
4Division of Pediatric Infectious Diseases, Cedars-Sinai Medical Center, Los Angeles, CA 90048, USA
5Immunobiology and Cancer Program, Oklahoma Medical Research Foundation, Oklahoma City, OK 73104, USA
6Present address: University of Ottawa Heart Institute and Department of Cellular and Molecular Medicine, University of Ottawa, Ottawa,
ON K1Y 4W7, Canada
7Present address: Departments of Biomedical Engineering and Pediatrics, Emory University, Atlanta, GA 30322, USA
*Correspondence: david.underhill@csmc.edu
SUMMARY
Degradation of Gram-positive bacterial cell wall

peptidoglycan in macrophage and dendritic cell
phagosomes leads to activation of the NLRP3 inflammasome, a cytosolic complex that regulates
processing and secretion of interleukin (IL)-1b and
IL-18. While many inflammatory responses to peptidoglycan are mediated by detection of its muramyl
dipeptide component in the cytosol by NOD2, we
report here that NLRP3 inflammasome activation is
caused by release of N-acetylglucosamine that is
detected in the cytosol by the glycolytic enzyme
hexokinase. Inhibition of hexokinase by N-acetylglucosamine causes its dissociation from mitochondria
outer membranes, and we found that this is sufficient
to activate the NLRP3 inflammasome. In addition,
we observed that glycolytic inhibitors and metabolic conditions affecting hexokinase function and
localization induce inflammasome activation. While
previous studies have demonstrated that signaling
by pattern recognition receptors can regulate metabolic processes, this study shows that a metabolic
enzyme can act as a pattern recognition receptor.
INTRODUCTION
Macrophages and dendritic cells play essential roles in initiating inflammation by releasing cytokines and chemokines in
response to pathogen-associated molecular patterns (PAMPs)
detected by innate immune receptors. Surface receptors, such
as surface Toll-like receptors (TLRs) and C-type lectin receptors,
detect extracellular PAMPs (Kumar et al., 2011). In addition, microbes internalized by phagocytes are enzymatically degraded,
releasing small molecules that are screened for potential danger
by a panel of intracellular innate immune receptors, such as
intracellular TLRs and Nod-like receptors. We and others have
found that degradation of Staphylococcus aureus in phagosomes is a key factor in determining the types and amounts
of inflammatory cytokines produced following phagocytosis (Ip
et al., 2010; Wolf et al., 2011; Muller et al., 2015). In particular,
we noted that production of interleukin (IL)-1b and IL-18 required
the degradation of S. aureus cell wall peptidoglycan (PGN) and
that this response is suppressed when the organism modifies
its PGN to become resistant to degradation (Shimada et al.,
2010).
IL-1b and IL-18 play essential roles in controlling bacterial infections, in part, by recruiting neutrophils to sites of infection
and polarizing T cell responses. Unlike many other cytokines,
IL-1b and IL-18 are transcribed as pro-cytokines in the cytosol.
Signaling to multiprotein complexes known as inflammasomes
activates caspase-1 to process and secrete the cytokines (Lamkanfi and Dixit, 2014; Martinon et al., 2004). While there are
several varieties of inflammasomes, the one responsible for responding to PGN is defined by the presence of NOD-like receptor family, pyrin domain-containing 3 (NLRP3). The mechanism
by which NLRP3 is activated by PGN is not known. It is generally
thought that all of the immunomodulatory activity of the S. aureus
PGN comes from the degradative release of muramyl dipeptide
(MDP), which is detected by cytosolic NOD2 receptor. However,
we observed that NLRP3 inflammasome activation in response
to S. aureus PGN was not affected by the loss of NOD2 (Shimada
et al., 2010). Thus, the fragment of PGN that must be generated
through degradation to activate the inflammasome and how it is
sensed have not been established.
Diverse particulate stimuli that activate the NLRP3 inflammasome have been identified, including crystals such as silica,
alum, asbestos, uric acid, and cholesterol (Dostert et al., 2008;
Duewell et al., 2010; Hornung et al., 2008; Martinon et al.,
2006). Like PGN particles, phagocytosis of these crystals is a
necessary step in the process leading to inflammasome activation. For crystalline particles, which are non-degradable and
non-microbial, it has been suggested that disruption of the
phagosomal compartment leads to NLRP3 inflammasome activation (Hornung et al., 2008). However, we have previously
observed that phagosomes containing PGN remain intact, and
624 Cell 166, 624636, July 28, 2016 2016 Published by Elsevier Inc.
Nlrp3-/-
0.8
3.0
0.6
2.0
0.5
0.7
2.5
Sup
***
0.3
***
0.2
dT
0.1
SA
PGN
pd
A:
U
T
AT
P
0.5
Strep
PGN
IL-1 p45
Lys
***
***
Tubulin
BS
PGN
D
UT
PGN
ATP
Nig
IL-1 (ng/ml)
4
3
2
1
0
5
34
64
77
K+ (mM)
92
121 150
S. aureus
ATP
16
14
12
10
8
6
4
2
0
E
UT
% LDH release
C
6
IL-1 p17
***
0.4
1.5
1.0
IL-1 (ng/ml)
WT
0.9
3.5
IL-1 (ng/ml)
***
1.0
4.0
T
AT
P
SA
St PG
re N
p
BS -PG
-P N
G
N
64 92 150
K+ (mM)
45
40
35
30
25
20
15
10
5
0
ATP
20
40
Time (h)
PGN
60
80
Figure 1. PGN Activates the NLRP3 Inflammasome Independent of Potassium Efflux and Cell Death
(A) LPS-primed BMDMs from wild-type and Nlrp3 / mice were untreated (UT) or stimulated with 5 mM ATP for 2 hr or pdA:dT or PGN from DoatA S. aureus (SA),
Streptomyces (Strep), or B. subtilis (BS) at 2040 mg/ml, and IL-1b was assayed in the supernatant after 6 hr.
(B) Immunoblot of mature IL-1b in supernatants (Sup) or pro-IL-1b in cell lysates (Lys) of wild-type macrophages stimulated as in (A).
(C and D) LPS-primed BMDMs were stimulated in the presence of increasing concentrations of extracellular KCl with (C) 20 mg/ml PGN 6 hr, 5 mM ATP 2 hr,
10 mg/ml nigericin (Nig) 2 hr, or (D), S. aureus (DoatA) 6 hr.
(E) LPS-primed BMDMs were stimulated with PGN (20 mg/ml) or ATP (5 mM), and release of lactate dehydrogenase (LDH) into supernatants was measured at
indicated times and shown as percentage of maximum at each time point.
Error bars indicate SD. ***p < 0.001.
See also Figure S1.
this, together with the observation that lysosomal degradation is

necessary, suggests the existence of an alternative mechanism
for specifically sensing PGN degradation products.
In this study, we have identified N-acetylglucosamine (NAG), a
sugar subunit of the backbone of PGN, as an activator of the
NLRP3 inflammasome. Anthrax bacteria specifically de-acetylate NAG in PGN, and we show that this PGN becomes a poor
activator of IL-1b secretion in vitro and in vivo. Mechanistically,
we observed that purified NAG and NAG released upon degradation of PGN in phagosomes are detected via inhibition of the
glycolytic enzyme hexokinase, resulting in its dissociation from
the mitochondrial outer membrane. Using a peptide that competes with hexokinase for binding to mitochondria and induces
its dissociation from the outer membrane, we observed that
hexokinase dissociation alone is sufficient to induce NLRP3
inflammasome activation. These conclusions are further sup-
ported by the observation that specific metabolic perturbations

that affect hexokinase function also induce inflammasome activation. Together, the data suggest a model in which hexokinase
effectively acts as a pattern recognition receptor, alerting the
cell to degradation of bacterial PGN in phagosomes and activating an inflammatory response via disruption of the glycolytic
pathway and mitochondrial function.
RESULTS
PGN-Induced NLRP3 Inflammasome Activation Is
Independent of Potassium Efflux and Pyroptosis
We and others have shown that phagocytosis of Gram-positive
PGN by bone marrow-derived macrophages (BMDMs) stimulates secretion of IL-1b via the NLRP3 inflammasome (Figures
1A and 1B) (Martinon et al., 2004; Shimada et al., 2010), a
Cell 166, 624636, July 28, 2016 625
process that, as we have previously shown, requires degradation (Shimada et al., 2010). The diminished IL-1b production by
Nlrp3 / BMDMs in response to PGN are not a consequence
of differential phagocytosis or lysosomal enzyme activity, which
are equivalent in wild-type and Nlrp3 / BMDMs (Figures S1A
S1C). In the process of evaluating inflammasome activation in
response to PGN, we observed some behaviors inconsistent
with current models of mechanisms of activation. First, it has
been suggested that efflux of cytosolic potassium is essential
for NLRP3 inflammasome activation (Munoz-Planillo et al.,
2013). While we confirmed that IL-1b secretion triggered by
ATP or nigericin in lipopolysaccharide (LPS)-primed macrophages is strongly inhibited by extracellular potassium, we found
that PGN-induced IL-1b secretion is not affected by extracellular
potassium (Figure 1C). We also observed no effect of extracellular potassium on IL-1b secretion in response to whole DoatA
S. aureus (Figure 1D), a strain that makes a PGN that is
highly sensitive to phagosomal degradation and that we have
previously shown to be a strong activator of the NLRP3 inflammasome (Shimada et al., 2010). Second, unlike many inflammasome activators, PGN-induced caspase-1 activation does not
result in pyroptosis, as measured by the release of lactate dehydrogenase (LDH) (Figure 1E), annexin V staining (Figure S1D),
or propidium iodide uptake (Figure S1E). We only observe
background levels of cell death over 3 days in macrophages
stimulated with PGN (Figure 1E). Compared to classic NLRP3
activators like ATP and nigericin, PGN-induced inflammasome
activation occurs over a longer time period; for example, in Figure 1A, PGN induces much less IL-1b over 6 hr than ATP triggers
in 2 hr. Given these unique features of PGN-induced activation,
we set out to determine how PGN is sensed by macrophages.
NAG Is the Minimal Inflammasome-Activating
Component of PGN
PGN, which makes up as much as 80% of the dry weight of typical
Gram-positive bacteria, is a polysaccharide of repeating units of
N-acetylmuramic acid (NAM; MurNAc) and NAG (GlcNAc) crosslinked by short amino acid side chains (Figure 2A).
MDP, the NOD2-activating fragment of PGN, has been
suggested to stimulate IL-1b release under certain conditions
(Faustin et al., 2007; Ferwerda et al., 2008; Hsu et al., 2008;
Marina-Garca et al., 2008; Martinon et al., 2004; Pan et al.,
2007). However, when we treated macrophages with soluble
MDP or lipofectamine-complexed MDP (to deliver it to the
cytosol), they did not mimic the response to PGN that we saw
(Figure 2B). Furthermore, as noted earlier, we have previously
shown that PGN-induced IL-1b secretion is not blocked in macrophages lacking NOD2 (Shimada et al., 2010). Together, the
data suggest that some other lysosomal degradation product
of PGN must be detected.
Therefore, we examined the inflammasome-activating capacity of other potential PGN degradation products. We observed
that the NAG sugar subunit from the backbone of PGN becomes
a potent activator of IL-1b processing and secretion in LPSprimed macrophages when it is complexed with lipofectamine
to deliver it to the cytosol (Figure 2C). We found that delivery
of NAG to the cytosol is important to its function, since soluble NAG only induced IL-1b when added to culture medium
626 Cell 166, 624636, July 28, 2016
at high concentrations (Figure S2A). The IL-1b detected in

response to lipofectamine-complexed NAG was confirmed to
be cleaved IL-1b p17 by immunoblot (Figure 2D), and inflammasome assembly was detected by NLRP3 and caspase-1 p10
proximity ligation (Figure 2E). In contrast, when we exposed
LPS-primed cells to lipofectamine-complexed NAM (the other
sugar subunit of the PGN backbone) or other sugars like glucosamine (GAM), glucose, or sucrose, they triggered little or no IL-1b
secretion (Figure 2F). Lipofectamine-complexed NAG alone was
unable to induce tumor necrosis factor a (TNF-a) in un-primed
cells (Figure 2G), indicating that NAG does not reproduce the
priming activity of PGN. In addition to mouse macrophages,
PGN and NAG activate IL-1b secretion in mouse dendritic cells,
as well as human macrophages and dendritic cells primed
with LPS (Figures S2BS2D). Like PGN, NAG inflammasome
activation is dependent on the NLRP3 inflammasome (Figure 2H),
independent of potassium efflux (Figure 2I), and does not induce
pyroptosis (Figure 2J).
Acetylation of PGN Is Necessary for Its InflammasomeActivating Potential
We reasoned that, if NAG is the critical component of PGN
involved in NLRP3 inflammasome activation, PGN without
NAG should not stimulate the inflammasome. PGN produced
by Bacillus anthracis (anthrax) is unusual in that as much as
88% of its NAG is de-acetylated to GAM due to the expression
of PGN NAG deacetylase activity (Zipperle et al., 1984). Thus,
anthrax PGN contains little NAG, although it can be chemically
re-acetylated in vitro (Zipperle et al., 1984). Native NAG-deficient
anthrax PGN induces little or no inflammasome activation, while
re-acetylated anthrax PGN strongly induces activation from
LPS-primed (Figure 3A) or PAM3CSK4-primed (Figure S3A) macrophages. Native anthrax PGN is internalized by macrophages
slightly less efficiently than the re-acetylated PGN in vitro (Figures 3B and 3C), although this difference cannot account for
the profound lack of IL-1b secretion. The re-acetylated anthrax
PGN-induced IL-1b was completely NLRP3 dependent (Figure 3D). The level of IL-1b produced by the re-acetylated
PGN-stimulated macrophages is sufficient to induce potent
inflammatory immune responses in vivo, since we observed
that neutrophil recruitment to the peritoneum of mice injected
with these macrophages was largely, although not entirely,
NLRP3 dependent (Figure S3B).
Re-acetylated anthrax PGN was significantly more inflammatory upon direct intraperitoneal injection into mice than native
anthrax PGN, as measured by neutrophil infiltration (Figure 3E).
PGN induces inflammation via a wide variety of mediators
(e.g., cytokines, chemokines, and complement). We observed
that the IL-1b receptor antagonist anakinra partially inhibited
re-acetylated anthrax PGN-induced neutrophil infiltration (Figure 3F), indicating that IL-1b production plays an important role
in the overall response.
PGN and NAG Inhibit Hexokinase and Induce Its
Dissociation from Mitochondria
In order to determine how NAG activates the NLRP3 inflammasome, we began by investigating how PGN and NAG relate to
mechanisms previously implicated in NLRP3-induced IL-1b
B
NAM
IL-1 (ng/ml)
L-Ala
D-Ala
D--Glu
L-Lys
(Gly)5
L-Lys
D--Glu
D-Ala
L-Ala
NAG
NAM
D
1.4
2.0
1.2
NAG
IL-1 (ng/ml)
NAG
C
2.5
1.5
1.0
NAG
U
T
AT
P
Li
po
LN
AG
1.0
IL-1 p45
Lys
0.8
Tubulin
0.6
0.4
0.5
IL-1 p17
Sup
0.2
U
T
Li
sM po
L- DP
M
pd DP
A:
dT
0
Lipo
L-NAG
Blue = DAPI
po
LN
L- AG
G
A
L- M
G
L- lc
Su
c
M
DP
PG
N
LP
S
Li
L-
L-
Li
L-
G
A
L- M
G
L- lc
Su
c
AM
100
po
AG
0.3
0.2
0.1
po
200
0.4
Li
300
dT
400
0.5
A:
500
10
pd
AT
P
600
12
Red = NLRP3/p10 PLA+

Nlrp3-/0.6
WT
700
TNF (ng/ml)
800
IL-1 (ng/ml)
IL-1 (pg/ml)
L-NAG
Lipo
L-NAG
10
dT
2 6 8 20 2 6 8 20 2 6 8 20 2 6 8 20 h
L-
K+ (mM)
A:
34 64 77 92 121 150
pd
0
5
AG
20
0.2
po
0.4
30
Li
0.6
40
0.8
50
1.0
LDH % of Max
IL-1 (ng/ml)
60
L-NAG
1.2
Figure 2. NAG Is the NLRP3 Inflammasome-Activating Component of PGN

(A) Schematic diagram of PGN structure.
(B) LPS-primed BMDMs were stimulated with lipofectamine-complexed MDP (L-MDP), soluble MDP (sMDP, 10 mg/ml), pdA:dT, or lipofectamine alone (Lipo) for
6 hr. UT, untreated.
(C) Cells were stimulated for 6 hr with lipofectamine complexes containing increasing amounts of NAG.
(D) IL-1b processing was assessed by immunoblot as in (C). Sup, supernatant; Lys, lysate.
(E) Proximity ligation assay (for association of NLRP3 with caspase-1) of LPS-primed BMDMs treated for 3 hr with lipofectamine alone (Lipo) or complexed with
NAG (L-NAG) (experiment was performed 23).
(F) Cells were stimulated for 6 hr with different sugars complexed with lipofectamine (NAM, N-acetylmuramic acid; GAM, glucosamine; Gluc, glucose; and Suc,
Sucrose).
Cell 166, 624636, July 28, 2016 627
secretion. As noted earlier, they do not require potassium efflux

or induce pyroptosis. However, PGN (Figure 4A) and NAG (Figure S4A) both trigger the appearance of mtDNA in the cytosol.
DNA release from the mitochondria has been associated with
NLRP3 activation in response to other stimuli (Shimada et al.,
2012; Zhong et al., 2016; Zhou et al., 2011), and while its exact
relationship to PGN-induced NLRP3 inflammasome regulation
is not clear, the observations prompted us to investigate further
how PGN-derived NAG could affect mitochondria.
Early biochemical studies attempting to identify the cytosolic
enzyme responsible for phosphorylation of glucose, the first
step in glycolysis, noted that NAG could competitively inhibit
the process (Spiro, 1958). The enzyme inhibited by NAG was
later identified as hexokinase (Wilson et al., 2011). Using purified
mouse macrophage mitochondria and recombinant human
hexokinase, we confirmed that NAG is a dose-dependent inhibitor of hexokinase enzymatic activity (Figures 4B and 4D), while
other breakdown products of PGN, including NAM and MDP,
do not inhibit hexokinase (Figures 4B and 4C).
As a competitive inhibitor, NAG, by definition, competes with
glucose for binding to the active site of the enzyme. Binding of
glucose to hexokinase has been directly characterized by solving
the crystal structure of hexokinase bound to glucose, and NAG
binding to the active site has been modeled (Aleshin et al.,
1998a, 1998b; Madej et al., 2014) (MMDB: 43169). Consistent
with these structural studies and with previous enzymatic studies
and our current data (Figure 4B), we have further observed NAG
directly binding to hexokinase by protein thermal shift assay (Figure S4B). Even though NAG binds to hexokinase, the enzyme
cannot phosphorylate NAG (Figure S4C), which is consistent
with its role as a competitive inhibitor. Acetylation of NAG is
important for its role as a hexokinase inhibitor, because de-acetylated NAG, i.e., GAM, is phosphorylated by hexokinase (Figure S4C). Because GAM is phosphorylated by hexokinase,
though less efficiently, it is not perceived as an inhibitor, suggesting why GAM does not activate the NLRP3 inflammasome (Figure 2F), and anthrax PGN, which is naturally de-acetylated, is
less inflammatory than re-acetylated anthrax PGN (Figure 3). To
further examine the role of acetylation, we solubilized native
and re-acetylated anthrax PGN, as well as S. aureus PGN, with
macrophage lysosomal extracts and observed that degradation
products derived from PGNs rich in NAG, re-acetylated anthrax
and S. aureus PGN, partially inhibited hexokinase activity (Figure S4D), while native anthrax PGN with low NAG content did not.
NAG inhibition of hexokinase was of particular interest,
because the enzyme (hexokinase I and/or II, depending on the
cell type) associates with the mitochondrial outer membrane
through an interaction with the voltage-dependent anion channel
(VDAC). This interaction with the VDAC is involved in regulation of glycolysis, mitochondrial stability, ROS production, and
permeability transition pore formation (Pastorino and Hoek,
2008). Previous investigators had noted that the knockdown of

the VDAC somehow blocks NLRP3 inflammasome activation,
but how this related to microbial sensing or whether this has
any relationship to cellular metabolism has not been clear
(Zhou et al., 2011). Previous studies on cellular metabolism
and regulation of apoptosis have observed that the association
of hexokinase with the VDAC is closely regulated by signaling
and feedback inhibition (Pastorino and Hoek, 2008). We hypothesized that the release of NAG after phagocytosis and degradation of PGN could affect mitochondria by influencing hexokinase
association with mitochondrial VDAC.
To test this hypothesis, we measured the ability of PGN
and NAG to induce release of hexokinase from mitochondria.
We observed that macrophages stimulated with PGN or NAG
showed elevated cytosolic levels of hexokinase by immunoblot
(Figures 4E and 4F). The increased hexokinase in the cytosol is
not a result of generalized loss of mitochondrial integrity; we
did not detect increased cytosolic levels of other mitochondria
proteins, including Tom20 or cytochrome c (Figures 4E and
4F), and mitochondrial membrane potential was not affected
(Figures S4E and S4F). Thus, while acute activation of the
NLRP3 inflammasome by some stimuli such as ATP may cause
severe cell damage and mitochondrial disruption, PGN and NAG
appear to induce a physiologically tolerable level of hexokinase
release that does not involve degradation of mitochondria or
collapse of total cellular mitochondrial function.
To more quantitatively measure hexokinase dissociation from
mitochondria, we measured hexokinase release into the cytosol
by ELISA (Figure 4G) and enzyme activity (Figure 4H) and
observed increases in response to PGN and NAG. We observed
hexokinase dissociation from mitochondria after phagocytosis
of a variety of Gram-positive bacteria (Figure S4G). Overall
expression of hexokinase was not affected (Figure S4H). Hexokinase release in response to PGN is upstream of NLRP3, since
we observed normal hexokinase release into the cytosol in
NLRP3-deficient macrophages (Figure S4I). To more directly
evaluate the effect of NAG on hexokinase, we microinjected
sugars directly into the cytosol of primary BMDMs expressing
GFP-tagged hexokinase, which localizes to mitochondria (Figure 4I). While injection of NAM had no effect on hexokinase localization, injection of NAG induced dissociation of hexokinase
from mitochondria within minutes, confirming that free NAG in
the cytosol is sufficient to trigger hexokinase release. Mitochondria remain intact, as observed by the expression of DsRed
tagged with a mitochondrial localization sequence.
Hexokinase Dissociation from Mitochondria Is Sufficient
to Trigger NLRP3 Inflammasome Activation and IL-1b
Production
We predicted that, if hexokinase inhibition and release from
the mitochondria constitute an initiating step in NLRP3
(G) Unprimed BMDMs were treated with lipofectamine-complexed sugars as in (F), and TNF-a was measured in the supernatant after 6 hr.
(H) LPS-primed BMDMs from wild-type (WT) and Nlrp3 / mice were stimulated as described above.
(I) LPS-primed BMDMs were stimulated with lipofectamine-complexed NAG in the presence of increasing concentrations of extracellular KCl for 6 hr.
(J) LPS-primed BMDMs, stimulated as described above, were assessed for LDH release at indicated times.
Error bars indicate SD. ***p < 0.001, Students t test.
See also Figure S2.
628 Cell 166, 624636, July 28, 2016
Figure 3. Acetylation of NAG Is Necessary for Inflammasome Activation by PGN

(A) LPS-primed BMDMs were treated with the increasing doses of native (AxPGN) or re-acetylated (Ac-AxPGN) anthrax PGN (20160 mg/ml) for 6 hr, and IL-1b in
the supernatant was measured by ELISA. UT, untreated.
(B) TRITC-labeled native or re-acetylated anthrax PGN (40 mg/ml) was incubated with LPS-primed BMDMs for 1 hr, and internalization was confirmed by
fluorescence microscopy.
(C) TRITC-labeled native or re-acetylated anthrax PGN internalization by BMDMs was measured by flow cytometry after 6 hr.
(D) LPS-primed BMDMs from wild-type (WT) and Nlrp3 / mice were stimulated with 80 mg/ml AxPGN for 6 hr, 80 mg/ml Ac-AxPGN for 6 hr, 5 mM ATP for 2 hr, or
pdA:dT for 6 hr, and IL-1b was measured by ELISA.
(E) Mice were injected i.p. with PBS (n = 8) or 10 mg of AxPGN (n = 7) or Ac-AxPGN (n = 7). After 4 hr, cells in the peritoneal lavage were harvested and analyzed by
flow cytometry. Total cells (left panel) and neutrophils (middle and right panels) were increased in response to re-acetylation of AxPGN. ns, not significant.
(F) Mice were injected i.p. with 10 mg Ac-AxPGN without (Ctl) (n = 5) or with 25 mg/kg anakinra (n = 5) and assayed as in (E) (experiment was performed 13).
Error bars indicate SD. **p % 0.01; ***p < 0.001, one-way ANOVA and Newman-Keuls multiple comparison test (E) and unpaired Students t test (F).
See also Figure S3.
Cell 166, 624636, July 28, 2016 629
20
18
16
14
12
10
8
6
4
2
0
- + + + + + LPS
1 2 4 h
PGN
D
14
120
12
100
NAG
NAM
Suc
human
HK Activity (U/ml)
mouse
HK Activity (U/ml)
**
80
60
40
20
25
50
75
Concentration (mM)
100
10
8
6
4
2
UT
MDP
AT
P
Cytosolic mtDNA
Fold Induction
5.0
4.5
4.0
3.5
3.0
2.5
2.0
1.5
1.0
0.5
0
% Actiivity
T
U
TM
PGN
1
L-NAG
3
hr
Tubulin
Tom20
Tom20
CytoC
Antibody
Specificity
Ctl
U
T
PG
N
Li
po
LN
AG
HK activity
(% of Total Cell Activity)
Before
*
Antibody
Specificity
Ctl
cytosol
NAG
***
Tubulin
I
3h
hr
HK2
H
20
18
16
14
12
10
8
6
4
2
0
HK2
CytoC
cytosol
Lipo
HK2 (ng/ml)
0.1 1 5
NAG (mM)
**
5
***
4
3
***
**
2
1
0
1 2 3 1 2 3 1 2 3 1 2 3 h
UT
PGN L-NAG pdA:dT
After
HK2
GFP
Mito
DsRed
Before
NAM
After
HK2
GFP
Mito
DsRed
Figure 4. Hexokinase Is the Receptor that Detects NAG for Inflammasome Activation
(A) LPS-primed BMDMs were stimulated with ATP (5 mM) for 2 hr or PGN (40 mg/ml) as indicated, and the presence of mtDNA in the cytosolic fraction was
measured by RT-PCR.
(B and C) Hexokinase (HK) activity in purified mouse macrophage mitochondria was assessed in the presence of increasing concentrations of (B) NAG, NAM,
sucrose, or (C) MDP (25 mM). UT, untreated.
(D) NAG inhibits the activity of purified human hexokinase.
(E and F) Hexokinase in the cytosol fraction of LPS-primed BMDMs stimulated for the indicated times with PGN (40 mg/ml, from S. aureus) or lipofectaminecomplexed NAG was detected by immunoblot. Clotrimazole (CTM) treatment was used as a positive control for hexokinase release from mitochondria, and
mitochondrial markers Tom20 and cytochrome c (CytoC) were included to control for mitochondrial integrity. Control lysates were included (Antibody Specificity
Ctl) to confirm antibody staining for each marker (non-continuous lane from the same gel). Lipo, lipofectamine.
(G) LPS-primed BMDMs were stimulated as indicated PGN (40 mg/ml), and hexokinase 2 in the cytosol was determined by ELISA.
(H) LPS-primed BMDMs were stimulated as indicated, and cytosolic hexokinase enzyme activity was measured.
(I) LPS-primed BMDMs expressing hexokinase 2 fused to GFP (HK2-GFP) and DsRed targeted to mitochondria (Mito-DsRed) were microinjected with NAG or
NAM as indicated. Association of hexokinase with mitochondria was visualized before and 1 min after injection (n = 10 cells assessed for each sugar).
Error bars indicate SD. *p % 0.05; **p % 0.01; ***p % 0.001, Students t test.
See also Figure S4.
630 Cell 166, 624636, July 28, 2016
Figure 5. Hexokinase Dissociation from Mitochondria Is Sufficient to Activate the NLRP3 Inflammasome
(A) LPS-primed BMDMs were treated with cell-permeable hexokinase dissociation peptide (HKVBD) or scrambled control peptides (Ctl) fused to TAT peptide
(20 mM) for the indicated times, and the amount of hexokinase in the cytosolic fraction was determined by immunoblot. Control lysate was included on each gel
(Antibody Specificy Ctl) to confirm antibody staining for each marker (non-continuous lane from the same gel). UT, untreated.
(B) LPS-primed BMDMs expressing HK2-GFP and mitochondria-localized DsRed (Mito-DsRed) were imaged following treatment with HKVBD and Ctl peptides
fused to TAT (20 mM) to assess hexokinase redistribution.
(C and D) LPS-primed BMDMs were treated with HKVBD or control peptides fused to cell-permeable antennapedia peptide; IL-1b (C) and IL-18 (D) were
measured by ELISA after 2 hr.
(E) LPS-primed BMDMs from wild-type and Nlrp3 / mice were stimulated with 5 mM ATP, pdA:dT, HKVBD or control peptides fused to antennapedia peptide,
and IL-1b was measured in the supernatant after 2 hr.
(F and G) LPS-primed BMDMs were treated with HKVBD or control peptides fused to TAT peptide; (F) cleaved IL-1b and caspase-1 were detected by immunoblot at
2 hr, and (G) inflammasome assembly was observed by NLRP3 and caspase-1 p10 proximity ligation (PLA, red). Nuclei were stained with DAPI (blue). Sup, supernatant.
Cell 166, 624636, July 28, 2016 631
inflammasome activation by PGN, then forcing hexokinase to

dissociate from the VDAC would be sufficient to trigger IL-1b
secretion. Previous metabolic studies have characterized the
binding site between hexokinase II and the VDAC and have
shown that a peptide derived from hexokinase II (HKVBD) can
be used to block binding and cause dissociation of the enzyme
from the VDAC (Chiara et al., 2008; Majewski et al., 2004; Pastorino et al., 2002). When we treated LPS-primed macrophages
with a cell-permeable version of this peptide, hexokinase rapidly
dissociated from mitochondria, as measured by immunoblot
of the cytosol fraction (Figure 5A). We also directly observed
HKVBD-peptide-induced dissociation of GFP-tagged hexokinase 2 from macrophage mitochondria by microscopy within minutes of exposure to the peptide (Figure 5B). When we treated
LPS-primed BMDMs with the hexokinase dissociation peptide,
we observed dose-dependent release of mature IL-1b by ELISA
(Figure 5C; Figure S5A), as well as IL-18 (Figure 5D). IL-1b processing and secretion in response to the HKVBD peptide were
faster than PGN (Figures S5B and S5C) and NLRP3 dependent
(Figure 5E). Inflammasome activation was confirmed by observation of cleaved IL-1b p17 and caspase-1 p10 in the supernatant of treated BMDMs (Figure 5F). Proximity ligation assay
revealed direct association of NLRP3 and caspase-1, evidence
of initial inflammasome assembly, within minutes of exposure
to the peptide (Figure 5G). To determine whether hexokinase
dissociation from the VDAC on mitochondrial membranes is sufficient to activate inflammatory responses in vivo, we injected
mice intraperitoneally with cell-permeable control or HKVBD
peptides. The HKVBD peptide was sufficient to induce inflammation and recruitment of inflammatory cells (Figures 5H and
S5D), and this response was reduced in mice deficient in caspase-1 and -11 (Figure 5I).
While NAG inhibits hexokinase activity and, therefore, triggers
hexokinase dissociation from mitochondria, HKVBD peptide induces hexokinase dissociation but does not inhibit hexokinase
activity (Figure S5E). Therefore, we conclude that hexokinase
dissociation, rather than inhibition, is the important upstream
step in inflammasome activation. HKVBD peptide can be toxic
to the cells during extended exposure, but during the short exposure in which IL-1b is induced, we observed only a small amount
of increased cell death (Figure S5F). Consistent with PGN
and NAG, we observed an increase in cytosolic mtDNA when
we stimulated cells with HKVBD peptide (Figure S5G). Since
HKVBD-peptide-induced hexokinase release is sufficient to activate the NLRP3 inflammasome, and since PGN degradation
leads to hexokinase release, we conclude that this mechanism is sufficient to explain how PGN activates the NLRP3
inflammasome.
Metabolic Conditions that Result in Hexokinase
Inhibition Lead to Inflammasome Activation
Glycolysis is regulated, in part, by feedback inhibition of hexokinase by its enzymatic product glucose-6-phosophate (G6P).
High levels of G6P trigger the release of hexokinase from mitochondria and, thus, reduce the rate of further G6P production
(Gerber et al., 1974; Pastorino et al., 2002). Therefore, we predicted that excess G6P would activate the NLRP3 inflammasome. Indeed, when we treated primed BMDMs and bone
marrow-derived dendritic cells (BMDCs) (data not shown) with
lipofectamine-complexed G6P, we observed IL-1b secretion
by ELISA (Figure 6A) and production of cleaved IL-1b p17 and
caspase-1 p10 in the supernatant (Figure 6B). This activation
was NLRP3 dependent (Figure S6A). Consistent with its role as
a hexokinase inhibitor, we observed increased hexokinase in
the cytosol following G6P treatment (Figure 6C). 2-deoxyglucose
(2-DG) is a glycolytic inhibitor that is commonly used in studies
of cell metabolism. It competes with glucose in the glycolytic
pathway. When we treated primed BMDMs with 2-DG, we
observed dose-dependent induction of IL-1b by ELISA (Figure 6D), as well as cleaved IL-1b p17 and caspase-1 p10 in the
supernatant (Figure 6E), as has recently been reported by others
(Nomura et al., 2015). However, 2-DG does not function as an
inhibitor of hexokinase like NAG or G6P. Instead, 2-DG is metabolized by hexokinase to 2-deoxyglucose-6-phosphate (2-DG6P)
(Figure S4C), which cannot be utilized by downstream glycolytic
enzymes (Wick et al., 1957). The result is a buildup of 2-DG6P
that inhibits hexokinase like G6P but is sensitive to the presence
of glucose, hexokinases preferred substrate. Thus, while 2-DG
can trigger inflammasome activation in primed macrophages in
media containing glucose, it is more effective in the absence of
glucose (Figure 6F) and is completely dependent on NLRP3 (Figure S6B). Consistent with our previous observations, 2-DG treatment leads to an increase in hexokinase in the cytosol of BMDMs
(Figure 6G). Lastly, we treated primed cells with citrate, a natural
intermediate in the tricarboxylic acid (TCA) pathway that inhibits
phosphofructokinase when it accumulates. Buildup of citrate
thus backs up the glycolytic pathway and naturally elevates
cytosolic G6P levels (Berg et al., 2002). As expected, treating
primed cells with citrate triggered NLRP3-dependent IL-1b
production and caspase-1 cleavage (Figures 6H and S6C). While
these metabolic stresses can cause cell death under certain
conditions, we observed IL-1b production under conditions
that do not cause substantial cell death (Figure S6D). These
data suggest an intriguing relationship between cellular metabolism and inflammatory signaling.
DISCUSSION
While the original evolutionary role of phagocytosis was to eat
and degrade other microbes to obtain nutrients, this study
suggests that mammalian phagocytes have adapted the cells
metabolic machinery for utilizing these nutrients to detect the
presence of microbial-derived sugars and metabolic perturbation as danger signals. In this study, we have shown that the
NAG subunit of the sugar backbone of bacterial PGN induces
inflammasome activation by inhibiting hexokinase, the first
(H and I) Indicated mice were injected i.p. with 500 ml of PBS (n = 6), 240 mM HKVBD (n = 7), or control peptide fused to TAT peptide (n = 6); peritoneal cavities were
lavaged after 4 hr; total cells were counted; and neutrophil content was determined by flow cytometry (both experiments were each done 13). ns, not significant.
Error bars indicate SD. *p % 0.05; **p % 0.01; ***p % 0.001, Students t test (CE and H) and one-way ANOVA and Newman-Keuls multiple comparison test (G).
See also Figure S5.
632 Cell 166, 624636, July 28, 2016
H
G
Figure 6. Metabolic Perturbations Affecting Hexokinase Activate the NLRP3 Inflammasome

(A and B) LPS-primed BMDMs were treated with increasing concentrations of lipofectamine-complexed glucose-6-phosphate (G6P) for 6 hr. IL-1b was measured
in the supernatant by ELISA (A), and cleaved IL-1b and caspase-1 were detected by immunoblot (B). UT, untreated; Lipo, lipofectamine.
(C) Hexokinase was detected in the cytosolic fraction following treatment with lipofectamine-complexed G6P for the indicated times. Control lysate was included
on each gel (Antibody Specificity Ctl) to confirm antibody staining for each marker (non-continuous lane from the same gel). CTM, clotrimazole.
(D and E) LPS-primed BMDMs were treated with increasing concentrations of 2-deoxyglucose (2-DG) for 6 hr. IL-1b was measured in the supernatant (Sup) by
ELISA (D), and cleaved IL-1b and caspase-1 were detected by immunoblot (E).
(F) LPS-primed BMDMs were treated with 2-DG in the presence or absence of glucose for 6 hr, and IL-1b was measured in the supernatant by ELISA.
(G) Hexokinase was measured in the cytosolic fraction following treatment with 2-DG in the absence of glucose.
(H) LPS-primed wild-type or Nlrp3 / BMDMs were treated with increasing concentrations of sodium citrate for 6 hr, and IL-1b was measured in the supernatant
by ELISA.
Error bars indicate SD. *p % 0.05; **p % 0.01; ***p % 0.001, Students t test.
See also Figure S6.
Cell 166, 624636, July 28, 2016 633
step in glycolysis. The inhibition of hexokinase results in hexokinase dissociation from the mitochondria, which, as we have
observed, is sufficient to initiate an NLRP3 inflammasomeactivating cascade in the cell. This model is supported by the
observation that several metabolic perturbations that inhibit
hexokinase function, such as treatment with glucose-6-phosphate, 2-deoxyglucose, or citrate, all lead to inflammasome
activation. How exactly hexokinase release from the mitochondrial outer membrane promotes NLRP3 inflammasome activation remains to be understood.
Mitochondrial dynamics have been broadly implicated in regulation of the NLRP3 inflammasome, including studies demonstrating a role for mitochondrial movement along microtubules
(Misawa et al., 2013), regulation of mitochondrial fission and
growth (Park et al., 2015), mitophagy (Zhong et al., 2016), and
release of mtDNA into the cytosol (Shimada et al., 2012; Zhong
et al., 2016) in modulating inflammasome activation. NAG inhibition of hexokinase was particularly interesting, because hexokinases I and II, the primary isoforms that regulate glycolysis, are
known to associate with the VDAC in the mitochondrial outer
membrane (John et al., 2011; Pastorino and Hoek, 2008; Pastorino et al., 2002; Rasola et al., 2010). The VDAC is known to regulate mitochondrial ROS (reactive oxygen species) production
(da-Silva et al., 2004), is a suggested component of the mitochondrial permeability transition pore that can release large molecules (including mtDNA) into the cytosol (Rasola et al., 2010;
Tomasello et al., 2009), and is localized to regions enriched for
cardiolipin (Sun et al., 2012) an NLRP3 activator. The interaction
of hexokinase with the VDAC on the outer membrane of mitochondria provides hexokinase with preferential access to newly
produced ATP transported from the matrix by the VDAC (Pastorino and Hoek, 2008). Hexokinase inhibition and dissociation from
the mitochondria constitute an essential step in regulation of the
rate of glycolysis (da-Silva et al., 2004; Pastorino and Hoek,
2008). Excess glucose-6-phosphate generated by hexokinase
leads to feedback inhibition of hexokinase and its dissociation
from mitochondria, slowing glycolysis. In addition, the interaction of hexokinase with the VDAC protects cells from mitochondrial ROS production (da-Silva et al., 2004) and suppresses
pro-apoptotic interactions between the VDAC and Bcl-family
members (Bax, Bid, etc.), which promotes sustained opening
of the mitochondrial permeability transition pore (Chiara et al.,
2008; Majewski et al., 2004; Pastorino and Hoek, 2008; Pastorino et al., 2002; Rasola et al., 2010). Each of these processes
have been previously implicated in NLRP3 inflammasome regulation, but how they relate to microbial sensing has not been
understood (Shimada et al., 2012; Zhou et al., 2011).
At first thought, NAG would seem to be a poor candidate to be
a PAMP detected by the innate immune system, since it is not
unique to bacteria. However, free NAG is not generally found in
the cytosol of mammalian cells and is primarily generated only
in small amounts following degradation of glycosylated proteins.
In biosynthetic pathways, uridine diphosphate (UDP)-NAG is
synthesized directly from glycolytic intermediates and utilized
in glycosylation processes without existing as free NAG. In
contrast, during degradation of particulate PGN in phagosomes,
unusually large amounts of NAG can be expected to become
available. The presence of a transporter that moves NAG from
634 Cell 166, 624636, July 28, 2016
lysosomes into the cytosol has been biochemically documented

(Jonas and Jobe, 1990; Jonas et al., 1989), although the molecular identity of the transporter is not yet known. We interpret the
data to suggest that macrophages have adapted to use their
ancient and essential metabolic glycolysis pathway to directly
sense unusually high levels of bacteria-derived NAG, generated
only upon phagocytosis of bacteria. Interestingly, unlike many
strong NLRP3 inflammasome activators, PGN does not stimulate pyroptosis. It is possible that this resistance to cell damage
is a result of the lower level of prolonged inflammasome activation stimulated by PGN, as compared to strong acute NLRP3
activators such as ATP. This is consistent with the idea that,
from the perspective of mounting an effective host defense,
macrophage cell death would seem to be an inappropriate
response to detection of PGN.
Though many studies over the years have suggested that
innate immune signaling has important effects on metabolism,
these studies have not implicated metabolic enzymes themselves as sensors of non-self (Haneklaus and ONeill, 2015;
Wen et al., 2012). Our findings suggest a novel area for crosstalk
between metabolism and innate immune signaling. The observation that modulation of a metabolic process can directly induce
inflammation may have profound implications for diseases as
wide ranging as diabetes, obesity, atherosclerosis, or inflammatory bowel disease, which have been linked to inflammasome
activation (Wen et al., 2012).
Mice
C57BL/6 mice were purchased from Jackson Laboratory. Nlrp3 / mice
(Mariathasan et al., 2006) were obtained from Dr. K. Fitzgerald (University of
Massachusetts), and Casp1 / mice also deficient in caspase-11 (Kuida
et al., 1995) were obtained from R.A. Flavell (Yale University). Mice were
housed in specific pathogen-free conditions in the Cedars-Sinai animal facility,
and all animal experiments were conducted according to Cedars-Sinai Medical Center Institutional Animal Care and Use Committee guidelines.
Cell Preparation and Stimulation/Lipofectamine Complexing
BMDMs and dendritic cells were grown as described previously (Wolf et al.,
2011), using 10% L-cell conditioned media or 50 ng/ml recombinant human
macrophage colony-stimulating factor (M-CSF) or murine granulocyte-macrophage (GM)-CSF, respectively. Cells were plated at 1 3 105 in a 96-well plate
and primed with LPS (50100 ng/ml) for 34 hr, followed by treatment with
PGN (2040 mg/ml) for 6 hr or ATP (5 mM) or nigericin (10 mM) for 2 hr. Lipofectamine-complexed stimuli were prepared by mixing pdA:dT (1 mg/ml) or sugars
(1 M) prepared in Opti-MEM with 24 ml lipofectamine (Invitrogen) per 100 ml for
30 min at room temperature. Cells were stimulated with 10 ml of the mix per
well. All sugar solutions were adjusted to a pH 7.4 prior to mixing with lipofectamine. Infection of cells with DoatA S. aureus and other bacteria was
done as described previously (Shimada et al., 2010; Wolf et al., 2011). Cell supernatants were analyzed by ELISA for IL-1b and TNFa (BioLegend) and IL-18
(MBL International).
Anthrax PGN
PGN from Bacillus anthracis Sterne stain was purified at the University of Oklahoma Health Sciences Center, as previously described (Iyer and Coggeshall,
2011; Langer et al., 2008). Anthrax PGN was re-acetylated according to previously published methods (Vollmer and Tomasz, 2000). AxPGN and Ac-AxPGN
were labeled with TRITC (Biotium) or Alexa Fluor 647 in PBS for 1 hr at 37 C
and washed to remove unreacted fluorophore. Labeled PGN was added to
BMDMs and spun down briefly, and cells were allowed to internalize PGN
for 1 hr and washed. Cells were either imaged or treated with PBS + proteinase
K (1 U/ml) to remove unbound particles and analyzed by flow cytometry to
determine the degree of phagocytosis.
support came from the Janis and William Wetsman Family Chair in Inflammatory Bowel Disease at Cedars-Sinai Medical Center (D.M.U.). Thanks to the
laboratory of Dr. Robin Shaw for use of their FemtoJet microinjection system.
Intraperitoneal Injections of PGN and HKVBD

C57BL/6 mice (n = 7 per group) were injected intraperitoneally (i.p.) with 500 ml
sterile PBS alone or containing 10 mg/ml AxPGN or Ac-AxPGN or 240 mM TATfused HKVBD or scramble peptide. For adoptive transfer experiments, wildtype and Nlrp3 / BMDMs were primed with 100 ng/ml PAM3CSK4 for 4 hr
and then given 80 mg/ml Ac-AxPGN for 1 hr. Cells were washed 33 with
PBS and counted, and 1 3 105 cells in 500 ml sterile PBS were injected i.p.
Mice were rested for 4 hr before being euthanized, and the peritoneal cavity
was lavaged 23 with 5 ml of cold sterile PBS + 2 mM EDTA. Cells were
counted; stained to assess cell types using antibodies against CD11b,
CD11c, Gr-1, CD3, and CD19 (BioLegend); and analyzed by flow cytometry
using FlowJo software.
Received: October 23, 2015

Revised: March 11, 2016
Cell Fractionation and Hexokinase Detection

Cells were grown in six-well plates and stimulated as described earlier.
Mitochondria and cytosol were separated using protocols previously described
(Pastorino et al., 2002), with some modification. Cells were lifted in 600 ml of
cell disruption buffer (20 mM HEPES, 10 mM KCl, 1.5 mM MgCl2, 1 mM EDTA,
250 mM sucrose, Roche cOmplete Protease Inhibitor) and incubated on ice for
5 min. The cells were disrupted by passing 30 times through a 22-gauge needle.
In some cases, cytosol was isolated by adding 50 mg/ml digitonin to the disruption buffer and incubating with rocking for 10 min. Lysates were then centrifuged
at 10,000 3 g for 10 min, and the supernatant was designated cytosol. For hexokinase assays, the pellet containing mitochondria was washed and resuspended
in mitochondria suspension media (20 mM HEPES, 1.5 mM MgCl2, 250 mM
sucrose). Cytosolic hexokinase was measured by either immunoblot or mouse
hexokinase 2 ELISA (Novateinbio) on equivalent amounts of protein lysate.
Proximity Ligation Assay
BMDMs were plated on glass coverslips and primed with LPS for 4 hr. Cells
were stimulated for designated times with the indicated stimuli, fixed with
4% paraformaldehyde, permeablized using 0.1% Triton X-100, and stained
with anti-NLRP3 (Cryo-2) (AdipoGen) and anti-caspase-1 p10 (Santa Cruz
Biotechnology). The Duolink In Situ PLA Kit was used according to the manufacturers instructions (Olink Biosciences).
Statistical Analysis
All experiments were conducted with triplicate measurements a minimum of
three times unless otherwise stated in the figure legends. Experiments were
analyzed using GraphPad Prism software or Microsoft Excel. The Grubbs
test was used to identify and exclude a single outlier (Figure 3E).
Supplemental Information includes Supplemental Experimental Procedures
and six figures and can be found with this article online at http://dx.doi.org/
10.1016/j.cell.2016.05.076.
An audio PaperClip is available at http://dx.doi.org/10.1016/j.cell.2016.05.
076#mmc3.
Studies were designed by A.J.W. and D.M.U. and performed by A.J.W.,
C.N.R., C.B., M.L.W., and K.S. Microinjection experiments were performed
by A.J.W. and W.L. Anthrax PGN was prepared by K.M.C. and N.I.P. Further
advice and conceptual development were provided H.C.C. and M.A. The
manuscript was prepared by A.J.W. and D.M.U.
ACKNOWLEDGMENTS
This study was funded by grants from the NIH (GM085796 to D.M.U.,
T32AI089553 to A.J.W., AI067995 to M.A., and AI062629 to K.M.C.). Further
REFERENCES
Aleshin, A.E., Zeng, C., Bartunik, H.D., Fromm, H.J., and Honzatko, R.B.
(1998a). Regulation of hexokinase I: crystal structure of recombinant human
brain hexokinase complexed with glucose and phosphate. J. Mol. Biol. 282,
345357.
Aleshin, A.E., Zeng, C., Bourenkov, G.P., Bartunik, H.D., Fromm, H.J., and
Honzatko, R.B. (1998b). The mechanism of regulation of hexokinase: new insights from the crystal structure of recombinant human brain hexokinase complexed with glucose and glucose-6-phosphate. Structure 6, 3950.
J.M. Berg, J.L. Tymoczko and L. Stryer, eds. (2002). The glycolytic pathway is
tightly controlled. In Biochemistry, Fifth Edition (New York: W. H. Freeman),
Section 16.2. http://www.ncbi.nlm.nih.gov/books/NBK21154/.
Chiara, F., Castellaro, D., Marin, O., Petronilli, V., Brusilow, W.S., Juhaszova,
M., Sollott, S.J., Forte, M., Bernardi, P., and Rasola, A. (2008). Hexokinase II
detachment from mitochondria triggers apoptosis through the permeability
transition pore independent of voltage-dependent anion channels. PLoS
ONE 3, e1852.
da-Silva, W.S., Gomez-Puyou, A., de Gomez-Puyou, M.T., Moreno-Sanchez,
R., De Felice, F.G., de Meis, L., Oliveira, M.F., and Galina, A. (2004). Mitochondrial bound hexokinase activity as a preventive antioxidant defense: steadystate ADP formation as a regulatory mechanism of membrane potential
and reactive oxygen species generation in mitochondria. J. Biol. Chem. 279,
3984639855.
Dostert, C., Petrilli, V., Van Bruggen, R., Steele, C., Mossman, B.T., and
Tschopp, J. (2008). Innate immune activation through Nalp3 inflammasome
sensing of asbestos and silica. Science 320, 674677.
Duewell, P., Kono, H., Rayner, K.J., Sirois, C.M., Vladimer, G., Bauernfeind,
F.G., Abela, G.S., Franchi, L., Nunez, G., Schnurr, M., et al. (2010). NLRP3 inflammasomes are required for atherogenesis and activated by cholesterol
crystals. Nature 464, 13571361.
Faustin, B., Lartigue, L., Bruey, J.M., Luciano, F., Sergienko, E., Bailly-Maitre,
B., Volkmann, N., Hanein, D., Rouiller, I., and Reed, J.C. (2007). Reconstituted
NALP1 inflammasome reveals two-step mechanism of caspase-1 activation.
Mol. Cell 25, 713724.
Ferwerda, G., Kramer, M., de Jong, D., Piccini, A., Joosten, L.A., Devesaginer,
I., Girardin, S.E., Adema, G.J., van der Meer, J.W., Kullberg, B.J., et al. (2008).
Engagement of NOD2 has a dual effect on proIL-1beta mRNA transcription
and secretion of bioactive IL-1beta. Eur. J. Immunol. 38, 184191.
Gerber, G., Preissler, H., Heinrich, R., and Rapoport, S.M. (1974). Hexokinase
of human erythrocytes. Purification, kinetic model and its application to the
conditions in the cell. Eur. J. Biochem. 45, 3952.
Haneklaus, M., and ONeill, L.A. (2015). NLRP3 at the interface of metabolism
and inflammation. Immunol. Rev. 265, 5362.
Hornung, V., Bauernfeind, F., Halle, A., Samstad, E.O., Kono, H., Rock, K.L.,
Fitzgerald, K.A., and Latz, E. (2008). Silica crystals and aluminum salts activate
the NALP3 inflammasome through phagosomal destabilization. Nat. Immunol.
9, 847856.
Hsu, L.C., Ali, S.R., McGillivray, S., Tseng, P.H., Mariathasan, S., Humke, E.W.,
Eckmann, L., Powell, J.J., Nizet, V., Dixit, V.M., and Karin, M. (2008). A NOD2NALP1 complex mediates caspase-1-dependent IL-1beta secretion in
response to Bacillus anthracis infection and muramyl dipeptide. Proc. Natl.
Acad. Sci. USA 105, 78037808.
Ip, W.K., Sokolovska, A., Charriere, G.M., Boyer, L., Dejardin, S., Cappillino,
M.P., Yantosca, L.M., Takahashi, K., Moore, K.J., Lacy-Hulbert, A., and Stuart,
Cell 166, 624636, July 28, 2016 635
L.M. (2010). Phagocytosis and phagosome acidification are required for pathogen processing and MyD88-dependent responses to Staphylococcus
aureus. J. Immunol. 184, 70717081.
Iyer, J.K., and Coggeshall, K.M. (2011). Cutting edge: primary innate immune
cells respond efficiently to polymeric peptidoglycan, but not to peptidoglycan
monomers. J. Immunol. 186, 38413845.
John, S., Weiss, J.N., and Ribalet, B. (2011). Subcellular localization of hexokinases I and II directs the metabolic fate of glucose. PLoS ONE 6, e17674.
Jonas, A.J., and Jobe, H. (1990). N-acetyl-D-glucosamine countertransport in
lysosomal membrane vesicles. Biochem. J. 268, 4145.
Jonas, A.J., Speller, R.J., Conrad, P.B., and Dubinsky, W.P. (1989). Transport
of N-acetyl-D-glucosamine and N-acetyl-D-galactosamine by rat liver lysosomes. J. Biol. Chem. 264, 49534956.
Kuida, K., Lippke, J.A., Ku, G., Harding, M.W., Livingston, D.J., Su, M.S., and
Flavell, R.A. (1995). Altered cytokine export and apoptosis in mice deficient in
interleukin-1 beta converting enzyme. Science 267, 20002003.
Kumar, H., Kawai, T., and Akira, S. (2011). Pathogen recognition by the innate
immune system. Int. Rev. Immunol. 30, 1634.
Lamkanfi, M., and Dixit, V.M. (2014). Mechanisms and functions of inflammasomes. Cell 157, 10131022.
Langer, M., Malykhin, A., Maeda, K., Chakrabarty, K., Williamson, K.S., Feasley, C.L., West, C.M., Metcalf, J.P., and Coggeshall, K.M. (2008). Bacillus
anthracis peptidoglycan stimulates an inflammatory response in monocytes
through the p38 mitogen-activated protein kinase pathway. PLoS ONE 3,
e3706.
Madej, T., Lanczycki, C.J., Zhang, D., Thiessen, P.A., Geer, R.C., Marchler-Bauer, A., and Bryant, S.H. (2014). MMDB and VAST+: tracking structural similarities between macromolecular complexes. Nucleic Acids Res. 42, D297D303.
Majewski, N., Nogueira, V., Bhaskar, P., Coy, P.E., Skeen, J.E., Gottlob, K.,
Chandel, N.S., Thompson, C.B., Robey, R.B., and Hay, N. (2004). Hexokinase-mitochondria interaction mediated by Akt is required to inhibit apoptosis
in the presence or absence of Bax and Bak. Mol. Cell 16, 819830.
Mariathasan, S., Weiss, D.S., Newton, K., McBride, J., ORourke, K., RooseGirma, M., Lee, W.P., Weinrauch, Y., Monack, D.M., and Dixit, V.M. (2006).
Cryopyrin activates the inflammasome in response to toxins and ATP. Nature
440, 228232.
Marina-Garca, N., Franchi, L., Kim, Y.G., Miller, D., McDonald, C., Boons,
G.J., and Nunez, G. (2008). Pannexin-1-mediated intracellular delivery of muramyl dipeptide induces caspase-1 activation via cryopyrin/NLRP3 independently of Nod2. J. Immunol. 180, 40504057.
Martinon, F., Agostini, L., Meylan, E., and Tschopp, J. (2004). Identification of
bacterial muramyl dipeptide as activator of the NALP3/cryopyrin inflammasome. Curr. Biol. 14, 19291934.
Martinon, F., Petrilli, V., Mayor, A., Tardivel, A., and Tschopp, J. (2006). Goutassociated uric acid crystals activate the NALP3 inflammasome. Nature 440,
237241.
Misawa, T., Takahama, M., Kozaki, T., Lee, H., Zou, J., Saitoh, T., and Akira, S.
(2013). Microtubule-driven spatial arrangement of mitochondria promotes
activation of the NLRP3 inflammasome. Nat. Immunol. 14, 454460.
Muller, S., Wolf, A.J., Iliev, I.D., Berg, B.L., Underhill, D.M., and Liu, G.Y. (2015).
Poorly cross-linked peptidoglycan in MRSA due to mecA induction activates
the inflammasome and exacerbates immunopathology. Cell Host Microbe
18, 604612.
Pan, Q., Mathison, J., Fearns, C., Kravchenko, V.V., Da Silva Correia, J., Hoffman, H.M., Kobayashi, K.S., Bertin, J., Grant, E.P., Coyle, A.J., et al. (2007).
MDP-induced interleukin-1beta processing requires Nod2 and CIAS1/
NALP3. J. Leukoc. Biol. 82, 177183.
Park, S., Won, J.H., Hwang, I., Hong, S., Lee, H.K., and Yu, J.W. (2015). Defective mitochondrial fission augments NLRP3 inflammasome activation. Sci.
Rep. 5, 15489.
Pastorino, J.G., and Hoek, J.B. (2008). Regulation of hexokinase binding to
VDAC. J. Bioenerg. Biomembr. 40, 171182.
Pastorino, J.G., Shulga, N., and Hoek, J.B. (2002). Mitochondrial binding
of hexokinase II inhibits Bax-induced cytochrome c release and apoptosis.
J. Biol. Chem. 277, 76107618.
Rasola, A., Sciacovelli, M., Pantic, B., and Bernardi, P. (2010). Signal transduction to the permeability transition pore. FEBS Lett. 584, 19891996.
Shimada, T., Park, B.G., Wolf, A.J., Brikos, C., Goodridge, H.S., Becker, C.A.,
Reyes, C.N., Miao, E.A., Aderem, A., Gotz, F., et al. (2010). Staphylococcus
aureus evades lysozyme-based peptidoglycan digestion that links phagocytosis, inflammasome activation, and IL-1beta secretion. Cell Host Microbe 7,
3849.
Shimada, K., Crother, T.R., Karlin, J., Dagvadorj, J., Chiba, N., Chen, S., Ramanujan, V.K., Wolf, A.J., Vergnes, L., Ojcius, D.M., et al. (2012). Oxidized
mitochondrial DNA activates the NLRP3 inflammasome during apoptosis.
Immunity 36, 401414.
Spiro, R.G. (1958). The effect of N-acetylglucosamine and glucosamine on carbohydrate metabolism in rat liver slices. J. Biol. Chem. 233, 546550.
Sun, Y., Vashisht, A.A., Tchieu, J., Wohlschlegel, J.A., and Dreier, L. (2012).
Voltage-dependent anion channels (VDACs) recruit Parkin to defective mitochondria to promote mitochondrial autophagy. J. Biol. Chem. 287, 40652
40660.
Tomasello, F., Messina, A., Lartigue, L., Schembri, L., Medina, C., Reina, S.,
Thoraval, D., Crouzet, M., Ichas, F., De Pinto, V., and De Giorgi, F. (2009). Outer
membrane VDAC1 controls permeability transition of the inner mitochondrial
membrane in cellulo during stress-induced apoptosis. Cell Res. 19, 1363
1376.
Vollmer, W., and Tomasz, A. (2000). The pgdA gene encodes for a peptidoglycan N-acetylglucosamine deacetylase in Streptococcus pneumoniae.
J. Biol. Chem. 275, 2049620501.
Wen, H., Ting, J.P., and ONeill, L.A. (2012). A role for the NLRP3 inflammasome in metabolic diseasesdid Warburg miss inflammation? Nat. Immunol.
13, 352357.
Wick, A.N., Drury, D.R., Nakada, H.I., and Wolfe, J.B. (1957). Localization
of the primary metabolic block produced by 2-deoxyglucose. J. Biol. Chem.
224, 963969.
Wilson, M.H., Edsell, M.E., Davagnanam, I., Hirani, S.P., Martin, D.S., Levett,
D.Z., Thornton, J.S., Golay, X., Strycharczuk, L., Newman, S.P., et al.;
Caudwell Xtreme Everest Research Group (2011). Cerebral artery dilatation
maintains cerebral oxygenation at extreme altitude and in acute hypoxiaan
ultrasound and MRI study. J. Cereb. Blood Flow Metab. 31, 20192029.
Wolf, A.J., Arruda, A., Reyes, C.N., Kaplan, A.T., Shimada, T., Shimada, K., Arditi, M., Liu, G., and Underhill, D.M. (2011). Phagosomal degradation increases
TLR access to bacterial ligands and enhances macrophage sensitivity to bacteria. J. Immunol. 187, 60026010.
Zhong, Z., Umemura, A., Sanchez-Lopez, E., Liang, S., Shalapour, S., Wong,
J., He, F., Boassa, D., Perkins, G., Ali, S.R., et al. (2016). NF-kB restricts inflammasome activation via elimination of damaged mitochondria. Cell 164,
896910.
Munoz-Planillo, R., Kuffa, P., Martnez-Colon, G., Smith, B.L., Rajendiran,

T.M., and Nunez, G. (2013). K+ efflux is the common trigger of NLRP3 inflammasome activation by bacterial toxins and particulate matter. Immunity 38,
11421153.
Zhou, R., Yazdi, A.S., Menu, P., and Tschopp, J. (2011). A role for mitochondria
in NLRP3 inflammasome activation. Nature 469, 221225.
Nomura, J., So, A., Tamura, M., and Busso, N. (2015). Intracellular ATP
decrease mediates NLRP3 inflammasome activation upon nigericin and crystal stimulation. J. Immunol. 195, 57185724.
Zipperle, G.F., Jr., Ezzell, J.W., Jr., and Doyle, R.J. (1984). Glucosamine substitution and muramidase susceptibility in Bacillus anthracis. Can. J. Microbiol.
30, 553559.
636 Cell 166, 624636, July 28, 2016
Article
Amyloid-like Self-Assembly of a Cellular

Compartment
Graphical Abstract
Authors
Elvan Boke, Martine Ruer, Martin Wuhr, ...,
David Drechsel, Anthony A. Hyman,
Timothy J. Mitchison
Correspondence
elvan_boke@hms.harvard.edu
In Brief
Amyloid-like self-assembly of a specific
protein drives formation of a cellular
compartment in oocytes.
Highlights
d
The organelle content of the Balbiani body is held together

by an Xvelo matrix
Xvelo forms amyloid-like networks in vitro, which can recruit
RNA and mitochondria
Prion-like domain of Xvelo dictates specificity in amyloid
assembly
Amyloid-like polymerization is conserved among vertebrate
Balbiani body organizers
Boke et al., 2016, Cell 166, 637650

Article
Amyloid-like Self-Assembly
of a Cellular Compartment
Elvan Boke,1,* Martine Ruer,2 Martin Wuhr,1,3 Margaret Coughlin,1 Regis Lemaitre,2 Steven P. Gygi,3 Simon Alberti,2
David Drechsel,2 Anthony A. Hyman,2 and Timothy J. Mitchison1
1Department
of Systems Biology, Harvard Medical School, Boston, MA 02115, USA

Planck Institute of Molecular Cell Biology and Genetics, 01307 Dresden, Germany
3Department of Cell Biology, Harvard Medical School, Boston, MA 02115, USA
*Correspondence: elvan_boke@hms.harvard.edu
2Max
SUMMARY
Most vertebrate oocytes contain a Balbiani body, a

large, non-membrane-bound compartment packed
with RNA, mitochondria, and other organelles. Little
is known about this compartment, though it specifies
germline identity in many non-mammalian vertebrates. We show Xvelo, a disordered protein with
an N-terminal prion-like domain, is an abundant constituent of Xenopus Balbiani bodies. Disruption of the
prion-like domain of Xvelo, or substitution with a
prion-like domain from an unrelated protein, interferes with its incorporation into Balbiani bodies
in vivo. Recombinant Xvelo forms amyloid-like networks in vitro. Amyloid-like assemblies of Xvelo recruit both RNA and mitochondria in binding assays.
We propose that Xenopus Balbiani bodies form by
amyloid-like assembly of Xvelo, accompanied by
co-recruitment of mitochondria and RNA. Prion-like
domains are found in germ plasm organizing proteins in other species, suggesting that Balbiani
body formation by amyloid-like assembly could be
a conserved mechanism that helps oocytes function
as long-lived germ cells.
INTRODUCTION
Female germ cells, oocytes, are highly specialized cells. In many
species, oocytes are long-lived and lie dormant for months or
years before they are activated prior to fertilization (Li and Albertini, 2013). They ensure the continuity of the species by providing
the female genome and mitochondria, along with most of the
housekeeping machinery and nutrients the early embryo will
need after fertilization. Oocytes have a unique subcellular organization, with a large nucleus, called the germinal vesicle, and
a large cytoplasm. In many species, the cytoplasm of the early
oocytes contains a highly specialized compartment called the
Balbiani body. It is non-membrane bound and densely packed
with mitochondria, RNA, ER, and Golgi. The Balbiani body assembles early in oocyte formation (Lei and Spradling, 2016), disappears in late-stage oocytes in mammals (Pepling et al., 2007),
and forms dispersed isles at the vegetal pole in late-stage oo-
cytes of Xenopus and zebrafish (Bontems et al., 2009; Kloc

et al., 2004).
How, or why, the Balbiani body forms is largely mysterious
(Gupta et al., 2010; Heim et al., 2014; Kloc and Etkin, 1995;
Kloc et al., 1993; Marlow and Mullins, 2008). In some species,
including frogs and fish, but not newts and mammals, one function of the Balbiani body is to serve as an embryonic determinant
that specifies germline identity by forming germ plasm (Lesch
and Page, 2012; Richardson and Lehmann, 2010). Germ plasm,
a special part of oocyte cytoplasm, protects specific maternal
RNAs from degradation and is believed to host healthy mitochondria during development to pass on to future generations
(Houston and King, 2000; Kogo et al., 2011). Germline specification is the only known function of the Balbiani body, but it is
notable that oocytes from species that use inductive processes
to specify their germline, including mammals, still contain a Balbiani body, whose ultrastructure is similar to those of organisms
that use germ plasm (Hertig, 1968). We do not know what functions the Balbiani body serves in all vertebrates. We speculate
their most conserved function is to protect the quality of mitochondria and other organelles during long periods of oocyte
dormancy, which can extend for decades in humans.
An elegant maternal effect screen in zebrafish identified only
one gene, bucky ball, required for Balbiani body formation
(Dosch et al., 2004). In buc mutants, the Balbiani body did not
form, and oocyte polarity was disrupted. Overexpression of
Bucky ball protein stimulated numerous Balbiani bodies and
ectopic germ cell formation (Bontems et al., 2009; Heim et al.,
2014; Marlow and Mullins, 2008). Bucky ball is proposed to be
a structural organizer of zebrafish Balbiani bodies, but the
biochemical basis of this proposed function has been unclear.
Xenopus laevis provides complementary advantages for analysis of Balbiani bodies because oocytes are abundant and
accessible, easy to manipulate due to their large size, and
amenable to live imaging. Xenopus oocytes are classified in six
different stages according to Dumont, 1972. Stage I oocytes
contain a clearly visible Balbiani body (Figure 1A). The Balbiani
body gradually disappears during oocyte development to give
rise to dispersed germ plasm islands at the vegetal pole of the
stage VI oocytes during oocyte maturation (Kloc et al., 2004).
Here, we report that the Balbiani body forms via amyloid-like
self-assembly of Xvelo, the Xenopus homolog of Bucky ball. Amyloids and amyloid-like proteins have largely been studied in the
context of neurodegenerative diseases, such as Alzheimers
B
2 M NaCl
Balbiani
body
50 m
Bb
95C
N
50 m
Time (min)
10 m
2 m
0.5 m
50 m
Figure 1. A Balbiani Body Is a Non-Membrane-Bound Compartment Packed with Membranous Organelles

(A) Phase contrast image of a stage I Xenopus laevis oocyte. Bb, Balbiani body; N, nucleus or germinal vesicle.
(B) Balbiani body immobilized in perfusion chambers. 2 M NaCl (first panel) or 95 C 50 mM HEPES, 100 mM KCl (pH 7.6) buffer (second panel) was perfused into
the chambers.
(C) Thin-section electron microscope (EM) images of isolated Balbiani bodies from stage I Xenopus oocytes. Mitochondria (dark spots), RNP particles (green
arrow head), and Golgi stacks (yellow arrow head) are clearly visible.
(D) Stage I oocytes were incubated in 10 mM Thioflavin T in 13 MMR for 10 min and washed twice with 13 MMR.
See also Figure S1, Table S1, and Movie S1.
disease, amyotrophic lateral sclerosis, or prion diseases (Haass

and Selkoe, 2007; Koo et al., 1999; Polymenidou and Cleveland,
2011). Several examples of amyloid-like aggregation mechanisms with normal biological functions have recently emerged
(Berchowitz et al., 2015; Fowler et al., 2006; Hou et al., 2011;
Li et al., 2012). Both pathological and physiological amyloidforming proteins contain low-complexity regions, which are
intrinsically disordered but can undergo conformational conversions into an amyloid-like fibrillar state (Kato et al., 2012). These
regions are also found in yeast prion proteins, a class of infectious proteins that generate heritable phenotypic diversity in
clonal population of yeast cells (Alberti et al., 2009; Halfmann
et al., 2012; Wickner et al., 2013). Protein regions with sequence
compositions similar to yeast prions are called prion-like domains (PLDs) (Alberti et al., 2009; Si et al., 2003).
RESULTS
A Balbiani Body Has a Rigid Structure that Stains with
Thioflavin T
We began our studies on Balbiani bodies by manually dissecting
them from Xenopus laevis stage I oocytes, which are transparent
and 50300 mm in diameter (Figure 1A). Stage I oocytes were
placed in a glass-bottom dish together with a cytoskeleton stabilizing buffer and mechanically disrupted using tweezers, leaving the Balbiani body intact. The Balbiani body preserved its circular shape and did not disintegrate under the considerable
shear forces applied during isolation (Movie S1). Balbiani bodies
also retained their structure under harsh conditions, including
638 Cell 166, 637650, July 28, 2016
high salt (2 M NaCl) (Figure 1B, first panel) and high temperature
(up to 95 C) (Figure 1B, second panel). Confirming previous work
on intact oocytes, thin-section electron microscopy of the isolated Balbiani bodies revealed densely packed mitochondria,
ER, and Golgi stacks as well as compact fibrillar elements that
others have shown to be made of ribonucleoprotein (RNP) (Figure 1C) (al-Mukhtar and Webb, 1971; Balinsky and Devis,
1963; Billett and Adam, 1976).
To probe the molecular properties of the Balbiani body, we
introduced a number of small molecules into the oocytes. Thioflavin T (ThT), a dye that stains the b sheet-rich structures of amyloid (Alberti et al., 2010; Nilsson, 2004), strongly accumulated in
the Balbiani body (Figure 1D), suggesting it is rich in b sheet
structures, which is a hallmark of amyloids.
Xvelo Is Highly Concentrated in Balbiani Bodies
To analyze the composition of Balbiani bodies, we used quantitative mass spectrometry (McAlister et al., 2014; Wuhr et al.,
2014). This revealed that the most enriched Balbiani body proteins that are not part of organelles are Velo1 (commonly known
as Xvelo) and fetal hemoglobin subunits, Hba1 and Hbg1 (Figures S1AS1C). We focused on Xvelo because it is homologous
to zebrafish Bucky ball, which plays a crucial role in Balbiani
body organization (Bontems et al., 2009). Computational analysis of Xvelo sequence predicted an intrinsically disordered protein with no known domains, apart from a C-terminal positively
charged region that could bind RNA.
Quantitative western blotting using a peptide antibody against
the C terminus of Xvelo (Figure S2A) provided an estimate of
Xvelo
Mitochondria
Overlay
50 m
2 m
20 m
0.25 s
30 min
60 min
Xvelo-GFP
Prebleach
100
Intensity (A.U.)
Xvelo - WT
80
60
40
20
0
0
20
40
Time (min)
60
Figure 2. Xvelo Forms a Stable Matrix

(A) mRNA encoding for Xvelo-GFP was microinjected into stage I oocytes. MitoTracker Deep Red was used to label mitochondria. Oocytes were imaged live with
a laser scanning confocal microscope with a 403 water-immersion objective.
(B) Magnification of the Balbiani body in (A).
(C) Internal rearrangement of fluorescent Xvelo-GFP particles after half bleach over time.
(D) The fluorescent recovery of the half-bleached Xvelo-GFP in the Balbiani body in (C) and two other biological replicates is shown by quantification of fluorescence in bleached region over time. Fluorescent intensity changes in the bleached region per pixel over time were plotted after it was normalized for photobleaching by using an unbleached neighboring area and background subtraction.
See also Figure S2.
Xvelo concentration in oocytes of 8 mM (Figures S2B and S2C).

If all the Xvelo was concentrated in the Balbiani body, and
assuming an average late stage I oocyte diameter is 250 mm
and Balbiani body diameter is 60 mm, this corresponds to
560 mM in the Balbiani body. Compared with the published
concentrations of proteins in a frog egg, where the most concentrated proteins were alpha-1-antitrypsin (Serpina1) and actin
(Actg1) at 17.6 and 14.3 mM, respectively (Wuhr et al., 2014),
one can appreciate the exceptionally high concentration of Xvelo
in Balbiani bodies. This high local concentration of Xvelo,
together with its homology to the zebrafish protein Bucky ball,
makes it a likely candidate for a structural organizer of the Balbiani body.
Xvelo Forms a Stable Matrix in Balbiani Bodies
To investigate Xvelo function in Xenopus, we began by injecting
mRNAs encoding for Xvelo tagged with GFP (Xvelo-GFP) into the
oocyte cytoplasm. Xvelo-GFP mainly localized to the Balbiani
body with little soluble protein present in the cytoplasm (Figure 2A). More detailed analysis at higher magnification showed
that Xvelo-GFP filled the gaps between the mitochondria in the
Balbiani body (Figure 2B). These results suggest that Xvelo
may form a matrix in which mitochondria and other organelles

are embedded.
To test whether Xvelo was part of a stable matrix, we used
FRAP (fluorescence recovery after photobleaching) to probe its
dynamics. We again injected mRNA encoding for Xvelo-GFP,
and after overnight incubation, we imaged the oocytes. After
photobleaching Xvelo-GFP, maximum recovery of 20% was
seen after 1 hr (Figures 2C and 2D). We conclude that Xvelo matrix is highly stable, with slow turnover, consistent with a role as a
structural matrix.
Xvelos Association with the Balbiani Body Requires Its
Prion-like Domain
We next investigated how the Xvelo matrix is formed and held
together. Despite the lack of any conventional domains, Xvelo
has a PLD at its N terminus, which is detected by several prion
detection algorithms (Figure 3A; Figure S3A) (Alberti et al.,
2009; Lancaster et al., 2014; Toombs et al., 2012). It also has a
lysine/ arginine rich region at its C terminus that might act as
an RNA binding domain (Figure 3A). To assess the behavior of
these different parts of Xvelo in vivo, we performed a structurefunction analysis by dividing the protein into four fragments
Cell 166, 637650, July 28, 2016 639
A
NH2
1
WT
82
398
F4
K/R rich region

...KIKEQDKPPKKKGALK
SSKRKQTRT...
4D
NQPRPYFYAQP...GNPDDPDDSVAL
7D
NQPRPDDDAQP..GNPDDPDDSVAL
Fragment 3
Fragment 4
Xvelo-woF1
GFP
Fragment 2
779 aa
COOH
598
F3
F2
Prion-like domain
82
21
58
NQPRPYFYAQP...GNPYFPYYSVAL
Fragment 1
150
F1
Xvelo-WT
F1-7D
Xvelo-4D
F1-4D
Xvelo-7D
F1-WT
F1-7D
F1-WT
F1-4D
50 m
GFP
Ccy/CBb
0.8
50 m
t = 0.1 s
t = 10 min
0.2
0
GFP
Xvelo-WT
Zoom
F1-4D
F1- 4D
50 m
F1-7D
F1-7D
Fluorescence Recovery (% of initial)
0.4
F1-WT
F1-WT
Prebleach
0.6
100
80
WT
F1 - 4D
F1 - 7D
60
40
20
0
0
100 200 300 400 500 600

Time (s)
Figure 3. Xvelo Self-Assembly Is Dependent on Its Prion-like Domain

(A) Diagram of the known structural elements of Xvelo. Prion-like domain, mutants (4D and 7D) and the fragments of Xvelo (F1F4) are marked in the figure.
(B) mRNAs encoding for Xvelo fragments shown in (A) and Xvelo without fragment 1 (Xvelo-woF1) are in vitro synthesized and micro-injected into stage I oocytes.
Oocytes were imaged after overnight incubation in oocyte culture medium (OCM).
640 Cell 166, 637650, July 28, 2016
(see Supplemental Experimental Procedures for details; Data

S1A).
We in vitro synthesized mRNAs encoding for the four Xvelo
fragments tagged with GFP and injected them into oocytes at
equal concentrations. Each fragment of Xvelo localized differently (Figure 3B). The C-terminal F4 fragment, which carries
the positively charged region, localized to nucleoli. Nucleoli are
RNA-rich, thus this localization may reflect predicted RNA binding activity of the F4 fragment (Figure 3B). The N-terminal F1
fragment, which carries the PLD, localized to the Balbiani
body. Its localization pattern was indistinguishable from the
full-length protein (Figures 2A and 3B). An additional fragment
lacking the F1 fragment but retaining the rest of the Xvelo protein
(Xvelo-woF1) did not localize to the Balbiani body (Figure 3B).
Thus, we conclude that the N terminus, which contains the
PLD, is the key region that targets Xvelo to the Balbiani body.
We next designed mutants that would disrupt the propensity
of Xvelo for amyloid-like self-assembly. Charged amino acids
are strongly disfavored in PLDs, as they interfere with the formation of a hydrophobic nucleus (Alberti et al., 2009; Lopez de la
Paz and Serrano, 2004; Serio et al., 2000). Thus, to create defective PLD mutants of Xvelo, we mutated either four or seven aromatic residues in its PLD to a negatively charged amino acid
(aspartate) (Figure 3A). We call these 4D and 7D mutants,
respectively (Figure 3A). The mutants no longer scored positive
in prion detection algorithms (Figures S3B and S3C). We injected
mRNAs encoding for wild-type or mutant versions of Xvelo
tagged with GFP into the oocytes and imaged after overnight incubation. We used mRNAs encoding for both fragment 1 (the
PLD) and full-length Xvelo to observe the effects of the mutants
in the oocytes in case the full-length protein interferes with the
assembly pattern of the mutants. Fragment 1 proteins carrying
the 4D and 7D mutations still exhibited a partial localization to
the Balbiani body, but they also showed a diffuse signal in the
cytoplasm, which was not observed for the wild-type protein
(Figures 3C and 3D). The effect was much stronger for the 7D
mutant, which was barely enriched in the Balbiani body (Figures
3C and 3D). Full-length proteins behaved similar to their F1 counterparts (Figures 3D and S3D). Thus, we conclude that the PLD of
Xvelo is essential and sufficient for Balbiani body localization,
and the two conserved motifs enriched for aromatic residues
provide an essential structural role in Xvelo targeting.
To investigate whether the mutations in the Xvelo PLD change
the association dynamics of the GFP-tagged constructs with the
endogenous Xvelo matrix, we performed FRAP on wild-type and
mutants. Fragment 1 had a slow turnover rate, similar to the fulllength protein (Figures 2C, 2D, 3E, 3F, S3E, and S3F). However,
the mutants recovered from photobleaching significantly faster

than wild-type; the 100% recovery time was 10 min for 4D and
3 min for 7D mutants after photobleaching (Figures 3E and 3F).
(This should be compared to the wild-type, which recovers to
20% after 1 hr in Figures 2C and 2D.) The recovery times of
the full-length mutants were similar to the F1 fragments (Figure S3F). These results suggest that the PLD drives the association of Xvelo with the pre-assembled Xvelo in the matrix.
If the mutants change the association dynamics of Xvelo with
the matrix, they might impose a dominant-negative effect (i.e.,
inhibit the assembly of wild-type Xvelo into the matrix). To test
this, we co-injected mRNAs encoding for full-length Xvelo-WTmCherry and GFP-tagged F1 wild-type and mutants at equal
concentrations and imaged by live confocal microscopy the
next day. Wild-type Xvelo co-localized with the F1 fragment in
the Balbiani body, and there was little signal in the cytoplasm
(Figure 3G). However, in the presence of the 4D mutant, the
wild-type Xvelo also formed discrete small aggregates in the
cytoplasm (Figure 3G). The 7D mutant also increased the amount
of diffusely localized wild-type Xvelo, but it did not cause the
punctate localization pattern seen with the 4D mutant (Figure 3G). The difference between the mutants may be due to
the fact that the 4D mutant binds more strongly to wild-type
Xvelo because of its remaining hydrophobic motif. The same
patterns were observed when the experiments were repeated
with full-length mutants (Figure S3G). Thus, we conclude that
mutant PLDs can partially inhibit the self-assembly of wild-type
Xvelo, most likely by binding to it and blocking growth into larger
assemblies.
Recombinant Xvelo Forms Micron-Scale Networks
In Vitro
To test whether Xvelo can form amyloid-like fibers on its own, we
analyzed recombinant Xvelo-GFP in vitro. Xvelo-GFP did not express at all in bacteria (data not shown) but expressed well in baculovirus-infected insect Sf9 cells. Purification of Xvelo-GFP
from insect cells was challenging due to its strong tendency to
aggregate, but we found that addition of 300 mM arginine to
Xvelo-GFP solubilized it in an otherwise physiological buffer (Figure S4A). The guanidino fragment of arginine may function
similar to guanidinium ion as a solubilizing agent (England and
Haran, 2011; Tsumoto et al., 2004).
We purified soluble Xvelo-GFP in 300 mM arginine and then
diluted into 30 mM arginine, which induced self-assembly (Figure 4A, first panel). Xvelo-GFP self-assembled first into small,
then large micron-scale networks over time (Figure 4A, first
panel). To test if network assembly required the PLD of Xvelo,
(C) mRNAs encoding for wild-type and PLD mutants of fragment 1-GFP were microinjected into oocytes. Oocytes were incubated overnight and imaged.
(D) Ratio of GFP concentration in the oocyte cytoplasm (Ccy) to the Balbiani body (CBb) in oocytes injected with mRNAs encoding for indicated proteins. Relative
concentrations were calculated by using oocyte or the Balbiani body volume from z stacks and the fluorescent intensity of GFP. Mean values and SEs of 10
oocytes are plotted.
(E) Internal rearrangement of fluorescent wild-type or mutant F1-GFP particles after photobleaching over time.
(F) The fluorescent recovery of photobleached wild-type or mutant F1-GFP in Balbiani bodies in (E) and two other biological replicates for each are shown by
quantification of fluorescence in bleached region over time normalized by an unbleached neighboring region.
(G) mRNAs encoding for full-length Xvelo-mCherry wild-type and GFP-tagged fragments (F1-WT-GFP, F1-4D-GFP, and F1-7D-GFP) were in vitro synthesized.
F1-WT-GFP or mutants were mixed with equal amounts of full-length Xvelo-mCherry mRNA and microinjected into the oocytes. After overnight incubation, the
oocytes were imaged by scanning confocal microscopy.
See also Figure S3.
Cell 166, 637650, July 28, 2016 641
Xvelo-WT
0 hour
1 hour
2 hours
3 hours
24 hours
98
62
49
38
28
Total Network Mass (A.U.)
Xvelo-7D-GFP
C
F1-WT-GFP
kDa
198
Xvelo-4D-GFP
Xvelo-WT-GFP
Xvelo-7D
Xvelo-4D
F1-WT
40 m
150
100
Xvelo-WT
F1-WT
Xvelo-4D
Xvelo-7D
50
0
0
24
Time (h)
Figure 4. Xvelo Forms Micron-Scale Networks In Vitro

(A) 15 mM of recombinant Xvelo-GFP, F1-WT-GFP, and full-length mutants (Xvelo-4D-GFP and Xvelo-7D-GFP) were diluted into a low-arginine buffer (30 mM) to
promote their self-assembly. The reaction mixtures were incubated at 25 C for the indicated time intervals and squashed under a coverslip to be imaged by
spinning-disc confocal microscopy.
(B) Coomassie-stained gels depicting recombinant Xvelo-GFP, Xvelo-4D-GFP, Xvelo-7D-GFP, and F1-WT-GFP.
(C) Quantification of networks in (A). 20 images were taken and stitched together, a threshold was applied, and the network intensities were measured. The
integrated intensity of networks per sample at each time point (total network mass) is plotted. Means and SEs of three biological replicates are shown.
See also Figure S4.
we expressed and purified the F1 fragment containing the PLD.

F1-WT-GFP also formed large networks over time, with striking
similarity to networks formed by full-length Xvelo, showing that
network assembly is driven by the PLD. We confirmed the critical
role of the PLD by showing that two full-length PLD mutants,
Xvelo-4D-GFP and Xvelo-7D-GFP, did not form networks (Figure 4A, lower panels) (Figure 4B).
To quantify Xvelo self-assembly in vitro, we tested Xvelo concentrations similar to its physiological concentration (Figure S2C). To balance between the aggregation propensity of
Xvelo at high concentrations and the measured physiological
concentration of 8 mM in oocytes if it was uniformly distributed,
642 Cell 166, 637650, July 28, 2016
we chose a stock concentration of 15 mM for each recombinant

protein so that after 10-fold dilution, the Xvelo concentration in
the mixture was 1.5 mM. We quantified total network mass by
summing the fluorescence intensity of networks in an area corresponding to a total of 20 images (Figures 4B and S4B). This analysis further confirmed that the F1 and full-length Xvelo-GFP have
indistinguishable kinetics of network formation while the PLD
mutants could not form any networks (Figure 4C). Xvelo-4DGFP formed small precipitates after dilution out of high arginine,
but these precipitates remained in solution without forming networks after overnight incubation (Figure 4A, third panel). Xvelo7D-GFP was completely soluble with no precipitates upon
Control
1 % SDS
0.1h
12h
24h
1000
Xvelo-WT
900
ThT (A.F.U)
800
40 m
700
600
500
400
300
F1 - WT
200
ThT
Blank
Mot3
Xvelo-7D
Xvelo-4D
F1- WT
Xvelo
Overlay
Arbitrary units (A.U)
E
EB1
Xvelo-7D
Xvelo-4D
F1 - WT
Xvelo-WT
Xvelo-WT
100
-OC
mCherry
500
400
300
200
100
0
20 m
Ponceau
200
400
300
200
100
0
600
Egg extract, boiled
Oocyte st I, boiled
Egg extract, boiled
Oocyte stage I
Xvelo-WT
100nm
Xvelo-4D
400
SDS - PAGE
SDD - AGE
Egg extract
200
Distance (m)
Oocyte st I, boiled
600
ThT
10 m
400
500
-Xvelo
Ponceau
Figure 5. Xvelo Shows Amyloid-like Features In Vivo and In Vitro

(A) SDS was added to Xvelo and F1-WT-GFP networks to a final 1% concentration, and the reactions were incubated at room temperature for 15 min. The
resulting mixtures were squashed under a coverslip and imaged by a spinning-disc confocal microscope.
(B) A final concentration of 5 mM of Thioflavin T was added to the wild-type, F1, and mutant network reactions at the indicated time points. Yeast prion Mot3 was
used as a positive control, whereas blank was only buffer and ThT. ThT fluorescence was measured (a.u.) by a fluorescence plate reader.
(C) 1 mg RFP-tagged wild-type, F1, and mutant recombinant proteins were dot-blotted on a nitrocellulose membrane and assayed for reactivity with a-amyloid
fibril OC. EB1-RFP was used as a negative control.
(D) Negative stain electron microscopy images of the untagged Xvelo-WT and Xvelo-4D self-assembly reactions (scale bars, 100 nm).
Cell 166, 637650, July 28, 2016 643
dilution, indicating its complete inability to self-assemble. We

also found that the assembly kinetics of Xvelo networks were
dependent on Xvelo concentration (Figure S4C). We conclude
that Xvelo can form networks on its own in vitro and that the
Xvelo PLD is essential and sufficient for the formation of these
networks.
yloid-like networks in stage I oocytes and in vitro. Although Xvelo

is present in the eggs at a detectable, albeit a much lower, concentration of 80 nM (Figures 5F and S2D), it does not form
SDS-resistant assemblies in the eggs (Figure 5F). Thus, the amyloid-like characteristics of Xvelo are transient and regulated
during development.
Xvelo Networks Exhibit Amyloid-like Properties In Vitro

and In Vivo
We next tested whether the Xvelo networks we see in vitro have
amyloid-like properties. For this purpose we tested a number of
different criteria. First we showed that both Xvelo-WT and F1GFP networks were resistant to SDS treatment (Figure 5A). We
noticed a partial solubilization of the Xvelo network by SDS (Figures 5A and S5A), suggesting the presence of a more stable fiber
backbone, which is decorated with more loosely associated
Xvelo that can be released by treatment with SDS. Next we
monitored acquisition of Thioflavin T (ThT) fluorescence, using
RFP-tagged Xvelo, fragment 1 and mutants Xvelo-4D and
Xvelo-7D, and the yeast Mot3 prion as a rapidly aggregating positive control (Alberti et al., 2009). F1 and full-length Xvelo networks quickly acquired ThT fluorescence in a PLD-dependent
manner (Figure 5B). We supported these findings by showing
that full-length and F1 networks were recognized by an anti-amyloid fibril antibody, whereas the mutants or an unrelated protein
(EB1-RFP) were not (Figure 5C). Finally, negative-stain transmission electron microscopy (TEM) revealed that Xvelo-WT forms fibers reminiscent of amyloids in vitro in a PLD-dependent manner
(Figure 5D).
To confirm that the Xvelo matrix behaves like an amyloid
in vivo, we examined whether Xvelo staining coincides with the
strong staining of the Balbiani body with ThT. For this purpose,
we injected oocytes with mRNA encoding for Xvelo-mCherry,
incubated the oocytes with ThT, and imaged them by live
confocal microscopy. Xvelo and ThT signals overlapped strongly
(Figure 5E). Considering Xvelo is the only enriched protein in the
Balbiani body with a PLD and that the endogenous Xvelo concentration exceeds 500 mM in the Balbiani body, the majority
of the ThT signal likely comes from endogenous Xvelo.
Next, we used semi-denaturing detergent agarose gel electrophoresis (SDD-AGE) to check the solubility of Xvelo. SDD-AGE
allows the resolution of a wide size range of SDS-resistant aggregates (Alberti et al., 2010; Bagriantsev et al., 2006). We collected
stage I oocytes and used egg extracts as a comparison. Indeed,
Xvelo formed SDS-resistant aggregates in vivo detected by the
SDD-AGE gel. Moreover, it did not form any detectable SDSresistant aggregates in the mature egg (Figure 5F).
Therefore, because Xvelo forms SDS-resistant, filamentous
assemblies that bind ThT in a manner that depends on the presence of its prion-like domain, we conclude that Xvelo forms am-
Prion-like Domain Specificity for Targeting to the

Balbiani Body
To examine whether other proteins with prion-like domains
might also be involved in organizing the Balbiani body, we first
looked in our mass spectrometry list for other proteins with a
PLD. However, apart from the common contaminant yolk protein, vitellogenin, none of the other enriched proteins in the Balbiani body were predicted to contain a prion-like domain.
To investigate whether Xvelo is unique in its ability to form a
stable matrix in the Balbiani body, we selected five RNA binding
proteins with prion-like domains, namely CPEB3, Dazap1, FUS,
hnRNPA1, and Tia1, as well as the aggregation-prone mutant of
FUS, FUS-156E, and injected mRNAs encoding for GFP-tagged
versions of these proteins into the oocytes. Apart from CPEB3,
all these proteins are present naturally in Xenopus eggs and oocytes (Wuhr et al., 2014). Among these proteins, hnRNPA1,
CPEB3, FUS, and FUS156E did not localize to the Balbiani
body (Figure 6A). Tia1 and Dazap1 localized to the Balbiani
body (as well as the cytoplasm and nucleus, respectively). However, fast turnover rates of both Tia1 and Dazap1 after photobleaching strongly suggest that they do not incorporate into a
stable matrix (Figures S6A and S6B). Both Tia1 and Dazap1
are implicated in translational repression by binding to 30 UTRs
of mRNAs (Dixon et al., 2003; Steger, 2001), and thus, their localization pattern can be explained by their binding to the repressed
mRNAs in the Balbiani bodies upon overexpression.
Our experiments suggest that the PLD of Xvelo has specific
features to target to and form a stable matrix in the Balbiani
body. Part of the evidence that Xvelo structurally organizes Balbiani bodies is its sequence homology to zebrafish Bucky ball,
whose genetics pointed to such a function (Bontems et al.,
2009). To test if the two proteins exhibit similar sub-cellular dynamics, we expressed Bucky ball in Xenopus oocytes as a GFP
fusion and looked at the characteristics of incorporation of
Bucky ball into Xenopus Balbiani bodies. Bucky ball targeted
to the Balbiani body, co-localizing with Xvelo (Figure 6B). Bucky
ball turnover time after photobleaching was still a little faster
than Xvelo, but of a similar order of magnitude (Figures 6C
and 6D). Thus, we conclude that sequence features conserved
in the PLDs of Bucky ball and Xvelo are required for targeting to
the Balbiani bodies in oocytes. This experiment also provides
strong evidence that Bucky ball and Xvelo are functional
homologs.
(E) Stage I oocytes were injected with mRNA coding for Xvelo-mCherry and incubated overnight. The oocytes were incubated in 10 mM ThT, washed twice, and
imaged by confocal microscopy. Bottom: zoomed in images. Line scans showing the co-localization of Xvelo-mCherry and ThT stain from five Balbiani bodies
were plotted. Each color represents the line scan of a different Balbiani body. We speculate that the outer rim Xvelo-mCherry signal belongs to the newly
translated Xvelo-mCherry protein that has just started to form a new, immature matrix and does not yet stain with ThT.
(F) SDD-AGE detects SDS-resistant Xvelo aggregates in vivo. Equal amounts of cytoplasmic extracts of stage I oocytes and mature eggs were loaded onto SDSPAGE. A five times more amount of egg extracts was loaded for SDD-AGE gels to make Xvelo concentrations comparable between the oocyte and egg extract
lanes. Xvelo was detected by an anti-Xvelo antibody.
See also Figure S5.
644 Cell 166, 637650, July 28, 2016
hnRNPA1
Tia1
Dazap1
FUS
FUS-156E
DIC
GFP
CPEB3
50 m
GFP
GFP
Xvelo-mCherry
Merge
Xvelo-mCherry
FUS(PLD)Xvelo
FUS
Buc(PLD)Xvelo
Buc
FUS/Xvelo
Buc /Xvelo
Buc
Prebleach
t = 0.1 s
t = 15min
D
Fluorescence Recovery (% of initial)
50 m
100
Buc
Buc(PLD)Xvelo
FUS(PLD)Xvelo
Xvelo - WT
80
60
40
20
0
0
200
400
600
Time (s)
800
1000
Figure 6. Xvelo Has Unique Properties for Forming a Stable Matrix

(A) mRNAs encoding for GFP-tagged hnRNPA1, CPEB3, Tia1, Dazap1, FUS, and FUS156E were in vitro synthesized and microinjected into the oocytes. After
overnight incubation, the oocytes were imaged by laser scanning confocal microscopy.
Cell 166, 637650, July 28, 2016 645
To further investigate the specificity of the Xvelo PLD for targeting to the Balbiani body, we swapped the PLD of Xvelo with
the PLD of FUS, an unrelated prion-like RNA binding protein,
and with the PLD of Bucky ball. The resulting chimeric proteins
were named FUS(PLD)Xvelo and Buc(PLD)Xvelo, respectively
(Figure S6C). Buc(PLD)Xvelo localized to the Balbiani body,
with a FRAP time in between Xvelo-WT and Bucky ball-WT (Figures 6C and 6D). FUS(PLD)Xvelo localized to the cytoplasm and
weakly to the Balbiani body (Figure 6B). The Balbiani body-localized protein recovered quickly after photobleaching, with a halflife of 1 min, much faster than that observed for Buc(PLD)Xvelo
(2 hr) (Figures 6C and 6D). Taken together, these data provide
strong evidence that the prion-like domains of functional homologs Xvelo and Bucky ball have unique features that target and
form a stable matrix in the Balbiani body.
We next examined the specificity of PLD interactions in vitro
with an aggregation assay that compared the assembly properties of Xvelo and FUS. We repeated previous reports showing
that FUS-WT forms liquid droplets in vitro, whereas the aggregation prone mutant FUS-156E forms aggregates (Patel et al.,
2015). We mixed Xvelo-RFP with either FUS-GFP or FUS156E-GFP in vitro in a high-arginine, high-salt buffer and then
diluted out the arginine and salt to initiate aggregation. XveloRFP networks and FUS-WT-GFP droplets or FUS-156E-GFP aggregates formed in the vicinity of each other but clearly were
separate and did not interact with one another (Figure S6D).
Thus, we conclude that PLDs do not aggregate with one another
randomly, even when they are in close proximity at high
concentration.
Xvelo Binds RNA and Clusters Mitochondria
A key biological function of Xvelo during Balbiani body formation
is likely to be the binding and concentration of organelles and
RNA through its amyloid-like self-assembly. We looked for direct
evidence that Xvelo can form a network that is sufficient to bind
and concentrate mitochondria and RNA. To test whether XveloGFP networks can bind RNA, we used a non-specific control
mRNA coding for mCherry and an RNA that is enriched in Balbiani bodies, the Xenopus homolog of nanos, xcat-2 (Zhou and
King, 1996). Both of the RNAs bound to the networks (only
mCherry-RNA results are shown). In contrast, the F1 fragment
that lacks the C-terminal motif did not bind to either of the
RNAs. Thus, Xvelo networks can sequester RNA in a manner
that requires a putative RNA-binding domain at the C terminus
of Xvelo (Figure 7A), but apparently without any strong RNA
sequence preference in vitro.
Germ cells receive a pool of organelles from cyst cells to form
Balbiani bodies and become oocytes in mice (Lei and Spradling,
2016). In early-stage oocyte development, these organelles are
clustered into Balbiani bodies by an unknown mechanism. We

asked whether Xvelo can cluster organelles on its own in a
cell-free system and thus stimulate aspects of Balbiani body
reconstruction in vitro. For this, we used an established cellfree system, cytoplasmic extracts from Xenopus eggs with intact
actin, which contain abundant organelles and RNA, like the germ
cell environment prior to Balbiani body formation (Field et al.,
2014; Lei and Spradling, 2016). We labeled the mitochondria
with MitoTracker and added Xvelo-GFP to the Xenopus egg extracts to 1.5 mM final concentration. Xvelo-GFP self-assembled
as expected, and induced co-clustering of mitochondria (Figure 7B). Mitochondrial clustering activity depended on the PLD
of Xvelo (Figure 7B, second panel). When we treated these clustered mitochondria with 2 M KCl, mitochondria were still stably
bound to, and entrapped by, the Xvelo network (Figure 7B, third
panel). We used line-scans to quantify co-clustering of Xvelo assemblies and mitochondria over five frames, in an area extending
2 mm2 (Figure 7C). Note the similar line-scans for Xvelo-WT and
mitochondria, demonstrating co-clustering, and the lack of mitochondrial clustering with the 4D mutant, which does not selfassemble. Although fragment 1 can still form networks similar
to full-length networks in egg extracts, these networks cannot
recruit mitochondria (Figure S7A). This allows us to speculate
that entrapment of mitochondria is not based on the geometrical
properties of Xvelo networks.
As a control, we introduced FUS-WT and FUS-156E to egg extracts. Both FUS-WT and FUS-156E behaved as expected in the
egg extracts; WT formed droplets, and 156E formed aggregates.
Strikingly, FUS structures were avoided by mitochondria in the
egg extracts, as opposed to the clustering we see by Xvelo (Figure 7D). Thus, we conclude that mitochondrial clustering in egg
extracts is a specific property of Xvelo that requires its amyloid-like self-assembly.
DISCUSSION
Here, we show that Xvelo, a highly enriched protein in Xenopus
Balbiani bodies, forms a mitochondria-embedding, SDS-resistant matrix in vivo pervading the entire volume of the Balbiani
body in early-stage oocytes. Xvelo has a prion-like domain in
its N terminus, which is sufficient and necessary to target and
incorporate into the Balbiani body. Pure protein experiments
further support our in vivo data that Xvelo has amyloid-like properties. Our key functional experiment, namely clustering of mitochondria by Xvelo in egg extracts, is a reconstruction of aspects
of Balbiani body formation. The fact that Xvelo can cluster mitochondria but the prion-like domain mutant does not strongly suggests that this clustering is dependent on its prion-like domain.
Thus, these data suggest that the organelle-rich Balbiani body
(B) mRNAs encoding for Xvelo-mCherry and GFP-tagged Bucky ball (Buc), FUS, and the PLD-swap versions of Xvelo, in which the PLD of Xvelo was replaced
either by the PLD of Bucky ball (BucPLDXvelo) or the PLD of FUS (FUSPLDXvelo), were injected into the oocytes at equal concentrations and imaged after
overnight incubation.
(C) Internal rearrangement of fluorescent Bucky ball-GFP (Buc), and PLD-swap versions of Xvelo, Buc(PLD)Xvelo-GFP and FUS(PLD)Xvelo-GFP, after photobleaching over time.
(D) The fluorescent recovery of photobleached constructs in Balbiani bodies in (C), as well as Xvelo-WT and two other biological replicates for each, is shown by
quantification of fluorescence in a bleached region over time normalized by an unbleached neighboring region.
See also Figure S6.
646 Cell 166, 637650, July 28, 2016
RNA (546-14-UTP)
GFP
Mitochondria
Zoom/Merge
Control
Xvelo-WT
GFP
FUS-WT
F1- WT
10 m
FUS-156E
20 m
Mitochondria
Merge
GFP
Mitochondria
Merge
+ 2M KCl
Xvelo-WT
Egg Extract
Xvelo-WT
Xvelo-4D
GFP
20 m
5 m
Xvelo-WT + 2M KCl
Xvelo-WT
500
400
300
500
GFP
400
300
GFP
200
200
0
500
1000
1500
2000
500
1000
1500
2000
400
300
200
Mitochondria
1500
Distance (m)
2000
1000
1500
2000
500
Mitochondria
400
300
200
100
100
1000
500
Distance (m)
500
500
500
300
Distance (m)
Distance (m)
GFP
400
200
0
500
Xvelo-4D
Mitochondria
400
300
200
100
500
1000
1500
Distance (m)
2000
500
1000
1500
2000
Distance (m)
Figure 7. Xvelo-GFP Aggregates Bind to RNA and Cluster Mitochondria

(A) Labeled RNAs, Xenopus nanos homolog xcat-2, and an mRNA encoding for mCherry protein were prepared using the MEGAscript SP6 kit with ChromaTide
Alexa Fluor 546-14-UTP. RNAs were added to pre-assembled networks and imaged with a spinning-disc confocal microscope.
(B) Recombinant Xvelo-WT-GFP or the prion-like domain mutant, Xvelo-4D-GFP, was added to Xenopus egg extracts with intact actin. Xvelo-GFP fills the gaps
between mitochondria (arrows, compare to Figure 2B). Mitochondria were labeled with Mitotracker Deep Red. Images were taken with a spinning-disc confocal
microscope. Bottom: the extracts were diluted with KCl so that the final KCl concentration was 2 M.
Cell 166, 637650, July 28, 2016 647
is organized by a functional amyloid into a dense matrix that sequesters mitochondria and other organelles.
The Balbiani body changes its properties during development:
it is a stable structure in the early oocyte, and it either disappears
in mammals (Hertig and Adams, 1967; Pepling et al., 2007) or
dissociates into small dispersed isles, called germ plasm, in
the mature oocytes of germ-plasm-containing species. This suggests that its formation and dispersal are regulated. We did not
detect any SDS-resistant Xvelo aggregates in egg extracts (Figure 5F), suggesting Xvelo does not form amyloid-like structures
in the egg. How this transformation occurs remains unclear. Our
preliminary data suggest that Xvelo is extensively phosphorylated in the egg, and most of these sites are not phosphorylated
in the oocyte. Regulation by phosphorylation could be a mechanism determining the dispersal of the Balbiani body, perhaps by
kinases or phosphatases that are activated at fertilization. Understanding the regulation of the physical state of Xvelo as the
oocyte matures will be important to elucidate the fate of the organelles residing in the Balbiani body.
Balbiani bodies are found in most vertebrates; mouse Balbiani
bodies were identified only recently, but Balbiani bodies in humans were observed decades ago, although they are almost untouched in the literature (Hertig and Adams, 1967; Pepling et al.,
2007). Could proteins similar to Xvelo be required for the formation of Balbiani-like structures in other organisms? In zebrafish,
bucky ball was identified in a maternal effect mutant screen as
the only gene that is essential for Balbiani body formation (Dosch
et al., 2004). Although the sequence similarity between Xvelo and
Bucky ball proteins are poor (Data S1B), these proteins have long
patches of intrinsically disordered regions, and score positive in
prion detection programs (Data S1C). Oskar is a key protein
required to organize Drosophila pole plasm (Ephrussi et al.,
1991), but no homologs of Oskar have been identified in vertebrates. We found that Oskar also is a disordered protein with a
predicted PLD (Data S1D). This suggests that amyloid-like selfassembly of a disordered protein could be an evolutionary
conserved mechanism for Balbiani body formation. Strikingly,
a recent paper has also linked the formation of large amyloidlike aggregates to gametogenesis in yeast, suggesting amyloid-like mechanisms may be involved in germline specification
across kingdoms (Berchowitz et al., 2015).
Despite the low sequence conservation of Xvelo and Bucky
ball (Data S1B), key residues in their PLDs are conserved, suggesting that these residues are structurally important and underlie the observed specificity of assembly. Xvelo does not interact
with other proteins with PLDs, such as FUS, upon self-assembly
in vitro. This is in contrast to the promiscuous behavior of many
disease-causing amyloids, which often show cross-seeding interactions and promote mutual nucleation events. We attribute
this to the fact that Xvelo self-assembly does not involve an intermediate liquid-like state. We speculate that fast assembly into an
inert amyloid-like structure prevents aberrant interactions with
other prion-like proteins, thus reducing the likelihood of a disease condition. We propose that rapid assembly kinetics and
high specificity are important driving forces underlying the evolution of functional amyloids.
What are the potential advantages of using an amyloid-like
mechanism to form the Balbiani body? One could imagine packing away germline components in amyloid-like structures is protective. The tightly packed structures could prevent other proteins from diffusing into them, such as regulators, thus keeping
the organelles in a dormant state. It could act to slow down the
diffusion of small toxic molecules generated by mitochondria,
which could be damaging. It also provides a novel way to organize the cytoplasm, forming a rigid, giant body, in which the organelles are clustered together into one place and kept there for
many years. Future studies are likely to provide mechanistic
insight into the central question of how the germline of an organism provides young cytoplasm with its complement of organelles
in every generation while the somatic cells age and die.
Detailed methods are available in Supplemental Experimental Procedures.
Oocyte Handling and Injections
All experiments using Xenopus and zebrafish were done with approval of the
Harvard Medical School (HMS) animal care review board. Ovaries were surgically removed from adult female Xenopus laevis frogs and treated with 2 mg/ml
collagenase 1A (Sigma) in 13 MMR by gentle rocking, until most of the oocytes
were clearly dissociated. Oocytes were later injected with mRNAs encoding
for indicated proteins by using a FemtoJet express microinjector (Eppendorf).
Xvelo Protein Purification from Insect Cells
Recombinant versions of MBP-Xvelo-GFP, -RFP, and no-fluorescent tag (for
negative-stain electron microscopy studies) were expressed in Sf9 insect cells
using the baculovirus expression system. Insect cells were harvested in lysis
buffer (50 mM HEPES [pH 7.6], 100 mM KCl, and 1 M arginine). The MBP
(maltose binding protein) tag was captured using dextrin Sepharose resin
and cleaved off using HRV 3C protease (MPI-CBG, in-house) by incubation
overnight on ice.
Microscopy
Differential interference contrast (DIC) and phase contrast microscopy for microinjections, perfusion chambers, and Balbiani body isolations were performed using a standard wide-field epifluorescence Nikon inverted microscope equipped with a Hamamatsu Orca CCD camera and 43, 103, and
203 dry objectives. Live confocal microscopy with oocytes was performed using Nikon A1R Laser Scanning confocal equipped with 103 dry and 403 water-immersion objectives. Spinning disc confocal images were taken with a Nikon Ti inverted microscope with Yokagawa CSU-X1 spinning disk confocal
with Spectral Applied Research Aurora Borealis modification, equipped with
203 dry, 403, and 603 oil-immersion objectives.
Semi-denaturing Detergent-Agarose Gel Electrophoresis
Stage I oocyte and egg extracts were prepared according to Field et al. (2014)
with intact actin. SDD-AGE was adapted from Alberti et al. (2009). The protein
concentrations of the lysates were adjusted and protein samples were mixed
with 43 sample buffer (80 mM Tris, 40 mM acetic acid, 2 mM EDTA, 20% [v/v]
(C) Line scans of Xvelo-WT-GFP and Xvelo-4D-GFP and mitochondria in Xenopus egg extracts. Five images were stitched together to have an area spanning
larger than 2 mm2. Each color represents a different field.
(D) Recombinant FUS-WT-GFP or the aggregation prone mutant, FUS-G156E-GFP, was added to Xenopus egg extracts with intact actin. Arrows point to the
exclusion zones of mitochondria in the presence of FUS structures.
See also Figure S7.
648 Cell 166, 637650, July 28, 2016
glycerol, 3% [w/v] SDS, and bromophenol blue) and incubated at room temperature for 15 min before loading onto a 1.8% agarose gel containing 13
TAE and 0.1% SDS. The gel was run in running buffer (13 TAE, 0.1% SDS)
at 90 V, followed by wet transfer to nitrocellulose membranes (Amersham Biosciences). Xvelo protein was detected by an anti-Xvelo antibody.
Bontems, F., Stein, A., Marlow, F., Lyautey, J., Gupta, T., Mullins, M.C., and
Dosch, R. (2009). Bucky ball organizes germ plasm assembly in zebrafish.
Curr. Biol. 19, 414422.
Dosch, R., Wagner, D.S., Mintzer, K.A., Runke, G., Wiemelt, A.P., and Mullins,
M.C. (2004). Maternal control of vertebrate development before the midblastula transition: mutants from the zebrafish I. Dev. Cell 6, 771780.

seven figures, one table, one movie, and one data set and can be found with
this article online at http://dx.doi.org/10.1016/j.cell.2016.06.051.
E.B. and T.J.M. conceived the project together. E.B. designed the experiments
with A.A.H., S.A., and T.J.M. All experiments were performed by E.B., except
for mass spectrometry analysis (M.W. and S.P.G.) and electron microscopy
(M.C. and M.R.). Protein expression and purification were performed by
M.R., D.D., and R.L. The manuscript was written by E.B. with input from
A.A.H., S.A., and T.J.M.
ACKNOWLEDGMENTS
We thank members of the T.J.M. and A.A.H. labs, especially Avinash Patel for
helpful discussions, Andrei Pozniakovski for cloning, and Christine Field for
help in making extracts. We would like to thank Doris Richter and Sonja
Kroschwald for technical assistance. We are grateful to the Protein Expression, Electron Microscopy and Image Processing facilities of the MPI-CBG
for their support. We would also like to thank the Nikon Imaging Center at
Harvard Medical School for microscopy support. Proteomic analysis was supported by NIH grant R01-GM103785 (PI Marc W. Kirschner). M.W. was supported by the Charles A. King Trust Postdoctoral Fellowship. This work was
supported by the MaxSynBio consortium, which is jointly funded by the
Federal Ministry of Education and Research of Germany and the Max Planck
Society (to A.A.H.) and NIH grant GM39565 (to T.J.M.).
Revised: May 6, 2016
REFERENCES
al-Mukhtar, K.A., and Webb, A.C. (1971). An ultrastructural study of primordial
germ cells, oogonia and early oocytes in Xenopus laevis. J. Embryol. Exp. Morphol. 26, 195217.
Alberti, S., Halfmann, R., King, O., Kapila, A., and Lindquist, S. (2009). A systematic survey identifies prions and illuminates sequence features of prionogenic proteins. Cell 137, 146158.
Alberti, S., Halfmann, R., and Lindquist, S. (2010). Biochemical, cell biological,
and genetic assays to analyze amyloid and prion aggregation in yeast.
Methods Enzymol. 470, 709734.
Bagriantsev, S.N., Kushnirov, V.V., and Liebman, S.W. (2006). Analysis of amyloid aggregates using agarose gel electrophoresis. Methods Enzymol. 412,
3348.
Balinsky, B., and Devis, R.J. (1963). Origin and differentiation of cytoplasmic
structures in the oocytes of Xenopus laevis. Acta Embryol. Morphol. Exp. 6,
55108.
Berchowitz, L.E., Kabachinski, G., Walker, M.R., Carlile, T.M., Gilbert, W.V.,
Schwartz, T.U., and Amon, A. (2015). Regulated formation of an amyloid-like
translational repressor governs gametogenesis. Cell 163, 406418.
Billett, F.S., and Adam, E. (1976). The structure of the mitochondrial cloud of
Xenopus laevis oocytes. J. Embryol. Exp. Morphol. 36, 697710.
Dixon, D.A., Balch, G.C., Kedersha, N., Anderson, P., Zimmerman, G.A., Beauchamp, R.D., and Prescott, S.M. (2003). Regulation of cyclooxygenase-2
expression by the translational silencer TIA-1. J. Exp. Med. 198, 475481.
Dumont, J.N. (1972). Oogenesis in Xenopus laevis (Daudin). I. Stages of oocyte

development in laboratory maintained animals. J. Morphol. 136, 153179.
England, J.L., and Haran, G. (2011). Role of solvation effects in protein denaturation: from thermodynamics to single molecules and back. Annu. Rev.
Phys. Chem. 62, 257277.
Ephrussi, A., Dickinson, L.K., and Lehmann, R. (1991). Oskar organizes the
germ plasm and directs localization of the posterior determinant nanos. Cell
66, 3750.
Field, C.M., Nguyen, P.A., Ishihara, K., Groen, A.C., and Mitchison, T.J. (2014).
Xenopus egg cytoplasm with intact actin. Methods Enzymol. 540, 399415.
Fowler, D.M., Koulov, A.V., Alory-Jost, C., Marks, M.S., Balch, W.E., and Kelly,
J.W. (2006). Functional amyloid formation within mammalian tissue. PLoS Biol.
4, e6.
Gupta, T., Marlow, F.L., Ferriola, D., Mackiewicz, K., Dapprich, J., Monos, D.,
and Mullins, M.C. (2010). Microtubule actin crosslinking factor 1 regulates the
Balbiani body and animal-vegetal polarity of the zebrafish oocyte. PLoS Genet.
6, e1001073.
Haass, C., and Selkoe, D.J. (2007). Soluble protein oligomers in neurodegeneration: lessons from the Alzheimers amyloid beta-peptide. Nat. Rev. Mol. Cell
Biol. 8, 101112.
Halfmann, R., Jarosz, D.F., Jones, S.K., Chang, A., Lancaster, A.K., and Lindquist, S. (2012). Prions are a common mechanism for phenotypic inheritance in
wild yeasts. Nature 482, 363368.
Heim, A.E., Hartung, O., Rothhamel, S., Ferreira, E., Jenny, A., and Marlow,
F.L. (2014). Oocyte polarity requires a Bucky ball-dependent feedback amplification loop. Development 141, 842854.
Hertig, A.T. (1968). The primary human oocyte: some observations on the fine
structure of Balbianis vitelline body and the origin of the annulate lamellae.
Am. J. Anat. 122, 107137.
Hertig, A.T., and Adams, E.C. (1967). Studies on the human oocyte and its follicle. I. Ultrastructural and histochemical observations on the primordial follicle
stage. J. Cell Biol. 34, 647675.
Hou, F., Sun, L., Zheng, H., Skaug, B., Jiang, Q.-X., and Chen, Z.J. (2011).
MAVS forms functional prion-like aggregates to activate and propagate antiviral innate immune response. Cell 146, 448461.
Houston, D.W., and King, M.L. (2000). Germ plasm and molecular determinants of germ cell fate. Curr. Top. Dev. Biol. 50, 155181.
Kato, M., Han, T.W., Xie, S., Shi, K., Du, X., Wu, L.C., Mirzaei, H., Goldsmith,
E.J., Longgood, J., Pei, J., et al. (2012). Cell-free formation of RNA granules:
low complexity sequence domains form dynamic fibers within hydrogels.
Cell 149, 753767.
Kloc, M., and Etkin, L.D. (1995). Two distinct pathways for the localization of
RNAs at the vegetal cortex in Xenopus oocytes. Development 121, 287297.
Kloc, M., Spohr, G., and Etkin, L.D. (1993). Translocation of repetitive RNA sequences with the germ plasm in Xenopus oocytes. Science 262, 17121714.
Kloc, M., Bilinski, S., and Etkin, L.D. (2004). The Balbiani body and germ cell
determinants: 150 years later. Curr. Top. Dev. Biol. 59, 136.
Kogo, N., Tazaki, A., Kashino, Y., Morichika, K., Orii, H., Mochii, M., and Watanabe, K. (2011). Germ-line mitochondria exhibit suppressed respiratory activity to support their accurate transmission to the next generation. Dev. Biol.
349, 462469.
Cell 166, 637650, July 28, 2016 649
Koo, E.H., Lansbury, P.T., Jr., and Kelly, J.W. (1999). Amyloid diseases:
abnormal protein aggregation in neurodegeneration. Proc. Natl. Acad. Sci.
USA 96, 99899990.
Lancaster, A.K., Nutter-Upham, A., Lindquist, S., and King, O.D. (2014).
PLAAC: a web and command-line application to identify proteins with prionlike amino acid composition. Bioinformatics 30, 25012502.
Lei, L., and Spradling, A.C. (2016). Mouse oocytes differentiate through organelle enrichment from sister cyst germ cells. Science 352, 9599.
Lesch, B.J., and Page, D.C. (2012). Genetics of germ cell development. Nat.
Rev. Genet. 13, 781794.
Li, R., and Albertini, D.F. (2013). The road to maturation: somatic cell interaction and self-organization of the mammalian oocyte. Nat. Rev. Mol. Cell Biol.
14, 141152.
Li, J., McQuade, T., Siemer, A.B., Napetschnig, J., Moriwaki, K., Hsiao, Y.-S.,
Damko, E., Moquin, D., Walz, T., McDermott, A., et al. (2012). The RIP1/RIP3
necrosome forms a functional amyloid signaling complex required for programmed necrosis. Cell 150, 339350.
Lopez de la Paz, M., and Serrano, L. (2004). Sequence determinants of amyloid fibril formation. Proc. Natl. Acad. Sci. USA 101, 8792.
Marlow, F.L., and Mullins, M.C. (2008). Bucky ball functions in Balbiani body
assembly and animal-vegetal polarity in the oocyte and follicle cell layer in zebrafish. Dev. Biol. 321, 4050.
McAlister, G.C., Nusinow, D.P., Jedrychowski, M.P., Wuhr, M., Huttlin, E.L.,
Erickson, B.K., Rad, R., Haas, W., and Gygi, S.P. (2014). MultiNotch MS3 enables accurate, sensitive, and multiplexed detection of differential expression
across cancer cell line proteomes. Anal. Chem. 86, 71507158.
Nilsson, M.R. (2004). Techniques to study amyloid fibril formation in vitro.
Methods 34, 151160.
Patel, A., Lee, H.O., Jawerth, L., Maharana, S., Jahnel, M., Hein, M.Y., Stoynov, S., Mahamid, J., Saha, S., and Franzmann, T.M. (2015). A liquid-to-solid
phase transition of the ALS protein FUS accelerated by disease mutation. Cell
162, 10661077.
Pepling, M.E., Wilhelm, J.E., OHara, A.L., Gephardt, G.W., and Spradling, A.C. (2007). Mouse oocytes within germ cell cysts and primordial
650 Cell 166, 637650, July 28, 2016
follicles contain a Balbiani body. Proc. Natl. Acad. Sci. USA 104,
187192.
Polymenidou, M., and Cleveland, D.W. (2011). The seeds of neurodegeneration: prion-like spreading in ALS. Cell 147, 498508.
Richardson, B.E., and Lehmann, R. (2010). Mechanisms guiding primordial
germ cell migration: strategies from different organisms. Nat. Rev. Mol. Cell
Biol. 11, 3749.
Serio, T.R., Cashikar, A.G., Kowal, A.S., Sawicki, G.J., Moslehi, J.J., Serpell,
L., Arnsdorf, M.F., and Lindquist, S.L. (2000). Nucleated conformational conversion and the replication of conformational information by a prion determinant. Science 289, 13171321.
Si, K., Lindquist, S., and Kandel, E.R. (2003). A neuronal isoform of the aplysia
CPEB has prion-like properties. Cell 115, 879891.
Steger, K. (2001). Haploid spermatids exhibit translationally repressed
mRNAs. Anat. Embryol. (Berl.) 203, 323334.
Toombs, J.A., Petri, M., Paul, K.R., Kan, G.Y., Ben-Hur, A., and Ross, E.D.
(2012). De novo design of synthetic prion domains. Proc. Natl. Acad. Sci.
USA 109, 65196524.
Tsumoto, K., Umetsu, M., Kumagai, I., Ejima, D., Philo, J.S., and Arakawa, T.
(2004). Role of arginine in protein refolding, solubilization, and purification. Biotechnol. Prog. 20, 13011308.
Wickner, R.B., Edskes, H.K., Bateman, D.A., Kelly, A.C., Gorkovskiy, A., Dayani, Y., and Zhou, A. (2013). Amyloids and yeast prion biology. Biochemistry
52, 15141527.
Wuhr, M., Freeman, R.M., Jr., Presler, M., Horb, M.E., Peshkin, L., Gygi,
S.P., and Kirschner, M.W. (2014). Deep proteomics of the Xenopus laevis
egg using an mRNA-derived reference database. Curr. Biol. 24, 1467
1475.
Zhou, Y., and King, M.L. (1996). Localization of Xcat-2 RNA, a putative germ
plasm component, to the mitochondrial cloud in Xenopus stage I oocytes.
Development 122, 29472953.
Article
Compositional Control of Phase-Separated Cellular

Bodies
Graphical Abstract
Authors
Salman F. Banani, Allyson M. Rice,
William B. Peeples, Yuan Lin,
Saumya Jain, Roy Parker,
Michael K. Rosen
Correspondence
michael.rosen@utsouthwestern.edu
In Brief
What are the general principles that
define the composition of phaseseparated cellular bodies?
Highlights
d
Cellular bodies are organized by scaffolds and recruit clients

Clients bind to free sites in the scaffolds, and binding scales
with client valency
Relative scaffold stoichiometries control client recruitment in
switch-like fashion
Cells can control these parameters and thus regulate cellular
body composition
Banani et al., 2016, Cell 166, 651663

Article
Compositional Control
of Phase-Separated Cellular Bodies
Salman F. Banani,1 Allyson M. Rice,1 William B. Peeples,1 Yuan Lin,1 Saumya Jain,2 Roy Parker,2 and Michael K. Rosen1,*
1Department
of Biophysics and Howard Hughes Medical Institute, UT Southwestern Medical Center, Dallas, TX 75390, USA
of Chemistry and Biochemistry, Howard Hughes Medical Institute, University of Colorado, Boulder, CO 80309, USA
*Correspondence: michael.rosen@utsouthwestern.edu
2Department
SUMMARY
Cellular bodies such as P bodies and PML nuclear

bodies (PML NBs) appear to be phase-separated liquids organized by multivalent interactions among
proteins and RNA molecules. Although many components of various cellular bodies are known, general
principles that define body composition are lacking.
We modeled cellular bodies using several engineered multivalent proteins and RNA. In vitro and in
cells, these scaffold molecules form phase-separated liquids that concentrate low valency client proteins. Clients partition differently depending on the
ratio of scaffolds, with a sharp switch across the
phase diagram diagonal. Composition can switch
rapidly through changes in scaffold concentration
or valency. Natural PML NBs and P bodies show
analogous partitioning behavior, suggesting how
their compositions could be controlled by levels of
PML SUMOylation or cellular mRNA concentration,
respectively. The data suggest a conceptual framework for considering the composition and control
thereof of cellular bodies assembled through heterotypic multivalent interactions.
INTRODUCTION
Eukaryotic cells compartmentalize biological processes to
achieve spatial and temporal control over biochemical reactions.
Compartmentalization has long been studied in the context of
membrane-bound organelles, where mechanisms of biogenesis
and transport of molecules into and out of the organelle are well
understood. Cells also contain numerous membrane-less organelles, collectively referred to as cellular bodies (Mao et al., 2011).
These structures, which include P granules, P bodies, nucleoli,
and promyelocytic leukemia nuclear bodies (PML NBs), are
micron-sized assemblies of proteins and often RNA found in
the cytoplasm and nucleoplasm of eukaryotic cells. They appear
to be functionally important, as inferred from their conservation
among evolutionarily distant species (Handwerger et al., 2005)
and their tendency to concentrate functionally related groups
of molecules (Mao et al., 2011; Mohamad and Boden, 2010).
Ultrastructural analysis of cellular bodies suggests that they
are porous structures with densities comparable to those of
the nucleo- or cytoplasm (Handwerger et al., 2005). Analysis in

live cells has revealed that, macroscopically, the bodies persist
for hours to days. Yet, they are highly dynamic at the molecular
level, turning over their contents within seconds to minutes
(Dundr et al., 2004; Weidtkamp-Peters et al., 2008). Recent
work has demonstrated that bodies exhibit liquid-like properties
(Brangwynne et al., 2009; Brangwynne et al., 2011; Chen et al.,
2008; Kroschwald et al., 2015; Patel et al., 2015; Wang et al.,
2014). These and other behaviors suggest that cellular bodies
are condensed phases that form through liquid-liquid phase
separation of the nucleo- or cytoplasm (Hyman et al., 2014).
Cellular bodies are often enriched in multivalent molecules (Li
et al., 2012)proteins that harbor multiple modular domains or
stretches of low-complexity amino acid sequence with repeated
interaction motifs (Decker et al., 2007; Han et al., 2012; Kato
et al., 2012; Reijns et al., 2008) or charged elements (ElbaumGarfinkle et al., 2015; Nott et al., 2015); RNA species that contain
multiple protein-binding elements; or combinations thereof.
Interactions between multivalent macromolecules can drive
polymerization-driven phase separation (Banjade et al., 2015;
Fromm et al., 2014; Li et al., 2012; Mitrea et al., 2016; Nott
et al., 2015), resulting in the formation of a condensed, droplet
phase suspended in the bulk solution phase. It has been suggested that this fundamental macromolecular behavior may be
an important principle governing the organization of cellular
bodies (Fromm et al., 2014; Li et al., 2012; Mitrea et al., 2016;
Nott et al., 2015). Indeed, expressing engineered multivalent proteins or ectopically tethering high copy numbers of body components (high local valency) in cells is sufficient to form dynamic,
membrane-less puncta that resemble bona fide cellular bodies
(Chung et al., 2011; Kaiser et al., 2008; Li et al., 2012).
Cellular bodies typically contain tens to hundreds of types of
molecules (Buchan and Parker, 2009; Fong et al., 2013). Where
characterized in detail, only a small number of these components
appear to be essential for the structural integrity of the body
(Clemson et al., 2009; Hanazawa et al., 2011; Ishov et al.,
1999). We refer here to such molecules as scaffolds. In contrast,
the remaining majority of components are dispensable for body
assembly and often reside in the bodies only under certain conditions (Dellaire et al., 2006; Grousl et al., 2009). These molecules, which we refer to here as clients, often contain elements
that specifically bind to elements in the scaffolds, often via low
valency interacting elements of the same class as those in the
scaffolds (e.g., Chalupnkova et al., 2008; Lin et al., 2006). For
example, P bodies assemble in part via scaffolding interactions
between RNA binding proteins and RNA but also recruit several
RNA binding proteins that are not important for P body assembly (Buchan and Parker, 2009). Within cellular bodies, clients
diffuse much more rapidly than scaffolds (Dundr et al., 2004;
Weidtkamp-Peters et al., 2008), suggesting that client-scaffold
interactions are more transient than the interactions among
scaffolds.
Compositional regulation is a general property of many cellular
bodies and may be crucial to their function. Cellular body compositions change during the phases of the cell cycle or in
response to stresses (Dellaire et al., 2006; Grousl et al., 2009).
Despite their importance, the fundamental principles governing
cellular body composition have been experimentally difficult to
elucidate, owing to the complex nature of both scaffolds and
clients and the diversity of species that reside within bodies.
However, simplified model systems composed of few types
of molecules, each with well-defined interaction elements, can
help isolate key molecular parameters and thus have the potential to reveal generalizable concepts.
Here, we describe the biochemical and cellular behavior of
three different sets of engineered molecules as simplified but
representative multivalent scaffolds and low valency clients,
which form model cellular bodies. Clients were differentially recruited into the bodies based on the relative stoichiometries of
the scaffolds. Changes in client recruitment occurred sharply
and on cellular timescales as the scaffold stoichiometries or valencies changed. Client partitioning also depended on client
valency. These findings lead to a simple mass action model
that predicts many features of the observed client partitioning
behavior and suggests how cellular body compositions could
be regulated in cells. Behaviors analogous to those of the model
systems were observed in PML NBs in mammalian nuclei and P
bodies in yeast cytoplasm. Thus, although natural cellular bodies
are complex, their compositions may be governed by simple underlying rules and could be altered based on parameters that are
easily tunable through cellular and evolutionary mechanisms.
RESULTS
Scaffold Stoichiometries Dictate Client Recruitment
We began by studying three independent pairs of interacting
multivalent scaffolds in vitro. These systems consisted of (1) a
protein with ten repeats of human SUMO3 (polySUMO) and a
protein with ten repeats of the SUMO Interaction Motif (SIM)
from PIASx (polySIM); (2) a protein with four repeats of the second SH3 domain from Nck (polySH3) and a protein containing
four repeats of a Proline-Rich Motif (PRM) from Abl1 (polyPRM)
(Li et al., 2012); and (3) the PTB protein (contains four RNA recognition motifs [RRMs]) and an RNA with five repeats of the RRM
recognition element UCUCU (polyUCUCU) (Li et al., 2012).
Each of these pairs phase separated when mixed together, but
not when individual components were alone in solution (Li
et al., 2012; Figure S1A; and data not shown).
To model client recruitment into the bodies, we engineered a
series of fluorescently labeled, monovalent clients (containing
a single element that binds the scaffold) and characterized their
partitioning into droplets generated by their cognate scaffolds.
We mixed (1) GFP-SUMO and RFP-SIM (or GFP-SIM) with
polySUMO/polySIM (Figure 1A); (2) GFP-PRM and RFP-SH3
652 Cell 166, 651663, July 28, 2016
with polySH3/polyPRM (Figure 1B); and (3) GFP-RRM and

UCUCU-AlexaFluor647 (AF647) with PTB/polyUCUCU (Figure 1C). Partition coefficients (PCs) for the clients, defined as
the ratio of concentrations in the droplet to the bulk phases,
ranged from 1 to 10 across the phase diagram (Figures S1B
and S1D). Client recruitment in all three systems was qualitatively
similar. Clients partitioned asymmetrically about the diagonal of
the phase diagram (the line of equal scaffold stoichiometry) or
near to it; each client was enriched only on the side where its
cognate scaffold was in stoichiometric excess in the solution.
For example, when polySIM was in excess (above the diagonal),
GFP-SUMO was enriched in the droplets (PC 3), but when
polySUMO was in excess (below the diagonal), GFP-SUMO
concentrated nearly equally in both phases (PC 1) (Figure S1B).
GFP-SIM showed an opposite pattern of enrichment (PC 3
when polySUMO was in excess; PC 1 when polySIM was in
excess). Recruitment preference transitioned sharply, in
switch-like fashion, as the diagonal was crossed. For the polySUMO/polySIM system, neither GFP alone nor clients mutated
at their binding sites were enriched in droplets on either side of
the diagonal (Figures S1B and S1C). Thus, binding to the scaffold
proteins is necessary and sufficient for enrichment into the
droplets (PC > 1).
Together, these data show that, regardless of the molecular
system, low valency clients partition asymmetrically into droplets formed by heterotypic scaffold interactions, with a sharp
switch in client recruitment preference across the diagonal.
Valency of Client Affects Client Recruitment
Since the clients of a given cellular body can differ in their valencies, we examined how client valency affected partitioning. We
fused to GFP 2 or 3 tandem repeats of SUMO or SIM and
measured the PC for these clients across the polySUMO/
polySIM phase diagram (Figure 2). Like their monovalent counterparts, the di- and trivalent clients partitioned into the droplets
predominantly on one side of the phase diagram, transitioning
sharply in their PCs across the diagonal. However, both the
di- and trivalent clients had larger magnitudes of maximum
partitioning than their monovalent counterparts, a feature that
increased with valency: max PC was 19 and 37 for GFP(SUMO)2 and GFP-(SUMO)3, respectively, and 21 and 61 for
GFP-(SIM)2 and GFP-(SIM)3, respectively. In all cases, maximum
partitioning occurred just past the diagonal, substantially
enhancing the sharpness of the switch between client preferences. The increased partitioning was likely due to higher
apparent affinity of the di- and tri-valent clients for the scaffold.
Indeed, isothermal titration calorimetry (ITC) experiments verified that apparent affinity of the clients to cognate sites increases
with increasing valency (Figure S2 and Table S1).
These data demonstrate that, in addition to position on the
phase diagram, client valency can strongly influence client partitioning and thus droplet composition.
Mass Action Explains Switch-like Partitioning of Low
Valency Clients
We sought to understand the origin of the switch-like nature of
client partitioning. Our data suggest that partitioning depends
strictly on SUMO-SIM interactions between clients and scaffolds
A
GFP-SUMO
SIM Module
Concentration (M)
Figure 1. Phase Diagram Position Dictates

Client Recruitment
polySUMO + polySIM
RFP-SIM
Merge
90
80
70
60
50
50
60
70
80
90
50
60
70
80
90
50
SUMO Module Concentration (M)
60
70
80
90
Scale: 100 m
Solutions of multivalent scaffolds plus the indicated clients were imaged for client fluorescence.
AF, Alexa fluorophore.
(A) GFP-SUMO (green) and RFP-SIM (magenta)
(100 nM each) were mixed with the indicated
module concentrations of polySUMO and polySIM.
(B) GFP-PRM (green) and RFP-SH3 (magenta)
(200 nM each) were mixed with the indicated
module concentrations of polyPRM and polySH3.
(C) UCUCU-AF647 (green) and RFP-RRM
(magenta) (200 nM each) were mixed with the
indicated module concentrations of polyUCUCU
and PTB.
See also Figure S1.
polyPRM + polySH3
GFP-PRM
RFP-SH3
Merge
RRM Module
Concentration (M)
SH3 Module
Concentration (M)
sites accessible to its cognate client.

Conversely, the scaffold that is stoichiometrically deficient will effectively be
350
saturated by scaffold-scaffold interac300
tions and be invisible to its cognate client
in either phase.
250
The scaffold composition of the droplet
and bulk phases varied in a smooth,
200
graded fashion across the phase dia200 250 300 350 400
200 250 300 350 400
200 250 300 350 400
gram, with PCs ranging from 30125
PRM Module Concentration (M)
Scale: 100 m
(Figures 3A and 3B). For most of the
phase diagram, the PC of the two scafC
folds was similar (within an 2-fold
polyUCUCU + PTB
range), such that the droplets were
UCUCU-AF647
RFP-RRM
Merge
essentially concentrated counterparts of
70
the bulk solution (Figure 3C). Thus, at
each point in the phase diagram, the
60
scaffold in excess has a higher concentration of free sites in the droplet than in
50
the bulk (Figure 3D) and consequently
40
concentrates its cognate client into the
droplets. The scaffold that is stoichiomet30
rically deficient has few free sites in either
20 30 40 50 60
20 30 40 50 60
20 30 40 50 60
phase, and its cognate client remains
UCUCU Module Concentration (M)
Scale: 100 m
uniformly distributed. Since the stoichiometric relationship between the two
scaffolds switches abruptly in both
(Figures S1B and S1C). We reasoned that client partitioning phases near the diagonal, the capacity of the droplets to recruit
should be governed by the relative concentrations of available one client over the other also switches abruptly.
scaffold binding sites in droplets versus the bulk.
We modeled client partitioning by mass action (Figure 3E). We
The apparent dissociation constant for polySUMO/polySIM allowed clients to equilibrate between two simulated phases
(Kd % 1 nM, based on ITC measurements with (SUMO)5/ while binding to free sites at concentrations computed from
(SIM)5) is much less than the scaffold concentrations in either our experiments (Figure 3C, Table S1, and Supplemental Inforphase (Figure 3C), suggesting that most scaffold sites are occu- mation). This simple mass action model suffices to recapitulate
pied. Moreover, the apparent client-scaffold dissociation con- the key qualitative features of observed client partitioning (Figstants (Kd = 7010,000 nM, estimated from ITC measurements ure 3F): (1) selective partitioning of clients, restricted to only
with (SUMO)m + (SIM)m, for m = 1, 2 or 3) (Figure S2 and Table one side of the diagonal; (2) a sharp change in partitioning as
S1) are much weaker than the apparent polySUMO/polySIM af- the diagonal is crossed; and (3) the dependence of partitioning
finity. Thus, clients should be poor competitors of scaffold-scaf- on the apparent client-scaffold affinity. As described in the Supfold interactions. This analysis suggests that, in either phase, plemental Information, the model also predicts less intuitive feaonly the scaffold that is in stoichiometric excess will have free tures of the data, including non-monotonic partitioning, as well
400
Cell 166, 651663, July 28, 2016 653
polySUMO + polySIM
GFP-(SUMO)2
75
60
45
30
15
0
90
75
60
45
30
15
0
90
S 80
nc IM 70
en M
60
tra od
tio ule 50
n
(
M
)
70
90
le
odu M)
50
OM
SUM tration (
cen
Con
60
75
60
45
30
15
0
90
S 80
nc IM 70
en M
60
tra od
tio ule 50
n
(
M
)
70
80
90
Co
le
odu M)
50
OM
SUM tration (
cen
Con
60
70
80
90
75
60
45
30
15
0
90
S 80
nc IM 70
en M
60
tra od
tio ule 50
n
(
M
)
le
odu M)
OM
SUM tration (
n
once
60
70
80
90
le
odu M)
50
OM
SUM tration (
cen
Con
60
GFP-(SIM)3
75
60
45
30
15
0
90
S 80
nc IM 70
en M
60
tra od
tio ule 50
n
(
M
)
Co
Co
50
S 80
nc IM 70
en M
60
tra od
tio ule 50
n
(
M
)
GFP-(SIM)2
Partition Coefficient
GFP-(SIM)1
75
60
45
30
15
0
90
Co
Co
80
S 80
nc IM 70
en M
60
tra od
tio ule 50
n
(
M
)
Co
GFP-(SUMO)3
GFP-(SUMO)1
50
70
80
90
le
odu M)
OM
SUM tration (
n
once
60
50
70
80
90
le
odu M)
OM
SUM tration (
n
once
60
Figure 2. Client Valency Affects Partitioning

PCs (means of duplicate samples) of the indicated clients (100 nM) into droplets formed by the indicated module concentrations of polySUMO and polySIM.
as dramatically high partitioning of one client and attenuated

partitioning of the other near the diagonal (Figure 3F; Figure 2,
trivalent clients; Figures S3 and S4).
Collectively, this analysis suggests that switch-like changes in
client partitioning fundamentally arise from the sharp inversion of
scaffold excess across the diagonal of the phase diagram.
Compositional States Interchange on Cellular
Timescales
Our data and analyses suggest how compositional states could
be controlled by mass action. We wondered whether transitions
between two compositional states were kinetically feasible on
cellular timescales. We equilibrated polySUMO/polySIM droplets at a point on the phase diagram where only one of the clients,
either GFP-SUMO or RFP-SIM, was preferentially enriched in the
droplets (but both were present in solution). We then abruptly
changed the concentration of the scaffold components to
move the system to a point across the diagonal where the reciprocal recruitment preference was expected (Figure 4A). The
droplets remained intact, spherical, and of relatively consistent
sizes throughout the experiment. Within 6 hr, all droplets
expelled the initially enriched client in exchange for the other
client (Figure 4B). Recruitment of the latter started at the outer
654 Cell 166, 651663, July 28, 2016
edges of the droplets and moved inward, and smaller droplets

exchanged clients more rapidly than larger droplets.
In fluorescence recovery after photobleaching (FRAP) experiments, clients diffused much more rapidly within droplets
than did scaffolds (i.e., for droplets 20 mm in diameter, RFPSIM and polySUMO had exponential recovery constants, t, of
1.3 min and 38 min, respectively; Figure S5 and Table S3).
Thus, scaffold rearrangements likely limit the rate of transitions
between compositional states. Scaling recovery times to droplets of 1 mm diameter, as often observed in cells, indicates that
compositions should interchange on a timescale of 6 s in
natural systems (Table S3).
We previously demonstrated how covalent modifications of
scaffolds could regulate the formation and dissolution of droplet
phases (Li et al., 2012). We likewise wondered whether covalent
modifications could also regulate droplet compositions. In cells,
SUMO modifications are dynamically added by the SUMO ligase
cascade and removed by SUMO proteases. We generated a
single component, fused (SUMO)9-(SIM)8 scaffold that could
be selectively cleaved by Ulp1, the yeast SUMO protease, to
produce (SUMO)7-(SIM)8, mimicking natural deSUMOylation
(see Experimental Procedures). Such (SUMO)m-(SIM)n (m s n)
fusions are essentially fixed on one side of the phase diagram
polySIM
Merge
90
80
70
60
50
50
60
70
80
90
50
60
70
80
90
50
60
70
SUMO Module Concentration (M)

polySUMO
polySIM
150
120
90
60
30
0
90
90
50
80
70
le
60
odu M)
OM
SUM tration (
n
ce
Con
90
80
70
le
60
odu M)
M
O
SUM tration (
n
ce
Con
50
Droplet
Free Sites
100
Clients
100
60
80
70
70
80
60
90 SUMO
50 SIM
Bulk
polySUMO
polySIM
60
80
70
60
80
50
Module Concentration (M)
90 SUMO
50 SIM
102
polySUMO
polySIM
Droplet/Bulk Ratio
Module Concentration in Phase (M)
100
80
60
40
20
0
50
90
S 80
nc IM 70
en M
tra od 60
tio ule 50
n
(
M
)
Droplet
80
60
40
20
polySUMO
polySIM
0
50
90
60
80
70
70
80
60
80
60
100
40
20
102
90 SUMO
50 SIM
0
50
90
60
80
70
70
80
60
Kd, Client to Scaffold Affinity

(M Module)
3000
2500
2000
1500
1000
500
0
50
90
Bulk
Co
Co
150
120
90
60
30
0
90
S 80
nc IM 70
en M
tra od 60
tio ule 50
n
(
M
)
90
SUMO Client
80
Scale: 100 m
SIM Client
SIM Module
Concentration (M)
polySUMO
90 SUMO
50 SIM
Figure 3. A Mass Action Model Predicts Client Partitioning Behavior

(A) Imaging of polySUMO and polySIM (1% labeled with AF488 [green] and AF647 [magenta], respectively) fluorescence.
(B) PCs (means of duplicate samples) of polySUMO (left) and polySIM (right) calculated from imaging (see Experimental Procedures).
(C) Scaffold module concentrations (blue and yellow dots) in the droplet (top) and bulk (bottom) phases at the anti-diagonal data points from panel (B). To model
client partitioning, values of concentrations were smoothed and interpolated with a cubic spline to yield continuous curves from discrete data. The continuous,
interpolated values were used for subsequent calculations. Error bars represent SEM. Dotted line, phase diagram diagonal.
(D) Blue curve shows the ratio of free SUMO sites in the droplet phase to free SUMO sites in the bulk phase. Yellow curve shows the analogous ratio for free SIM
sites.
(E) Mass action model for the partitioning of a low valency client, L, that binds to free scaffold sites R1 and R2 in the droplet and bulk, respectively (see
(F) Predicted PC of clients as a function of affinity for scaffolds. Free site concentrations computed in (E) were used to parameterize the model (C) and predict
partitioning of client as a function of their apparent affinity (ranging from 102102 mM module) for the scaffolds (see Experimental Procedures).
diagonal and have client recruitment preferences analogous

to the in trans systems (see Figure 5A). When mixed with
GFP-(SIM)2 and RFP-(SUMO)2, (SUMO)9-(SIM)8 droplets recruited the former but not the latter client (Figure 4C). Ulp1 cleavage, which shifted the scaffold to the other side of the phase
diagram diagonal, caused the droplets to expel GFP-(SIM)2
and recruit RFP-(SUMO)2. These data suggest that enzymatic
modifications of cellular body scaffolds, such as SUMOylation

and deSUMOylation, could robustly regulate body composition.
We conclude that droplets can transition, without compromising structural integrity, between substantially different compositional states on timescales accessible to cells. This can occur
with only subtle changes in the concentration or covalent modifications of their polymer scaffolds.
Cell 166, 651663, July 28, 2016 655
Figure 4. Droplets Interchange Composition on Cellular Timescales without Compromising Structural Integrity
(A) Schematic of experiment. After equilibration of 100 nM GFP-SUMO and 100 nM RFP-SIM with polySUMO and polySIM at module concentrations of 60 mM and
80 mM, respectively, concentrations of the polySUMO and polySIM were abruptly shifted to 80 mM and 60 mM, respectively, for trajectory 1 and vice versa for
trajectory 2.
(B) Time lapse imaging of droplets starting immediately after the abrupt change in concentrations of polySUMO and polySIM, showing merged, pseudocolored
fluorescence signals from GFP-SUMO (green) and RFP-SIM (magenta). Note that small droplets (white arrows, top) interconvert more quickly than larger droplets
(bottom).
(C) 6 mM of a (SUMO)9-(SIM)8 scaffold containing Ulp1 cleavage sites after only the two N-terminal SUMOs was equilibrated with 50 nM of GFP-(SIM)2 (green) and
RFP-(SUMO)2 (magenta). Time lapse imaging was started immediately after addition of 10 nM of Ulp1. Pseudocolored images showing merged fluorescent
signals from the two clients are shown.
Engineered Cellular Puncta Selectively Concentrate

Low Valency Clients
We next asked whether the partitioning behavior observed
in vitro could also be observed in cells. For these experiments,
we used in cis [(SUMO)m-(SIM)n] scaffolds, which afforded tight
experimental control of the relative module concentrations independent of absolute concentrations. Both (SUMO)10-(SIM)5
and (SUMO)5-(SIM)10 phase separated in vitro at micromolar
concentrations. (SUMO)10-(SIM)5 droplets enriched GFP-SIM
(PC = 4.7), but not GFP-SUMO (PC = 1.3), and (SUMO)5(SIM)10 showed the reverse (PC = 2.8 for GFP-SUMO;
PC = 1.2 for GFP-SIM) (Figures 5A and 5B).
We then individually expressed RFP-(SUMO)10-(SIM)6 or RFP(SUMO)6-(SIM)10 in HeLa cells, where they each formed spherical, micron-sized puncta in the cytoplasm. In live cells, the
puncta occasionally contacted each other and coalesced into
larger structures, suggesting that they are phase-separated liquids (data not shown). When co-transfected with individual
YFP-tagged clients, RFP-(SUMO)10-(SIM)6 puncta only concentrated YFP-SIM. Reciprocally, RFP-(SUMO)6-(SIM)10 puncta
only concentrated YFP-SUMO. In both cases, neither YFP alone
656 Cell 166, 651663, July 28, 2016
nor clients with mutations at their binding sites were enriched

in cellular puncta (Figure S6A). We also obtained qualitatively
analogous data using co-expression of in trans polySUMO and
polySIM scaffolds, along with YFP-tagged clients (Figure S6B).
However, experimental uncertainties in the relative concentrations of the scaffold components made it difficult to assign cells
confidently to one side the diagonal.
Taken together, our data suggest that mass action-based
compositional control can be achieved as robustly in cells as
in vitro.
Scaffold Stoichiometries Control Client Recruitment
into Natural Cellular Bodies
We sought to determine whether natural cellular bodies could
exhibit compositional control analogous to our model systems.
We focused on two natural cellular bodies, PML NBs in mammalian
nuclei and P bodies in the yeast cytoplasm, systems in which the
interactions governing client recruitment were well characterized
and where their stoichiometries were experimentally perturbable.
PML NBs are micron-sized, membrane-less organelles in
mammalian nuclei that are involved in processes including
C
(SUMO)5-(SIM)10
Scaffold:
Scale:
RFP-(SUMO)6-(SIM)10
YFP-SIM
YFP-SUMO
YFP-SIM
Scale:
D
12
Scaffold
Client
10
8
6
4
2
0
GFP-SUMO
GFP-SIM
Scaffold
Client
10
8
6
4
2
0
GFP-SUMO
GFP-SIM
RFP-(SUMO)10-(SIM)6
RFP-(SUMO)6-(SIM)10
14
YFP-SUMO
YFP-SIM
12
10
8
6
4
2
0
10
12
Scaffold Partition Coefficient
14
Client Partition Coefficient
12
(SUMO)5-(SIM)10
(SUMO)10-(SIM)5
RFP-(SUMO)10-(SIM)6
YFP-SUMO
Scaffold
Client:
GFP-SIM
Client
Scaffold
Client
GFP-SUMO
Scaffold
GFP-SIM
Scaffold
(SUMO)10-(SIM)5
GFP-SUMO
Client
Client:
Client
Scaffold:
14
YFP-SUMO
YFP-SIM
12
10
8
6
4
2
0
10
12
14
Scaffold Partition Coefficient
Figure 5. Cellular PolySUMO-PolySIM Puncta Selectively Recruit Low Valency Clients

(A) 60 nM of GFP-SUMO or GFP-SIM (green) was mixed with 12 mM of (SUMO)10-(SIM)5 (left) or (SUMO)5-(SIM)10 (right) (1% RFP-tagged; magenta), and the
resulting droplets were imaged for scaffold and client fluorescence.
(B) PCs for both scaffold (black bars) and clients (white bars) from experiment in (A). Graphs show averages from triplicate experiments. Error bars represent SEM.
Dotted line, PC = 1.
(C) Live cell fluorescence images of YFP-SUMO or YFP-SIM (green) co-transfected with RFP-(SUMO)10-(SIM)6 (left) or RFP-(SUMO)6-(SIM)10 (right) (magenta) into
HeLa cells.
(D) PCs of scaffolds and client components calculated from cells in the experiment. Each symbol represents the average PC into all puncta (typically 1-3) in a given
cell (12-35 cells per sample) when the indicated scaffold was co-transfected with YFP-SUMO (black circles) or YFP-SIM (white circles). Dotted line, PC = 1.
Red + sign, median PC.
See also Figure S6.
DNA damage repair, apoptosis, and anti-viral responses (Lallemand-Breitenbach and de The, 2010). The PML protein appears
to be the primary scaffold for these bodies (Ishov et al., 1999).
PML can self-assemble via elements within its Tripartite Motif
(TRIM) (Antolini et al., 2003; Huang et al., 2014) and also via binding of its conserved SIM element to SUMOs conjugated at up to
eight sites in the protein (Nisole et al., 2013; Shen et al., 2006).
Though not strictly required for body assembly (Brand et al.,
2010; Sahin et al., 2014), SUMO-SIM interactions likely
contribute substantially to body architecture, as deletion of the
SIM motif or perturbations to PML SUMOylation via mutagenesis, viral infection, or knockdown/overexpression of SUMO
ligases/proteases can cause changes in the size, number,
morphology, or dynamics of PML NBs (Best et al., 2002; Hattersley et al., 2011; He et al., 2015; Muller and Dejean, 1999; Shen
et al., 2006; Weidtkamp-Peters et al., 2008). SUMO-SIM interactions also appear to be critical for the recruitment of many PML
NB clients (e.g., Daxx and Sp100) (Lin et al., 2006; Van Damme
et al., 2010; Zhong et al., 2000).
We initially examined partitioning of GFP-tagged SUMO/SIM
clients into endogenous PML NBs in U2OS cells (Figure S7A).
Immunofluorescence imaging using an antibody against PML revealed numerous PML NBs in nearly all cell nuclei. For each
client, we measured the ratio of GFP fluorescence intensity

within the PML NBs to that in the bulk nucleoplasm (Intensity Ratio, IR = IntensityPML_NB/Intensitynucleoplasm, see Supplemental
Information). GFP-SIM was enriched in these bodies with a median IR of 2.9. In contrast, as previously reported for monovalent
SUMO clients (Ayaydin and Dasso, 2004), GFP-SUMO had little
enrichment in the PML NBs (median IR = 1.3). Increasing valency
increased the enrichment of the preferred client into PML NBs
(median IR = 8.1 for GFP-(SIM)3) but had no effect on the impartial client (median IR = 1.4 for GFP-(SUMO)3). Neither GFP alone
nor a client with mutated SIM sites were enriched (Figure S7B).
The selective, valency-dependent partitioning into PML NBs is
analogous to the behaviors of our polySUMO/polySIM model
system on the polySUMO-enriched side of the phase diagram diagonal (i.e., to that of (SUMO)10-(SIM)5, above the diagonal). Our
model predicts that PML NBs on the opposite side of the phase
diagram diagonal should exhibit inverted partitioning behavior
with respect to SUMO versus SIM clients. To create such
structures, we expressed either wild-type (WT) GFP-PML or a
PML mutant (GFP-PML(SUMO)) lacking some of the known
SUMOylation sites in PML/ mouse embryonic fibroblasts
(MEFs) (Figure 6A). The mutant protein is SUMOylated in cells,
but to a lesser degree than the wild-type protein (Figure S7F).
Cell 166, 651663, July 28, 2016 657
Scaffold:
RFP(SIM)3
GFP-PML(SUMO)
Client:
RFP(SUMO)3
Strain:
RFP(SIM)3
Xrn1-GFP
WT
lsm1
dcp2
Scaffold
Client:
GFP-PML
RFP(SUMO)3
Merge
Client
Intensity Ratio
Scale:
2 m
Scale:
10 m
10
8
6
4
2
0
1.88
3.71
4.08
Median
Client
Intensity Ratio
**
7
6
5
4
3
2
1
0
7
6
5
4
3
2
1
0
1.20
***
2.79
2.14
***
1.21
Median
***
***
***
Figure 6. Client Recruitment into Natural Cellular Bodies Is Affected by Scaffold Stoichiometries
(A) Images of RFP-SUMO or RFP-SIM (red) co-transfected with GFP-PML or GFP-PML(SUMO) (green) into PML/ MEFs (top); nuclear staining with Hoecsht
33342 (blue). Plots (bottom) show IRs from individual cells (black dots) and median values (red horizontal lines). Each symbol represents the average IR (see
Experimental Procedures) for all puncta in a given cell. 3244 cells were analyzed per sample, each on average containing 16 or 5 puncta per cell with GFP-PML or
GFP-PML(SUMO), respectively. Distributions were statistically compared using the Wilcoxon rank sum test followed by the Bonferonni correction for multiple
comparisons to determine significance. ***p < 0.001. Dotted line, IR = 1.
(B) Representative images of WT, lsm1D, or dcp2D yeast strains carrying Xrn1-GFP (green) in their genomes (top). Distributions of Xrn1-GFP IRs (bottom), where
each symbol represents IR corresponding to an individual P body. 13 P bodies per cell were analyzed from a set of 410 cells per sample. Analysis for
significance was performed as in (A). **p < 0.01.
See also Figure S7.
Both the WT and mutant scaffolds formed micron-sized puncta

in nuclei. Like natural PML NBs, GFP-PML puncta substantially
recruited RFP-(SIM)3 (median IR = 2.8), but not RFP-(SUMO)3
(median IR = 1.2). In reciprocal fashion, GFP-PML(SUMO) puncta
efficiently recruited RFP-(SUMO)3 (median IR = 2.1) but recruited
RFP-(SIM)3 poorly (median IR = 1.2). Neither scaffold could
recruit RFP alone nor clients with mutations at the SUMOor SIM-binding site (Figures S7C and S7D). Moreover, these results were independent of the SUMO paralog (SUMO1 versus
SUMO3) used to construct the client (Figure S7E).
These data suggest that decreasing the degree of PML
SUMOylation can shift the bodies to a region analogous to the
opposite side of the SUMO/SIM diagonal, where recruitment of
SUMO-containing clients is favored.
We next explored analogous compositional control in P
bodies, protein- and mRNA-rich cellular bodies in the cytoplasm
of eukaryotes that promote mRNA decay (Parker, 2012).
P bodies assemble through multivalent interactions of RNA
binding proteins composed of modular RNA binding domains
and self-associating disordered regions and mRNA molecules
(Decker and Parker, 2012). mRNAs that have exited translation
act as important P body scaffolds (Teixeira et al., 2005). We
658 Cell 166, 651663, July 28, 2016
thus asked whether modulating the levels of mRNA, thereby

affecting the relative stoichiometry of an important scaffold
component, could affect recruitment of clients into P bodies.
We used the lsm1D and dcp2D yeast strains, which are deficient
in mRNA decapping and therefore accumulate deadenylated
mRNAs that would otherwise be targets for degradation (Parker,
2012). We then measured the IR of the P body client Xrn1 (Jain
and Parker, 2013) fused to GFP (Xrn1-GFP), in the WT, lsm1D,
or dcp2D strains. Xrn1 is predicted to be recruited, at least in
part, by interactions with RNA. Compared to its recruitment
into P bodies in WT cells (median IR = 1.88), recruitment in the
lsm1D and dcp2D strains increased 2-fold (median IR = 3.71
and 3.71 and 4.08, respectively) (Figure 6B). This behavior was
qualitatively consistent with the behaviors of our engineered clients (Figure 1). The recruitment of the P body scaffold Edc3 (Jain
and Parker, 2013) also increased in the two deletion strains
concomitant with the increase in deadenylated mRNAs, consistent with the increased partitioning of scaffolds observed in
certain regimes of the phase diagram when the concentration
of the cognate scaffold increased (Figure 3B). Thus, these data
suggest that, like PML NBs, compositional control can be
achieved in P bodies by modulation of scaffold stoichiometries.
Figure 7. A Model for Compositional Control of Cellular Bodies
C
E
Collectively, these data indicate that the compositions of both

of these natural cellular bodies can be controlled by modulation
of stoichiometries of scaffold elements, analogous to the behaviors observed in our model systems. This behavior suggests
simple cellular mechanisms to rapidly and dramatically alter
the composition, and thus function, of cellular bodies in
response to stimuli.
DISCUSSION
Hierarchical Organization of Cellular Bodies
We propose a hierarchical model for the composition of cellular
bodies (Figure 7). The model has several key features. First, scaffolds self-associate by multivalent heterotypic interactions and
undergo assembly-driven phase separation (Li et al., 2012),
forming a condensed phase (Figure 7A)i.e., the cellular body.
Second, clients partition into bodies by interacting with scaffolds, often utilizing the same types of interacting elements as
those between scaffolds (Figure 7B). The typically lower valency
(and therefore lower apparent affinity) of clients minimizes their
competition with the higher affinity scaffold-scaffold interactions. As a result, clients are recruited by binding only to excess,
or free, scaffold sites. Thus, their recruitment will be governed by
the stoichiometric ratios of the scaffolds (Figure 7C). Third, since
the enrichment of excess sites switches sharply across the
phase diagram diagonal from one class of sites to the other,
bodies can change compositions in a switch-like manner as a
function of phase diagram position. Fourth, since scaffold and
client valencies can affect position on the phase diagram and
the degree of client partitioning, respectively, covalent modifications that change valency can be used to rapidly switch between
compositional states (Figure 7D).
We note that clients that bind regions of the scaffold that are
not involved in heterotypic assembly will be recruited but remain
relatively insensitive to changes in scaffold stoichiometries (Fig-
Multivalent scaffold molecules (high valency blue

and yellow molecules) assemble and phase separate to form the body (A). Many client molecules (low
valency blue and yellow molecules, with additional
domains) are enriched in the body through binding to
free cognate sites in the scaffold that is in stoichiometric excess (B). Client modules have a hatched
pattern to distinguish them from scaffold modules.
Stoichiometric excess of the scaffold modules can
be changed either through changes in the scaffold
concentrations (C) or through changes in
the scaffold valency (not shown). Since stoichiometric excess of the scaffolds in droplet (A) and bulk
(not shown) changes sharply across the phase diagram diagonal, the nature of the clients also switches
sharply across the diagonal. Higher valency promotes stronger recruitment of the clients (D). Molecules that bind to other regions of the scaffolds
(E, light blue trianguloids) will be recruited independently of the scaffold stoichiometry. Natural bodies
are composed of more complicated molecules, with
multiple types of interaction elements, but should
follow this same logic.
ure 7E). Moreover, molecules with otherwise appropriate physicochemical properties (e.g., complementary charge) may also
partition into droplets due to non-specific interactions (Li et al.,
2012). Clients containing multiple types of interaction elements,
some mirroring scaffold-scaffold interactions and others not,
could show complex behaviors that are essentially superpositions of these individual effects. This reasoning may explain
the recruitment of Xrn1-GFP into P bodies without perturbation
of mRNA content (i.e., on what may be the non-cognate side
of the phase diagram diagonal), as well as the enhanced
recruitment when cellular mRNA is increased, as observed in
Figure 6B.
Complexities of Natural Cellular Bodies
Although natural cellular bodies are appreciably more complicated than our engineered model systems, their compositions
may still be understood with simple extensions to the framework
we present here. First, cellular bodies may have multiple scaffolds held together by different types of multivalent interactions.
For example, RNA granules likely have multiple scaffolds with
contributions from both low-complexity sequence elements,
as well as RNA and RNA-binding domains. PML NBs likely
assemble by a combination of TRIM and SUMO-SIM interactions. Multiple types of scaffolds and scaffold interactions
may cooperate to synergistically promote polymerization and
phase separation, as suggested previously (Lin et al., 2015).
Moreover, clients may also possess multiple classes of low
valency elements that can each interact with scaffolds. Nevertheless, in the absence of cooperativity, one can think of the
different interaction motifs independently. For any given class,
the corresponding free sites in a scaffold will dictate partitioning
of clients that can bind to those sites. Indeed, perturbing one
type of interaction motif within PML NBs or P bodies had strong
effects on the recruitment of clients that bind to that motif
(Figure 6).
Cell 166, 651663, July 28, 2016 659
Second, for some systems, the distinction between scaffolds

and clients may be less stark than in our engineered systems
(see also Supplemental Information). As client valency approaches that of the scaffolds, this distinction breaks down,
and the client begins to compete with scaffold-scaffold interactions. For such clients, the change in partitioning across the
diagonal is likely to be less sharp, as we observe for scaffolds
(Figures 3A and 3B). Further investigations are needed to understand the distribution of scaffolds, clients, and such intermediate
molecules in the various known natural bodies.
Finally, several cellular bodies contain subcompartments
(condensed phases within the primary condensed phase) and
thus are not simple droplet/bulk systems (Brangwynne et al.,
2011; Feric et al., 2016; Jain et al., 2016; Wang et al., 2014).
Each subcompartment can have a unique composition organized by a distinct set of scaffold molecules. A client may bind
to free sites in any or all of the bodys subcompartments. Partition coefficients between any two sub-compartments or between a sub-compartment and the surrounding bulk will still
result from mass action driven by the corresponding free site
ratios.
Thus, despite the inherent complexities of natural cellular
bodies, they may still be understood through the lens of our simple model.
Biological Mechanisms of Regulating Body Composition
Biological processes could regulate the composition of
cellular bodies by acting on either scaffolds or clients on timescales ranging from physiologic to evolutionary. On the most
rapid timescales (seconds to minutes), covalent modifications
could change valencies of the scaffold components, shifting
the position of the system within the phase diagram. They
could also change valencies and affinities of the clients, influencing their degree of partitioning, as suggested here (Figure 2), and previously (Han et al., 2012; Kwon et al., 2013).
On slower timescales (hours to days), the scaffold concentrations could change via regulation of expression levels, or their
valencies could be changed by alternative splicing. On evolutionary timescales, changes in gene sequences could change
the affinity of scaffold components for each other or for clients, shifting composition and function in a more permanent
sense.
Some of these processes can be observed in PML NBs. For
example, the SUMOylation of PML is substantially decreased
during mitosis concomitant with loss of some SIM-containing
clients (Dellaire et al., 2006; Everett et al., 1999), and phosphorylation of the SIM in PML increases its affinity for SUMO (Cappadocia et al., 2015). Similarly, phosphorylation of the SIMs of PML
NB clients, including Daxx (Chang et al., 2011), increases their
interactions with the bodies.
Changes in Body Composition May Dictate Changes in
Function
Unlike macromolecular machines, cellular bodies continuously
rearrange the bonding interactions and organization of their constituent parts and thus are not stereochemically defined across
their lengths. Their functions, therefore, cannot be controlled
by allosteric transitions between conformational states, as
660 Cell 166, 651663, July 28, 2016
often occurs with macromolecules. Instead, transitions between

compositional states are likely to be key determinants of body
function. The differential partitioning of molecules in different regions of the phase diagram implies that it may be most appropriate to consider a given type of cellular body as a distribution
of entities (likely defined by a limited number of scaffolds) that
lie on a continuum of compositions subject to cellular control.
This idea was suggested previously for RNA-based bodies
based on the related compositions of P bodies, stress granules,
and RNA transport granules (Buchan and Parker, 2009). Similarly, in the case of PML NBs, a variety of structures in different
cell types and cell states have been characterized, unified by
their enrichment of the PML protein but varying in their composition of other components (Dellaire et al., 2006; Luciani et al.,
2006). Our data suggest that this behavior may be generally
applicable to many cellular bodies.
Since function is dictated by composition, this reasoning implies that cellular bodies may exhibit a continuum of functions,
rather than a limited set of discrete functions as seen for macromolecular machines in different conformations. Even though
cellular body function may be more continuous than discrete,
our data suggest that mechanisms could exist, as they do in canonical macromolecular machines, to mediate sharp switches
between different functional regimes. Moreover, we and others
(Molliex et al., 2015; Patel et al., 2015; Ramaswami et al.,
2013; Weber and Brangwynne, 2012) speculate that pathological states of cellular bodies may also lie on the same compositional and functional continuum. As such, manipulation or
depletion of certain scaffolds may be a promising approach to
mitigate the toxicities associated with these pathological granules. Indeed, in models of ALS, toxicities due to TDP-43 aggregation can be alleviated by removal of the Ataxin-2 scaffold
(Elden et al., 2010).
Implications for Cellular Body Function
Precise control of client partitioning could mediate colocalization
of reaction partners to accelerate reaction rates and increase reaction specificity. For example, polySUMOylation of cellular substrates was recently demonstrated to activate the ubiquitin E3
ligase RNF4, a process that could be driven or enhanced by
such compartmentalization (Rojas-Fernandez et al., 2014). Similarly, metabolic flux could be controlled by colocalization-mediated substrate channeling (Srere, 1987) or the colocalization of a
branch point enzyme and downstream molecules in a pathway
(Castellana et al., 2014). Compositional control may also help
regulate en masse reactions such as SUMOylation of many
cellular factors at PML NBs, which, analogous to DNA repair
foci (Psakhye and Jentsch, 2012), colocalize not only enzymes
of the SUMOylation cascade but also several SUMOylation
substrates (Van Damme et al., 2010). Partitioning into a cellular
body could also serve to sequester components away from their
cellular targets, as has been proposed in the regulation of Daxx
(Lallemand-Breitenbach and de The, 2010) and the priming of
RNA Polymerase II prior to transcription initiation (Kwon et al.,
2013). Indeed, strong depletion of clients from bulk solution
through dramatically high PC is consistent with behaviors we
observe in our mass action model (Figures 3F, S3E, and S3F
and Supplemental Information).
Conclusion
We demonstrate how cellular body assembly, when driven by
heterotypic polymerization and concomitant phase separation,
naturally leads to a simple and predictive model for compositional control of these structures. Our model suggests how
bodies could be switched sharply between distinct compositional (and thus functional) states on a range of biological timescales. Moreover, it suggests that superficially similar cellular
bodies composed of a given set of scaffolds may be markedly
different in their composition and function, depending on the
relative scaffold stoichiometries. Thus, a complete understanding of cellular bodies may require knowing relative scaffold
amounts in addition to scaffold identities.
Our studies thus provide a mechanistic framework for studying the biochemical and regulatory function of cellular bodies
owing to properties not attributable to any individual molecule
but rather to those intrinsic to the macroscopic structure itself.
Mass Action Model for Client Partitioning

Measured concentrations (from imaging) and affinities (from ITC, see Supplemental Information) of polySUMO and polySIM were used to calculate the free
sites concentrations in droplet and bulk phases. An equilibrium mass action
model was created to describe our systems as two compartments with unequal concentrations of receptors (free scaffold sites) and a permeable ligand
(client). The model was numerically solved using MATLAB and predicted PCs
were calculated as ratio of the total ligand concentration between the two
compartments (see Supplemental Information for details).
Cellular Experiments
For engineered polySUMO/polySIM and PML NB experiments, mammalian
cells (HeLa, U2OS, and PML/ MEFs) were cultured in DMEM supplemented
with 10% fetal bovine serum, 1% Penicillin-Streptomycin, and 1% GlutaMAX
in 5% CO2 at 37 C. Cells were transfected using Lipofectamine 2000 (Life
Technologies) and imaged 1824 hr after transfection. For P body experiments, WT and mutant yeast strains cells carrying Xrn1-GFP in the genome
were grown to log phase at 30 C in yeast minimal media supplemented with
a complete set of amino acids and 2% dextrose.
Genes, RNA, and Plasmids

polyPRM, polySH3, PTB, and the polyUCUCU RNA were described previously (Li et al., 2012). The RNA client UCUCU-AF647 (50 -UCUCUAAAAA-30 ;
30 -labeled with AF647), as well as (SUMO)5 and (SIM)5 as synthetic genes,
were purchased from Integrated DNA Technologies. Decavalent, fused, and
low valency SUMO/SIM constructs were constructed from (SUMO)5 and
(SIM)5 by PCR. To prevent conjugation and proteolysis, we mutated the C-terminal di-glycine motif in all SUMO proteins (see Supplemental Information).
The RRM client was constructed from the first RRM domain of PTB. The
mCherry, mEGFP, mVenus, and mCerulean (referred to as RFP, GFP, YFP,
and CFP, respectively) fusion proteins were produced by cloning into corresponding vectors (Clontech). Sequences of molecules used in this study are
listed in Table S4.

Protein Expression, Purification, and Labeling

All purified proteins were expressed and purified similarly (see Supplemental
Information). Proteins were expressed in E. coli strain BL21 DE3T1R by induction with 1 mM IPTG. Proteins were purified with Ni-NTA Agarose Resin
(QIAGEN) or Amylose Resin (NEB), followed by ion exchange (Source 15Q
and/or Source 15S [GE Healthcare]) and size exclusion chromatographies using a Superdex 200 or Superdex 75 gel filtration columns (GE Healthcare). Proteins were labeled using maleimide-conjugated Alexa dyes (Life Technologies)
following the manufacturers protocol.
In Vitro Partitioning Assays
Scaffold molecules (1% Alexa-labeled) were mixed with GFP- or RFPtagged or Alexa-labeled clients in wells of chambered cover glass
(GraceBiolabs) or 384-well plates (Sigma) passivated with 30 mg/mL BSA
(Sigma). Mixtures were incubated for 2 to 4 hr for SH3/PRM and PTB/
RNA experiments and 2026 hr for SUMO/SIM experiments and imaged
at 203 magnification.
Image Acquisition and Analysis
Yeast cells were imaged using DeltaVision Elite microscope at 1003
magnification using a sCMOS camera. In all other experiments, imaging
was performed using spinning disk confocal microscopes equipped with
EMCCD cameras at 203 or 1003 magnification for in vitro or cellular experiments, respectively. Images were analyzed using ImageJ or MATLAB
(Mathworks) (see Supplemental Information). Fluorescence intensities
were calibrated to concentrations using standard solutions of purified client
molecules or corresponding fluorescent proteins, whose concentrations
were independently determined. When possible, care was taken to circumvent effects of the PSF in concentration determination (see Supplemental
Information).
Conceptualization, S.F.B. and M.K.R.; methodology, S.F.B. and M.K.R.; investigation, S.F.B., A.R., W.P., Y.L., S.J.; writing, S.F.B., A.R., R.P., M.K.R.; supervision, R.P. and M.K.R.
ACKNOWLEDGMENTS
We thank Louie Kerr, Kate Luby-Phelps, and Abhijit Bugde for assistance
with imaging; Chad Brautigam and Thomas Scheuermann for assistance
with ITC; Mark Kittisopikul for helpful discussions regarding image analysis,
numerical fitting, and statistical testing; Pier Paolo Scaglioni for providing
PML/ cells; Rama Ranganathan for critical reading of the manuscript;
and members of the Rosen lab for helpful discussions. This work was
supported by the Howard Hughes Medical Institute, the HCIA program
of HHMI, grants from the NIH (R01-GM56322) and Welch Foundation
(I1544), a Sara and Frank McKnight Graduate Fellowship (to S.F.B.), and
the NSF (1000196079; to A.M.R.).
Received: September 8, 2015
REFERENCES
Antolini, F., Lo Bello, M., and Sette, M. (2003). Purified promyelocytic leukemia
coiled-coil aggregates as a tetramer displaying low a-helical content. Protein
Expr. Purif. 29, 94102.
Ayaydin, F., and Dasso, M. (2004). Distinct in vivo dynamics of vertebrate
SUMO paralogues. Mol. Biol. Cell 15, 52085218.
Banjade, S., Wu, Q., Mittal, A., Peeples, W.B., Pappu, R.V., and Rosen, M.K.
(2015). Conserved interdomain linker promotes phase separation of the multivalent adaptor protein Nck. Proc. Natl. Acad. Sci. USA 112, E6426E6435.
Best, J.L., Ganiatsas, S., Agarwal, S., Changou, A., Salomoni, P., Shirihai, O.,
Meluh, P.B., Pandolfi, P.P., and Zon, L.I. (2002). SUMO-1 protease-1 regulates
gene transcription through PML. Mol. Cell 10, 843855.
Brand, P., Lenser, T., and Hemmerich, P. (2010). Assembly dynamics of PML
nuclear bodies in living cells. PMC Biophys. 3, 3.
Cell 166, 651663, July 28, 2016 661
Brangwynne, C.P., Eckmann, C.R., Courson, D.S., Rybarska, A., Hoege, C.,
Gharakhani, J., Julicher, F., and Hyman, A.A. (2009). Germline P granules
are liquid droplets that localize by controlled dissolution/condensation. Science 324, 17291732.
Brangwynne, C.P., Mitchison, T.J., and Hyman, A.A. (2011). Active liquid-like
behavior of nucleoli determines their size and shape in Xenopus laevis oocytes. Proc. Natl. Acad. Sci. USA 108, 43344339.
Buchan, J.R., and Parker, R. (2009). Eukaryotic stress granules: the ins and
outs of translation. Mol. Cell 36, 932941.
Cappadocia, L., Mascle, X.H., Bourdeau, V., Tremblay-Belzile, S., ChakerMargot, M., Lussier-Price, M., Wada, J., Sakaguchi, K., Aubry, M., Ferbeyre,
G., and Omichinski, J.G. (2015). Structural and functional characterization of
the phosphorylation-dependent interaction between PML and SUMO1. Structure 23, 126138.
Castellana, M., Wilson, M.Z., Xu, Y., Joshi, P., Cristea, I.M., Rabinowitz, J.D.,
Gitai, Z., and Wingreen, N.S. (2014). Enzyme clustering accelerates processing of intermediates through metabolic channeling. Nat. Biotechnol. 32,
10111018.
Chalupnkova, K., Lattmann, S., Selak, N., Iwamoto, F., Fujiki, Y., and Nagamine, Y. (2008). Recruitment of the RNA helicase RHAU to stress granules
via a unique RNA-binding domain. J. Biol. Chem. 283, 3518635198.
Chang, C.-C., Naik, M.T., Huang, Y.-S., Jeng, J.-C., Liao, P.-H., Kuo, H.-Y.,
Ho, C.-C., Hsieh, Y.-L., Lin, C.-H., Huang, N.-J., et al. (2011). Structural and
functional roles of Daxx SIM phosphorylation in SUMO paralog-selective binding and apoptosis modulation. Mol. Cell 42, 6274.
Chen, Y.C., Kappel, C., Beaudouin, J., Eils, R., and Spector, D.L. (2008). Live
cell dynamics of promyelocytic leukemia nuclear bodies upon entry into and
exit from mitosis. Mol. Biol. Cell 19, 31473162.
Chung, I., Leonhardt, H., and Rippe, K. (2011). De novo assembly of a PML
nuclear subcompartment occurs through multiple pathways and induces
telomere elongation. 124, 36033618.
Clemson, C.M., Hutchinson, J.N., Sara, S.A., Ensminger, A.W., Fox, A.H.,
Chess, A., and Lawrence, J.B. (2009). An architectural role for a nuclear noncoding RNA: NEAT1 RNA is essential for the structure of paraspeckles. Mol.
Cell 33, 717726.
Decker, C.J., and Parker, R. (2012). P-bodies and stress granules: possible
roles in the control of translation and mRNA degradation. Cold Spring Harb.
Perspect. Biol. 4, a012286.
Decker, C.J., Teixeira, D., and Parker, R. (2007). Edc3p and a glutamine/asparagine-rich domain of Lsm4p function in processing body assembly in Saccharomyces cerevisiae. J. Cell Biol. 179, 437449.
Dellaire, G., Eskiw, C.H., Dehghani, H., Ching, R.W., and Bazett-Jones, D.P.
(2006). Mitotic accumulations of PML protein contribute to the re-establishment of PML nuclear bodies in G1. J. Cell Sci. 119, 10341042.
Dundr, M., Hebert, M.D., Karpova, T.S., Stanek, D., Xu, H., Shpargel, K.B., Meier, U.T., Neugebauer, K.M., Matera, A.G., and Misteli, T. (2004). In vivo kinetics
of Cajal body components. J. Cell Biol. 164, 831842.
Elbaum-Garfinkle, S., Kim, Y., Szczepaniak, K., Chen, C.C.H., Eckmann, C.R.,
Myong, S., and Brangwynne, C.P. (2015). The disordered P granule protein
LAF-1 drives phase separation into droplets with tunable viscosity and dynamics. Proc. Natl. Acad. Sci. USA 112, 71897194.
Elden, A.C., Kim, H.J., Hart, M.P., Chen-Plotkin, A.S., Johnson, B.S., Fang, X.,
Armakola, M., Geser, F., Greene, R., Lu, M.M., et al. (2010). Ataxin-2 intermediate-length polyglutamine expansions are associated with increased risk for
ALS. Nature 466, 10691075.
Everett, R.D., Lomonte, P., Sternsdorf, T., van Driel, R., and Orr, A. (1999). Cell
cycle regulation of PML modification and ND10 composition. J. Cell Sci. 112,
45814588.
Feric, M., Vaidya, N., Harmon, T.S., Mitrea, D.M., Zhu, L., Richardson, T.M.,
Kriwacki, R.W., Pappu, R.V., and Brangwynne, C.P. (2016). Coexisting liquid
phases underlie nucleolar subcompartments. Cell 165, 16861697.
662 Cell 166, 651663, July 28, 2016
Fong, K.W., Li, Y., Wang, W., Ma, W., Li, K., Qi, R.Z., Liu, D., Songyang, Z., and
Chen, J. (2013). Whole-genome screening identifies proteins localized to
distinct nuclear bodies. J. Cell Biol. 203, 149164.
Fromm, S.A., Kamenz, J., Noldeke, E.R., Neu, A., Zocher, G., and Sprangers,
R. (2014). In vitro reconstitution of a cellular phase-transition process that
involves the mRNA decapping machinery. Angew. Chem. Int. Ed. Engl. 53,
73547359.
Grousl, T., Ivanov, P., Frydlova, I., Vasicova, P., Janda, F., Vojtova, J., Malnska, K., Malcova, I., Novakova, L., Janoskova, D., et al. (2009). Robust heat
shock induces eIF2alpha-phosphorylation-independent assembly of stress
granules containing eIF3 and 40S ribosomal subunits in budding yeast,
Saccharomyces cerevisiae. J. Cell Sci. 122, 20782088.
Han, T.W., Kato, M., Xie, S., Wu, L.C., Mirzaei, H., Pei, J., Chen, M., Xie, Y., Allen, J., Xiao, G., and McKnight, S.L. (2012). Cell-free formation of RNA granules: bound RNAs identify features and components of cellular assemblies.
Cell 149, 768779.
Hanazawa, M., Yonetani, M., and Sugimoto, A. (2011). PGL proteins self associate and bind RNPs to mediate germ granule assembly in C. elegans. J. Cell
Biol. 192, 929937.
Handwerger, K.E., Cordero, J.A., and Gall, J.G. (2005). Cajal bodies, nucleoli,
and speckles in the Xenopus oocyte nucleus have a low-density, sponge-like
structure. Mol. Biol. Cell 16, 202211.
Hattersley, N., Shen, L., Jaffray, E.G., and Hay, R.T. (2011). The SUMO protease SENP6 is a direct regulator of PML nuclear bodies. Mol. Biol. Cell 22,
7890.
He, X., Riceberg, J., Pulukuri, S.M., Grossman, S., Shinde, V., Shah, P., Brownell, J.E., Dick, L., Newcomb, J., and Bence, N. (2015). Characterization of the
loss of SUMO pathway function on cancer cells and tumor proliferation. PLoS
ONE 10, e0123882.
Huang, S.Y., Naik, M.T., Chang, C.F., Fang, P.J., Wang, Y.H., Shih, H.M., and
Huang, T.H. (2014). The B-box 1 dimer of human promyelocytic leukemia
protein. J. Biomol. NMR 60, 275281.
Hyman, A.A., Weber, C.A., and Julicher, F. (2014). Liquid-liquid phase separation in biology. Annu. Rev. Cell Dev. Biol. 30, 3958.
Ishov, A.M., Sotnikov, A.G., Negorev, D., Vladimirova, O.V., Neff, N., Kamitani,
T., Yeh, E.T., Strauss, J.F., 3rd, and Maul, G.G. (1999). PML is critical for ND10
formation and recruits the PML-interacting protein daxx to this nuclear structure when modified by SUMO-1. J. Cell Biol. 147, 221234.
Jain, S., and Parker, R. (2013). The discovery and analysis of P Bodies. Adv.
Exp. Med. Biol. 768, 2343.
Jain, S., Wheeler, J.R., Walters, R.W., Agrawal, A., Barsic, A., and Parker, R.
(2016). ATPase-Modulated Stress Granules Contain a Diverse Proteome and
Substructure. Cell 164, 487498.
Kaiser, T.E., Intine, R.V., and Dundr, M. (2008). De novo formation of a subnuclear body. Science 322, 17131717.
Kato, M., Han, T.W., Xie, S., Shi, K., Du, X., Wu, L.C., Mirzaei, H., Goldsmith,
E.J., Longgood, J., Pei, J., et al. (2012). Cell-free formation of RNA granules:
low complexity sequence domains form dynamic fibers within hydrogels.
Cell 149, 753767.
Kroschwald, S., Maharana, S., Mateju, D., Malinovska, L., Nuske, E., Poser, I.,
Richter, D., and Alberti, S. (2015). Promiscuous interactions and protein disaggregases determine the material state of stress-inducible RNP granules. eLife
4, e06807.
Kwon, I., Kato, M., Xiang, S., Wu, L., Theodoropoulos, P., Mirzaei, H., Han, T.,
Xie, S., Corden, J.L., and McKnight, S.L. (2013). Phosphorylation-regulated
binding of RNA polymerase II to fibrous polymers of low-complexity domains.
Cell 155, 10491060.
Lallemand-Breitenbach, V., and de The, H. (2010). PML nuclear bodies. Cold
Spring Harb. Perspect. Biol. 2, a000661.
Li, P., Banjade, S., Cheng, H.C., Kim, S., Chen, B., Guo, L., Llaguno, M., Hollingsworth, J.V., King, D.S., Banani, S.F., et al. (2012). Phase transitions in the
assembly of multivalent signalling proteins. Nature 483, 336340.
Lin, D.-Y., Huang, Y.-S., Jeng, J.-C., Kuo, H.-Y., Chang, C.-C., Chao, T.-T., Ho,
C.-C., Chen, Y.-C., Lin, T.-P., Fang, H.-I., et al. (2006). Role of SUMO-interacting motif in Daxx SUMO modification, subnuclear localization, and repression
of sumoylated transcription factors. Mol. Cell 24, 341354.
Lin, Y., Protter, D.S.W., Rosen, M.K., and Parker, R. (2015). Formation and
maturation of phase separated liquid droplets by RNA binding proteins. Mol.
Cell 60, 208219.
Luciani, J.J., Depetris, D., Usson, Y., Metzler-Guillemain, C., Mignon-Ravix,
C., Mitchell, M.J., Megarbane, A., Sarda, P., Sirma, H., Moncla, A., et al.
(2006). PML nuclear bodies are highly organised DNA-protein structures
with a function in heterochromatin remodelling at the G2 phase. J. Cell Sci.
119, 25182531.
Mao, Y.S., Zhang, B., and Spector, D.L. (2011). Biogenesis and function of
nuclear bodies. Trends Genet. 27, 295306.
Mitrea, D.M., Cika, J.A., Guy, C.S., Ban, D., Banerjee, P.R., Stanley, C.B.,
Nourse, A., Deniz, A.A., and Kriwacki, R.W. (2016). Nucleophosmin integrates
within the nucleolus via multi-modal interactions with proteins displaying
R-rich linear motifs and rRNA. eLife 5, e13571.
Mohamad, N., and Boden, M. (2010). The proteins of intra-nuclear bodies: a
data-driven analysis of sequence, interaction and expression. BMC Syst.
Biol. 4, 44.
Molliex, A., Temirov, J., Lee, J., Coughlin, M., Kanagaraj, A.P., Kim, H.J., Mittag, T., and Taylor, J.P. (2015). Phase separation by low complexity domains
promotes stress granule assembly and drives pathological fibrillization. Cell
163, 123133.
Muller, S., and Dejean, A. (1999). Viral immediate-early proteins abrogate the
modification by SUMO-1 of PML and Sp100 proteins, correlating with nuclear
body disruption. J. Virol. 73, 51375143.
Nisole, S., Maroui, M.A., Mascle, X.H., Aubry, M., and Chelbi-Alix, M.K. (2013).
Differential Roles of PML Isoforms. Front. Oncol. 3, 125.
Nott, T.J., Petsalaki, E., Farber, P., Jervis, D., Fussner, E., Plochowietz, A.,
Craggs, T.D., Bazett-Jones, D.P., Pawson, T., Forman-Kay, J.D., and Baldwin,
A.J. (2015). Phase transition of a disordered nuage protein generates environmentally responsive membraneless organelles. Mol. Cell 57, 936947.
Parker, R. (2012). RNA degradation in Saccharomyces cerevisae. Genetics
191, 671702.
Patel, A., Lee, H.O., Jawerth, L., Maharana, S., Jahnel, M., Hein, M.Y., Stoynov, S., Mahamid, J., Saha, S., Franzmann, T.M., et al. (2015). A Liquid-toSolid Phase Transition of the ALS Protein FUS Accelerated by Disease
Mutation. Cell 162, 10661077.
Psakhye, I., and Jentsch, S. (2012). Protein group modification and synergy in
the SUMO pathway as exemplified in DNA repair. Cell 151, 807820.
Ramaswami, M., Taylor, J.P., and Parker, R. (2013). Altered ribostasis: RNAprotein granules in degenerative disorders. Cell 154, 727736.
Reijns, M.A., Alexander, R.D., Spiller, M.P., and Beggs, J.D. (2008). A role for
Q/N-rich aggregation-prone regions in P-body localization. J. Cell Sci. 121,
24632472.
Rojas-Fernandez, A., Plechanovova, A., Hattersley, N., Jaffray, E., Tatham,
M.H., and Hay, R.T. (2014). SUMO chain-induced dimerization activates
RNF4. Mol. Cell 53, 880892.
Sahin, U., Ferhi, O., Jeanne, M., Benhenda, S., Berthier, C., Jollivet, F., NiwaKawakita, M., Faklaris, O., Setterblad, N., de The, H., and Lallemand-Breitenbach, V. (2014). Oxidative stress-induced assembly of PML nuclear bodies
controls sumoylation of partner proteins. J. Cell Biol. 204, 931945.
Shen, T.H., Lin, H.-K., Scaglioni, P.P., Yung, T.M., and Pandolfi, P.P. (2006).
The mechanisms of PML-nuclear body formation. Mol. Cell 24, 331339.
Srere, P.A. (1987). Complexes of sequential metabolic enzymes. Annu. Rev.
Biochem. 56, 89124.
Teixeira, D., Sheth, U., Valencia-Sanchez, M.A., Brengues, M., and Parker, R.
(2005). Processing bodies require RNA for assembly and contain nontranslating mRNAs. RNA 11, 371382.
Van Damme, E., Laukens, K., Dang, T.H., and Van Ostade, X. (2010). A manually curated network of the PML nuclear body interactome reveals an important role for PML-NBs in SUMOylation dynamics. Int. J. Biol. Sci. 6, 5167.
Wang, J.T., Smith, J., Chen, B.C., Schmidt, H., Rasoloson, D., Paix, A., Lambrus, B.G., Calidas, D., Betzig, E., and Seydoux, G. (2014). Regulation of RNA
granule dynamics by phosphorylation of serine-rich, intrinsically disordered
proteins in C. elegans. eLife 3, e04591.
Weber, S.C., and Brangwynne, C.P. (2012). Getting RNA and protein in phase.
Cell 149, 11881191.
Weidtkamp-Peters, S., Lenser, T., Negorev, D., Gerstner, N., Hofmann, T.G.,
Schwanitz, G., Hoischen, C., Maul, G., Dittrich, P., and Hemmerich, P.
(2008). Dynamics of component exchange at PML nuclear bodies. J. Cell
Sci. 121, 27312743.
Zhong, S., Muller, S., Ronchetti, S., Freemont, P.S., Dejean, A., and Pandolfi,
P.P. (2000). Role of SUMO-1-modified PML in nuclear body formation. Blood
95, 27482752.
Cell 166, 651663, July 28, 2016 663
Article
Pre-assembled Nuclear Pores Insert into the Nuclear

Envelope during Early Development
Graphical Abstract
Authors
Bernhard Hampoelz,
Marie-Therese Mackmull,
Pedro Machado, ..., Thomas Lecuit,
Yannick Schwab, Martin Beck
Correspondence
martin.beck@embl.de
In Brief
Rapidly growing embryos meet the
challenge of stocking the nuclear
envelope with nuclear pore complexes by
keeping a ready store of pores in stacked
membranous rings that feed into the
envelope without puncturing it.
Highlights
d
Annulate lamellae (AL) NPCs insert into the nuclear envelope

during interphase
AL-NPCs are pore scaffolds devoid of most transport
channel nucleoporins
NE-openings enable AL insertion, yet the permeability barrier
remains unperturbed
AL-NPC insertion operates only before gastrulation
Hampoelz et al., 2016, Cell 166, 664678

July 28, 2016 2016 The Author(s). Published by Elsevier Inc.
Article
Pre-assembled Nuclear Pores Insert
into the Nuclear Envelope
during Early Development
Bernhard Hampoelz,1 Marie-Therese Mackmull,1 Pedro Machado,2 Paolo Ronchi,2 Khanh Huy Bui,1,5 Nicole Schieber,4
Rachel Santarella-Mellwig,2 Aleksandar Necakov,1 Amparo Andres-Pons,1 Jean Marc Philippe,3 Thomas Lecuit,3
Yannick Schwab,2,4 and Martin Beck1,4,*
1European
Molecular Biology Laboratory, Structural and Computational Biology Unit, 69117 Heidelberg, Germany
Molecular Biology Laboratory, Electron Microscopy Core Facility, 69117 Heidelberg, Germany
3Aix-Marseille Universite
, CNRS, IBDM UMR 7288, 13009 Marseille, France
4European Molecular Biology Laboratory, Cell Biology and Biophysics Unit, 69117 Heidelberg, Germany
5Present address: Department of Anatomy and Cell Biology, Groupe de Recherche Axe
sur la Structure des Proteines (GRASP), McGill
University, Montreal, Quebec H3A 0C7, Canada
*Correspondence: martin.beck@embl.de
2European
SUMMARY
Nuclear pore complexes (NPCs) span the nuclear envelope (NE) and mediate nucleocytoplasmic transport. In metazoan oocytes and early embryos,
NPCs reside not only within the NE, but also at
some endoplasmic reticulum (ER) membrane sheets,
termed annulate lamellae (AL). Although a role for AL
as NPC storage pools has been discussed, it remains
controversial whether and how they contribute to the
NPC density at the NE. Here, we show that AL insert
into the NE as the ER feeds rapid nuclear expansion
in Drosophila blastoderm embryos. We demonstrate
that NPCs within AL resemble pore scaffolds that
mature only upon insertion into the NE. We delineate
a topological model in which NE openings are critical
for AL uptake that nevertheless occurs without
compromising the permeability barrier of the NE.
We finally show that this unanticipated mode of
pore insertion is developmentally regulated and operates prior to gastrulation.
INTRODUCTION
In eukaryotes, the double membranous nuclear envelope (NE)
encloses the nucleoplasm and separates it from the cytoplasm.
The inner nuclear membrane (INM) provides contact with chromatin and the outer nuclear membrane (ONM) is continuous
with the endoplasmic reticulum (ER). The two bilayers are fused
at nuclear pore complexes (NPCs) that form aqueous channels
through which regulated transport of macromolecules occurs.
NPCs consist of multiple copies of 30 different nucleoporins
(Nups) that are organized into biochemically distinct sub-complexes (Figures S1A, S1A0 , and S1B). Two such modules, the inner ring complex (also called Nup93 complex) and the Y-complex (also called Nup107 complex) constitute the NPC scaffold
that is symmetric across the NE plane. FG-Nups (containing
phenylalanine-glycine rich intrinsically disordered protein domains) dock onto the scaffold. They constitute the permeability
barrier and interact with translocating cargo complexes. Some
of them (e.g., Nup214/88, Nup358 [RanBP2], and Nup153) introduce asymmetry by specifically binding to the cytoplasmic or nuclear face of the NPC, respectively (reviewed in Grossman et al.,
2012) (Figure S1B).
Obviously, the sheer size and compositional complexity of
NPCs renders its assembly and membrane insertion a very intricate task. Two distinct NPC assembly pathways that are temporally separated during the cell cycle have been described. First,
during interphase, NPCs are assembled de novo onto an enclosed NE (DAngelo et al., 2006). Interphase assembly occurs
ubiquitously throughout eukaryotes and strictly requires the
fusion of the INM and ONM by a mechanism that is only partially
understood (Doucet and Hetzer, 2010). Second, no membrane
fusion is required for NPC assembly at mitotic exit. This so-called
postmitotic assembly mode is restricted to eukaryotes that
disassemble their NPCs during mitosis into soluble sub-complexes after phosphorylation by mitotic kinases (Laurell et al.,
2011). In anaphase, de-phosphorylation of Nups is thought to
trigger the ordered re-assembly onto the separated chromatids
before or while membranes enclose daughter nuclei (Doucet
et al., 2010; Dultz and Ellenberg, 2010; Dultz et al., 2008). Both
insertion mechanisms rely on the stepwise recruitment of
pre-assembled sub-complexes. An insertion of pre-assembled
NPCs into the NE has (to the best of our knowledge) not yet
been described.
NPCs not only reside within the NE but are also found in
stacked cytoplasmic membranes termed annulate lamellae
(AL) that are a subdomain of the ER (Figure S1C) (Cordes
et al., 1996; Daigle et al., 2001). Based on two-dimensional
(2D) transmission electron micrographs these membrane stacks
have been perceived as parallel membrane sheets decorated
with NPCs (hereafter called AL-NPCs) that morphologically
appear similar to their counterparts on the nuclear envelope
(NE-NPCs) (Kessel, 1983). AL appear in some but not all transformed cell lines (Cordes et al., 1996; Daigle et al., 2001) and
are highly abundant in germ cells and early embryos throughout
664 Cell 166, 664678, July 28, 2016 2016 The Author(s). Published by Elsevier Inc.
Figure 1. AL-NPCs Insert into the Nuclear Envelope

(A and A0 ) Nuclear growth and NPC distribution during interphase in the Drosophila syncytium: Stills from a time-lapse movie recorded from a GFP::Nup107
expressing embryo imaged right after interphase onset (A) and 5 min later (A0 ). GFP::Nup107 localizes to the NE and to cytoplasmic foci (arrowheads) in (A), which
disappear at t + 5 min (A0 ).
(BB00 ) Cytoplasmic foci of Nup107 fluorescence localize to AL-NPCs. Correlative light and electron microscopy (CLEM) of a RFP::Nup107 expressing interphase
embryo. RFP::Nup107 is concentrated along the NE and at foci (boxed in B), that correspond to NPCs along ER membranes (arrowheads in B0 and B00 ).
Cell 166, 664678, July 28, 2016 665
animal phyla, including Xenopus, Caenorhabditis elegans, sea

urchin, Drosophila, and also humans (Soupart and Strong,
1974). A role of AL as a storage compartment for maternally
deposited Nups that can be made available for meiosis and
the rapid cell cycles during early embryogenesis has been suggested (Lenart and Ellenberg, 2003; Longo and Anderson,
1968; Spindler and Hemleben, 1982) but not experimentally
proven. Despite these fundamental and long-standing pretensions the function of AL remains elusive and controversial, primarily for two reasons: (1) it has been difficult to conceive how
the insertion of parallel stacked membrane sheets containing
pre-assembled and possibly pre-oriented NPCs is topologically
possible; and (2) direct experimental evidence for a contribution
of AL-NPCs to the pool of NE-NPCs has never been obtained.
On the contrary a previous study in Drosophila embryos has detected large soluble pools of transport channel Nups and
concluded that NPC insertion likely proceeds from soluble cytosolic Nups (Onischenko et al., 2004).
Here, we address the function of AL in the physiological
context of the Drosophila blastoderm embryo that is rich in
AL, while it undergoes a series of 13 synchronized mitoses
in a syncytium (Figure S2A). Subsequently, the plasma membranes enclose the cortically aligned somatic nuclei in the
extended 14th interphase, forming the first epithelial cell layer
before the embryo initiates gastrulation (Schejter and Wieschaus, 1993). This occurs concomitantly with the broad
onset of transcriptional activity on the zygotic genome, a
major developmental transition present in all metazoan (Newport and Kirschner, 1982). In the syncytial blastoderm, cellcycle progression is very rapid, with interphase durations of
10 min during the early cell cycles. At least in mammalian
cells, de novo NPC interphase assembly has been described
to proceed with markedly slower kinetics (Dultz and Ellenberg,
2010). This led us to hypothesize that NPC assembly into a
closed NE in Drosophila embryos might occur by a different,
faster mechanism. By tracking NPCs in living embryos,
we demonstrate direct uptake of AL-NPCs into the NE, as
the ER feeds nuclear expansion. We derive a topological
model that explains how the INM becomes continuous with inserting membrane sheets from the ER. We conclude that AL
insertion to the NE is a previously unanticipated mode of
NPC insertion that relies on pre-assembled, yet immature

NPC scaffolds and operates prior to gastrulation.
RESULTS
Nuclear Pores Insert from the ER into the NE
To investigate whether AL-NPCs contribute to the pool of NENPCs, we conducted live-imaging experiments in Drosophila
blastoderm embryos before formation of the first somatic
cell layer (Figure S2A). During this stage of development, AL
are abundant and thus could potentially serve as a reservoir
for NE-NPCs. To track NPCs throughout early embryogenesis,
we expressed functional GFP or RFP fusions of Nup107
(Katsani et al., 2008) to image scaffold Nups and injected
sub-critical concentrations of the fluorescently labeled lectin
wheat germ agglutinin (WGA) to label FG-Nups. In interphase,
GFP::Nup107 localized to the NE and to prominent foci
throughout the cytoplasm (Figure 1A), similar to structures
that were previously characterized as AL (Cordes et al., 1996;
Daigle et al., 2001; Onischenko et al., 2004, 2005). As expected
these foci also stained positive for transport channel Nups (Figures 1D1D00 ) and always localized to membranes (Figures
S2CS2E00 ). To further confirm that these foci are morphologically identical to AL, we performed correlative light and electron
microscopy. RFP::Nup107 fluorescence was strongly enriched
along the NE and at AL (Figures 1B1B00 , S2B, and S2B0 ). We
conclude that fluorescence imaging in life embryos is wellsuited to study the spatiotemporal dynamics of annulate
lamellae.
Image quantification revealed a 2.5- to 3-fold increase in nuclear surface during interphases, indicating a considerable uptake of ER membranes within a few minutes (Figures 1A, 1A0 ,
and 1C). Thus, the surface area of the nucleus just before division
is more than twice as large as the combined surfaces of the two
daughter nuclei after division. This finding implies an excess of
nucleoporins with respect to the available nuclear surface after
mitosis. Both NE-NPCs and AL-NPCs disassemble during
mitosis (Cordes et al., 1996; Onischenko et al., 2005; Stafstrom
and Staehelin, 1984b). This leaves a fenestrated NE that, unlike
in vertebrates, stays throughout mitosis around the separating
sister chromatids and the mitotic spindle, except at centrosomes
(C) NPC density stays constant as nuclei grow. Quantification of normalized nuclear surface increase (blue curves) and the mean GFP::Nup107 fluorescence
intensity SD at the NE (red curves) during interphases. Values on both graphs are normalized to the earliest measured time point for each movie (n = 71 nuclei in
four embryos).
(DE) The Y-complex protein Nup107 and WGA-labeled transport Nups co-localize at the NE and at AL-NPCs that insert to the NE. Top view still (DD00 ) and
kymograph (E) from a time-lapse movie imaging a WGA-Alexa555 injected syncytial blastoderm embryo expressing GFP::Nup107 (see also Movie S2). Insertion is
captured in the kymograph (E) that spans the region of interest (ROI) boxed in (D).
(FG) Stills (FF00 ) and kymograph (G) of the blue shaded ROI from a time-lapse movie recording photo-converted EosFP::Seh-1 before (0 s, F) or after (6 s, F0 , and
145 s, F00 ) photo-conversion of AL-NPCs adjacent to the NE. NPC transfer from AL to the NE is documented by the lateral dispersion of the converted signal, which
starts after 100 s (G).
(H) AL-NPC number drops during early interphases. Quantification of the relative AL-NPC number inferred from GFP::Nup107 fluorescence for four embryos liveimaged over interphases. AL-NPCs were counted from 1 mm distant z sections spanning nuclear height in a constant field of view comprising 10 nuclei for each
embryo.
(I) GFP::Nup107 fluorescence intensity shifts from AL to the NE in the first 25% of interphases, when AL number drops the most (H). Fluorescence intensities were
integrated over consecutive confocal slices covering the entire nuclear height at the NE (NE-NPCs) and at AL (AL-NPCs). Cytoplasmic GFP::Nup107 was
determined from the mean fluorescence intensity. Analysis was done on two embryos (n = 13 and 11 nuclei, respectively). See Figure 1A for representative image;
error bars represent SD over multiple ROIs. All images in Figure 1 are acquired from embryos in cycles 1013 of the syncytial blastoderm stage.
See also Figure S1 and Movies S2 and S4.
666 Cell 166, 664678, July 28, 2016
Figure 2. AL-NPCs Resemble Pore Scaffolds

(A) Composition of NE-NPCs and AL-NPCs. Median intensity-based absolute quantification (iBAQ) scores of Nups detected in nuclei containing NE-NPCs,
microsomal membranes containing AL-NPCs and cytosol after fractionation of Drosophila embryos in the syncytial blastoderm stage (n = 3 biological replicates).
Nups are grouped into known subcomplexes and color-coded as represented in (E0 ).
(B) Western blot analysis of fractionated Drosophila syncytial blastoderm embryos. The Lamin Dm0 is exclusively nuclear (N), while a-tubulin is strongly enriched
in the cytoplasm (C), confirming fractionation quality. Detection with mAb414 recognizing a panel of FG-Nups reveals Nup358 predominantly in membranes (M)
and nuclei but absent from the cytosol. Other FG Nups are mostly soluble (see text). Amido black shows equal loading.
Cell 166, 664678, July 28, 2016 667
(Movie S1). Simultaneous with NE-NPC re-assembly at the

daughter nuclei in late mitosis, AL-NPCs appeared first at membranes at the spindle (Figures S2CS2C00 ; Movie S1), consistent
with the idea of excess nucleoporins at late mitosis/early
interphase.
To address if AL-NPCs contribute to the NE-NPC pool during
the following interphase, we first determined if NE-NPC density
decreases during nuclear surface expansion. We quantified the
mean fluorescence intensities of GFP::Nup107 at the NE and
found that it stayed almost constant (Figure 1C). As a consequence, NPCs have to insert constantly into the NE as its surface increases. Conversely, AL-NPCs were highly abundant in
early but not late interphase (Figures 1A and 1A0 ). We therefore
investigated their fate during interphase progression by
tracking AL-NPCs in living embryos and found that they insert
into the NE (Figures 1D1G and S2ES2E00 ; Movies S2 and
S3), along ER membranes (Figures S2ES2E00 ; Movie S3). To
directly confirm the transfer of Nups from the observed cytoplasmic foci to the NE, we imaged embryos expressing
photo-convertible Seh1-EosFP. After photo-conversion of a
fluorescent spot close to a nucleus, the signal remained locally
constrained for 100 s before it laterally resolved into the proximal NE over roughly the same time frame (Figures 1F1F00 and
1G; Movie S4), suggesting a critical event prior to lateral diffusion. We conclude that AL-NPCs insert into the NE during blastoderm interphases.
Previous EM-based morphometry suggested that the total
number of AL-NPCs stays constant in the syncytial blastoderm,
i.e., during the first 90 min of Drosophila embryogenesis (Onischenko et al., 2004). However, if AL-NPCs considerably
contribute to the pool of NE-NPCs by inserting into the NE while
it expands, their number should decrease at least temporarily on
much shorter timescales, namely during the 10 min of each
interphase. Indeed, we found that AL-NPCs diminished as interphases progressed (Figure 1H). This reduction was particularly
strong in the first half of interphase, when the rate of nuclear
growth and thus NPC insertion was highest (Figure 1C). To
estimate if the reduction of AL-NPC number reflects NPC redistribution from AL to the NE, we measured the respective
GFP::Nup107 fluorescence levels at both compartments in
three dimensions over time. For all quantified nuclei, the integrated NE fluorescence intensities of GFP::Nup107 increased
between 70% and 100% within the first quarter of interphase,
while inversely intensities at AL strongly decreased, mirroring
AL disappearance (Figure 1I). This supports a scenario in which
the pool of NE-NPCs is predominantly fed by integration of ALNPCs. Alternatively, AL could disassemble and soluble Nups
could add to pore formation at the growing NE from the cytoplasm. Yet this is unlikely since the GFP::Nup107 background
intensity in the cytoplasm remained constant as AL disassembled (Figure 1I). We conclude that our data rather support
a scenario in which pre-assembled NPCs insert from AL into

the NE.
There are, however, major impediments that challenge the
notion that intact NPC can insert into the NE: first, NPCs have
an inherent compositional directionality across the NE plane
(Figure S1B). If AL-NPCs were identical to NE-NPCs, they would
be assembled asymmetrically in the absence of a nuclear
compartment providing a directionality cue. They also had to
be inserted into the NE in the correct orientation. Second, the
integration of NPC-containing ER sheets into an intact NE poses
striking topological obstacles. In particular, how an AL membrane sheet can become continuous with the INM of a sealed,
intact NE is far from obvious. How is AL-NPC insertion thus
possible?
AL-NPCs Resemble Pore Scaffolds
The asymmetry of NPCs derives from sets of FG-Nups that are
found exclusively either at the nucleoplasmic or cytoplasmic
side of the NPC, in contrast to the symmetrically embedded
scaffold Nups of the inner ring and Y-complexes (Figures S1A
and S1B). We therefore explored if NPC composition was preserved in AL. We subjected blastoderm embryos to subcellular
fractionation and comparatively analyzed fractions enriched for
nuclei, microsomal membranes containing AL (devoid of the
NE) and soluble cytosolic proteins by quantitative mass spectrometry (Figures S3AS3D00 , S4A, and S4B). We found that
AL-NPCs contain the full set of NPC scaffold components,
namely all the members of the inner ring and Y-complexes
(Figures 2A and S4B). In contrast, their levels were low or undetectable in the cytosol, with the exception of Sec13, a known
member of the cytosolic coatomer complex (Fath et al., 2007)
(Figures 2A and S4B). These data further support the above-proposed scenario in which soluble pools of scaffold Nups cannot
significantly contribute to the maintenance of NE-NPC number
during interphase (see also Figure 1I).
The FG-Nups 358 and 98 displayed a subcellular distribution
that was similar to scaffold Nups (Figures 2A and S4B). Both
have been recently shown to critically contribute to NPC scaffold
formation (Fischer et al., 2015; Stuwe et al., 2015; von Appen
et al., 2015). Notably, the presence of Nup98 in AL-NPCs, a
protein that is the essential constituent of the NPC permeability
barrier (Hulsmann et al., 2012), suggests that NPCs are impermeable for larger molecules at all times. In contrast, the members of the Nup62/58/54 and Nup214/88 (the latter called Mbo
in flies) complexes as well as the nuclear basket components
Tpr (called Mtor in flies) and Nup153 were absent in AL-NPCs
(Figures 2A2D and S4B). Instead, the Nup214/88 and Nup62/
58/54 complexes were highly abundant in the cytosol (Figures
2A, 2B, and S4B), in agreement with previous biochemical results that identified certain FG-Nups to be predominantly soluble
and excluded from ER-membranes (Onischenko et al., 2004).
(CD00 ) Top views onto a fixed Drosophila syncytial blastoderm embryo in interphase. The nuclear basket components Nup153 (C and C0 ) and Mtor (D and D0 ) are
absent from AL-NPCs (arrowheads in C00 , D00 ), which stain positive for mAb414 (C and C00 ) or WGA (D and D00 ), both labeling FG-Nups (including Nup358).
(E and E0 ) AL-NPCs (E) are pore scaffolds made of transmembrane Nups, the inner ring, and Y-complex nucleoporins, extended by Nup358, which might or might
not be attached symmetrically across ER membranes. NE-NPCs (E0 ) recruit soluble Nups to construct the mature pore that is asymmetric across the NE.
See also Figures S3 and S4.
668 Cell 166, 664678, July 28, 2016
Cell 166, 664678, July 28, 2016 669
One might thus surmise that NPC assembly is completed after

insertion into the NE by recruiting soluble Nups from the cytosol
in order to establish directionality and transport competence.
Indeed, the nuclear accumulation of nls::GFP was delayed as
compared to the burst phase of AL-NPC insertion (Figures S3E
and S3F). We conclude that AL-NPCs are pore scaffolds devoid
of most FG-Nups and all nuclear basket components. With the
exception of Nup358, Nups that asymmetrically distribute
across the NE-plane in NE-NPCs are absent from AL-NPCs (Figures 2E and 2E0 ).
Topology of AL Insertion
To address how AL-NPC insertion is topologically possible, we
sought to identify putative steps of AL insertion by ultrastructural
analysis. We first analyzed sections through staged blastoderm
embryos by transmission electron microscopy after high-pressure freezing and freeze substitution. AL were apparent in the
cytoplasm throughout the embryo as interconnected stacks of
membranes containing NPCs (Figures S5A, S6A, and S6B
S6B00 ). AL that were close to the NE and thus potentially could
engage in an insertion event, often appeared continuous with
the NE (Figures 3A, 3B, S5B00 , and S5C0 ). Strikingly, as evident
in multiple sections, these AL-NE fusion sites often were adjacent to apparent openings within the NE (Figures 3A and 3B).
At the edges of these openings, the INM and ONM were seen
to be fused in the electron micrographs, emphasizing that the
gaps are not sample preparation artifacts (Figures 3A and 3B).
We used correlative light and electron microscopy as described
above and recorded an electron tomogram at a site where a fluorescent spot of RFP::Nup107 close to the NE indicated a potentially ongoing insertion event (Figures 3C, 3C0 , and S5BS5B00 ).
Although the observed membrane topology in this region was
complex, it clearly showed the critical features: a patch of AL
engaged with the NE in direct proximity to NE openings. In
conclusion, the unanticipated discovery of NE openings offers
a topological explanation for how the INM becomes continuous
with AL (see below).
A limitation of the aforementioned analysis is that it resolves
membranes only when they are roughly aligned parallel to the
electron optical axis and thus manifest as a projection in the
electron micrographs. Because AL assume various orientations
with respect to the NE it is relatively unlikely to capture both in a
favorable two-dimensional projection. To better resolve inserting
AL sheets in 3D, we used the slice and view technique, in which a

low angle focused ion beam is used to mill away thin (510 nm)
layers of an embedded specimen alternating with image acquisition by focused ion beam-scanning electron microscopy (FIBSEM). The resulting volumes have an almost isotropic resolution
and can be virtually rotated to obtain slices in basically any
spatial direction. We first confirmed that the parallel AL-NPC
decorated ER sheets are indeed highly interconnected in three
dimensions and link to the NE close to openings of the nuclear
membrane (Figure S5C00 ). NE-openings were frequent and surprisingly large (Figures 3D and 3E). By tangentially slicing the
NE in volumes obtained by FIB-SEM, we could resolve inserting
AL-sheets as part of an ER compartment that enclosed large
parts of the respective NE-opening and contained NPCs (Figures
3F3F00 and 3G). We conclude that AL insertion typically occurs
in proximity to NE openings.
The three-dimensional data allowed us to deduce putative
topological intermediates of AL-insertion. Those involve establishing membrane connections from the adjacent sheet to the
nuclear membranes and in proximity to openings of the existing
NE. NE-openings could either form de novo or persist from the
previous mitosis (Figures 4A and 4A0 ). Importantly their existence suggests a model for AL uptake that elegantly resolves
the topological puzzle: NE openings link the INM to the inserting
sheet and convert the latter into NE (Figures 3F, 3G, and 4A).
Driven by nuclear expansion, both the adopted novel and
the underlying previously present NE sheet laterally slide
away from each other and augment nuclear surface (Figures
4A and 4B0 ). These topological intermediates can be viewed
as part of a spatiotemporal continuum of AL and NE membranes. This model would predict that redundant pieces of NE
should result from AL insertion (Figures 4A and 4B0 ). We indeed
could confirm the existence of NPC-decorated, redundant
NE membranes in both micrographs (Figures S5DS5F) as
well as nucleoplasmic GFP::Nup107 foci in live microscopy (Figure 6C). Such redundant NE could be resolved either by fission
or re-insertion in an equivalent way as AL insertion from the
cytoplasm.
The Permeability Barrier of the NE Is Maintained during
AL Insertion
The apparent NE openings suggest a compromised permeability barrier between the nucleoplasm and the cytosol, except
Figure 3. Topology of AL Insertion

(AB) NE-openings accompany AL insertions. (A and A0 ) Transmission electron micrographs (TEM) of two consecutive serial sections of a nucleus with closely
aligned AL-NPCs (arrowheads). Note the clear INM-ONM connection of the encircled opening (A), which is replaced by intact NE in (A0 ), while the boxed opening is
seen in both sections. (B) Electron micrograph of a nucleus where the NE is opened (boxed) and continuous with an ER stretch.
(C) Capture of an AL-insertion into the NE by correlative light and electron microscopy (CLEM) and tomography (C0 ). (C) RFP::Nup107 localizes to the NE and is
enriched at proximal AL-NPCs (arrowhead).
(C0 ) Single slice through a tomogram recorded at the region depicted by the yellow box in (C). Arrowheads point to a membrane connection of the inserting AL to
the NE, adjacent to a NE-opening.
(D) Multiple NE openings (arrowheads) are apparent in a volume containing a nucleus that was obtained by focused ion beam-scanning electron microscopy (FIBSEM; a single slice is shown superimposed with the NE that is isosurface-rendered in blue).
(E) Histogram of NE-opening diameters measured in TEMs as represented in (A, A0 , and B). Most NE-openings are in the range of 400800 nm.
(FG) FIB-SEM visualizes the continuity of AL membrane sheets with the NE. Slices through a FIB-SEM volume tangential to the nucleus (FF00 ) and isosurface
rendering of the same region (G). (FF00 ) Slices at slightly different angles. Note that cross-sectioned NE with NPCs in side view (yellow arrowheads in FF00 ) are
perpendicular to an AL membrane sheet that has a branched topology and contains multiple NPC in top view (white arrowheads in FF00 ).
670 Cell 166, 664678, July 28, 2016
Figure 4. Model of AL-NPC Insertion

(AB0 ) Planar (A and A0 ) or three-dimensional (B
and B0 ) model for AL-NPC insertion, as inferred
from electron micrographs. Insertion involves the
establishment of membrane connections from the
inserting ER sheet containing AL-NPCs to the NE
and opening of the NE adjacent to the connected
ER sheet. NE openings could emerge de novo by
an unknown mechanism (A) or alternatively could
remain from the previous round of mitosis (A0 ). The
connected ER sheet (arrowhead in B) becomes
part of the NE while nuclear surface increases
by lateral dissipation within linked membrane
planes. Because pores limit lateral membrane
dissipation, NE-NPCs located in between the NE
opening and the AL-sheet connection predict
the formation of transient redundant NE stretches
(A and B0 ).
(C and C0 ) The permeability barrier of the NE
would be compromised if inserting ER sheets
fail to entirely surround (or sufficiently shield)
the NE opening against the cytoplasm (C). Alternatively no cytoplasmic influx through the
NE opening would occur in a concealed
compartment (C0 ).
that the NE openings would reside in a compartment that is

entirely surrounded by ER-membranes (Figures 4C and 4C0 ).
The electron microscopy data (Figures 3C0 and 3G) highlight
the topological complexity of AL insertion and indicate that
the same event might span a considerable fraction of the
nuclear surface area. As such, it is not ultimately possible
to conclude from the three-dimensional data whether or not
the NE remains topologically closed during AL insertion. We
therefore set out to experimentally test if the NE permeability
can be maintained despite AL fusion. We imaged GFP::
Nup107-expressing blastoderm embryos that were injected
with fluorescently labeled dextrans of different molecular
weight (Lenart and Ellenberg, 2006). Small dextrans of 10 and
25 kDa were not excluded by NPCs

and diffused into the nucleoplasm during
interphase (Figures 5A and 5A0 ). In
contrast, nuclei were impermeable to
155 kDa dextran (Figure 5A00 ), suggesting
an intact barrier. Importantly, 155 kDa
dextran did not even leak into the nucleoplasm as AL inserted to the NE (Figures
5B5B00 and 5E; Movie S5), demonstrating that insertion of AL does not
interfere with the permeability barrier of
the nuclear membranes.
To test if an NE opening of the observed
size would in principle cause dextran
influx, we artificially ruptured the nuclear
membranes by performing laser nanosurgery on the NE. Using GFP::Nup107,
we targeted the NE with a 950 nm Titan
Sapphire Laser and punctured the nuclear membranes (Figures 5C5C00 ; Movie
S6). Successful puncture was reflected by a strong mechanical
response of the entire nucleus apparent as NE folding and tumbling (Movie S6). Strikingly, 155 kDa-dextran accumulated in the
nucleoplasm of punctured nuclei within tens of seconds (Figures
5C0 and 5C00 ; Movie S6) with kinetics that did not depend on the
size of the punctured region (Figure 5C00 ). These experiments
conclusively demonstrate that the permeability barrier of the
NE was disrupted after laser-induced rupture while it was not
impaired when AL inserted (Figure 5E), despite comparable dimensions of the respective NE openings apparent in electron
micrographs (Figure 3E). These findings are in line with our
topological model and suggest that the NE-openings are entirely
surrounded by ER membrane sheets.
Cell 166, 664678, July 28, 2016 671
672 Cell 166, 664678, July 28, 2016
NPC Organization and Insertion Mode Change during

Development
In mammalian cell lines, NPCs are stationary embedded within
the NE but mobile along ER membranes in AL (Daigle et al.,
2001). Our results demonstrate frequent AL insertions in
blastoderm embryos and indicate that AL-NPCs predominantly contribute to an increased NE-NPC number during
nuclear expansion. Lateral mobility of NPCs within the NE
could facilitate their re-distribution following AL insertion. We
thus performed FRAP experiments on GFP::Nup107 expressing
embryos to test whether AL-NPC became immobile upon
NE insertion. Strikingly, we observed fast recovery of GFP::
Nup107 all along the rim after photobleaching (Figures 6A
and 6A0 ). Together with our finding that NPC material laterally
dispersed after AL insertion to the NE (Figures 1F1F00 ), we
conclude that in the Drosophila syncytial blastoderm NPCs
are mobile within the NE. In contrast, GFP::Nup107 fluorescence did not recover after photo-bleaching of nuclei at the
onset of gastrulation, suggesting that dispersion of pores
within the envelope was abolished (Figures 6B and 6B0 ). The
impaired NE-NPC mobility in those nuclei was reflected by
distinct principles of pore organization along the NE when
compared to nuclei in younger embryos. GFP::Nup107 distributed uniformly along the NE of blastoderm embryos but
appeared clustered into distinct steady foci just before
embryos started to gastrulate (Figures 6C and 6D). The switch
in NPC mobility and organization coincides with the activation
of the zygotic genome (Figure 7A). Thus pore organization at
the NE could be controlled by zygotic genes or by transcription-associated changes in chromatin. Consistent with
both hypotheses, injection of the RNA polymerase inhibitor
a-amanitin prevented clustering of NPCs on later stage nuclei
(Figure 6E).
The nuclear lamina, a meshwork of intermediate filament
proteins that underlies the NE and projects into the nucleoplasm in metazoa is critical for NPC organization and mobility
within the NE (Daigle et al., 2001), but other NE proteins
could also be important for NPC mobility. Comparative analysis
of the proteomes of isolated nuclei from either blastoderm or
gastrulating embryos revealed a significant (p = 0.00024)

2.5-fold enrichment of a very prominent INM protein, lamin
B receptor (LBR), in older nuclei (data not shown). In mammalian cells LBR is recruited by the Y-complex member ELYS/
Mel28 to specific NE-subdomains and could thus directly
link to NPC distribution (Clever et al., 2012). Strikingly, LBR
was absent from the NE in syncytial embryos but became
localized to the rim of somatic nuclei during cellularization
in interphase 14 (Figures S7A and S7B). Notably, the protein
remained undetectable at the NE of nuclei from pole cells,
which are the posteriorly localized germ cell progenitors (Figures S7C and S7D). To address whether LBR is sufficient to
alter NPC organization at the NE, we ectopically expressed
the protein in WGA-injected syncytial blastoderm embryos.
In these embryos pores appeared clustered within the NE
and nuclei acquired an irregular morphology (Figure 6F).
Both phenotypes are reminiscent of wild-type nuclei at gastrulation onset. Strikingly, ectopic expression of LBR in the early
embryo also induced larger AL sizes (Figures 6F0 and 6G),
implying that LBR expression and NPC clustering counteracts
AL-insertion.
At last, we detected striking differences in the NPC insertion
mode between the different developmental stages. In contrast
to the early embryo (compare to Figure 1C), the mean fluorescent intensities of either GFP::Nup107 or fluorescently labeled
WGA at the nuclear rim strongly decreased as interphase 14
proceeded (Figure 6H), while nuclei significantly increase their
surface area (Fullilove and Jacobson, 1971). Interestingly, the
switch in NPC organization and insertion is concomitant with
the reported (Onischenko et al., 2004; Stafstrom and Staehelin,
1984a) and observed (Figures 6D and 6I) disappearance of ALNPCs from the cortical nuclei layer at early gastrulation. At the
same time AL remained abundant in pole cells (Figure 6I), suggesting that nuclei from the prospective soma and germline
have different NE organizations, compatible with our results
on differential LBR localization (Figure S7). Jointly, our data
suggest that AL insertion to the NE is an early developmental
program that is reduced or lost as the embryo matures
(Figure 7).
Figure 5. The Permeability Barrier of the NE Is Maintained during AL Insertion

(A) Kymographs from time-lapse movies recorded in syncytial blastoderm embryos expressing GFP::Nup107 injected with fluorescently labeled dextrans of
different molecular weight. Regions of interest (ROIs) for kymographs span entire nuclei as schematically depicted. The NE is permeable for 10 kDa (A) and 25 kDa
dextran (A0 ), but not for 155 kDa dextran (A00 ).
(BB00 ) The NE stays impermeable upon AL-insertion. Kymographs (scheme) of a time-lapse movie imaging an embryo expressing GFP::Nup107 (B and B0 )
injected with dextran-155 kDa-TRITC (B and B00 ). Dextran stays cytoplasmic upon insertion of GFP::Nup107-labeled AL-NPCs (arrow). Colored ROIs refer to the
graph in (E).
(CE) Laser puncture of the NE compromises its permeability barrier. (C) Top view still of a time-lapse movie imaging a GFP::Nup107 expressing syncytial embryo
injected with dextran-155 kDa-TRITC, where the NE of two nuclei was laser punctured simultaneously (indicated by the arrowhead and asterisk, respectively). (C0 )
Kymograph of the respective movie along the ROI (white box in C). Dextran leaks into the nucleoplasm of the punctured nucleus (arrowheads in C and C0 ), but not
in the neighboring control nucleus. (C00 ) Quantification of the mean dextran-TRITC fluorescence intensities SD for the respective ROIs color-coded in (C).
Dextran accumulates in the nucleoplasm of nuclei within 20 s after puncture with an initial kinetics that is independent of the size of the punctured region. (D)
Representative kymograph for GFP::Nup107 (D0 ) and dextran-155 kDa-TRITC (D00 ) after NE-puncture; colored ROIs refer to the graph in (E). (E) Dextran leaks into
the nucleoplasm upon NE puncture, but not when AL insert to the NE. Quantitation of dextran influx upon NE-puncture and AL insertion. GFP::Nup107 and
155 kDa-dextran-TRITC levels were inferred from their respective fluorescence intensities, determined from line scans on ROIs in kymographs as exemplified in
(B0 ), (B00 ), (D0 ), and (D00 ). Mean fluorescence intensities SD are plotted as a function of the distance from the NE for control nuclei (no insertion, n = 10 nuclei),
nuclei with AL-NPC insertions (n = 7) or punctured nuclei (n = 7). Experiments were aligned by the respective maximal GFP::Nup107 intensity value, delineating the
position of the NE.
See also Movies S5 and S6.
Cell 166, 664678, July 28, 2016 673
Figure 6. Developmental Regulation of NPC Organization and Insertion

(AD) Lateral mobility and organization of NPCs change during development. Representative top view stills (AD) and kymographs (A0 and B0 ) of time-lapse
movies imaging GFP::Nup107 in syncytial embryos (A, A0 , and C) or before gastrulation onset in interphase 14 (B, B0 , and D). GFP::Nup107 at the NE was photobleached (arrowheads in A0 and B0 ) at the depicted orange regions of interest (ROIs) (A and B) and recovered at syncytial blastoderm nuclei (A0 ) but not at nuclei
from interphase 14 embryos (B0 ). For both kymographs the respective ROIs are boxed in red in (A) and (B). (C and D) GFP::Nup107 distributes evenly along the NE
at the spherical nuclei of syncytial embryos (C) but appears clustered at the rim of the irregularly shaped nuclei at gastrulation onset (D).
(E) Pore clustering is zygotically induced. Top view still from a movie recording a GFP::Nup107 expressing embryo in interphase 14, injected with a-amanitin,
where nuclei stay round and pores fail to cluster due to inhibition of zygotic gene activation.
(FG) The zygotically induced gene LBR is sufficient to cluster NPCs and increase AL size. (F and F0 ) Top view stills from two syncytial blastoderm embryos where
LBR expression was maternally induced. WGA labeled NPCs appeared clustered along the NE (F), similar to wild-type embryos in interphase 14 (B and D) and
accumulate in larger AL-NPCs (F0 ) (arrowheads). (G) Histogram of AL-NPC sizes measured from images as shown in (C and F0 ). AL-NPCs are larger in blastoderm
embryos where LBR expression was ectopically induced (n = 282 AL), compared to control embryos (n = 282).
(H) The mode of NPC insertion is developmentally controlled. NPC density along the NE was inferred from the mean fluorescence intensities of GFP::Nup107 or
fluorescently labeled WGA, which were normalized and plotted as a function of interphase progression. NPC densities stayed almost constant in syncytial interphases but strongly decayed in interphase 14 (see also Figure 1C). Plotted are mean fluorescence intensities SD for 10 nuclei from each imaged syncytial
blastoderm (n = 5) or cellularizing (n = 4) embryo.
(I) During interphase 14, AL-NPCs were diminished from the somatic nuclear layer at the embryos cortex but not from pole cells.
See also Figure S7.
674 Cell 166, 664678, July 28, 2016
Figure 7. NPC Insertion in the Context of

Embryonic Development
(A) AL are abundant throughout the 120 min of
syncytial development but diminish from the
cortical layer in interphase 14, concomitant with
cellularization and the transcriptional activation at
zygotic induction.
(A0 ) NE-NPCs are laterally mobile and all along the
NE in the syncytial blastoderm, but immobilize and
cluster at the NE starting with cellularization.
(B) In each precedent interphase of the syncytial
blastoderm, AL number oscillates within the cortical nuclear layer on a timescale of 10 min, with
increasingly longer interphases in each cycle. Inverse to AL-NPC number, which decreases in each
interphase, NE-NPC number increases together
with nuclear surface expansion.
(C and D) AL-NPCs insertion to the NE occurs in the
range of 12 min. It involves an open NE and batch
insertion of AL-NPCs within a proximal ER sheet
(D). Lateral mobility allows NE-redistribution of inserted NPCs. Insertion of subsequent ER sheets
augments nuclear surface (C).
DISCUSSION
Collectively, the following scenario emerges from our data. AL
are abundant in early Drosophila embryos and predominantly
contribute to maintain the constant NE-NPC density in the expanding NE during interphase. The abundance of AL at the
cortical nuclei layer thereby oscillates together with the progression of the consecutive interphases until the start of global transcription when AL disappear and the mode of NPC insertion
changes (Figures 7A and 7B). During each onset of early interphases, AL-NPCs are assembled similarly to NE-NPCs but since
the combined nuclear surface of the two daughter nuclei is
smaller as compared to the parental nucleus, they remain in the cytoplasm. As interphases progress, AL-NPCs feed into
the pool of NE-NPCs alongside ER membranes that augment NE surface during
rapid nuclear expansion (Figure 7C). AL
insertion is enabled by NE openings that
might either persist from previous mitosis
or form de novo by an unknown mechanism (Figure 7D). Upon AL insertion, the
NE permeability barrier remains unperturbed, likely because the NE openings
are entirely surrounded by the ER
network. The inserting NPCs comprise
pre-assembled NPC scaffolds that recruit
the full set of Nups only subsequent to
insertion and only then establish transport
competence.
Why do the expanding nuclei of the syncytial blastoderm maintain a constant
number of NPCs per surface area despite
their transcriptional inactivity? One might
surmise that this is due to mechanical
properties but also temporal constraints. The insertion of NPCs
might be crucial to enable the massive influx of material into
the nucleoplasm during nuclear expansion (volume increase).
Indeed, the strained configuration of nuclei is reflected by their
strong mechanical response (NE tumbling) upon disruption of
the NE and permeability barrier after laser puncture. Second,
the batch transfer of entire NPC scaffolds as inherent parts of
membrane sheets overcomes the described kinetic constrains
of interphase assembly in mammalian cells, that are not compatible with the short interphases in the Drosophila syncytium (Dultz
and Ellenberg, 2010). Given the abundance of AL-NPCs and the
reported high insertion rate of NPCs into the NE of Xenopus
Cell 166, 664678, July 28, 2016 675
leavis oocytes (DAngelo et al., 2006) it appears likely that similar

mechanisms operate in vertebrates. It remains unclear how
sufficient amounts of AL are generated to globally feed nuclear
surface expansion over multiple cell cycles until the start of
transcription. However, Nups are maternally provided and AL
are abundant not only at the cortical layer of nuclei but also within
the interior of the embryo (Figures S6A and S6BS6B00 ). Therefore, a possibility that needs to be considered is that a source
of AL-NPCs already generated during oogenesis feeds nuclear
growth throughout the syncytial blastoderm.
In addition to their eminent role in transport, NE-NPCs organize the nuclear periphery by delineating zones of active
euchromatin as compared to transcriptionally repressed heterochromatin in between pores (Ptak et al., 2014). Crucial to
this is that NPCs are laterally immobile within the NE, which
was shown to depend on the nuclear lamina (Daigle et al.,
2001). Lamins are nuclear intermediate filament proteins
and come in two major types: B-type Lamins are ubiquitous,
while A-type Lamins are expressed exclusively when cells
differentiate. Both proteins engage in distinct meshworks and
also impact on NPC insertion rate (Lenz-Bohme et al., 1997;
Liu et al., 2000). Our work puts NPC organization and the
mode of pore insertion into a developmental context. We propose that in Drosophila AL insertion is innate to earliest
embryogenesis and diminishes when pores get laterally
restricted and cluster at the NE. There are no A-type lamins expressed at that stage, and specifically expressed INM proteins
could be crucial. Intriguingly, the formation of immobile pore
clusters coincides with the transcriptional upregulation of hundreds of genes at zygotic induction, a developmental transition
present in all metazoan that is accompanied by characteristic
changes in chromatin signatures (Rudolph et al., 2007; Vastenhouw et al., 2010). We reveal that the zygotically upregulated
INM protein LBR, a developmentally controlled INM tether of
peripheral heterochromatin (Solovei et al., 2013), is sufficient
to prematurely aggregate NPCs in blastoderm interphases,
when artificially expressed earlier in embryogenesis. This also
leads to larger AL likely because LBR counteracts AL insertion
for which lateral NPC mobility is required. Our data suggest a
zygotically induced regulation that links pore insertion and organization, NE composition and ultimately also chromatin organization at the nuclear periphery. All of these events eventually
contribute to the commitment of originally pluripotent somatic
nuclei into distinct lineages.
Detailed experimental procedures are available in the Supplemental
Information.
Embryo Injections, Live Imaging, and Immunostainings
Staged embryos were treated according to standard protocols and injected
with Alexa488 or Alexa555 conjugated WGA (100 mg/ml, Life Technologies),
a-amanitin (100 mg/ml, Sigma) or TRITC/FITC conjugated dextrans of
different molecular weight (Lenart and Ellenberg, 2006). Subsequently embryos were imaged on an inverted Zeiss LSM780 confocal microscope
equipped with a 633/1,4 NA oil immersion objective. For immunostainings,
embryos were fixed in 4% formaldehyde and processed according to standard protocols.
676 Cell 166, 664678, July 28, 2016
Sub-cellular Fractionation and Protein Identification by Mass

Spectrometry
Dechorionated embryos were lysed and nuclei were isolated by centrifugation at 5,000 rpm for 13 min and stripped from attached membranes
by centrifugation (45 min, 12,000 rpm) through a 1 M sucrose cushion.
Microsomal membranes were isolated by spinning the supernatant from
the nuclear precipitate at 40,000 rpm for 45 min. Samples were further
processed and analyzed by shotgun mass spectrometry as previously
described (Mackmull et al., 2015). Raw files for quantitative label free
analysis were analyzed using MaxQuant (Cox and Mann, 2008) and the
MS/MS spectra were searched against the Drosophila Swiss-Prot entries
using the Andromeda search engine (Cox et al., 2011). Protein differential expression was evaluated using the Limma package. Differences
in protein abundances were statistically determined using the Students
t test moderated by the empirical Bayes method. Significant regulated
proteins were defined by a cut-off of log2 fold change %1 or R1 and
p value % 0.01.
Transmission Electron Micrograph, FIB-SEM, and Correlative Light
and Electron Microscopy Imaging
Embryos were high pressure frozen, freeze-substituted and infiltrated with
resin. Blocks were subsequently trimmed for FIB-SEM or cut into 300-nm
sections for correlative light and electron microscopy (CLEM) analysis using
an ultramicrotome. For serial transmission electron microscopy (TEM), the
resin-embedded embryos were trimmed and consecutive 100 nm distant sections were obtained with a section thickness of 80 nm. TEM imaging was
carried out on a FEI Tecnai F30 equipped with Gatan US4000 CCD camera,
operated at 300 kV or a FEI Biotwin equipped with an Olympus Keen View
G2 camera operated at 120 kV, respectively. The fluorescence microscopy
(FM) imaging was carried out as previously described (Avinoam et al., 2015;
Kukulski et al., 2011). Tomography was performed in 1 increments at
4,7003 magnification on a FEI Tecnai F30 electron microscope. FIB-SEM
imaging was carried out on an Auriga 60 (Zeiss) using the Atlas3D software.
Datasets were acquired with 5 nm pixel size and 5 nm steps in z and aligned
using TrakEM (Fiji).
ACCESSION NUMBERS
The accession number for the mass spectrometry proteomics data reported in
this paper is ProteomeXchange Consortium: PXD004120.
seven figures, and six movies and can be found with this article online at
B.H. conceived the project, designed and performed experiments, analyzed
data, and wrote the manuscript. M.T.M. designed and performed experiments and analyzed data. P.M., P.R., K.H.B., N.S., R.S.M., A.A.P., A.N.,
and J.M.P. performed experiments. T.L. designed experiments. Y.S. designed experiments and oversaw the project. M.B. conceived the project,
designed experiments, analyzed data, oversaw the project, and wrote the
manuscript.
ACKNOWLEDGMENTS
We thank Drs. Iain Mattaj and Peter Lenart for critical reading of the manuscript
and Dr. Yannick Azou-Gros for experimental assistance. We are grateful to our
colleagues Drs. V. Doye, N. Dostatni, A. Ephrussi, S. deRenzis, M. Mavrakis, A.
Akhtar, G. Krohne, I. Mattaj, and J. Ellenberg for reagents. Stocks obtained
from the Bloomington Drosophila Stock Center (NIH P40OD018537) were
used in this study. We gratefully acknowledge support by the European Molecular Biology Laboratory (EMBL) advanced light microscopy (ALMF), electron
microscopy (EMCF), and proteomic (PCF) core facilities and Petra Riedinger
for graphical design. We are grateful to Drs. P. Paul-Gilloteaux, X. Heiligenstein, and S. Mosalaganti for assistance and expertise in correlative light
and electron microscopy analysis. B.H. was supported by the Agence Nationale de la Recherche (ANR) (Programme Blanc) NeMo (to T.L) and the European Molecular Biology Organisation (EMBO). K.H.B. was supported by postdoctoral fellowships from the Swiss National Science Foundation (SNF),
EMBO, and Marie Curie Actions. M.B. acknowledges funding by EMBL and
the European Research Council (309271-NPCAtlas).
Received: October 25, 2015
REFERENCES
Avinoam, O., Schorb, M., Beese, C.J., Briggs, J.A., and Kaksonen, M. (2015).
ENDOCYTOSIS. Endocytic sites mature by continuous bending and remodeling of the clathrin coat. Science 348, 13691372.
Clever, M., Funakoshi, T., Mimura, Y., Takagi, M., and Imamoto, N. (2012). The
nucleoporin ELYS/Mel28 regulates nuclear envelope subdomain formation in
HeLa cells. Nucleus 3, 187199.
Cordes, V.C., Reidenbach, S., and Franke, W.W. (1996). Cytoplasmic annulate
lamellae in cultured cells: composition, distribution, and mitotic behavior. Cell
Tissue Res. 284, 177191.
Cox, J., and Mann, M. (2008). MaxQuant enables high peptide identification
rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 13671372.
Cox, J., Neuhauser, N., Michalski, A., Scheltema, R.A., Olsen, J.V., and Mann,
M. (2011). Andromeda: a peptide search engine integrated into the MaxQuant
environment. J. Proteome Res. 10, 17941805.
DAngelo, M.A., Anderson, D.J., Richard, E., and Hetzer, M.W. (2006). Nuclear
pores form de novo from both sides of the nuclear envelope. Science 312,
440443.
Daigle, N., Beaudouin, J., Hartnell, L., Imreh, G., Hallberg, E., LippincottSchwartz, J., and Ellenberg, J. (2001). Nuclear pore complexes form immobile
networks and have a very low turnover in live mammalian cells. J. Cell Biol.
154, 7184.
Doucet, C.M., and Hetzer, M.W. (2010). Nuclear pore biogenesis into an intact
nuclear envelope. Chromosoma 119, 469477.
Doucet, C.M., Talamas, J.A., and Hetzer, M.W. (2010). Cell cycle-dependent
differences in nuclear pore complex assembly in metazoa. Cell 141, 1030
1041.
Dultz, E., and Ellenberg, J. (2010). Live imaging of single nuclear pores reveals
unique assembly kinetics and mechanism in interphase. J. Cell Biol. 191,
1522.
Dultz, E., Zanin, E., Wurzenberger, C., Braun, M., Rabut, G., Sironi, L., and Ellenberg, J. (2008). Systematic kinetic analysis of mitotic dis- and reassembly of
the nuclear pore in living cells. J. Cell Biol. 180, 857865.
Fath, S., Mancias, J.D., Bi, X., and Goldberg, J. (2007). Structure and organization of coat proteins in the COPII cage. Cell 129, 13251336.
Katsani, K.R., Karess, R.E., Dostatni, N., and Doye, V. (2008). In vivo dynamics
of Drosophila nuclear envelope components. Mol. Biol. Cell 19, 36523666.
Kessel, R.G. (1983). The structure and function of annulate lamellae: porous
cytoplasmic and intranuclear membranes. Int. Rev. Cytol. 82, 181303.
Kukulski, W., Schorb, M., Welsch, S., Picco, A., Kaksonen, M., and Briggs, J.A.
(2011). Correlated fluorescence and 3D electron microscopy with high sensitivity and spatial precision. J. Cell Biol. 192, 111119.
Laurell, E., Beck, K., Krupina, K., Theerthagiri, G., Bodenmiller, B., Horvath, P.,
Aebersold, R., Antonin, W., and Kutay, U. (2011). Phosphorylation of Nup98 by
multiple kinases is crucial for NPC disassembly during mitotic entry. Cell 144,
539550.
Lenart, P., and Ellenberg, J. (2003). Nuclear envelope dynamics in oocytes:
from germinal vesicle breakdown to mitosis. Curr. Opin. Cell Biol. 15, 8895.
Lenart, P., and Ellenberg, J. (2006). Monitoring the permeability of the nuclear
envelope during the cell cycle. Methods 38, 1724.
Lenz-Bohme, B., Wismar, J., Fuchs, S., Reifegerste, R., Buchner, E., Betz, H.,
and Schmitt, B. (1997). Insertional mutation of the Drosophila nuclear lamin
Dm0 gene results in defective nuclear envelopes, clustering of nuclear pore
complexes, and accumulation of annulate lamellae. J. Cell Biol. 137, 1001
1016.
Liu, J., Rolef Ben-Shahar, T., Riemer, D., Treinin, M., Spann, P., Weber, K.,
Fire, A., and Gruenbaum, Y. (2000). Essential roles for Caenorhabditis elegans
lamin gene in nuclear organization, cell cycle progression, and spatial organization of nuclear pore complexes. Mol. Biol. Cell 11, 39373947.
Longo, F.J., and Anderson, E. (1968). The fine structure of pronuclear
development and fusion in the sea urchin, Arbacia punctulata. J. Cell Biol.
39, 339368.
Mackmull, M.T., Iskar, M., Parca, L., Singer, S., Bork, P., Ori, A., and Beck, M.
(2015). Histone deacetylase inhibitors (HDACi) cause the selective depletion of
bromodomain containing proteins (BCPs). Mol. Cell. Proteomics 14, 1350
1360.
Newport, J., and Kirschner, M. (1982). A major developmental transition in
early Xenopus embryos: II. Control of the onset of transcription. Cell 30,
687696.
Onischenko, E.A., Gubanova, N.V., Kieselbach, T., Kiseleva, E.V., and Hallberg, E. (2004). Annulate lamellae play only a minor role in the storage of
excess nucleoporins in Drosophila embryos. Traffic 5, 152164.
Onischenko, E.A., Gubanova, N.V., Kiseleva, E.V., and Hallberg, E. (2005).
Cdk1 and okadaic acid-sensitive phosphatases control assembly of nuclear
pore complexes in Drosophila embryos. Mol. Biol. Cell 16, 51525162.
Ptak, C., Aitchison, J.D., and Wozniak, R.W. (2014). The multifunctional nuclear pore complex: a platform for controlling gene expression. Curr. Opin.
Cell Biol. 28, 4653.
Rudolph, T., Yonezawa, M., Lein, S., Heidrich, K., Kubicek, S., Schafer, C.,
Phalke, S., Walther, M., Schmidt, A., Jenuwein, T., and Reuter, G. (2007).
Heterochromatin formation in Drosophila is initiated through active removal
of H3K4 methylation by the LSD1 homolog SU(VAR)3-3. Mol. Cell 26, 103115.
Schejter, E.D., and Wieschaus, E. (1993). Functional elements of the cytoskeleton in the early Drosophila embryo. Annu. Rev. Cell Biol. 9, 6799.
Fischer, J., Teimer, R., Amlacher, S., Kunze, R., and Hurt, E. (2015). Linker
Nups connect the nuclear pore complex inner ring with the outer ring and
transport channel. Nat. Struct. Mol. Biol. 22, 774781.
Solovei, I., Wang, A.S., Thanisch, K., Schmidt, C.S., Krebs, S., Zwerger, M.,
Cohen, T.V., Devys, D., Foisner, R., Peichl, L., et al. (2013). LBR and lamin
A/C sequentially tether peripheral heterochromatin and inversely regulate differentiation. Cell 152, 584598.
Fullilove, S.L., and Jacobson, A.G. (1971). Nuclear elongation and cytokinesis
in Drosophila montana. Dev. Biol. 26, 560577.
Soupart, P., and Strong, P.A. (1974). Ultrastructural observations on human

oocytes fertilized in vitro. Fertil. Steril. 25, 1144.
Grossman, E., Medalia, O., and Zwerger, M. (2012). Functional architecture of

the nuclear pore complex. Annu. Rev. Biophys. 41, 557584.
Spindler, M., and Hemleben, C. (1982). Formation and possible function of

annulate lamellae in a planktic foraminifer. J. Ultrastruct. Res. 81, 341350.
Hulsmann, B.B., Labokha, A.A., and Gorlich, D. (2012). The permeability of reconstituted nuclear pores provides direct evidence for the selective phase
model. Cell 150, 738751.
Stafstrom, J.P., and Staehelin, L.A. (1984a). Are annulate lamellae in the
Drosophila embryo the result of overproduction of nuclear pore components?
J. Cell Biol. 98, 699708.
Cell 166, 664678, July 28, 2016 677
Stafstrom, J.P., and Staehelin, L.A. (1984b). Dynamics of the nuclear envelope
and of nuclear pore complexes during mitosis in the Drosophila embryo. Eur. J.
Cell Biol. 34, 179189.
Vastenhouw, N.L., Zhang, Y., Woods, I.G., Imam, F., Regev, A., Liu, X.S., Rinn,
J., and Schier, A.F. (2010). Chromatin signature of embryonic pluripotency is
established during genome activation. Nature 464, 922926.
Stuwe, T., Bley, C.J., Thierbach, K., Petrovic, S., Schilbach, S., Mayo, D.J.,
Perriches, T., Rundlet, E.J., Jeon, Y.E., Collins, L.N., et al. (2015). Architecture
of the fungal nuclear pore inner ring complex. Science 350, 5664.
von Appen, A., Kosinski, J., Sparks, L., Ori, A., DiGuilio, A.L., Vollmer, B.,
Mackmull, M.T., Banterle, N., Parca, L., Kastritis, P., et al. (2015). In situ structural analysis of the human nuclear pore complex. Nature 526, 140143.
678 Cell 166, 664678, July 28, 2016
Article
Adjacent Codons Act in Concert to Modulate

Translation Efficiency in Yeast
Graphical Abstract
Authors
Caitlin E. Gamble, Christina E. Brule,
Kimberly M. Dean, Stanley Fields,
Elizabeth J. Grayhack
Correspondence
fields@uw.edu (S.F.),
elizabeth_grayhack@urmc.rochester.edu
(E.J.G.)
In Brief
Rather than protein synthesis relying
solely on readout of individual codons,
pairs of codons dictate translational
efficiency, suggesting unexpected
coupling between tRNA binding sites
within the ribosome.
Highlights
d
17 codon pairs in yeast mediate strong inhibition of

translation
Inhibition by codon pairs is distinct from dipeptide and
individual codon effects
Inhibitory pairs slow the ribosome on native mRNAs and
involve wobble decoding
Codon order is key to inhibition, implying distinct roles for
each position
Gamble et al., 2016, Cell 166, 679690

Article
Adjacent Codons Act in Concert to
Modulate Translation Efficiency in Yeast
Caitlin E. Gamble,1,2,6 Christina E. Brule,3,4,6 Kimberly M. Dean,3,4,8 Stanley Fields,1,5,7,* and Elizabeth J. Grayhack3,4,7,*
1Departments
of Genome Sciences and Medicine, University of Washington, Seattle, WA 98195, USA

in Molecular and Cellular Biology, University of Washington, Seattle, WA 98195, USA
3Department of Biochemistry and Biophysics, School of Medicine and Dentistry, University of Rochester, Rochester, NY 14642, USA
4Center for RNA Biology, University of Rochester, Rochester, NY 14642, USA
5Howard Hughes Medical Institute, University of Washington, Seattle, WA 98195, USA
6Co-first author
7Co-senior author
8Present address: BD Biosciences, 2350 Qume Drive, San Jose, CA 95131, USA
*Correspondence: fields@uw.edu (S.F.), elizabeth_grayhack@urmc.rochester.edu (E.J.G.)
2Program
SUMMARY
Translation elongation efficiency is largely thought of

as the sum of decoding efficiencies for individual codons. Here, we find that adjacent codon pairs modulate translation efficiency. Deploying an approach in
Saccharomyces cerevisiae that scored the expression of over 35,000 GFP variants in which three adjacent codons were randomized, we have identified 17
pairs of adjacent codons associated with reduced
expression. For many pairs, codon order is obligatory for inhibition, implying a more complex interaction than a simple additive effect. Inhibition mediated
by adjacent codons occurs during translation itself
as GFP expression is restored by increased tRNA
levels or by non-native tRNAs with exact-matching
anticodons. Inhibition operates in endogenous
genes, based on analysis of ribosome profiling
data. Our findings suggest translation efficiency is
modulated by an interplay between tRNAs at adjacent sites in the ribosome and that this concerted
effect needs to be considered in predicting the functional consequences of codon choice.
INTRODUCTION
Translation elongation shapes the proteome, influencing the
amount of protein produced per mRNA and folding of the
nascent protein (Gingold and Pilpel, 2011; Ingolia et al., 2009;
Thanaraj and Argos, 1996). During translation elongation, ribosomes coordinate interactions of codons in mRNA with the anticodons of cognate tRNAs, resulting in addition of an amino acid
to the growing polypeptide, followed by a three-base translocation of the mRNA. Synonymous codons specify addition of the
same amino acid to a growing polypeptide chain, but differ in
their relative use in the genome, the abundance of the tRNAs
that decode them, and their requirement for wobble (nonWatson-Crick) decoding interactions between the third base of
the codon and the first base of the anticodon.
Codon choice modulates translation efficiency (Gingold and

Pilpel, 2011), protein folding (Thanaraj and Argos, 1996; Zhang
et al., 2009), and mRNA decay (Presnyak et al., 2015). A set of
optimal codons, decoded by abundant tRNAs, is implicated
in high-translation efficiency in Saccharomyces cerevisiae and
Escherichia coli (Burgess-Brown et al., 2008; dos Reis et al.,
2004; Gingold and Pilpel, 2011; Pechmann and Frydman,
2013; Presnyak et al., 2015; Sharp and Li, 1987; Welch et al.,
2009). The importance of codon choice is underscored by the
finding that codon use differs substantially between genes expressed in proliferating human cells and in differentiated tissues
(Gingold et al., 2014). However, the parameters that modulate
elongation are not well understood.
While suboptimal codon use could reflect a lack of selective
pressure (Plotkin and Kudla, 2011), in some cases suboptimal
codon use is functionally important, for instance, in the expression of the Neurospora clock protein FRQ (Zhou et al., 2013)
and the cyanobacteria oscillator kaiABC (Xu et al., 2013). The
prevailing hypothesis for how suboptimal codons affect translation has been that their decoding by low-abundance tRNAs
slows ribosome progress. Variation in the decoding rates of individual codons has been detected in some studies (Curran and
Yarus, 1989; Gardin et al., 2014; Lareau et al., 2014; Pedersen,
1984; Srensen and Pedersen, 1991; Stadler and Fire, 2011),
but not in others (Ingolia et al., 2009; Pop et al., 2014; Subramaniam et al., 2014). Furthermore, it is not resolved how differences
in decoding impact translation efficiency, with different studies
suggesting that suboptimal codons affect mRNA decay (Presnyak et al., 2015), translation initiation rates (Chu et al., 2014),
or the recruitment of quality control systems (Letzring et al.,
2013).
Detection of codon-mediated effects is complicated by three
factors. First, changes in codon use affect mRNA sequence,
which also influences mRNA structure, protein and microRNA
binding sites, and splicing signals (Goodman et al., 2013; Kudla
et al., 2009; Weatheritt and Babu, 2013; Welch et al., 2009). Second, translation of particular amino acids or amino acid combinations, such as proline repeats, may affect both the rate and
efficiency of translation (Gutierrez et al., 2013; Lareau et al.,
2014). Third, codon-mediated effects almost certainly depend
on additional parameters beyond the single codon, including a
codons location in the gene (Letzring et al., 2010; Pechmann

and Frydman, 2013; Tuller et al., 2010a, 2010b; Wolf and Grayhack, 2015) and sequence context (Boycheva et al., 2003; Fedorov et al., 2002; Moura et al., 2005).
Interactions between adjacent codons were first implicated in
translation efficiency by the biased use of codon context and
codon pairs in organisms in all three kingdoms (Fedorov et al.,
2002; Gutman and Hatfield, 1989). In human cells, recoding viral
genes with underused codon pairs reduces expression and
leads to attenuated viruses (Coleman et al., 2008). Moreover,
codon pairs and codon context affect the rate of translation elongation in the HisT leader peptide in Salmonella enterica (Chevance et al., 2014), while adjacent CGA codons inhibit translation
in S. cerevisiae more effectively than individual CGA codons
(Letzring et al., 2010). Thus, interactions between sites in the
ribosome may play important roles in regulating translation.
However, a major impediment to understanding translational
control mediated by codon choice has been the lack of an unambiguous method to identify codons or codon combinations that
reduce translation efficiency.
We reasoned that an analysis of extensive variation within a
small region could identify codon combinations that reduce
gene expression in the yeast S. cerevisiae. Large synthetic libraries of a reporter gene provide a robust tool for evaluating
the functional impacts of sequence variation (Goodman et al.,
2013; Kudla et al., 2009; Welch et al., 2009). We therefore
used fluorescence-activated cell sorting (FACS) and deep
sequencing to measure the expression of 35,811 GFP variants
in which three adjacent codons near the 50 end of the coding
sequence were randomized. We identified 17 codon pairs
associated with low expression and examined their effects on
translation. We have found that most of these codon pairs substantially reduce the rate of translation elongation on native yeast
mRNAs and dramatically reduce expression in only a single
orientation, consistent with inhibition by the codon pair and not
by the sum of individual codon effects. We conclude that the
rate of translation elongation is modulated by the concerted
effects of tRNA:codon interactions in two adjacent sites in the
ribosome.
RESULTS
Analysis of 35,811 Three-Codon Variants Reveals Codon
Pairs Linked to Reduced Expression
To identify codons or codon pairs that substantially inhibit yeast
translation, we randomized three adjacent codons at amino
acids 68 of a fusion protein encoding superfolder GFP in the
chromosomally integrated RNA-ID reporter (Dean and Grayhack, 2012). Codon-mediated translational control has been
recapitulated in this reporter, in which a bidirectional GAL1,10
promoter separately drives expression of GFP and RFP, with
normalization of GFP to red fluorescent protein (RFP) used to
reduce transcriptional noise. We created two libraries that randomized the three codons (Figure 1A): the (VNN)3 library encoded each codon by VNN (V = A, C, or G) to avoid insertion
of stop codons, and the (NNN)3 library encoded each codon by
NNN. This approach seemed likely to comprehensively define interactions between adjacent codons, since each codon pair, the
680 Cell 166, 679690, July 28, 2016
reverse of each codon pair, and the two individual codons would
be represented many times in different contexts.
To detect differences in GFP expression, we used fluorescence-activated cell sorting (FACS) to separate yeast cells into
three fluorescence bins. For the (NNN)3 library, we made and
separately sorted two independent yeast libraries. We estimated
that the assay detected expression levels, relative to a no-insert
GFP reference, from 75%100% in bin 1, from 25%75% in
bin 2 (median 43% of bin 1 median), and from 2.5%25% in bin 3
(median 6% of bin 1 median) (Figure 1A); GFP variants with stop
codons migrated into the background bin. Following FACS, we
sequenced the three-codon insertions from cells in each bin,
carried out quality filtering, and determined the relative distribution of sequences. We estimated mean expression (GFPSEQ) for
each sequence based on the sequences distribution across
bins and applying the median fluorescence of a bin to all reads
counts in that bin. GFPSEQ scores correlated across the three libraries (r = 0.91 to 0.93) (Figure S1A) and with mean GFP expression of 76 individual constructs measured by flow cytometry
(GFPFLOW) (r = 0.81), although binning limited the resolution
(Figure S1B).
We considered that amino acid sequences encoded by the insertions could affect GFP stability or expression, although most
of these effects should be mitigated by using superfolder GFP,
which has robust fluorescence even when fused to several insoluble proteins (Pedelacq et al., 2006). Thus, for downstream analysis, we included only the 35,811 unique DNA sequences
specifying one of 5,148 tripeptides that had at least one synonymous sequence above the mean of all GFPSEQ scores (which
left out 4.1% of tripeptides and 6.2% of DNA variants). For
each DNA sequence, the highest scoring sequence encoding a
synonymous peptide served as its synonymous reference. We
scored expression due to codon usage (syn-GFPSEQ) as the
GFPSEQ ratio of a given sequence and its synonymous reference.
As expected, most synonymous variants had similar expression (Figure 1B; Table S1), with a mean syn-GFPSEQ of 0.954.
However, 1,119 DNA sequences (low variants) had syn-GFPSEQ
ranging from 0.059 through 0.647 (three SDs or more below the
mean). Intermediate variants comprised 5,127 sequences from
0.648 through 0.953; high variants comprised 24,417 non-reference (as well as 5,148 reference) sequences with syn-GFPSEQ
greater than 0.953.
There were no examples in which the use of an individual
codon consistently reduced expression to a degree detectable
in our assay. The median syn-GFPSEQ for each set of variants
containing one or more copies of a given codon ranged from
0.97 to 1.00, but the use of broad expression bins limited our
ability to detect differences in GFPFLOW values between 75%
and 100% of the reference GFP. A subset of codons occurred
frequently in low variants (Table S2), suggesting that combined
use of particular codons may dramatically reduce expression.
To identify inhibitory codon pair candidates, we looked for
combinations of adjacent codons enriched in the low-variant
category. We found 293 six-base sequences (non-gapped
6-mers) enriched in the low variants at one or more of the four
possible starting positions of the nine-base insertions (permutation p value % 0.001; Table S3). Most six-base sequences were
enriched at a single position, as might be expected if they form
Figure 1. Identification of Six Base Sequences Linked to Low GFP Expression
C
1. Library of GFP variants
Library insertion
PGAL1,10
RFP
GFP
RFP
VNN VNN VNN

NNN NNN NNN
2. Fluorescence-activated cell sorting
Bin: bkgd
3 2 1
6-mer position
NNN-NNN-NNN
2
3
4
5
4
105
3
104
2
103
104
105
GFP
3. High throughput sequencing of bins
B
10000
High
Intermediate
Low
7500
500
5000
250
0
-log10(p-value)
DNA variants
750
2500
0.00
0
0.00
0.25
0.25
0.50
10
0.95
0.50
0.75
syn-GFPSEQ
Inhibitory
5
0.75
Arg-Pro
Optimal
1.00
(A) Schematic of the method to examine effects of

three randomized codons on superfolder GFP
expression, using the RNA-ID reporter. The FACS
sort of (NNN)3 library 1 is shown.
(B) Distribution of syn-GFPSEQ scores. Variants
were assigned to low- (magenta; n = 1,119), intermediate (gray; n = 5,127), and high- (gold;
n = 24,417, excluding high-expression synonymous references) expression categories.
(C) Significance of 6-mer enrichment in lowexpression variants by 6-mer position (14) in
the nine-base variable region (library insertion).
6-mers with at least one p value % 0.001 are
plotted based on hierarchical clustering of positional permutation p values. 57 6-mers are not
plotted due to missing values; this includes 6-mers
that form an in-frame stop codon. 6-mers with a
p value % 0.001 at both in-frame start positions
(1 and 4) are labeled (although CUG-AGG, CUGAUA*, and CUU-AGG are not plotted because they
form a stop codon at another position). Candidate
inhibitory pairs that remain enriched in a reduced
structure dataset are indicated with a star.
(D) Flow cytometry scatter plots from six individual
variants; label (GFP*100/RFP).
See also Tables S1, S2, S3, and S4.
CUU-CAG
CU U-CGG
GU A-CCG *
GU A-CGG
GUG-CGA *
GU A-CGA *
CG A-GCG *
CG A-CGG *
CG A-CGA *
CG A-CUG *
CG A-CCG *
CUG-GCG
CG A-AU A *
CUG-CCG *
CUG-CGA *
CU C-CCG *
CU U-CUG *
A U A-CGG *
A GG-CGA *
A U A-CGA *
CG A-CAU
A GG-CGG *
C U C-A U A *
CUG-CUG *
identified a reduced-structure subset of

variants as all those with a similar
CGA-CCG
AGA-CCA
degree of structure to the majority of
13.0 0.1
82.6 2.1
103
high expression variants, based on both
Arg-Ile
105
local and global structure predictions
(Figure S1C; Table S1; Supplemental
104
Experimental Procedures). We then evalCGA-AUA
AGA-AUU
uated whether each candidate pair re1 12.9 1.3
38.6 1.7
103
mained enriched among low-expression
105 Arg-Arg
variants present in the reduced-structure
subset. We found 20 of the 28 candidates
104
enriched at in-frame start positions (perAGG-CGG
AGA-AGA
mutation p value % 0.055) and revised
58.1 2.7 11 1.6 3.8
103
our candidate list to include only these
CUC-CGG
6-mer
103
104
105
(Figure 1C; Table S4). We conclude that
position
3
4
1
2
GFP
structure is unlikely to account for most
of the reduced expression by these 20
candidates.
part of a secondary structure or a recognition motif that also inExpression of GFP variants with a candidate inhibitory pair
cludes common sequences. However, 28 of these 6-mers was substantially reduced, with syn-GFPSEQ medians ranging
(0.75% of all possible 6-mers without a stop codon) were en- from 0.44 to 0.82 (Table S3). Candidate pairs were present in
riched at both in-frame start positions (permutation p value % 29% (n = 319) of all low-expression variants. We validated the
0.001 for each position) and comprised our initial list of inhibitory inhibitory effects of the 20 inhibitory codon pair candidates by
codon pair candidates (Figure 1C).
flow cytometry of individual constructs. For each pair, we asSince strong RNA secondary structure in the 50 end of an open sessed expression due to codon usage by comparing GFPFLOW
reading frame (ORF) can reduce translation efficiency in E. coli of two synonymous variants, one with the inhibitory codon pair
and S. cerevisiae (Goodman et al., 2013; Kudla et al., 2009; and the other with an optimized pair based on codon adaption
Shah et al., 2013; Tuller et al., 2010b), we investigated whether index (CAI), which scores codons based on their frequency of
enrichment of each 6-mer in low-expression variants is expli- use in highly expressed genes (Sharp and Li, 1987). All variants
cable primarily by formation of strong secondary structure. We with an inhibitory pair candidate had lower GFPFLOW, ranging
RFP
104
Cell 166, 679690, July 28, 2016 681
n=21
AGG-CGA* n=30
CUG-CUG
CUU-CUG
GUA-CCG*
GUA-CGA*
GUG-CGA*
0.00
0.4
0.2
0.0
D
Inhibitory pair
Single codon optimized
Insert
RFP PGAL GFP
0.25
0.50
0.75
Variant syn-GFP
1.0
0.8
0.6
0.4
0.2
0.0
1.00
SEQ
from 14%76% that of synonymous optimized variants (Figure 1D; Table S4).
Codon Pairs Mediate Frame-Dependent Inhibition in
Different Sequence Contexts
To assess the likelihood that inhibition is mediated by translation,
we examined the properties of the candidate pairs. If inhibition
were coupled to translation, then the enriched 6-base sequences would likely inhibit expression only when the two
codons were in-frame, but not when out-of-frame, where mechanisms decoupled from translation might explain enrichment.
For the 20 candidates, we compared the syn-GFPSEQ distribution of variants with these candidates at in-frame positions (the
six-base sequences starting at positions 1 and 4) (Figure 2A,
blue) to that of variants with the candidates at out-of-frame positions (the six-base sequences starting at positions 2 and 3)
682 Cell 166, 679690, July 28, 2016
Insert
RFP PGAL Rluc GFP
1.0
0.8
0.6
0.4
0.2
0.0
G
AAG CC
A- G
C CC
G
A- G
C
C
A
CUG-CGA*
0.6
(A) syn-GFPSEQ distribution of variants with each of

the 20 inhibitory codon pair candidates. Variants
with the indicated codon pair in-frame (blue) are
compared to variants with the codon pair out-offrame (the 6-mer at positions 2 and 3) (gray) and
variants with the same two codons in-frame,
but separated (purple). Boxplot shows median
centerline and edges marking the first and third
quartiles. Inhibitory pairs that depend on both
frame and adjacent positioning (corrected Wilcoxon p values % 0.006) are indicated with a star.
Pairs with Wilcoxon p values > 0.006 are shaded in
gray.
(B) The CGA-GCG pair is inhibitory in different
contexts. The GFPFLOW ratio from each of three
sets of variants is positioned above the corresponding variant in a syn-GFPSEQ boxplot of all
variants with the CGA-GCG codon pair (identical
to the blue CGA-GCG boxplot in 2A). The GFPFLOW
ratio (inhibitory/optimal) is a comparison of
GFPFLOW values from two synonymous variants,
one with an inhibitory codon pair and the other with
an optimized pair.
(C) Inhibitory pairs are effective in Renilla luciferase-GFP (light blue) or GLN4(1-99)-GFP (dark
blue). Here, the GFPFLOW ratio compares variants
with three copies of an inhibitory pair to synonymous variants with three copies of the optimized
pair.
(D) Each codon in the CUC-CCG and CGA-CCG
pairs contributes to inhibition. Schematics of the
respective reporters (Renilla luciferase-GFP reporters contain three copies of the pair) and the
GFPFLOW ratio from each of three sets of variants
(inhibitory codon pair, optimized 50 codon, and
optimized 30 codon) are shown.
See also Figure S2.
CUG-CCG*
1.0
GFPFLOW ratio
(Inhibitory/Optimal)
CUG-AUA*
GFP
0.8
C
G
-C
C
U
C G
-C
C
A
CUC-CCG*
Rluc
RFP PGAL GLN4 GFP
CUC-AUA
1.00
Insert
-C
CGA-GCG*
0.75
RFP PGAL
CGA-CUG*
0.50
CGA-CGG*
0.25
Variant syn-GFPSEQ
AG
G
CG -CG
A A
CG -AU
A A
CG -CC
A G
CG -CG
A A
CG -CG
A G
CG -CU
A G
CU -GC
C G
CU - C C
G G
CU -CC
G G
GU -CG
A A
GU -CG
G- A
CG
A
CGA-CGA*
0.27
0.00
CGA-CCG*
0.65
0.49
n = 30
CGA-AUA*
CGA-GCG
Arg-Ala
1.0
0.8
0.6
0.4
0.2
0.0
AUA-CGG*
GFPFLOW ratio
AUA-CGA*
n=29
n=25
n=36
n=47
n=18
n=11
n=17
n=21
n=27
n=30
n=5
n=27
n=17
n=15
n=22
n=14
n=17
n=25
n=16
n=29
n=38
n=18
n=16
n=21
n=15
n=26
n=30
n=9
n=6
n=12
n=3
n=7
n=15
n=9
n=11
n=22
n=17
n=11
n=30
n=22
n=12
n=25
n=15
n=16
n=25
n=14
n=12
n=27
n=15
n=18
n=25
n=27
n=20
n=36
n=20
n=21
n=30
n=26
GFPFLOW ratio
AGG-CGG*
Figure 2. Adjacent Codons Mediate FrameDependent Inhibition
Out-of-frame
In-frame
Separated
Codon pair:
GFPFLOW ratio
(Figure 2A, gray); 19 pairs had lower

syn-GFPSEQ scores when the codon
pairs were in-frame (corrected Wilcoxon
p values % 0.006; CUC-AUA not significant). Additionally, we compared synGFPSEQ distributions of variants with an inhibitory pair to that
of variants with these codons at non-adjacent, in-frame positions (separated) (Figure 2A, purple). If the codon pair, rather
than additive effects of single codons, mediates translation inhibition, then we would expect greater inhibition by adjacent codons than by non-adjacent codons. 17 of the 20 candidate pairs
had lower syn-GFPSEQ scores when the codons were adjacent
(corrected Wilcoxon p values % 0.006; CUC-AUA, CUG-CUG,
and CUU-CUG not significant). Thus, inhibition by these 17 pairs
is dependent on both frame and adjacent positioning of the two
codons.
Because the boxplots revealed a range of syn-GFPSEQ scores
from variants with the same inhibitory pair, we used GFPFLOW to
obtain higher resolution measurements of relative inhibition
by an inhibitory pair in different contexts, including in variants
with high syn-GFPSEQ scores. For three variants containing
or
IA wobble
UG wobble
Codon
Pair
R-R
R-R
I-R
I-R
R-I
R-P
R-R
R-R
R-L
R-A
L-P
L-I
L-P
L-R
V-P
V-R
V-R
AGG-CGA
AGG-CGG
AUA-CGA
AUA-CGG
CGA-AUA
CGA-CCG
CGA-CGA
CGA-CGG
CGA-CUG
CGA-GCG
CUC-CCG
CUG-AUA
CUG-CCG
CUG-CGA
GUA-CCG
GUA-CGA
GUG-CGA
CGA-CCG
105
Arg-Pro
Optimal
Pair
Vector
13.8 0.5
103
105 AGA-CCA
Arg-Pro
tP(CGG)*
87.6 2.4
Vector
33.1 1.6
AGA-GCU
Arg-Ala
tP(CGG)*
63.4 4.0
104
105
Vector
119.4 4.6
103
GFP
Vector
1.6
3 native tRNA
0.4
ND
ure S2B). Thus, each of these inhibitory

codon pairs mediates reduced expression either near the start of translation or
at internal coding sequence locations.
In two cases, we explicitly demonstrated that each codon in the pair
is necessary for inhibition. First, we
compared GFPFLOW of a variant with an
inhibitory pair (CUC-CCG) to GFPFLOW
of variants in which either the 50 or 30
codon was replaced with an optimized
codon (UUG-CCG, CUC-CCA). The
variant with the inhibitory pair showed dramatically lower
expression (14% of the optimized variants; Figure 2D). Second,
we tested three copies of the CGA-CCG pair at amino acid 100
and obtained similar results (Figure 2D). Thus, the inhibitory
codon pairs mediate reduced expression when present in-frame
in a coding sequence.
AG
GCG
A
CG
AAU
A
CG
ACC
G
CG
ACG
A
CG
ACG
G
CG
ACU
G
CG
AGC
G
CU
CCC
G
CU
GCC
G
CU
GCG
A
GU
ACG
A
GU
GCG
A
NA
(A) Wobble decoding is prevalent in the 17 inhibitory codon pairs.

(B) Flow cytometry scatter plots show GFPFLOW
from two sets of variants that contain an inhibitory
pair (top) or a synonymous optimized pair (bottom), in cells with either an empty vector or a
plasmid expressing the indicated tRNA; the nonnative exact matching tRNA is also indicated by a
star.
(C) The effect on the GFPFLOW ratio of expressing a
tRNA that decodes the 30 codon in an inhibitory
pair. Vector, blue; native tRNA, light purple; nonnative tRNA, dark purple. Error bars represent SD.
NA, not applicable; ND, not determined.
See also Figure S3.
3 non-native tRNA
0.8
NA
tA(UGC)
118.9 5.3
104
105
Figure 3. Codon Pair-Mediated Inhibition

Affects Translation and Is Suppressed by
Particular tRNAs
GFP
1.2
0.0
tA(UGC)
67.3 8.6
104
Vector
103 76.5 0.5
103
C
GFPFLOW ratio
CGA-GCG
Arg-Ala
Inhibitory
104
Pair
RFP
Amino
Acid
Vector
3 non-native tRNA (decodes Pro CCG)
3 native tRNA (decodes Ala GCG)
CGA-GCG (Figure 2B) and three containing CGA-CGG (Figure S2A), the ratio of inhibitory to optimal GFPFLOW scores was
always less than 0.66. Thus, the codon pair was inhibitory in
different contexts, although the magnitude of inhibition varied.
This variation could reflect effects from RNA structure or additional sequence context (nucleotide, codon, or amino acid).
Additionally, each three-codon insert introduces four codon
pairs (including the invariant codons at positions 5 and 9), all of
which could affect translation.
If inhibition by codon pairs is a general function of their translation by the ribosome, then the codon pairs should reduce
expression when positioned at diverse locations within the coding sequence. However, the magnitude by which codons affect
expression can depend on their location relative to the start of
translation; for example, CGA codon repeats are more inhibitory
near the start of the coding sequence (Letzring et al., 2010; Wolf
and Grayhack, 2015). Therefore, we tested whether inhibition occurs at internal locations by inserting three copies of an inhibitory
codon pair at amino acid 100 (between an N-terminal GLN4(1-99)
domain and GFP) and at amino acid 318 (between Renilla luciferase and GFP) (Letzring et al., 2010; Wolf and Grayhack,
2015); we carried out this test for the 12 pairs with the lowest
syn-GFPSEQ medians. In each case, GFPFLOW with the inhibitory
pairs was lower than with optimized pairs (from 20%67%; Figure 2C). We also showed that increasing the copy number of the
codon pairs results in greater inhibition (three pairs tested at
amino acid 6 and two pairs tested at amino acid 100) (Fig-
Wobble Decoding Is Central to Inhibition by Codon Pairs

The codon composition of the 17 pairs is consistent with the idea
that these pairs inhibit translation and that wobble decoding is
central to their effects. All ten codons in these pairs are implicated in poor translation by their infrequent use in highly expressed genes, as measured by CAI (Sharp and Li, 1987). The
Arg CGA codon, the only codon in yeast decoded via a purinepurine I,A wobble, is found in more than half of the candidate
pairs (Figure 3A). The other nine codons in the 17 pairs include
all three codons decoded exclusively by U,G wobble, as well
as six codons decoded by low-copy tRNAs (one or two gene
copies), three of which are also decoded by a second tRNA via
wobble interactions (Johansson et al., 2008) (Figure 3A). Wobble
interactions have been implicated in both slow decoding (Lareau
et al., 2014; Srensen and Pedersen, 1991; Stadler and Fire,
2011) and in inefficient translation (Letzring et al., 2010).
To determine if defects in decoding inhibitory codon pairs are
responsible for low expression, we evaluated the ability of overexpressed tRNAs to suppress the low GFPFLOW of variants with
inhibitory codon pairs. We initially examined suppression by
Cell 166, 679690, July 28, 2016 683
tRNAs that decode the 30 codon of inhibitory pairs, since the

30 codon is likely to occupy the ribosomal A site during the
inhibitory reaction. The expression defect for variants with 10
of 12 pairs tested was suppressed either by increasing the abundance of a native tRNA or by expressing a non-native tRNA that
enables decoding by Watson-Crick base pairing at all three
bases (exact matching) (Figures 3B and 3C; not for AGG-CGA
or CGA-CGG). Maximal suppression ranged from 1.8- to 7.7fold increases in GFPFLOW (relative to an empty vector), and suppression was only slightly augmented in the one case tested by
co-expression of two tRNAs, one for each codon (Figure S3A).
As shown below, for one of the pairs (CGA-CGG) in which
tRNA for the 30 codon did not suppress, the expression defect
was strongly suppressed by a non-native exact matching tRNA
that decodes the 50 codon (see below). Thus, for these 11
tRNA-suppressible pairs, inhibition is due to a translation defect.
Furthermore, since the expression defect for AGG-CGA was
alleviated by shifting the reading frame (Figure S3B), we infer
that inhibition in this case could also likely be a translational
defect.
To evaluate the role of wobble decoding in codon-mediated
inhibition, we compared the degree of suppression by 30 native
tRNAs to that of 30 exact matching, non-native tRNAs. The exact
matching tRNA was more effective at suppressing inhibition by
the eight tested pairs (substantially so for three pairs with a
30 Pro CCG codon [Figures 3B and 3C] and four pairs with a
30 Arg CGA codon, but marginally so for a pair with a 30 Leu
CUG codon [Figure 3C]). Because correcting wobble decoding
improved translation more effectively than did increased
amounts of the native tRNA, we conclude that I,A and U,G
wobble base pairings contribute to inhibition of translation by
codon pairs.
We also examined the GFP mRNA abundance from six variants with an inhibitory pair. The amount of GFP mRNA from
each of these variants with an inhibitory pair was reduced relative
to that from a synonymous optimized variant (Figure S3C), as
might be expected since many translational defects result in
mRNA degradation (Shoemaker and Green, 2012). For a variant
with CUC-CCG, expression of the non-native, exact matching

tRNAProCGG suppressed both mRNA and GFP defects (Figure S3D), illustrating a link between mRNA and translation efficiency for this variant with an inhibitory pair.
Inhibition by Codon Pairs Implicates Interactions
between Sites in the Ribosome
Since the codons had to be adjacent for low expression mediated by codon pairs, we considered that these pairs might act
in a concerted manner to mediate inhibition, with each codon
in the pair playing a unique role in the inhibitory effect and occupying a specific position in the ribosome. If inhibition occurs
when the 30 codon enters or occupies the ribosomal A site,
then the 50 codon in the pair would occupy the P site. In this
case, overproduction of a native tRNA that decodes the 50 codon
would not be expected to suppress inhibition by the codon pair.
In testing ten pairs (excluding CGA-CGA, which has identical
codons, and AGG-CGA, which is not tRNA suppressible), we
found that increased expression of a native tRNA corresponding
to the 50 codon had no significant effect on inhibition by eight
684 Cell 166, 679690, July 28, 2016
pairs, and only marginally suppressed inhibition by GUA-CGA

(Figure 4A). For CUC-CCG, overproduction of the single copy
native tRNALeu(GAG) substantially suppressed inhibition, but
overproduction of a different native tRNA (tRNALeu(UAG)) that
competes to decode the same codon by wobble interactions
increased inhibition (Figure 4A). Thus, reducing the use of
wobble decoding for the P site codon improved expression.
We demonstrated that charged tRNAArg(ICG) was increased
9-fold when the tRNA was expressed from a high copy 2m
plasmid (Figures 4A and 4B), even though this overproduction
resulted in no suppression of inhibitory pairs with a 50 CGA.
Furthermore, the use of an even higher copy leu2-d 2m plasmid
(Beggs, 1978) resulted in a 20-fold increase in charged
tRNAArg(ICG), but still no detectable suppression of these pairs
(Figures 4B and 4C). By contrast, the non-native exact matching

tRNAArgUCG suppressed expression defects (Figure 4A), and,
thus, translation of the 50 codons is also central to inhibition
by codon pairs. These results are consistent with the idea
that codon-anticodon interactions in the P site affect A-site
decoding.
If the position of each codon in the ribosome is critical for
codon pair-mediated inhibition, then the order of codons in an
inhibitory codon pair should be important for inhibition. Of the
17 pairs identified in this study, the CGA-CGA pair is composed
of identical codons and two sets of pairs are inhibitory with the
codons in either order (AUA-CGA, CGA-AUA and CUG-CGA,
CGA-CUG), leaving 12 inhibitory pairs with a single order of
codons. For each of these 12, we compared the GFP expression
of variants with the inhibitory pair to those with its reverse pair
(i.e., with the two codons in reverse order). In each case, variants
with the inhibitory pair tended to have lower syn-GFPSEQ scores
than variants with the reverse pair (corrected Wilcoxon p value %
0.006; Figure 4D). Similarly, two variants with the inhibitory pair
CUC-CCG had low GFPFLOW relative to a synonymous variant
with the optimized pair, whereas two variants with the reverse
pair, CCG-CUC, had high GFPFLOW (Figure 4E). Thus, the idea
that the 50 codon in an inhibitory pair has a role distinct from
the 30 codon is supported by failure of overproduced native
tRNAs that decode the 50 codon to suppress inhibition (eight
pairs), as well as by the dependence of GFP inhibition on the
order of codons (12 pairs). We conclude that inhibition by most
codon pairs is likely to involve interactions between tRNAs at
adjacent sites in the ribosome.
12 Inhibitory Codon Pairs Have Elevated Ribosome
Occupancies on Yeast Gene Transcripts
To assess the potential influence of inhibitory pairs on translation
of yeast genes, we examined the overall expression of genes
containing these pairs. The pairs occur 2,922 times in 1,868
genes (31.6% of the 5,917 yeast genes), including 28 occurrences in 659 genes (Engel et al., 2014). Consistent with the 17
inhibitory pairs having a negative impact on translation efficiency
and as expected from the low CAI of their individual codons,
these pairs predominantly occur in yeast genes with low to moderate expression (based on mRNA levels) (Figure 5A). Many
ORFs with an inhibitory pair tend to have reduced protein (Kulak
et al., 2014) per mRNA transcript (Presnyak et al., 2015)
compared to other ORFs within a similar CAI range (corrected
5 native tRNA
0.8
0.6
0.4
0.2
NA
0.0
NA
NA
0.4
0.2
Lane 1 2 3 4 5
Inhibitory pair
Reverse order pair
1.2
n=34
AGG-CGA* n=30
AUA-CGG* n=22
n=27
CGA-CCG* n=16
n=22
CGA-CGG* n=22
n=38
0.8
CGA-GCG* n=18
n=30
0.4
CUC-CCG* n=16
n=15
0.0
CUG-AUA* n=22
CU
C
CC -C
G- CG
C U -A
C - CU
AC
AC
U
UAC C
U- UC
CC -C
G- CG
CU
C
A
G
-C
UG
G
-C
UA
A
-C
G
CU
CG
CG
A-
CG
G
A-
CG
Inhibitory pair
AGG-CGG* n=40
n=36
CC
A-
CG
CG
A-
AU
0.0
CG
1/5x
0.6
A-
1x
Phe(GAA) Arg(ICG)
1x 1x 1/5x 1x
tR(ICG) 2 leu2-d
Vector
0.8
CG
2 leu2-d
vector
tR(ICG)
tR(ICG)
Deacyl.
vector
tR(ICG)
tR(ICG)
2 LEU2
GFPFLOW ratio
Figure 4. Inhibition Depends on Codon

Order and Pair Effect
5 non-native tRNA
1.0
GFPFLOW ratio
Vector
AG
GCG
A
CG
AAU
A
CG
ACC
G
CG
ACG
A
CG
ACG
G
CG
ACU
G
CG
AGC
G
CU
CCC
G
CU
GCC
G
CU
GCG
A
GU
ACG
A
GU
GCG
A
GFPFLOW ratio
Reverse order pair
(A) The effect on the GFPFLOW ratio of expressing a

tRNA that decodes the 50 codon of an inhibitory
pair. Vector, blue; native tRNA, light orange; nonnative tRNA, dark orange. Leu CUC is decoded by
two native tRNAs; the exact matching tRNA is
indicated by a star. Error bars represent SD. NA,
not applicable. CGA-CGA data are also shown in
Figure 3C.
(B) Charged tRNAArg(ICG) levels increase when
tRNAArg(ICG) is expressed from either a 2m or 2m
leu2-d vector, as measured with an acidic northern
blot probed for tRNAArg(ICG) and tRNAPhe(GAA).
Charged tRNA (black arrow) and uncharged tRNA
(gray arrow) are indicated.
(C) Effects of increasing native tRNAArg(ICG) by
expression from a 2m leu2-d vector on the
GFPFLOW ratio from each of eight sets of variants
(leu2-d vector, blue; tRNAArg(ICG), gray). Error bars
represent SD.
(D) syn-GFPSEQ distribution of variants with an
inhibitory pair (blue) compared to that of variants
with the same pair of codons in reverse order
(pink). Distributions are plotted for the 12 pairs for
which only a single order of the codons is present
in the list of inhibitory pairs. Boxplot edges mark
the first and third quartiles. Stars indicate a corrected Wilcoxon p value % 0.006. Blue boxplots
are identical to those in Figure 2A.
(E) Inhibition by the CUC-CCG pair depends on the
order of the codons. The GFPFLOW ratio for two
sets of variants, each with an inhibitory pair (blue)
and the reverse pair (pink), is shown. Error bars
represent SD.
See also Table S1.
n=22
CUG-CCG* n=28
n=30
To assess each codon pair for evidence

of reduced translation rates, we evaluGUA-CGA* n=18
ated the overall footprint count at codon
n=36
pairs relative to neighboring codon posiGUG-CGA* n=35
n=30
tions. For each pair, we located all of its
sites in yeast ORFs and aligned windows
0.00
0.25
0.50
0.75
1.00
of up to 100-codon positions, with the
Variant syn-GFPSEQ
pair at the center of each window. At
each of the aligned positions, we calcut test p value % 0.01; Figure 5B) or to ORFs with a reverse-order lated ribosome occupancy by summing footprint counts across
pair (corrected t test p value = 6.34 3 105 for group of 12 pairs; ORFs and normalizing to total counts from all positions. Based
Figure 5C).
on an even distribution of footprints, we would expect baseline
We proceeded to investigate whether translation elongation occupancy of about 0.01 at each position. Occupancies at those
slows at inhibitory codon pairs in yeast transcripts by exam- positions with the pair in ribosomal sites tended to be higher than
ining existing yeast ribosome profiling data, in which ribosome baseline occupancies at surrounding codon positions (Figlocations and the identities of codons in the ribosome are ure 6A). By combining occupancies from two positions (with
inferred from the sequences of ribosome-protected mRNA the pair in the ribosomal P, A sites and E, P sites), we obtained
fragments (footprints). Relative translation rates are inferred the pairs cumulative ribosome occupancy. We applied the
from the density of footprints, with positions of reduced same approach to calculate cumulative ribosome occupancies
translation speed yielding higher densities. Since cells treated for dipeptides and individual codons.
with cycloheximide were recently shown to have altered
Consistent with some previous reports (Artieri and Fraser,
footprint distributions (Hussmann et al., 2015), we evaluated 2014; Hussmann et al., 2015; Lareau et al., 2014), Pro-Pro sites
a yeast experiment carried out without cycloheximide (Jan had the highest cumulative occupancy of sites for a dipeptide
et al., 2014).
(0.04), and the codons CGA, CCG, and CGG had the highest
GUA-CCG* n=29
n=25
Cell 166, 679690, July 28, 2016 685
ORFs by mRNA Quartile

(Presnyak, 2015)
N.D.
n = 38
Yeast ORFs
Q4
n = 98
Inhibitory pairs
n = 1,868
Q3
n = 398
Q1
n = 621
68%
32%
Q2
n = 713
No inhibitory pairs
n = 4,049
C
Inhibitory pairs
Reverse order pairs
Inhibitory pairs
No inhibitory pairs
10
**
*
**
*
**
*
log2(protein/mRNA)
log2(protein/mRNA)
10
n = 708
n = 118
n = 92
n = 335
n = 304
n = 477
n = 517
n = 308
n = 570
n = 111
n = 390
n = 39
n = 279
n = 678
CAI bin: 0.425 0.450 0.475 0.500 0.525 0.550
**
*
Figure 5. Inhibitory Pairs Occur in Genes with Both Low Expression

and Translation Efficiency
(A) Proportion of S. cerevisiae ORFs with at least one of the 17 inhibitory pairs
(left) and the proportion of these ORFs present in each mRNA abundance
quartile (right), as based on steady state, total mRNA (Presnyak et al., 2015).
Q1 indicates the bottom 25% of S. cerevisiae transcript abundance.
(B) Estimated translation efficiency distribution (protein abundance [Kulak
et al., 2014] normalized to mRNA [Presnyak et al., 2015]) for ORFs with at least
one inhibitory pair (blue) or no inhibitory pair (gray) and grouped by CAI. CAI
bins, labeled by their lower CAI limit, are 0.025 in size. Stars indicate a corrected t test p value % 0.01 (*) or % 3.69 3 109 (***).
(C) Estimated translation efficiency distribution for ORFs with at least one of 12
inhibitory pairs for which the reverse-order pair was not in the inhibitory list
(blue) and ORFs with at least one of the reverse-order pairs (pink). ORFs with
both inhibitory and reverse-order pairs were excluded from the analysis. Stars
indicate a corrected t test p value % 6.34 3 105.
cumulative occupancies for individual codons (0.04 to 0.05).

Inhibitory pairs also had elevated cumulative occupancies. For
each inhibitory pair, we evaluated the significance of its cumulative ribosome occupancy by comparison to 10,000 permutations
of the footprint counts in each ORF and found that all 17 inhibitory pairs had higher occupancy than expected by chance (corrected permutation p value < 0.009), given ORF coverage and
footprint distributions. Twelve inhibitory pairs had cumulative occupancies in the top 0.6% of all codon pairs (more than three
SDs above the mean; Figure 6B). In particular, the four pairs
686 Cell 166, 679690, July 28, 2016
with the strongest inhibitory effects in the GFP assay were

among the five pairs with the highest occupancies (Figure 6B).
Thus, inhibitory pairs tended to be translated slowly, and pairs
with some of the highest occupancies also showed the greatest
inhibition of GFP expression.
To assess the pairs effects on ribosome occupancy, relative
to potential individual codon and dipeptide effects, we first
ranked synonymous codon pairs by each pairs cumulative
occupancy. Inhibitory codon pairs tended to have some of the
highest occupancies among codon pairs specifying a given
dipeptide (Figure 6C). We also carried out direct comparisons
between synonymous pairs using a Fishers exact test on the
footprint counts at each pair and its surrounding codon positions. We compared each inhibitory pair to two other pairs: a synonymous pair with the same 50 codon, but an optimized 30 codon,
and a synonymous pair with the same 30 codon, but an optimized
50 codon. For 12 inhibitory pairs, the proportion of footprints at
the inhibitory pair was higher than for each synonymous comparison (corrected Fishers exact p value % 6.79 3 108; Table S5).
In analyzing a separate ribosome profiling dataset by Lareau
et al. (2014), we found ten of these inhibitory pairs also reached
significance in synonymous comparisons (corrected Fishers
exact p value % 0.002). We conclude that ribosomes tend to
translate through 12 of 17 inhibitory codon pairs more slowly
than through either of the individual codons across matching
dipeptide sites.
We also evaluated the impact of codon order on ribosome
occupancy. For the 10 of 11 pairs that differed from synonymous pairs (excluding the CGA-CGA pair), the proportion of
footprints at the inhibitory pair was significantly higher than
for the reverse pair (corrected Fishers exact p value %
4.63 3 1032; Figure 6D). Thus, we conclude that slower translation of these inhibitory pairs is, in large part, due to codon pair
effects, rather than due to the simple result of sequential, individual codon effects.
DISCUSSION
We establish that codon pairs affect translation elongation and
translation efficiency in yeast in a manner distinct from the effects of their individual constituent codons. For 17 inhibitory
codon pairs, we show that it is the pair, rather than the sixbase sequence, the two individual codons, or the encoded
dipeptide, that is responsible for inhibition. GFP variants containing an inhibitory pair had significantly lower expression than variants in which the same six-base sequence was out of frame, the
two codons were present but separated, or one of the codons of
the pair was instead an optimal codon. We demonstrate that the
inhibition occurs during translation by suppressing it with overexpressed tRNA (11/12 pairs tested). Codon pair effects are
distinguished from individual codon effects by two findings reported here. First, the order of codons in the pair was required
for inhibition (for 12 of the 17 pairs). Second, translation rates
of many inhibitory pairs were slower (based on ribosome occupancies) than the rates of pairs encoding the same dipeptide
or of pairs with the reverse codon order. These findings implicate
interplay between adjacent ribosomal sites in codon pair-mediated inhibition.
0.05
CGA-GCG
Arg-Ala
n=35
CG
Cumulative ribosome occupancy

(P, A-site + E, P-site)
0.10
Ribosome occupancy
0.00
0.10
0.05
CUC-CCG
Leu-Pro
n=60
0.00
0.10
0.05
CGA-CUG
Arg-Leu
n=91
0.00
0.10
0.05
CUG-AUA
Leu-Ile
n=559
A-
G
CC
Mean
CG
G
A- G
A
CG CC
AU
C
AU
G
C
C
GA
-C
2 sd
3 sd
Inhibitory pair
Other codon pair
0.10
CG
CG
GA
A A
C
CG G
G
A- C-C
G
CC
U
U
U
A
A-C G G C CG
U
A
C
G
GCGG-C A
G
GU
CU -C
G
GA
CU
G
A A-C
CG
GG
G
AAU
UA G-C
-C
U
A
G
A
G
AG
AG
CU
0.05
0.00
0.00
0 +25 +50
-25
codon distance
-50
0.4
0.6
0.8
syn-GFPSEQ median
1.0
C
Inhibitory pair
Synonymous pair
Mean
UA*
GA
CU
Leu-Arg
A
CG G
AUAACG
AU
3 sd
Leu-Pro
0.05
2 sd
G*
CC
CUC CCG*
CUG
Leu-Ile
0.10
Ile-Arg
CG
CUG
0.00
*
UG
AC
CG
CG
AC
Arg-Pro
0.05
UA*
AA
CG
Arg-Leu
AG
CG
Arg-Ile
CG
0.10
Arg-Ala
CG
0.00
0.05
GA*
AC G*
CG ACG
G
CG
CG GA
AGGGGC
A
10 20 30
CG
C
GUA
Val-Pro
0.10
Ribosome occupancy
0.00
0.09
0.06
0.03
0.00
0.09
0.06
0.03
0.00
0.09
0.06
0.03
0.00
0.09
0.06
0.03
0.00
10 20 30
Rank
Reverse order pair
10 20 30
(A) Examples of ribosome occupancy for an

inhibitory pair and surrounding baseline positions.
At codon distance 0, the inhibitory pair is positioned in the P and A sites of the ribosome. Occupancy at each position is the sum of footprints
across aligned ORFs and normalized to total
footprints from all window positions.
(B) Median syn-GFPSEQ of variants with a given
pair versus cumulative ribosome occupancy for
two positions (with the pair in the P, A sites and E, P
sites). Horizontal lines represent the mean occupancy of all codon pairs and 2 or 3 SDs above the
mean (as indicated).
(C) Ranking of synonymous codon pairs by their
cumulative ribosome occupancy (at P, A and E,P
positions). Black dots below the bars indicate
synonymous pairs used in Fishers exact comparisons because they have a CAI-optimal codon
and a 50 or 30 codon identical to one of the inhibitory pairs.
(D) Ribosome occupancy by position in the ribosome for inhibitory pairs (blue) and pairs with the
reverse codon order (pink). Panels on the right
show two sets of pairs, for which both codon orders were identified as an inhibitory codon pair.
The black line indicates expected occupancy
(0.01), based on an even distribution of footprints.
Stars indicate inhibitory pairs with higher cumulative occupancy at the P, A-site and E, P-site positions compared to the reverse pair (one-sided
Fishers exact corrected p value % 4.63 3 1032).
Inhibitory pair
1, E site E, P site P, A site A site, +1
0.10
Arg-Arg
Cumulative ribosome occupancy (P,A-site + E,P-site)
0.15
Figure 6. Inhibitory Codon Pairs in Yeast

Gene Transcripts Have Elevated Ribosome
Occupancies
Val-Arg
suppressed inhibition in some cases

while overproduction of native wobble
0.00
decoding tRNA did not. Wobble decoding
A*
G
0.10
GC A*
has been implicated in both slow and
GU ACG
GU
0.05
inefficient decoding of individual codons
0.00
(Lareau et al., 2014; Letzring et al., 2010;
1 10 20 30
Srensen and Pedersen, 1991; Stadler
Rank
G CG CG UA CG G A GA GA UA GA UG
A
G
G
G
and Fire, 2011). Our findings are consisC
-C -A -C -C
CG CG CG CG -C G -C -A -C CC -CG -C
G- G- UA- GA- GA GA- UC UG UG UA- UA UG AUAGA UG GA
tent with a model in which wobble decodG
G
C
A C *C *C *C *C *C *G *G *G
A A
*C
*C
ing, rather than limited quantities of
tRNA, is central to codon pair-mediated
inhibition.
Codon-anticodon interactions at both the 50 and 30 codon play
The ribosome is a highly coordinated machine with communia major role in inhibition, as illustrated by three lines of evidence. cation between tRNAs in the A, P, and E sites mediated by
First, most inhibitory pairs (15/17) have a codon that relies on numerous protein and rRNA contacts (Demeshkina et al.,
wobble decoding, while synonymous pairs with codons that 2010). It is well established that tRNA:codon interactions at the
are decoded by the same tRNA species (but via Watson-Crick P site affect A site interactions, as during programmed framebase pairing) were not inhibitory. Moreover, in 12 of the 17 inhib- shifting (Atkins and Bjork, 2009) and in a post-peptide bond
itory pairs, the 30 codon is decoded by an abundant wobble de- quality control mechanism in E. coli (Zaher and Green, 2009).
coding tRNA (encoded by 3, 5, 6 and 10 tRNA gene copies, with However, it has not been appreciated that communication begene copy number strongly correlating with abundance [Tuller tween tRNAs at adjacent sites plays a general role in regulating
et al., 2010a]). Second, non-native exact matching tRNAs that the rate and efficiency of translation elongation. Concerted efdecode the 30 codons suppressed inhibition much more than fects of adjacent codons could occur at several steps in the elondid increased amounts of native, wobble decoding tRNAs (seven gation reaction, e.g., tRNA accommodation, formation of the
pairs). Third, exact matching tRNAs that decode the 50 codons hybrid state, translocation, or tRNA exit. Furthermore, inhibition
0.05
Cell 166, 679690, July 28, 2016 687
mediated by different pairs may work by distinct mechanisms,

since pairs differ with respect to their translation rate, dependence on codon order, requirement for wobble decoding, and
even in suppression by overproduction of a native exact matching 30 tRNA. However, we infer that acceptance of the 30 codon
into the A site is likely limiting for many of the identified pairs,
since overproduction of the native 30 tRNA frequently improved
GFP expression. Thus, inhibitory effects depend on a complex
interplay of the interactions between adjacent sites in the
ribosome, codon-anticodon interactions, and acceptance of a
codon into the A site.
That pairs of codons modulate translation efficiency may, in
part, explain why the effects of synonymous codons on translation efficiency have remained baffling (Plotkin and Kudla,
2011). Although several previous studies implicated codon
pairs in translation efficiency (Chevance et al., 2014; Coleman
et al., 2008; Gutman and Hatfield, 1989; Letzring et al., 2010),
most work has focused on the roles of individual codons (Plotkin and Kudla, 2011), with papers on codons outnumbering
papers on codon pairs or adjacent codons 175:1 (PubMed
citations of title and abstract). The prevailing model has been
that codons influence elongation efficiency primarily through
the small, additive effects of individual codons (Plotkin and Kudla, 2011) and indeed individual effects of some codons are
apparent (Hussmann et al., 2015; Lareau et al., 2014; Srensen
and Pedersen, 1991; Stadler and Fire, 2011). However, we
observed that the effects of an individual codon differed
considerably depending on which other codons it was paired
with. For example, eight CGA-NNN codon pairs had synGFPSEQ medians between 0.44 and 0.73, while the remaining
53 such pairs had medians >0.91. Moreover, the existence of
strong inhibitory pairs calls into question the idea that many,
individually small events sum to a substantial effect on expression. Instead, a few inhibitory codon pairs may act as discrete
regulatory signals and could be as strongly selected as miRNA
recognition sequences.
Inhibitory codon pairs in yeast, and potentially in other
organisms, may have broad effects on translation efficiency,
protein folding, and mRNA decay. Understanding the mechanisms by which inhibitory codon pairs impact translation
is essential to predict the functional implications of codon
composition.
Library Construction, FACS, and Flow Cytometry
Construction and transformation of (NNN)3 and (VNN)3 libraries of GFP variants in the RNA-ID reporter were performed as described previously (Dean
and Grayhack, 2012). Growth of each library, fluorescence-activated cell
sorting of 39.5 million cells from each library, and analysis of individual
variants were performed as described previously (Dean and Grayhack,
2012), with differences noted in the Supplemental Experimental Procedures.
Oligonucleotides and plasmids employed in this work are listed in Tables S6
and S7.
Sequencing of GFP Three-Codon Insertions
From genomic DNA samples, we amplified GFP library fragments through 25
PCR cycles, using primers specific to the flanking regions and containing a
FACS bin-specific index (Table S6). We then pooled the amplified fragments
and sequenced on an Illumina GAII sequencer with single-end reads. For qual-
688 Cell 166, 679690, July 28, 2016
ity control, we required each read to have accurately called six bases
(AACGCA) immediately downstream of our variable region and for each of
the nine variable base calls to have a score of Q30 or better. To compare
read counts across bins, we corrected for the number of cell sorting events
in a given bin. See additional filtering and scoring details in the Supplemental
Analysis of Ribosome Profiling Datasets
We analyzed a whole-cell ribosome profiling sample with no cycloheximide
treatment from Jan et al. (2014). A-site codon footprint tallies were provided
by Jeff Hussman as described in Hussmann et al. (2015). See additional details
in the Supplemental Experimental Procedures.
Acidic Northern Blot Analysis
Bulk RNA, prepared from 3 OD pellets, was resolved on 6.5% acrylamide
gels at pH 5 as described previously (Alexandrov et al., 2006).
Explanation of the Statistical Methods
To assess the significance of each 6-mer sequences enrichment in low GFP
variants, we tracked occurrences of the 6-mer in low variants across
100,000 permutations. Variants were assigned to one of ten pools based on
GC count, and we shuffled the expression categories within each pool. From
this analysis we derived p values for the frequency of each 6-mer in low variants, based on the probability of obtaining as many, or more, low-variant
counts by chance.
We also estimated the chance probabilities of footprint densities. For each
pair, we carried out 10,000 permutations, in which we shuffled the A-site
codon footprint counts within each ORF with the pair and recalculated footprint density at ribosomal site positions. To directly evaluate the significance
of differences between synonymous pairs, we performed one-sided Fishers
exact tests on 2 3 2 contingency tables with the footprint counts for each
pair at ribosome site positions and at the remainder of codon positions within
a 100-codon window. See additional details in the Supplemental Experimental
Procedures.
three figures, and seven tables and can be found with this article online at
C.E.G., C.E.B, S.F., and E.J.G. wrote the manuscript. C.E.G. and C.E.B. acquired data and performed the computational and experimental analyses,
respectively; K.M.D. identified tRNA suppressible variants from a pilot screen;
and E.J.G. and S.F. supervised the work.
ACKNOWLEDGMENTS
We thank Eric Phizicky, Andrew Wolf, Scott Butler, Gloria Culver, Adam Geballe, David Mathews, David Morris, and Yi-Tao Yu for discussions and comments on the manuscript, Jeffrey Hussman for assistance with ribosome
profiling data, Josh Hatfield, Shannon Schmitt, Blake Bentley, and Erin Eidschun for assistance with experiments, XiaoJu Zhang and David Mathews
for initial help analyzing codon use in the yeast genome, the URMC Flow Cytometry Resource and NCCR (1S10RR029229901) for technical support. This
work was supported by an NSF grant (MCB-1329545) (to E.J.G.) and an NIH
grant (1P41 GM103533) (to S.F.). C.E.B. was also supported by an NIH T32
Training Grant (GM068411), and C.E.G. was supported by an NSF Fellowship.
S.F. is an investigator of the Howard Hughes Medical Institute.
Received: December 18, 2015
REFERENCES
Alexandrov, A., Chernyakov, I., Gu, W., Hiley, S.L., Hughes, T.R., Grayhack,
E.J., and Phizicky, E.M. (2006). Rapid tRNA decay can result from lack of
nonessential modifications. Mol. Cell 21, 8796.
Artieri, C.G., and Fraser, H.B. (2014). Accounting for biases in riboprofiling data
indicates a major role for proline in stalling translation. Genome Res. 24, 2011
2021.
Atkins, J.F., and Bjork, G.R. (2009). A gripping tale of ribosomal frameshifting:
extragenic suppressors of frameshift mutations spotlight P-site realignment.
Microbiol. Mol. Biol. Rev. 73, 178210.
Beggs, J.D. (1978). Transformation of yeast by a replicating hybrid plasmid.
Nature 275, 104109.
Boycheva, S., Chkodrov, G., and Ivanov, I. (2003). Codon pairs in the genome
of Escherichia coli. Bioinformatics 19, 987998.
Burgess-Brown, N.A., Sharma, S., Sobott, F., Loenarz, C., Oppermann, U.,
and Gileadi, O. (2008). Codon optimization can improve expression of
human genes in Escherichia coli: a multi-gene study. Protein Expr. Purif. 59,
94102.
Chevance, F.F., Le Guyon, S., and Hughes, K.T. (2014). The effects of codon
context on in vivo translation speed. PLoS Genet. 10, e1004392.
Chu, D., Kazana, E., Bellanger, N., Singh, T., Tuite, M.F., and von der Haar, T.
(2014). Translation elongation can control translation initiation on eukaryotic
mRNAs. EMBO J. 33, 2134.
Coleman, J.R., Papamichail, D., Skiena, S., Futcher, B., Wimmer, E., and Mueller, S. (2008). Virus attenuation by genome-scale changes in codon pair bias.
Science 320, 17841787.
Curran, J.F., and Yarus, M. (1989). Rates of aminoacyl-tRNA selection at 29
sense codons in vivo. J. Mol. Biol. 209, 6577.
Dean, K.M., and Grayhack, E.J. (2012). RNA-ID, a highly sensitive and robust
method to identify cis-regulatory sequences using superfolder GFP and a fluorescence-based assay. RNA 18, 23352344.
Demeshkina, N., Jenner, L., Yusupova, G., and Yusupov, M. (2010). Interactions of the ribosome with mRNA and tRNA. Curr. Opin. Struct. Biol. 20,
325332.
dos Reis, M., Savva, R., and Wernisch, L. (2004). Solving the riddle of codon
usage preferences: a test for translational selection. Nucleic Acids Res. 32,
50365044.
Engel, S.R., Dietrich, F.S., Fisk, D.G., Binkley, G., Balakrishnan, R., Costanzo,
M.C., Dwight, S.S., Hitz, B.C., Karra, K., Nash, R.S., et al. (2014). The reference
genome sequence of Saccharomyces cerevisiae: then and now. G3 (Bethesda)
4, 389398.
Hussmann, J.A., Patchett, S., Johnson, A., Sawyer, S., and Press, W.H. (2015).
Understanding biases in ribosome profiling experiments reveals signatures of
translation dynamics in yeast. PLoS Genet. 11, e1005732.
Ingolia, N.T., Ghaemmaghami, S., Newman, J.R., and Weissman, J.S. (2009).
Genome-wide analysis in vivo of translation with nucleotide resolution using
ribosome profiling. Science 324, 218223.
Jan, C.H., Williams, C.C., and Weissman, J.S. (2014). Principles of ER cotranslational translocation revealed by proximity-specific ribosome profiling. Science 346, 748751.
Johansson, M.J., Esberg, A., Huang, B., Bjork, G.R., and Bystrom, A.S. (2008).
Eukaryotic wobble uridine modifications promote a functionally redundant decoding system. Mol. Cell. Biol. 28, 33013312.
Kudla, G., Murray, A.W., Tollervey, D., and Plotkin, J.B. (2009). Codingsequence determinants of gene expression in Escherichia coli. Science 324,
255258.
Kulak, N.A., Pichler, G., Paron, I., Nagaraj, N., and Mann, M. (2014). Minimal,
encapsulated proteomic-sample processing applied to copy-number estimation in eukaryotic cells. Nat. Methods 11, 319324.
Lareau, L.F., Hite, D.H., Hogan, G.J., and Brown, P.O. (2014). Distinct stages of
the translation elongation cycle revealed by sequencing ribosome-protected
mRNA fragments. eLife 3, e01257.
Letzring, D.P., Dean, K.M., and Grayhack, E.J. (2010). Control of translation efficiency in yeast by codon-anticodon interactions. RNA 16, 25162528.
Letzring, D.P., Wolf, A.S., Brule, C.E., and Grayhack, E.J. (2013). Translation of
CGA codon repeats in yeast involves quality control components and ribosomal protein L1. RNA 19, 12081217.
Moura, G., Pinheiro, M., Silva, R., Miranda, I., Afreixo, V., Dias, G., Freitas, A.,
Oliveira, J.L., and Santos, M.A. (2005). Comparative context analysis of codon
pairs on an ORFeome scale. Genome Biol. 6, R28.
Pechmann, S., and Frydman, J. (2013). Evolutionary conservation of codon
optimality reveals hidden signatures of cotranslational folding. Nat. Struct.
Mol. Biol. 20, 237243.
Pedelacq, J.D., Cabantous, S., Tran, T., Terwilliger, T.C., and Waldo, G.S.
(2006). Engineering and characterization of a superfolder green fluorescent
protein. Nat. Biotechnol. 24, 7988.
Pedersen, S. (1984). Escherichia coli ribosomes translate in vivo with variable
rate. EMBO J. 3, 28952898.
Plotkin, J.B., and Kudla, G. (2011). Synonymous but not the same: the causes
and consequences of codon bias. Nat. Rev. Genet. 12, 3242.
Pop, C., Rouskin, S., Ingolia, N.T., Han, L., Phizicky, E.M., Weissman, J.S., and
Koller, D. (2014). Causal signals between codon bias, mRNA structure, and the
efficiency of translation and elongation. Mol. Syst. Biol. 10, 770.
Fedorov, A., Saxonov, S., and Gilbert, W. (2002). Regularities of contextdependent codon bias in eukaryotic genes. Nucleic Acids Res. 30, 11921197.
Presnyak, V., Alhusaini, N., Chen, Y.H., Martin, S., Morris, N., Kline, N., Olson,
S., Weinberg, D., Baker, K.E., Graveley, B.R., and Coller, J. (2015). Codon optimality is a major determinant of mRNA stability. Cell 160, 11111124.
Gardin, J., Yeasmin, R., Yurovsky, A., Cai, Y., Skiena, S., and Futcher, B.
(2014). Measurement of average decoding rates of the 61 sense codons in vivo.
eLife 3, e03735.
Shah, P., Ding, Y., Niemczyk, M., Kudla, G., and Plotkin, J.B. (2013). Ratelimiting steps in yeast protein translation. Cell 153, 15891601.
Gingold, H., and Pilpel, Y. (2011). Determinants of translation efficiency and

accuracy. Mol. Syst. Biol. 7, 481.
Gingold, H., Tehler, D., Christoffersen, N.R., Nielsen, M.M., Asmar, F., Kooistra, S.M., Christophersen, N.S., Christensen, L.L., Borre, M., Srensen, K.D.,
et al. (2014). A dual program for translation regulation in cellular proliferation
and differentiation. Cell 158, 12811292.
Goodman, D.B., Church, G.M., and Kosuri, S. (2013). Causes and effects of
N-terminal codon bias in bacterial genes. Science 342, 475479.
Gutierrez, E., Shin, B.S., Woolstenhulme, C.J., Kim, J.R., Saini, P., Buskirk,
A.R., and Dever, T.E. (2013). eIF5A promotes translation of polyproline motifs.
Mol. Cell 51, 3545.
Gutman, G.A., and Hatfield, G.W. (1989). Nonrandom utilization of codon pairs
in Escherichia coli. Proc. Natl. Acad. Sci. USA 86, 36993703.
Sharp, P.M., and Li, W.H. (1987). The codon Adaptation Indexa measure of
directional synonymous codon usage bias, and its potential applications. Nucleic Acids Res. 15, 12811295.
Shoemaker, C.J., and Green, R. (2012). Translation drives mRNA quality control. Nat. Struct. Mol. Biol. 19, 594601.
Srensen, M.A., and Pedersen, S. (1991). Absolute in vivo translation rates of
individual codons in Escherichia coli. The two glutamic acid codons GAA and
GAG are translated with a threefold difference in rate. J. Mol. Biol. 222,
265280.
Stadler, M., and Fire, A. (2011). Wobble base-pairing slows in vivo translation
elongation in metazoans. RNA 17, 20632073.
Subramaniam, A.R., Zid, B.M., and OShea, E.K. (2014). An integrated
approach reveals regulatory controls on bacterial translation elongation. Cell
159, 12001211.
Cell 166, 679690, July 28, 2016 689
Thanaraj, T.A., and Argos, P. (1996). Ribosome-mediated translational pause

and protein domain organization. Protein Sci. 5, 15941612.
Tuller, T., Carmi, A., Vestsigian, K., Navon, S., Dorfan, Y., Zaborske, J., Pan, T.,
Dahan, O., Furman, I., and Pilpel, Y. (2010a). An evolutionarily conserved
mechanism for controlling the efficiency of protein translation. Cell 141,
344354.
Tuller, T., Waldman, Y.Y., Kupiec, M., and Ruppin, E. (2010b). Translation efficiency is determined by both codon bias and folding energy. Proc. Natl.
Acad. Sci. USA 107, 36453650.
Wolf, A.S., and Grayhack, E.J. (2015). Asc1, homolog of human RACK1, prevents frameshifting in yeast by ribosomes stalled at CGA codon repeats.
RNA 21, 935945.
Xu, Y., Ma, P., Shah, P., Rokas, A., Liu, Y., and Johnson, C.H. (2013). Nonoptimal codon usage is a mechanism to achieve circadian clock conditionality.
Nature 495, 116120.
Zaher, H.S., and Green, R. (2009). Quality control by the ribosome following
peptide bond formation. Nature 457, 161166.
Weatheritt, R.J., and Babu, M.M. (2013). Evolution. The hidden codes that
shape protein evolution. Science 342, 13251326.
Zhang, G., Hubalewska, M., and Ignatova, Z. (2009). Transient ribosomal

attenuation coordinates protein synthesis and co-translational folding. Nat.
Struct. Mol. Biol. 16, 274280.
Welch, M., Govindarajan, S., Ness, J.E., Villalobos, A., Gurney, A., Minshull, J.,
and Gustafsson, C. (2009). Design parameters to control synthetic gene
expression in Escherichia coli. PLoS ONE 4, e7002.
Zhou, M., Guo, J., Cha, J., Chae, M., Chen, S., Barral, J.M., Sachs, M.S., and
Liu, Y. (2013). Non-optimal codon usage affects expression, structure and
function of clock protein FRQ. Nature 495, 111115.
690 Cell 166, 679690, July 28, 2016
Article
Genetic Codes with No Dedicated Stop Codon:

Context-Dependent Translation Termination
Graphical Abstract
Authors
Estienne Carl Swart, Valentina Serra,
Giulio Petroni, Mariusz Nowacki
Correspondence
mariusz.nowacki@izb.unibe.ch
In Brief
In some ciliates, all three stop codons
can either terminate translation or code
for an amino acid. Ribosomes may
interpret this ambiguity using
downstream features in the transcript,
indicating that translational termination
can be context-dependent.
Highlights
d
Alternative nuclear genetic codes continue to be discovered

in ciliates
Genetic codes with stops and all their codons encoding
standard amino acids exist
Transcript ends may distinguish stop codons as such in
ambiguous genetic codes
The ability to resolve genetic code ambiguity may enable
genetic code evolution
Swart et al., 2016, Cell 166, 691702

July 28, 2016 2016 The Author(s). Published by Elsevier Inc.
Article
Genetic Codes with No Dedicated Stop Codon:
Context-Dependent Translation Termination
Estienne Carl Swart,1 Valentina Serra,2 Giulio Petroni,2 and Mariusz Nowacki1,*
1Institute
of Cell Biology, University of Bern, 3012 Bern, Switzerland

of Biology, University of Pisa, Pisa 56126, Italy
*Correspondence: mariusz.nowacki@izb.unibe.ch
2Department
SUMMARY
The prevailing view of the nuclear genetic code is that

it is largely frozen and unambiguous. Flexibility in
the nuclear genetic code has been demonstrated in
ciliates that reassign standard stop codons to amino
acids, resulting in seven variant genetic codes,
including three previously undescribed ones reported
here. Surprisingly, in two of these species, we find efficient translation of all 64 codons as standard amino
acids and recognition of either one or all three stop codons. How, therefore, does the translation machinery
interpret a stop codon? We provide evidence,
based on ribosomal profiling and stop codon depletion shortly before coding sequence ends, that mRNA
30 ends may contribute to distinguishing stop from
sense in a context-dependent manner. We further
propose that such context-dependent termination/
readthrough suppression near transcript ends enables genetic code evolution.
INTRODUCTION
The first exceptions to the supposed universality of eukaryotic
nuclear genetic codes were reported in ciliates (Caron and
Meyer, 1985; Helftenbein, 1985; Horowitz and Gorovsky, 1985;
Preer et al., 1985). Subsequently, additional genetic codes
were discovered in other ciliates, all due to stop codon reassignments, and appear to recur independently in different ciliate lineages (Lozupone et al., 2001; Sanchez-Silva et al., 2003; Tourancheau et al., 1995). Genetic code evolution is considered to have
both an ancient phase, which gave rise to the standard genetic
code before the radiation of bacteria, archaea, and eukaryotes,
and a modern phase, which led to diversification from the standard code (Sengupta and Higgs, 2015). Thus far, alternative nuclear genetic codes have only been found in three major eukaryotic lineages other than ciliates. The first alternative nuclear
genetic code, discovered in ciliates, with the UAA and UAG
stop codons reassigned to glutamine, is also present in green
algae (Acetabularia and Batophora) (Schneider and de Groot,
1991; Schneider et al., 1989) and diplomonads (Keeling and
Doolittle, 1996). Alternative nuclear genetic codes, with CUG reassigned from leucine, also occur in the yeasts Candida albicans
(predominantly to serine) and Pachysolen tannophilus (to alanine)
(Gomes et al., 2007; Muhlhausen et al., 2016; Santos and Tuite,

1995).
Other than the diversity of genetic codes in ciliates, the greatest number of variant genetic codes are found in mitochondria
(Knight et al., 2001), whose diversification may have been facilitated by their small genomes and strong mutational biases,
which increase the likelihood of loss and reassignment of rare
codons (Osawa and Jukes, 1989). Expressed ciliate genomes
(macronuclear genomes) are not especially small (typically
50100 Mb) (Swart et al., 2013), and the manner in which
changes in their genetic codes arose may not be as straightforward as that in smaller mitochondrial genomes. Alternative explanations for the evolution of ciliate genetic codes, such as
the abolishment of recognition of certain stop codons by mutations in the stop-recognizing translation termination factor eukaryotic release factor 1 (eRF1) allowing codon reassignment
have therefore been proposed (Lozupone et al., 2001).
While the genetic code is classically taught as being unambiguous, and indeed may largely be so, we now know this is an oversimplification. Since the original discovery of the standard genetic
code, alternative translational interpretations of codons have
been found, most notably in the use of the UGA codon for selenocysteine incorporation, in the context of special mRNA stemloops in the UTRs of a small number of protein-coding genes
(Nasim et al., 2000). An additional form of codon ambiguity, translational readthrough of stop codons, is now also recognized as
pervasive, but usually weak, in eukaryotes, occurring at a few
percent or less compared to the non-readthrough form (e.g.,
Dunn et al., 2013; Harrell et al., 2002; Roy et al., 2015). Translational readthrough usually gives rise to short protein extensions,
e.g., a median length of 35 amino acids in Drosophila (Jungreis
et al., 2011). Readthrough is enabled by near-cognate pairing of
tRNAs to codons, with either the first or third anticodon base noncanonically paired (Blanchet et al., 2014). Thus, there is competition for the same codons between eRF1 and tRNAs.
Although the options for engineering of new genetic codes with
artificial amino acids have been proliferating (Lemke, 2014), many
important questions about natural genetic codes remain unresolved. Among these questions, are basic ones of how codons
are recognized in variant genetic codes with stop codon reassignments and whether there is competition between eRF1 and stopcognate tRNAs for the same codons. Experimental evidence attempting to address the former problem has been conflicting,
supporting either loss or ongoing recognition of reassigned stop
codons by eRF1 (Eliseev et al., 2011; Lekomtsev et al., 2007;
Salas-Marco et al., 2006; Vallabhaneni et al., 2009).
Cell 166, 691702, July 28, 2016 2016 The Author(s). Published by Elsevier Inc. 691
With extensive sequence data spanning a wide range of

eukaryotes, including ciliates, now available, uncertain genetic
codes may be properly determined, and consequently, the
proposed basis for nuclear genetic code diversification is also
ripe for reinvestigation. We present the new genetic codes we
discovered in the course of screening a large collection of eukaryotic transcriptomes, how codons may have multiple meanings in two of these codes, and the consequences of tolerance
of genetic code ambiguity for genetic code evolution.
RESULTS
Genetic Codes in which All 64 Codons Encode Standard
Amino Acids
To identify and classify reassigned codons, we used a computational screening approach to search the Marine Microbial
Eukaryote Transcriptome Sequencing Project (MMETSP) transcriptomes (Keeling et al., 2014). We found that like Bembidion
americanum, Bradyrhizobium japonicum uses UGA as a trypto692 Cell 166, 691702, July 28, 2016
Figure 1. New Genetic Codes

(A) Stop codon reassignments (Q, glutamine; W,
tryptophan; C, cysteine; Y, tyrosine; *, stop) are
mapped onto an eRF1 maximum likelihood phylogeny. Homo sapiens (standard genetic code) is an
outgroup. Bootstrap support for every node is
shown. Scale bar indicates amino acid substitutions
per site. UGA codons were previously found in the
coding sequences of Blepharisma americanum and
were predicted to encode tryptophan (Eliseev et al.,
2011; Lozupone et al., 2001). Experimental assays
in Blepharisma japonicum suggest its eRF1 recognizes all three standard stop codons (Eliseev et al.,
2011). It should be noted that ciliates from the family
Mesodiniidae have both a unique genetic code
(UAG/UAA = UAR = tyrosine; UGA = stop) and
extremely divergent rRNAs (Johnson et al., 2004).
(B) Predicted C. magnum genetic code. Stop codons are highlighted in orange. Predicted amino
acids are those with maximal heights. Codon usage
inferred from translated BLAST matches is shown
below the codons. UAA and UAG codons were
previously predicted to encode glutamine (Lozupone et al., 2001; Tourancheau et al., 1995).
phan codon, although it does so at low

levels (0.059%) and hence this reassignment may easily go undetected in small
sequence samples (Figures 1B, S1A,
and S1B). Thus, given this reassignment
and previous experimental results (Eliseev
et al., 2011), we deduce that B. japonicums eRF1 and at least one of its tryptophan tRNAs may be in competition for the
same codon.
Because MMETSP represents the current broadest eukaryotic molecular diversity survey (Keeling et al., 2014) we
screened all its transcriptomes to search for new genetic codes.
In our screen, we discovered three new genetic codes among
24 ciliate species (Figures 1A, 1B and S1; Data S1A), but no
new codes in the remaining 265 eukaryotes (Data S1B). Unexpectedly, in two of these genetic codes, belonging to the heterotrichous ciliate Condylostoma magnum and an unclassified
karyorelict (18S rRNA 95% identical to that of Parduzcia orbis
[Edgcomb et al., 2011]; Parduzcia sp. hereafter) all three
stop codons are predicted to be reassigned to amino acids:
UAA = Q, UAG = Q, UGA = W. As the remaining C. magnum
and Parduzcia sp. codons encode standard amino acids (Figures
1A and S1A), all 64 of their codons are translated. Hence, the
question is if and how translation termination occurs given these
codes.
Because the UGA codon usage in C. magnum, Parduczia sp.,
and B. japonicum is relatively low (0.042%, 0.120%, and
0.059%, respectively), to computationally assess the hypothesis
that the C. magnum and Parduczia sp. genes with in-frame UGA
codons are functional, and not simply pseudogenes with in frame
stops, we sought essential single copy genes with in-frame UGAs

and examined their substitution rates. In-frame UGA codons are
present in critical genes, such as C. magnum tryptophan-tRNA
ligase (Figure 2B; MMETSP0210: CAMNT_0008287141) and
eRF1 of Parduczia sp. (MMETSP1317: CAMNT_0047593165).
Substitution rates of genes such as these support the hypothesis
of functionality since they indicate strong purifying selection, e.g.,
for C. magnum tryptophan-tRNA ligase aligned to Oxytricha trifallax tryptophan-tRNA ligase, dN/dS is 0.013 (dN/dS = nonsynonymous substitutions per nonsynonymous site over synonymous
substitutions per synonymous site; dN/dS <1 indicates purifying
selection) (Yang, 2007). The hypothesis that UGA codons are
translated was assessed experimentally in two ways: we determined that UGA codons are translated as tryptophan by protein
mass spectrometry (Data S1D and S1E); using ribosome profiling
we observe that ribosomes efficiently translate through UGA
codons, as they also do through UAG and UAA codons (Figures
2B and S3E).
The Genetic Codes of C. magnum and Parduczia sp. Are
Ambiguous
Given evidence that all three stop codons in the C. magnum
and Parduczia sp. genetic codes can be translated, we wished
to assess how translation termination occurs. To investigate
the nature of translation termination in C. magnum and Parduczia
sp. we began by examining histone H4 coding sequence ends,
since the proteins encoded by these sequences are among the
most highly conserved proteins and typically have the same
C-terminal residues (e.g., 95% of 105 reviewed UniProt histone
H4 proteins end with two glycines; Feb 9, 2015). With respect
to the conserved C-terminal amino acid of histone H4 homologs
in other eukaryotes, each of the C. magnum histone H4 paralog
coding sequences is expected to end with a C-terminal glycine codon (Figure 2C). The codon immediately following this,
either UAG or UGA, is therefore a candidate stop. The coding
sequence of the single histone H4 in the Parduzcia sp. transcriptome is followed by a UGA codon at the expected stop position
(Figure 2C). With respect to aligned homologs from other
organisms, all the Parduczia sp. transcripts we inspected have
a UGA where a stop codon would normally be expected.
C. magnum also has transcripts that have only the possibility
of UAA stops in proximity to where stops are expected (Figures
S2BS2D). From the sequence alignments, we therefore infer
that C. magnums eRF1 recognizes all three standard stop codons and hence needs to outcompete stop cognate tRNAs to
terminate translation.
To test whether translation termination occurs at the putative
histone H4 stop codons, we used ribosome profiling (ribo-seq).
For C. magnums histone H4.1b and H4.1c forms, it can be
seen that translation terminates precisely at the predicted stop
codons (Figure 2D), whereas it does so with a small amount of
imprecision for H4.1d (Figure 3A; H4.1a was insufficiently covered
by ribo-seq reads to assess termination). In general, translation
terminating C. magnum translation terminating ribosome-protected fragments (RPFs) end 11/12 nucleotides (nt) after stop
codon 30 nt (Figure 3Dcompare to sense codons in Figure 3C;
Figure 2D is a typical example). Consequently, both the primary
and secondary H4.1d stop codons, UAG and UAA, trigger trans-
lation termination, and the typical histone H4 C-terminus may occasionally be extended by one or more amino acids.
While readthrough is conventionally classified as translation of
stop codons by near-cognate tRNAs, in C. magnum, which has
stop cognate tRNAs (see next section), translation through
stop codons by near-cognate tRNAs is effectively indistinguishable from translation by cognate tRNAs in ribo-seq data. Therefore, for the sake of simplicity, in C. magnum, we classify
readthrough as translation through codons that typically trigger
translation termination (as for H4.1d). It should be noted that in
C. magnum, multiple translation termination opportunities often
exist before the ribosome translates into poly(A) tails (on average
approximately five codons intervene between the primary and
additional downstream non-primary stops). As a consequence,
if extensions result from readthrough they are typically expected
to be very short. Even though multiple possible stop codons
exist, examples of imprecise termination as in H4.1d are in the
minority: 90% of transcripts examined with >20 RPFs situated
at their stops show no readthrough. Thus, overall readthrough is
quite low, e.g., a mean of <1.8% and median of 0% (Figure S3K).
The small amount of readthrough that does occur is most
readily detected when the ribosome occupies downstream
stops (Figure 3E).
Multiple lines of evidence therefore demonstrate that stop
codons as a class in the C. magnum and Parduczia sp. genetic
codes are ambiguous, whereas their individual codons are typically recognized unambiguously as either sense or stops, solving
the translation termination paradox.
In Search of tRNAs that Enable Stop Codon
Translation
All model ciliates have suppressor tRNAs that are complementary to and permit translation of reassigned stop codons (Eisen
et al., 2006; Hanyu et al., 1986; Kuchino et al., 1985). Although
we found a comprehensive set of tRNAs in our C. magnum
genome assemblies, including glutamine tRNAs capable of
recognizing UAA and UAG codons (Figures 4A and 4B; Data
S1G), we were unable to detect tRNATrps with UCA anticodons.
Given the high sequence coverage of the C. magnum macronuclear genome, it is unlikely that we missed tRNATrp(UCA)s. Ciliates possess both a micronuclear and a macronuclear genome,
with the former predominantly unsequenced in our C. magnum
assembly due to its comparatively low ploidy. It is also unlikely
that tRNATrp(UCA)s have gone undetected because they are micronuclear genome-encoded: although these genomes are transcriptionally active during ciliate sexual development they are
generally inactive during vegetative growth (Chen et al., 2014;
Nowacki et al., 2009) when many transcripts with UGA tryptophan codons are expressed. To test if CCA / UCA anticodon
editing produces a UGA-cognate tRNATrp, we sequenced RTPCR products targeting nuclear genome-encoded tRNATrps
and examined tRNA reads from small RNA sequencing data,
but found no signs of significant anticodon editing (see Supplemental Experimental Procedures).
All sequenced ciliate mitochondrial genomes encode a UGAcognate tRNATrp(UCA) (Swart et al., 2013) and so does that
of C. magnum (Figure S4A). Experiments in cell-free lysates
show cytoplasmic ribosomes can use yeast mitochondrial
Cell 166, 691702, July 28, 2016 693
694 Cell 166, 691702, July 28, 2016
tRNATrp(UCA) to translate UGA codons (Tuite and McLaughlin,

1982). Thus, to determine whether C. magnums mitochondrial
tRNATrp(UCA)s are used to translate its mRNA UGA codons, it
will be necessary to show these tRNAs are accessible to cytoplasmic ribosomes in quantities adequate for translation.
In standard genetic code organisms, readthrough UGA stop
codons are preferentially translated as tryptophan (e.g., for
Saccharomyces cerevisiae: UGA: 86% W, 7% C, 7% R) (Roy
et al., 2015) by near-cognate tRNATrp(CCA)s. Near-cognate
pairing of tRNATrp(CCA) to UGA may also be substantially
enhanced through particular mutations, e.g., in Escherichia coli
a tRNATrp(CCA) D-stem point mutation leads to 303 more tryptophan translation at UGA stop codons than the wild-type tRNA
(Hirsh, 1971; Hirsh and Gold, 1971). C. magnum has three types
of tRNATrp(CCA) (Figures S4B and S4C), and it will be necessary
to experimentally assess if any of these tRNAs permits efficient
translation of its mRNA UGA codons.
Stop Codon Recognition Switches from Sense in
Coding Sequences to Stop Near Transcript Ends
We assessed two hypotheses for how sense codons are distinguished from stop codons in ambiguous codes: (1) that there
are sequence-specific features (motifs) allowing discriminating
protein factors to bind nearby sense and stop codons, and
(2) that proximity to transcript ends results in recognition of
stops. We reject the hypothesis that specific sequences are
necessary for stop/sense discrimination for the following reasons: (1) the base composition around sense stop codons is
not constrained (Figure S5A), and (2) although the bases flanking
C. magnum stop codons are weakly biased (Figure S5B), and
such biases exist in other eukaryotes, where they are associated
with enhanced termination efficiency (McCaughan et al., 1995), it
is trivial to find sense stop codons with the preferred stop
codon flanking Us, thus flanking bases cannot be sufficient to
distinguish stop codons.
We next assessed if the proximity of the stop codon to transcript ends might determine sense/stop state. While analyzing
ciliate 30 UTRs we were struck by how short they are, with those
of heterotrichs the shortest of all (median lengths, excluding the
poly(A) tail and stop codon: 2123 nt; Figure 5A). In the literature,
we could find no eukaryotes with shorter 30 UTRs. In comparison,
yeast, metazoan, and plant 30 UTRs typically have a >100 nt
length mode and may be considerably longer (Aoki et al., 2010;
Jan et al., 2011). Because poly(A) tails of certain C. magnum transcripts, especially those with UAA stop codons, start immediately after their stop codon (Figures 5B5D) stops can be situ-
ated adjacent to poly(A)-binding proteins (PABPs) in vivo, and

hence translation may be terminated with no additional information encoded by 30 UTRs. Because the ribosome occupies 11 or
12 nucleotides downstream of C. magnum stop codons, even for
those transcripts with 30 UTRs, there may be little room for ribosomes to maneuver passed stop codons without displacing
PABPs. Given such short 30 UTRs in ciliates, we therefore propose that nearby protein-bound poly(A) tails may contribute to
discriminating stop from sense.
The very low readthrough levels detected in C. magnum by
ribosome profiling imply that when stop codons are positioned
close to transcript ends the probable outcome is termination.
The few stop codons existing in the vicinity before stop codons (2466 nt upstream; mean 50 nt upstream; 16 out of
1,672 transcripts) are efficiently translated and show no signs
of appreciable premature translation termination (Figure S3I).
Given the low tolerance of either readthrough or premature
translation termination, the prediction is that when codons
recognized inefficiently as either stop or sense arise in coding
sequences, they are deleterious. Thus, in the hypothesis of
discrimination of codons as stops close to transcript ends, if
stop codons arise just upstream of the proper stops, where
they might either be translated or result in premature termination,
they will be counterselected and hence decrease in frequency.
Consistent with this hypothesis, such a decrease in stop
codon frequency exists in the upstream coding sequence vicinity
of the stops in C. magnum (UAA, UAG, UGA) and Parduczia sp.
(UGA) (Figures 6 and S6). Conversely, no codons other than
stop codons become rare in coding sequences just before
the actual stops (e.g., C. magnum; Figure S6). Furthermore,
following cognate tRNA acquisition CAA and CAG frequencies
are expected to remain higher near stops than distal coding
sequence regions, since these codons may not freely mutate
to UAA and UAG without causing premature translation termination (Figure 6D; unlike any other codons [Figure S6]; given the low
UGA sense codon usage, only a small fraction of UGG codons
has mutated to UGA, and UGG codon frequencies are not expected to be higher near stops).
DISCUSSION
Based on the observations of ribosome positioning and distribution of stop codons in transcripts, for translation in C. magnum
and Parduczia sp. we propose a model where translation, rather
than termination, is the default recognition mode for stop codons and where termination is due to the context-specific
Figure 2. Stop Codons in C. magnum and Parduczia sp.: Either Sense or Stop Codons
(A) C. magnum protein kinase alignment region highlighting putative sense stop codons. Standard genetic code stop codons are shown with stars, with
larger stars for UGA. MMETSP0210 IDs: CAMNT_0008311047, CAMNT_0008316317, CAMNT_0008295895, CAMNT_0008281491, CAMNT_0008274923,
CAMNT_0008274561, CAMNT_0008271577, CAMNT_0008291651, CAMNT_0008280967, CAMNT_0008289329.
(B) Ribosome-protected fragments (RPFs) mapped to a C. magnum tryptophan-tRNA ligase transcript (Data S1AC and S1AD). RPF coverage is calculated from
all the bases of 2532 nt RPFs.
(C) Histone H4 C-termini and stop codons (gray arrow, coding sequence) from C. magnum, Parduczia sp., and Homo sapiens. Poly(A) tails are visible at
C. magnum and Parduczia sp. mRNA 30 termini. Histone H4.1a H4.1d: MMETSP0210 IDs: CAMNT_0008274265, CAMNT_0008297091, CAMNT_0008284521,
and CAMNT_0008296393; Parduczia sp. histone H4 is MMETSP137 CAMNT_0047598059. H. sapiens histone H4 is GenBank: M16707.1. Judging from pairedend read mapping, the 30 UTR of H4.1a is incorrectly fused to a downstream transcript.
(D) RPFs mapped to histone H4.1c (Data S1AE and S1AF).
See also Figure S2.
Cell 166, 691702, July 28, 2016 695
Figure 3. Ribosome Profiling Reveals Different Ribosome States at Stop Codons

(A) RPFs (2532 nt) mapped to histone H4.1d (Data S1AG and S1AH). RPF 30 termini counts are given at the sequence coverage steps: the first and second steps
correspond to ribosomes whose P-sites are the first and second stop codons, respectively.
(B) RPF read length distribution and frame distribution. For the 3U TruSeq ribo profile nuclease digestion more mRNA reads were present due to lower rRNA
degradation, and most 30-nt RPFs have their 30 ends in frame 3 (compare to Figures S3A and S3B).
(C and D) Distribution of 30 nt RPF 30 ends around sense (C) and stop (D) UAG, UGA, and UAA codons (positions 13, indicated by dashed vertical lines) in Trinity
assembled transcripts. CDS, coding sequence; UTR, untranslated region. Putative ribosomal P- and A-site locations of translation terminating RPFs situated at
stop codons, based on that predicted for other eukaryotic ribosomes (Chung et al., 2015). Figures S3CS3H show the distribution of RPF 30 ends around individual stop codons. Though the termination signal is most pronounced for 30-nt RPFs, it is also exhibited by other RPFs (Figure S3J).
(E) Distribution of 30-nt RPFs for transcripts with detected readthrough (R13 nt downstream of the primary stop codon); additional stop codons are located
downstream of the primary one, hence the region downstream of the primary stop may be either coding or untranslated.
See also Figure S3.
override provided by transcript ends (Figure 7). Thus, at sense

stop codons, tRNAs outcompete eRF1, and at proper stop codons, eRF1 outcompetes tRNAs. The converse model (default
696 Cell 166, 691702, July 28, 2016
termination; context-specific translation), is not consistent with

our results, and given preexisting surrounding coding sequence
constraints, widespread context-specific translation signals
Figure 4. Predicted UAA- and UAG-Cognate C. magnum tRNAs

(A and B) UAA- and UAG-cognate glutamine tRNA secondary structures.
Bonds shown are predicted by the RNAfold web server (Lorenz et al., 2011)
(default parameters).
See also Figure S4.
necessary to translate all the stop codons are exceedingly unlikely to arise.
Given the existence of transcripts without 30 UTRs, we deduce
these regions are not essential for translation termination, and
we propose that the close proximity of a poly(A) tail and
poly(A)-interacting proteins, in particular PABPs, alone may be
necessary to trigger termination. Three prior observations favor
this hypothesis: (1) PABP overexpression enhances translation
termination when it is weak, implying that PABPs may be
involved in translation termination (Cosson et al., 2002), (2) tethering of a PABP 3773 nt downstream of a premature stop codon
substantially decreases NMD and results in recruitment of the
translation termination factor eRF3, suggesting that PABP is
involved in discriminating stops from premature stops (Amrani
et al., 2004); and (3) PABPs bind to AU-rich RNA including 30
UTRs (Baejen et al., 2014; Kini et al., 2016; Sladic et al., 2004).
Reassigned stop codons in C. magnum and Parduczia sp.
differ from conventional readthrough stops in standard genetic
code organisms because they are efficiently translated and
distributed throughout coding sequences, whereas conventional
readthrough stops are the major termination signals whose
disregard gives rise to modest levels of short protein extensions
(Dunn et al., 2013; Jungreis et al., 2011). From their distribution
throughout coding sequences, it is evident that most reassigned
codons in ciliates arose from substitutions of codons that were
already normally translated, rather than from readthrough stop
codons. Upon acquisition of a stop cognate tRNA, a shift in balance from translation termination to readthrough at stop codons
is expected. Normally this acquisition would immediately be
deleterious, due to the creation of aberrant C-terminal peptide
signals or the triggering of non-stop mRNA decay (Frischmeyer
et al., 2002) upon translation into mRNA poly(A) tails. By enforcing proper translation termination close to transcript ends, ciliates with ambiguous genetic codes provide a way of getting
around these problems.
Given that we detected no new genetic codes in 265 diverse
non-ciliate eukaryotic species from MMETSP, the abundance
of alternative genetic codes within ciliates is all the more striking.
Two hypotheses for the origin of genetic codes in ciliates are that
they were enabled by codon capture or eRF1 mutations. Under
the codon capture hypothesis (Osawa and Jukes, 1989)
when a codon disappears in a genome due to strong mutational
biases it may then be reassigned when a suitable cognate tRNA
arises (via tRNA duplication and anticodon mutation) and the
codon subsequently reappears. To date, all sequenced ciliate
genomes are AT rich (Aeschlimann et al., 2014; Aury et al.,
2006; Coyne et al., 2011; Eisen et al., 2006; Swart et al., 2013;
Wang et al., 2016). Reflecting their A/T mutational biases, among
eukaryotes with the highest UAA stop codon usage are standard
genetic code ciliates (Figures S7BS7D; Data S1V). This suggests that the diversification of genetic codes from the standard
one could have followed UAG and UGA stop codon depletion in
ancestral ciliates with AT rich genomes. While codon capture is a
reasonable explanation for the evolution of the Blepharisma genetic code (UAA stop codon usage 91%), it does not readily
explain the origin of other ciliate genetic codes. For example,
in Euplotes sp., according to tRNA anticodon-codon wobble
rules, UGG codons are expected to be misread as cysteine
following the origin of a tRNACys(UCA).
Even when relaxing the stop codon disappearance criterion (via
genetic code ambiguity tolerance), codon capture cannot easily
explain the general UAG and UAA reassignment trends seen in
Figure 1A. In all ciliates with reassigned UAG and UAA codons
and complete macronuclear genomes, both tRNAs with anticodon
complements of these codons are present (Aeschlimann et al.,
2014; Aury et al., 2006; Coyne et al., 2011; Eisen et al., 2006; Swart
et al., 2013). In the event that the first acquisition during codon reassignment was a tRNA(UUA), by the codon-anticodon wobble
rules UAA and UAG would both be translated; however, as this requires prior UAA stop codon disappearance, it is contrary to the
ciliate mutational tendencies. If codon reassignment were to occur
after a tRNA(CUA) acquisition, only UAG codons would be translated, and under the codon capture hypothesis, genetic codes
with UAG reassignment alone should be common; however, this
is not observed. Therefore, codon capture alone cannot explain
the diversity of genetic codes in ciliates.
As eRF1 recognizes stop codons, this protein could be a
determinant of genetic code reassignments in ciliates. Previously it was hypothesized that particular eRF1 amino acid substitutions are associated with each variant genetic code (Lozupone
et al., 2001). The additional ciliate genetic codes and eRF1 diversity present in ciliates and other eukaryotes present multiple
contradictions to the reported concordances between eRF1
amino acid substitutions and variant genetic codes (Lozupone
et al., 2001) (Figure S7A). Because no obvious associations between single eRF1 substitutions and variant genetic codes are
evident, any possible associations between genetic codes and
eRF1 changes must be more complex than individual amino
acid changes. The existence of the ambiguous ciliate genetic codes is also a challenge to explain by this hypothesis.
Because ciliate genetic code diversity does not seem to be
adequately explained by codon capture or eRF1 changes, we
instead propose that it is due to past genetic code ambiguity
tolerance and resolution, as exemplified by C. magnum and
Parduczia sp. Conversely, the inability to resolve ambiguity
favors the frozen state of the genetic code in other eukaryotes.
Cell 166, 691702, July 28, 2016 697
698 Cell 166, 691702, July 28, 2016
The codons in C. magnum and Parduczia sp. that are recognized

either by tRNAs or eRF1 represent precisely the type of intermediate states with multiple meanings originally proposed to occur
in the hypothesis of genetic code evolution through ambiguous
translational intermediates (Schultz and Yarus, 1994). We
furthermore propose that the evolution of very short, AU-rich 30
UTRs and termination facilitated by poly(A) proximity have
enabled codon reassignment, as translational ambiguity due to
Figure 6. Terminal Stop Codon Decline

Close to C. magnum Stops
Stacked bar graphs of stop codon counts are for
the transcript regions upstream of poly(A) tails
(position 0). Transcript ends include 0, 1, or 2 nucleotides of the poly(A) tail to complete the final
codon. 30 UTRs occur in the region to the right of
the right-most dashed vertical line. Codons counted are those in the 1672 poly(A)-tailed single gene,
single isoform Trinity assembled transcripts.
(AC) The top three subgraphs are drawn in
decreasing order of ordinate limits. Vertical line
at 39 nt indicates approximately where most
downstream stops are either stop codons or
codons in 30 UTRs. Codons whose sense/stop
states have not been determined are indicated by
amino acid/*. Transcripts with UGA codons upstream of 39 nt were visually classified based on
BLASTX searches. Upstream of 39 nt, UGA codons predominantly code for tryptophan; downstream of 39 nt, UGA codons are predominantly
stops or codons in 30 UTRs downstream of primary
stops (both indicated by gray bars). In the genetic
codes of C. magnum and Parduczia sp. UGA is a
codon triality (codon duality is reviewed in Atkins
and Baranov, 2007), because in addition to being
interpreted as a tryptophan codon and a stop
codon, it also serves as a selenocysteine codon in
the context of SECIS elements. Pale gray bars
correspond to a transcript with an uncertain
C-terminal, as judged by BLAST.
(D) Standard glutamine and tryptophan sense
codon counts.
(E) Base frequencies are stable in the region of
stop codon decline (90 to 42 bases upstream of poly-As).
the acquisition of stop cognate tRNAs

could be suppressed at stops.
In light of the ambiguous genetic codes
presented here, it is worth reconsidering
the idea that the standard genetic
code is one in a million and is optimized
to minimize the effects of errors arising from mutations (Freeland
and Hurst, 1998) (although contested [Koonin and Novozhilov,
2009]). Naturally, organisms with only one or two stop codons
due to reassignments are more robust to sense premature
stop codon mutations than those with the standard genetic
code. Given that, other than in the vicinity of transcript ends,
stop codons are translated by default, the genetic codes of
C. magnum and Parduczia sp. may confer very high resistance
Figure 5. Extremely Short and Nonexistent 30 UTRs in Heterotrichs

(A) Ciliate 30 UTR length distributions (lengths exclude the stop codon and poly(A) tail) for representatives of the ciliate genetic codes in Figure 1.
(B) Length distribution of C. magnum 30 UTRs. Lengths are from the putative primary stop in the 60 nt window upstream of poly(A) sites and exclude the stop and
poly(A) tail lengths.
(C) A 30 UTR-less gene (synaptobrevin homolog). Poly(A) tail-ending reads mapped to the genomic region encoding this gene are shown, and no other reads
extend beyond the poly(A) addition site. CDS, coding sequence (Data S1AI and S1AJ).
(D) RPFs mapped to a transcript of the gene in (C) (Data S1AK and S1AL).
See also Figure S6.
Cell 166, 691702, July 28, 2016 699
Figure 7. Model for Distinguishing Stops

from Sense Stops
Representative regions from the same transcript
(MMETSP0210: CAMNT_0008285195), with translation through a UAG sense codon and termination
at a UAG stop codon (codon state verified by riboseq). CDS, coding sequence; 30 UTR, 30 UTR; eRF1,
eukaryotic release factor 1; eRF3, eukaryotic
release factor 3; PABP, poly(A)-binding protein;
standard amino acids are indicated by circles. Putative interaction between eRF3 and PABPs, as inferred from experimental evidence in yeast (Cosson
et al., 2002), is indicated by a dotted bidirectional
arrow. Ribosome position and the protected mRNA
span are illustrated as inferred from C. magnum
RPFs and from estimates of other eukaryotic ribosomes (Chung et al., 2015).
to substitutions that would cause premature translation termination in the standard genetic code. A potential drawback of such
robustness is that large insertions at 30 transcript ends may
expose stops that were previously translated. However, large insertions likely occur much less often than substitutions, and the
strong purifying selection governing non-protein-coding regions
in the heterotrich and karyorelict genomes will inhibit progressive
transcript end lengthening.
In summary, we propose that ambiguous ciliate genetic codes
are resolved by context-dependent translation termination, and
the reason why ciliates possess such diverse genetic codes is
that their ancestors had the ability to thrive for extended periods
with ambiguous genetic codes, as epitomized by C. magnum.
Together with the other variant genetic codes, these codes
show that the standard nuclear genetic code is not necessarily
an evolutionary dead end and that genetic codes can occasionally be observed in a state of flux. As highlighted here, the ambiguous genetic codes of C. magnum and Parduczia sp. also have
ramifications for our understanding of the suppression of translational readthrough, as well as how nonsense-mediated decay
(NMD) and selenocysteine translation operate (conserved proteins from both of these pathways are present in ciliates with
ambiguous genetic codes; see e.g., Figure S2E). To facilitate
future investigations concerning how sense is distinguished
from stop and related questions about codon disambiguation,
we have made a draft C. magnum macronuclear genome available under the accession number European Nucleotide Archive:
GCA_001499635.1.
See the Supplemental Experimental Procedures for additional detailed
protocols.
Transcriptomes Analyzed
Transcriptomes for C. magnum (MMETSP0210), Parduczia sp. (MMETSP1317),
and other eukaryotes assembled as part of MMETSP (Gentekaki et al., 2014;
Keeling et al., 2014)) were used to identify genetic codes and analyze stop codon
usage. We also predicted genetic codes after de novo assembling the transcriptomes of two peritrichous ciliates: Campanella umbellaria and Carchesium polypinum (NCBI short read archive: SRR1768423 and SRR1768437, respectively;
data from a recent phylogenomic study) (Feng et al., 2015) with Trinity (Grabherr
et al., 2011) (default parameters, version: trinityrnaseq_r20140717).
700 Cell 166, 691702, July 28, 2016
Prediction of Alternative Stop Codon Reassignments

To predict codon reassignments, we simplified and refined the key steps of a
method developed for such prediction (Dutilh et al., 2011), which identifies codons aligned to conserved amino acids in hidden Markov models inferred from
multiple sequence alignments. Dutilh et al. (2011) may be consulted for a graphical outline and more details of the method. This method builds upon and advances the classical method of inspecting conserved positions in multiple
sequence alignments of homologous protein sequences to infer codon reassignments. First, we generated a database of peptide sequences by translating
nucleotide sequences in all six frames with the standard genetic code,
recording standard stop codons as X (any amino acid). Next, we used
HMMER 3.1b (http://hmmer.org) to search and align the hidden Markov models
from the Pfam-A protein domain database (release 27) (Finn et al., 2014) against
the translated sequences. Using a custom Python script, the alignment outputs
were filtered at a conditional e-value threshold <1e-10. We then simultaneously
scanned through the Pfam consensus, aligned database match and its underlying coding sequence, recording the codon and consensus amino acid for
well-conserved amino acids at R50% frequency in columns of the multiple
sequence alignment used to build the Pfam model. From the resultant counts
of aligned amino acid/codon pairs (mi,j; i = 1..64 codons, j = 1..20 amino acids)
a 20 amino acid by 64 codon matrix, M, was created, with each entry scaled by
P
the sum of the counts for each amino acid (i.e., M = mi;j = i mi;j ). This matrix was
used to generate a sequence logo with WebLogo 3.3 (Crooks et al., 2004) (command line switches: scale-width no -c chemistry -U probability -A protein).
Note that the lower frequency amino acids shown in the genetic code logos
generated by this procedure typically reflect the underlying codon mutational
space, but may also be subject to noise, and the focus for codon reassignment
prediction should be on the highest frequency amino acid. Genetic code
sequence logos for all MMETSP transcriptomes are provided as Data S1A (ciliates) and Data S1B (nonciliates). See Table S1 for a summary of the ciliate genetic code predictions. An explanation of stop codon identification is provided
in the Supplemental Experimental Procedures.
Ribosome Profiling
Illuminas TruSeq Ribo Profile (Mammalian) kit was used for ribosome profiling.
A total of 32,000 C. magnum cells (strain COL2) were isolated, gently pelleted
at 280 3 g for 2 min in 100 ml pear-shaped centrifuge tubes, then washed in
clean saline solution and centrifuged again at 280 3 g for 2 min to remove
excess algae. The cleaned C. magnum cell pellet was incubated in saline solution with 0.1 mg/ml cycloheximide for 1 min. Cells were rinsed with 10 ml
PBS, 0.1 mg/ml cycloheximide, pelleted at 280 3 g, and excess liquid was
removed with a micropipette. Pelleted cells were lysed in TruSeq Ribo Profile
lysis buffer using a syringe with a 21G needle. The TruSeq Ribo Profile protocol
was followed for the remaining ribosome profiling steps. Three concentrations
of TruSeq Ribo Profile Nuclease (3 U, 10 U, and 30 U) were used to generate
ribosome-protected fragments (RPFs), which were purified with MicroSpin
S-400 columns. Ribo-Zero Gold Yeast rRNA depletion was performed on purified RPFs. DNA libraries isolated from 15 (10 U) or 17 (3 U, 10 U) cycle PCRs
were multiplexed and sequenced on one lane of a HiSeq 2500 sequencer by

Fasteris SA (Switzerland). Ribosome profiling data are available from the European Nucleotide Archive: ERS1066482ERS1066484. After adaptor trimming,
reads were mapped to 1,672 poly(A)-tailed, translation frame inferred Trinity
assembled transcripts (see the Supplemental Experimental Procedures)
with STAR (parameters:alignIntronMin 12 alignIntronMax 25). Reads
with 0 or 1 mismatches to the transcripts were used in ribo-seq analyses.
ACCESSION NUMBERS
The accession number for the draft of the C. magnum macronuclear genome
reported in this paper is European Nucleotide Archive: GCA_001499635.1.
seven figures, one table, and supplemental data and can be found with this
article online at http://dx.doi.org/10.1016/j.cell.2016.06.020.
Baejen, C., Torkler, P., Gressel, S., Essig, K., Soding, J., and Cramer, P. (2014).
Transcriptome maps of mRNP biogenesis factors define pre-mRNA recognition. Mol. Cell 55, 745757.
Blanchet, S., Cornu, D., Argentini, M., and Namy, O. (2014). New insights into
the incorporation of natural suppressor tRNAs at stop codons in Saccharomyces cerevisiae. Nucleic Acids Res. 42, 1006110072.
Caron, F., and Meyer, E. (1985). Does Paramecium primaurelia use a different
genetic code in its macronucleus? Nature 314, 185188.
Chen, X., Bracht, J.R., Goldman, A.D., Dolzhenko, E., Clay, D.M., Swart, E.C.,
Perlman, D.H., Doak, T.G., Stuart, A., Amemiya, C.T., et al. (2014). The architecture of a scrambled genome reveals massive levels of genomic rearrangement during development. Cell 158, 11871198.
Chung, B.Y., Hardcastle, T.J., Jones, J.D., Irigoyen, N., Firth, A.E., Baulcombe,
D.C., and Brierley, I. (2015). The use of duplex-specific nuclease in ribosome
profiling and a user-friendly software package for Ribo-seq data analysis.
RNA 21, 17311745.
Cosson, B., Couturier, A., Chabelskaya, S., Kiktev, D., Inge-Vechtomov, S.,
Philippe, M., and Zhouravleva, G. (2002). Poly(A)-binding protein acts in translation termination via eukaryotic release factor 3 interaction and does not influence [PSI(+)] propagation. Mol. Cell. Biol. 22, 33013315.
E.C.S. performed the computational analyses and assisted in laboratory experiments. V.S. cultured C. magnum, isolated nucleic acids and proteins,
and performed laboratory experiments searching for tRNAs. E.C.S. and V.S.
performed ribosome profiling. M.N. supervised the project. E.C.S. drafted
the manuscript with input from V.S., G.P., and M.N.
Coyne, R.S., Hannick, L., Shanmugam, D., Hostetler, J.B., Brami, D., Joardar,
V.S., Johnson, J., Radune, D., Singh, I., Badger, J.H., et al. (2011). Comparative genomics of the pathogenic ciliate Ichthyophthirius multifiliis, its free-living
relatives and a host species provide insights into adoption of a parasitic lifestyle and prospects for disease control. Genome Biol. 12, R100.
ACKNOWLEDGMENTS
Crooks, G.E., Hon, G., Chandonia, J.M., and Brenner, S.E. (2004). WebLogo: a
sequence logo generator. Genome Res. 14, 11881190.
We thank Letizia Modeo for collecting C. magnum, Vittorio Boscaro for the
original C. magnum RNA isolation, Sophie Braga-Lagache and Manfred Heller
from the Mass Spectrometry and Proteomics Laboratory at the Childrens University Hospital in Bern for mass spectrometry support, Deis Haxholli for initial
genetic code inspections and the M.N. lab members for support and discussion. This research was supported by grants from the European Research
Council (ERC) (EPIGENOME) and National Center of Competence in Research
(NCCR) RNA and Disease to M.N., and the European COST Action BM1102.
Cluster computing was performed at the Vital-IT Center for High-Performance
Computing (http://www.vital-it.ch) of the Swiss Institute of Bioinformatics.
Dunn, J.G., Foo, C.K., Belletier, N.G., Gavis, E.R., and Weissman, J.S. (2013).
Ribosome profiling reveals pervasive and regulated stop codon readthrough in
Drosophila melanogaster. eLife 2, e01179.
Dutilh, B.E., Jurgelenaite, R., Szklarczyk, R., van Hijum, S.A., Harhangi, H.R.,
Schmid, M., de Wild, B., Francoijs, K.J., Stunnenberg, H.G., Strous, M.,
et al. (2011). FACIL: Fast and Accurate Genetic Code Inference and Logo. Bioinformatics 27, 19291933.
Edgcomb, V.P., Leadbetter, E.R., Bourland, W., Beaudoin, D., and Bernhard,
J.M. (2011). Structured multiple endosymbiosis of bacteria and archaea in a
ciliate from marine sulfidic sediments: a survival mechanism in low oxygen,
sulfidic sediments? Front Microbiol 2, 55.
Received: August 11, 2015

Eisen, J.A., Coyne, R.S., Wu, M., Wu, D., Thiagarajan, M., Wortman, J.R.,
Badger, J.H., Ren, Q., Amedeo, P., Jones, K.M., et al. (2006). Macronuclear
genome sequence of the ciliate Tetrahymena thermophila, a model eukaryote.
PLoS Biol. 4, e286.
REFERENCES
Eliseev, B., Kryuchkova, P., Alkalaeva, E., and Frolova, L. (2011). A single
amino acid change of translation termination factor eRF1 switches between
bipotent and omnipotent stop-codon specificity. Nucleic Acids Res. 39,
599608.
Aeschlimann, S.H., Jonsson, F., Postberg, J., Stover, N.A., Petera, R.L., Lipps,
H.J., Nowacki, M., and Swart, E.C. (2014). The draft assembly of the radically
organized Stylonychia lemnae macronuclear genome. Genome Biol. Evol. 6,
17071723.
Amrani, N., Ganesan, R., Kervestin, S., Mangus, D.A., Ghosh, S., and Jacobson, A. (2004). A faux 30 -UTR promotes aberrant termination and triggers
nonsense-mediated mRNA decay. Nature 432, 112118.
Aoki, K., Yano, K., Suzuki, A., Kawamura, S., Sakurai, N., Suda, K., Kurabayashi, A., Suzuki, T., Tsugane, T., Watanabe, M., et al. (2010). Large-scale analysis of full-length cDNAs from the tomato (Solanum lycopersicum) cultivar
Micro-Tom, a reference system for the Solanaceae genomics. BMC Genomics
11, 210.
Atkins, J.F., and Baranov, P.V. (2007). Translation: duality in the genetic code.
Nature 448, 10041005.
Aury, J.M., Jaillon, O., Duret, L., Noel, B., Jubin, C., Porcel, B.M., Segurens, B.,
Daubin, V., Anthouard, V., Aiach, N., et al. (2006). Global trends of wholegenome duplications revealed by the ciliate Paramecium tetraurelia. Nature
444, 171178.
Feng, J.M., Jiang, C.Q., Warren, A., Tian, M., Cheng, J., Liu, G.L., Xiong, J.,
and Miao, W. (2015). Phylogenomic analyses reveal subclass Scuticociliatia
as the sister group of subclass Hymenostomatia within class Oligohymenophorea. Mol. Phylogenet. Evol. 90, 104111.
Finn, R.D., Bateman, A., Clements, J., Coggill, P., Eberhardt, R.Y., Eddy, S.R.,
Heger, A., Hetherington, K., Holm, L., Mistry, J., et al. (2014). Pfam: the protein
families database. Nucleic Acids Res. 42, D222D230.
Freeland, S.J., and Hurst, L.D. (1998). The genetic code is one in a million.
J. Mol. Evol. 47, 238248.
Frischmeyer, P.A., van Hoof, A., ODonnell, K., Guerrerio, A.L., Parker, R., and
Dietz, H.C. (2002). An mRNA surveillance mechanism that eliminates transcripts lacking termination codons. Science 295, 22582261.
Gentekaki, E., Kolisko, M., Boscaro, V., Bright, K.J., Dini, F., Di Giuseppe, G.,
Gong, Y., Miceli, C., Modeo, L., Molestina, R.E., et al. (2014). Large-scale phylogenomic analysis reveals the phylogenetic position of the problematic taxon
Protocruzia and unravels the deep phylogenetic affinities of the ciliate lineages. Mol. Phylogenet. Evol. 78, 3642.
Cell 166, 691702, July 28, 2016 701
Gomes, A.C., Miranda, I., Silva, R.M., Moura, G.R., Thomas, B., Akoulitchev, A.,
and Santos, M.A. (2007). A genetic code alteration generates a proteome of
high diversity in the human pathogen Candida albicans. Genome Biol. 8, R206.
Grabherr, M.G., Haas, B.J., Yassour, M., Levin, J.Z., Thompson, D.A., Amit, I.,
Adiconis, X., Fan, L., Raychowdhury, R., Zeng, Q., et al. (2011). Full-length
transcriptome assembly from RNA-Seq data without a reference genome.
Hanyu, N., Kuchino, Y., Nishimura, S., and Beier, H. (1986). Dramatic events in
ciliate evolution: alteration of UAA and UAG termination codons to glutamine
codons due to anticodon mutations in two Tetrahymena tRNAs. EMBO J. 5,
13071311.
Harrell, L., Melcher, U., and Atkins, J.F. (2002). Predominance of six different
hexanucleotide recoding signals 30 of read-through stop codons. Nucleic
Acids Res. 30, 20112017.
Helftenbein, E. (1985). Nucleotide sequence of a macronuclear DNA molecule
coding for alpha-tubulin from the ciliate Stylonychia lemnae. Special codon usage: TAA is not a translation termination codon. Nucleic Acids Res. 13, 415433.
Hirsh, D. (1971). Tryptophan transfer RNA as the UGA suppressor. J. Mol. Biol.
58, 439458.
Hirsh, D., and Gold, L. (1971). Translation of the UGA triplet in vitro by tryptophan transfer RNAs. J. Mol. Biol. 58, 459468.
Horowitz, S., and Gorovsky, M.A. (1985). An unusual genetic code in nuclear
genes of Tetrahymena. Proc. Natl. Acad. Sci. USA 82, 24522455.
Jan, C.H., Friedman, R.C., Ruby, J.G., and Bartel, D.P. (2011). Formation, regulation and evolution of Caenorhabditis elegans 3UTRs. Nature 469, 97101.
Johnson, M.D., Tengs, T., Oldach, D.W., Delwiche, C.F., and Stoecker, D.K.
(2004). Highly divergent SSU rRNA genes found in the marine ciliates Myrionecta rubra and Mesodinium pulex. Protist 155, 347359.
Jungreis, I., Lin, M.F., Spokony, R., Chan, C.S., Negre, N., Victorsen, A., White,
K.P., and Kellis, M. (2011). Evidence of abundant stop codon readthrough in
Drosophila and other metazoa. Genome Res. 21, 20962113.
Keeling, P.J., and Doolittle, W.F. (1996). A non-canonical genetic code in an
early diverging eukaryotic lineage. EMBO J. 15, 22852290.
Keeling, P.J., Burki, F., Wilcox, H.M., Allam, B., Allen, E.E., Amaral-Zettler,
L.A., Armbrust, E.V., Archibald, J.M., Bharti, A.K., Bell, C.J., et al. (2014).
The Marine Microbial Eukaryote Transcriptome Sequencing Project
(MMETSP): illuminating the functional diversity of eukaryotic life in the oceans
through transcriptome sequencing. PLoS Biol. 12, e1001889.
Kini, H.K., Silverman, I.M., Ji, X., Gregory, B.D., and Liebhaber, S.A. (2016).
Cytoplasmic poly(A) binding protein-1 binds to genomically encoded sequences within mammalian mRNAs. RNA 22, 6174.
Knight, R.D., Freeland, S.J., and Landweber, L.F. (2001). Rewiring the
keyboard: evolvability of the genetic code. Nat. Rev. Genet. 2, 4958.
Koonin, E.V., and Novozhilov, A.S. (2009). Origin and evolution of the genetic
code: the universal enigma. IUBMB Life 61, 99111.
Kuchino, Y., Hanyu, N., Tashiro, F., and Nishimura, S. (1985). Tetrahymena
thermophila glutamine tRNA and its gene that corresponds to UAA termination
codon. Proc. Natl. Acad. Sci. USA 82, 47584762.
Lekomtsev, S., Kolosov, P., Bidou, L., Frolova, L., Rousset, J.P., and Kisselev,
L. (2007). Different modes of stop codon restriction by the Stylonychia and Paramecium eRF1 translation termination factors. Proc. Natl. Acad. Sci. USA 104,
1082410829.
Muhlhausen, S., Findeisen, P., Plessmann, U., Urlaub, H., and Kollmar, M.
(2016). A novel nuclear genetic code alteration in yeasts and the evolution of
codon reassignment in eukaryotes. Genome Res. Published online May 6,
2016. http://dx.doi.org/10.1101/gr.200931.115.
Nasim, M.T., Jaenecke, S., Belduz, A., Kollmus, H., Flohe, L., and McCarthy,
J.E. (2000). Eukaryotic selenocysteine incorporation follows a nonprocessive
mechanism that competes with translational termination. J. Biol. Chem. 275,
1484614852.
Nowacki, M., Higgins, B.P., Maquilan, G.M., Swart, E.C., Doak, T.G., and
Landweber, L.F. (2009). A functional role for transposases in a large eukaryotic
genome. Science 324, 935938.
Osawa, S., and Jukes, T.H. (1989). Codon reassignment (codon capture) in
evolution. J. Mol. Evol. 28, 271278.
Preer, J.R., Jr., Preer, L.B., Rudman, B.M., and Barnett, A.J. (1985). Deviation
from the universal code shown by the gene for surface protein 51A in Paramecium. Nature 314, 188190.
Roy, B., Leszyk, J.D., Mangus, D.A., and Jacobson, A. (2015). Nonsense suppression by near-cognate tRNAs employs alternative base pairing at codon
positions 1 and 3. Proc. Natl. Acad. Sci. USA 112, 30383043.
Salas-Marco, J., Fan-Minogue, H., Kallmeyer, A.K., Klobutcher, L.A., Farabaugh, P.J., and Bedwell, D.M. (2006). Distinct paths to stop codon reassignment by the variant-code organisms Tetrahymena and Euplotes. Mol. Cell.
Biol. 26, 438447.
Sanchez-Silva, R., Villalobo, E., Morin, L., and Torres, A. (2003). A new noncanonical nuclear genetic code: translation of UAA into glutamate. Curr. Biol. 13,
442447.
Santos, M.A., and Tuite, M.F. (1995). The CUG codon is decoded in vivo as
serine and not leucine in Candida albicans. Nucleic Acids Res. 23, 14811486.
Schneider, S.U., and de Groot, E.J. (1991). Sequences of two rbcS cDNA
clones of Batophora oerstedii: structural and evolutionary considerations.
Curr. Genet. 20, 173175.
Schneider, S.U., Leible, M.B., and Yang, X.P. (1989). Strong homology between the small subunit of ribulose-1,5-bisphosphate carboxylase/oxygenase
of two species of Acetabularia and the occurrence of unusual codon usage.
Mol. Gen. Genet. 218, 445452.
Schultz, D.W., and Yarus, M. (1994). Transfer RNA mutation and the malleability of the genetic code. J. Mol. Biol. 235, 13771380.
Sengupta, S., and Higgs, P.G. (2015). Pathways of Genetic Code Evolution in
Ancient and Modern Organisms. J. Mol. Evol. 80, 229243.
Sladic, R.T., Lagnado, C.A., Bagley, C.J., and Goodall, G.J. (2004). Human
PABP binds AU-rich RNA via RNA-binding domains 3 and 4. Eur. J. Biochem.
271, 450457.
Swart, E.C., Bracht, J.R., Magrini, V., Minx, P., Chen, X., Zhou, Y., Khurana,
J.S., Goldman, A.D., Nowacki, M., Schotanus, K., et al. (2013). The Oxytricha
trifallax macronuclear genome: a complex eukaryotic genome with 16,000 tiny
chromosomes. PLoS Biol. 11, e1001473.
Tourancheau, A.B., Tsao, N., Klobutcher, L.A., Pearlman, R.E., and Adoutte, A.
(1995). Genetic code deviations in the ciliates: evidence for multiple and independent events. EMBO J. 14, 32623267.
Lemke, E.A. (2014). The exploding genetic code. ChemBioChem 15, 1691
1694.
Tuite, M.F., and McLaughlin, C.S. (1982). Endogenous read-through of a UGA

termination codon in a Saccharomyces cerevisiae cell-free system: evidence
for involvement of both a mitochondrial and a nuclear tRNA. Mol. Cell. Biol.
2, 490497.
Lorenz, R., Bernhart, S.H., Honer Zu Siederdissen, C., Tafer, H., Flamm, C.,
Stadler, P.F., and Hofacker, I.L. (2011). ViennaRNA Package 2.0. Algorithms
Mol. Biol. 6, 26.
Vallabhaneni, H., Fan-Minogue, H., Bedwell, D.M., and Farabaugh, P.J. (2009).
Connection between stop codon reassignment and frequent use of shifty stop
frameshifting. RNA 15, 889897.
Lozupone, C.A., Knight, R.D., and Landweber, L.F. (2001). The molecular basis
of nuclear genetic code change in ciliates. Curr. Biol. 11, 6574.
Wang, R., Xiong, J., Wang, W., Miao, W., and Liang, A. (2016). High frequency
of +1 programmed ribosomal frameshifting in Euplotes octocarinatus. Sci.
Rep. 6, 21139.
McCaughan, K.K., Brown, C.M., Dalphin, M.E., Berry, M.J., and Tate, W.P.
(1995). Translational termination efficiency in mammals is influenced by the
base following the stop codon. Proc. Natl. Acad. Sci. USA 92, 54315435.
702 Cell 166, 691702, July 28, 2016
Yang, Z. (2007). PAML 4: phylogenetic analysis by maximum likelihood. Mol.

Biol. Evol. 24, 15861591.
Article
Complementary Contributions of Striatal Projection

Pathways to Action Initiation and Execution
Graphical Abstract
Authors
Fatuel Tecuapetla, Xin Jin,
Susana Q. Lima, Rui M. Costa
Correspondence
fatuel@ifc.unam.mx (F.T.),
rui.costa@neuro.fchampalimaud.org
(R.M.C.)
In Brief
The direct and the indirect basal ganglia
pathway regulate movements by acting in
complementary supportive and
permissive manners, rather than by
having opposing hyperkinetic and
akinetic effects.
Highlights
d
Both basal ganglia pathways are required for action

sequence initiation and performance
Striatonigral manipulations slowed, but striatopallidal
manipulations aborted, initiation
Striatonigral, but not striatopallidal, pathway activation
prolonged action sequences
Inhibition or overactivation of the striatopallidal pathway
aborted action sequences
Tecuapetla et al., 2016, Cell 166, 703715

Article
Complementary Contributions of Striatal Projection
Pathways to Action Initiation and Execution
Fatuel Tecuapetla,1,2,4,* Xin Jin,3 Susana Q. Lima,1 and Rui M. Costa1,4,*
1Champalimaud
Neuroscience Programme, Champalimaud Centre for the Unknown, Avenida De Braslia, Lisbon 1400-038, Portugal
Molecular, Instituto de Fisiologa Celular, Universidad Nacional Autonoma de Mexico, Ciudad Universitaria,
Circuito exterior s/n, Ciudad de Mexico 04510, Mexico
3Molecular Neurobiology Laboratory, The Salk Institute for Biological Studies, 10010 North Torrey Pines Road, La Jolla, CA 92037, USA
4Co-senior author
*Correspondence: fatuel@ifc.unam.mx (F.T.), rui.costa@neuro.fchampalimaud.org (R.M.C.)
2Neuropatologia
SUMMARY
The performance of an action relies on the initiation

and execution of appropriate movement sequences.
Two basal ganglia pathways have been classically
hypothesized to regulate this process via opposing
roles in movement facilitation and suppression. By
using a series of state-dependent optogenetic manipulations, we dissected the contributions of each
pathway and found that both the direct striatonigral
pathway and the indirect striatopallidal pathway are
necessary for smooth initiation and the execution
of learned action sequences. Optogenetic inhibition
or stimulation of each pathway before sequence
initiation increased the latency for initiation: manipulations of the striatonigral pathway activity slowed
action initiation, and those of the striatopallidal
pathway aborted action initiation. The inhibition of
each pathway after initiation also impaired ongoing
execution. Furthermore, the subtle activation of
striatonigral neurons sustained the performance of
learned sequences, while striatopallidal manipulations aborted ongoing performance. These results
suggest a supportive versus permissive model,
where patterns of coordinated activity, rather than
the relative amount of activity in these pathways,
regulate movement initiation and execution.
INTRODUCTION
The ability to select the actions that we want to do in a particular
situation is critical for life. However, a particular action is typically
composed of complex sequences of movement. Therefore, besides initiating the appropriate movement sequence, it is also
important to monitor and maintain the performance of ongoing
movements after initiation. Basal ganglia-thalamo-cortical loops
are critical for the selection and organization of actions (Albin
et al., 1989; Alexander and Crutcher, 1990; Bar-Gad et al.,
2003; Cui et al., 2013; DeLong, 1990; Eliasmith et al., 2012; Graybiel, 1998; Hikosaka et al., 2000; Jin and Costa, 2010; Kravitz
et al., 2010; Mink, 1996). Previous studies have suggested that
activity in basal ganglia circuits is important for the initiation

and the performance of action sequences (Boyd et al.,
2009; Graybiel, 2005; Jin et al., 2014; Kao et al., 2005; Olveczky
et al., 2005; Schmidt et al., 2013). It has been hypothesized that
the two projection pathways originating in the striatumthe
striatonigral (direct pathway that directly projects to the basal
ganglia output) and the striatopallidal (indirect pathway)have
opposing roles in action selection and ongoing action modulation (Alexander and Crutcher, 1990; DeLong, 1990; Durieux
et al., 2012; Gerfen et al., 1990; Kravitz et al., 2010; Tai et al.,
2012). Consistently, it has been shown that the separate activation of each of these pathways reveals an opposing contribution
of each projection pathway to movement (Hikida et al., 2010;
Kravitz et al., 2010; Tai et al., 2012; Tecuapetla et al., 2014). However, recent studies examining the activity of these projection
pathways during the initiation of naturalistic movements or
learned action sequences have revealed that both projection
pathways are concurrently active during sequence initiation
(Cui et al., 2013; Isomura et al., 2013; Jin et al., 2014; Tecuapetla
et al., 2014) and are differentially modulated during sequence
performance (Jin et al., 2014).
It is therefore crucial to causally test if the activity in each of the
striatal projection pathways is critical for the initiation of actions.
Furthermore, it is also important to determine if the activity in
striatonigral and striatopallidal neurons is necessary for the
ongoing execution of action sequences after they are initiated.
In this study, we used state-dependent optogenetics manipulations to inhibit or activate the activity of each striatal projection
pathway, independently or simultaneously, while animals initiated or executed a learned action sequence. The findings
presented here are not compatible with a rate model in which
relative activity between these pathways promote more or less
movement, but rather the findings suggest a model in which
complementary activity patterns in each of these projection
pathways are important for smooth action initiation and
execution.
RESULTS
Mice Learn to Press in Sequences or Bouts
Mice were trained on a self-paced operant task until they pressed
a lever in bouts or in sequences of more than one press (Jin and
Costa, 2010). After 1 day of magazine training, and 3 days of
continuous reinforcement, animals were trained for 1112 days in

a fixed ratio 8 schedule (FR8), where a reinforcer (10% sucrose)
was dispensed every eight presses. Animals gradually organized
their presses in sequences or bouts (Figure 1A) and decreased
the proportion of single lever presses (both for C57BL/6J animals,
n = 8; Figure 1B, upper; for the cohorts of Cre animals, see Figure 1B, bottom). The percentage of lever press sequences (bouts
with R2 lever press) became stable after 4 days of training in the
FR8 schedule (FR8 session 411; p > 0.05; Kruskal-Wallis; both
for C57B6/J and Cre animals; Figure 1B). The mean number of
lever presses per sequence increased with training and then remained stable (4 0.6 on day 1; 6 0.8 on day 6; 6 0.7 on
day 11, Kruskal-Wallis, day 1 versus 6 p < 0.03; day 6 versus
11 p > 0.05). With training, mice developed a stereotyped path
from the magazine to the lever (after consuming the reward
they moved toward the lever to start the next sequence of lever
presses; Figure 1C; Movie S1). We placed an infrared beam in
this path and measured the latency of the animals to start a
new sequence of lever presses. We verified that after just a few
days of training animals showed a rather stable latency between
crossing the infrared (IR) beam and performing the first lever
press of a sequence (Figure 1D, top, C57BL/6J, n = 8, bottom;
Cre animals, at least six animals per group).
Inhibiting the Activity of Each Projection Pathway before
Sequence Initiation Increased the Latency to Initiate
Action Sequences
Previous studies have shown that the dorsolateral striatum (DLS)
is critical for the execution of well-learned motor sequences (Yin,
2009, 2010; Yin et al., 2005), and activity in striatal projection
neurons in DLS correlates with the initiation and execution of action sequences (Jin and Costa, 2010; Jin et al., 2014; Jog et al.,
1999). To achieve optogenetic manipulations of the activity of
specific projection pathways in DLS, mice expressing Cre recombinase under specific promoters were bilaterally injected
with viruses (AAV2/1) that express the opsin of interest in a
Cre-dependent manner (ArchT for inhibition and ChR2 for activation experiments; Figure 1E) and implanted with fiber optics
above the site of viral injection (Figure 2B).
To achieve simultaneous inhibition of both striatal pathways,
we injected a virus that expressed Archaerhodopsin in a Credependent manner (Flex-ArchT-GFP) into the DLS of RGS9L
Cre mice that express Cre in the striatal projection neurons of
both pathways (Figures 1G1I). To achieve selective expression
of ArchT-GFP in either the striatonigral or the striatopallidal pathways, we injected the Flex-ArchT-GFP virus into the DLS of
either D1-Cre (FK150 or the more striatal-specific EY217) mice,
or D2-Cre and A2A-cre mice (ER43 and KG139), respectively
(Figures 1E, left, and 1J). We used more than one Cre line for
each cell type to ensure that the effects observed reflected the
manipulation of striatonigral and striatopallidal neurons (general
D1-Cre lines target neurons outside the striatum and D2 receptors may be expressed in some striatal interneurons) (Cui et al.,
2013; Tai et al., 2012). Using stereological counting, we determined that 32%35% of all neurons in the injection area were infected (Figure 1F). The expression of ArchT-GFP did not change
the intrinsic properties of the neurons for up to 3 weeks after
expression (Figure S1F).
704 Cell 166, 703715, July 28, 2016
We verified ex vivo and in vivo that green-light illumination

caused the inhibition of ArchT-GFP-expressing neurons (Figure S1). We used 2535 mW measured at the fiber tip to achieve
a sufficiently large modulation in vivo. We calculated the number
of neurons in each hemisphere that could be optogenetically
manipulated based on 35% of the cells being infected (Figure 1F),
the spread of the viral infection (>0.5 mm; see Figure 1E), the
estimation of the light penetration (Tecuapetla et al., 2014), and
the density of neurons (see the Supplemental Experimental
Procedures; 6,900 neurons in 0.300 mm3). We estimated that
approximately 1,700 neurons per hemisphere received enough
light to be inhibited in the case of ArchT, and approximately
1,400 to be activated in the case of ChR2 (see the Supplemental
After injection of the virus, animals recovered for about 3
4 days and then training started. After 1112 days of training
on FR8, animals were subjected to optogenetic manipulation
test sessions. During test sessions, we performed an intra-session comparison for each animal between blocks in which the
light was on versus blocks in which the light was off (the behavioral session was divided into three blocks, unless otherwise
specified during the text), allowing us to compare the effects of
light in each animal within the same session (see Figure 2).
To study the contribution of each basal ganglia pathway in the
initiation of an individual action sequence, we manipulated the
activity of the striatal cells before the first lever press of each
bout or sequence of presses (Figure 2A). We took advantage
of the fact that training animals developed not only stereotyped
sequences of lever pressing but also a preferred path from the
food magazine to the lever press (see Movie S1). We placed a
vertical infrared beam between the food magazine and the lever;
this allowed us to trigger the optogenetic inhibition before the
first lever press (Figure 2), ensuring that striatal neurons were
inhibited before the first lever press (light on for 5 s, inhibition
starting on average after 57 20 ms after light onset). The latency
between IR beam crossing and first lever press on the day of
optogenetic inhibition before optogenetic manipulation was as
follows: RGS9-ArchT eYFP (median [mean error] in s): 1.7
[2.3 0.7]; D1 ArchT-GFP: 2.6 [4.5 1.0]; A2A/D2 ArchT GFP:
1.4 [2.7 0.9]; RGS9L ChR2 eYFP 5Hz; 3.1 [3.1 0.4]; D1
ChR2 eYFP 5 Hz: 1.2 [1.3 0.1]; A2A/D2 ChR2 eYFP 5 Hz; 1.1
[1.5 0.3]; RGS9L ChR2 eYFP 14 Hz: 3.1 [3.1 0.4]; D1 ChR2
eYFP 14 Hz: 1.1 [1.4 0.3]; A2A/D2 ChR2 eYFP 14 Hz: 1.3
[2.7 1.1]).
We observed that simultaneous inhibition of the activity of both
striatal pathways increased the latency to perform the first press
of the sequence (RGS9L ArchT-GFP light off = 2.3 0.7 versus
light-on = 5.6 1.4; n = 8; Z = 2.38; p < 0.05, Wilcoxon test; Figure 2C, middle). We also verified that this effect was not due to
a decrease in performance along the test behavioral session
by comparing the relation between the light-on and light-off
blocks during the test session (L_on) with the same blocks in a
control session (with no light delivered during sequence initiation; L_off) and corroborated that the simultaneous inhibition
of both pathways increased the latency to initiate an action
sequence (RGS9L ArchT-GFP L_off = 1.1 0.1 versus L_on =
3.1 1.1, n = 8; Z = 1.96; p < 0.05, Wilcoxon test; Figure 2C, right
upper; see Figures S2AS2I for further comparison between the
Lever press
Lever press start
Lever press end
Head entry
Lick
Sucrose 10 %
A
Task (FR8)
Day 1 of training
Magazine
ArchT expression
ChR2 expression
Str
Lever
Cx
Str
latency
Str
GP
Str
DAPI
RGS9L Cre ArchT-GFP
Day 11 of training
GP
GP
SNr
GPm
SNr
RGS9L Cre ChR2-eYFP
GPm
1 mm
50 sec
80
optogenetic
testing
session
60
40
20
0
CRF 1
11
Lever press (%)
Days in FR8
100
RGS9LArchT
RGS9LChR2
D1AchT
A2A/D2ArchT
D1ChR2
A2A/D2ChR2
80
60
40
20
0
CRF 1
11
Days in FR8
DAPI
D1 Cre ArchT-GFP
WT(C57BL/6J)
Str
Str
GPm
GP
SNr
GPm
DAPI
D1 Cre ChR2-eYFP
SNr
GPm
ArchT-GFP
NeuN
2
1
Str
CRF 1
11
DAPI
A2A Cre ArchT-GFP
Days in FR8
5
3
2
1
0
3
11
10 m
GP
DAPI
D2 Cre ChR2-eYFP
(%) Neurons expresing the opsin
CRF 1
Str
GP
RGS9L Cre
40
60
D2/A2A Cre
D1 Cre
60
40
40
20
20
40
100
40
20
0
100
40
100
Days in FR8
RGS9L Cre
eYFP
Merged
RGS9L Cre
tdtom
Merged
Lateral
1.5 mm
Intensity / max intensity
100
Str
(%) positions
Latency to start (sec)
Sequences >= 2 lever press

Single lever press
Latency to start (sec)
Lever press (%)
D1 Cre Opsin
D2/A2A Cre Opsin
100
80
BAC D1
tdtom RGS9L and D1 (+)
150 RGS9L (+)
BAC D2
eGFP RGS9L and D2 (+)
150 RGS9L (+)
RGS9L Cre-eYFP RGS9L Cre-td tom

BAC D1 td tom
BAC D2 eGFP
100
50
10 m
10 m
0
1
Positions
# Cells
DAPI
# Cells
100
50
0
60
40
20
0
GP GPm SNr
Positions
Figure 1. Learning to Initiate and Perform Sequences of Lever Press in Cre Lines Expressing Opsins in the Different Striatal Projection
Pathways
(A) Representative example of the behavioral time stamps from one C57BL/6J mouse during two stages of training. Vertical lines represent the time stamps of
lever presses; they are color-coded to highlight the first and last press in a sequence.
(B) Proportion of sequences of presses (black) and individual presses (red) as training progressed for a group of C57BL/6J mice (upper) and for the different
cohorts of Cre lines expressing the opsins used in this study (bottom).
(C) Experimental setup showing the operant box and the position of an infrared beam in the path from the reward magazine to the lever.
(D) Mean latency between crossing the infrared beam and the first lever press of a sequence, as training progressed, for C57BL/6J mice (upper) and for the
different cohorts of Cre lines (bottom). The vertical arrows in (B) and (D) indicate when the optogenetic manipulations started.
(E) Representative sagittal mouse brain slices for the different Cre lines used in this study expressing either ArchT-GFP (left) or ChR2-eYFP (right). The reporter
signal (GFP or eYFP) is present in the striatum and in the nuclei that receive inputs from the infected striatal cells. Bottom middle: representative picture combining
three frames in the z axes (1 mm distance; confocal image) and depicting the presence of ArchT-GFP in the periphery of the labeled NeuN-positive cells.
(F) Quantification of expression of the opsins in different sections of each Cre line, using stereological quantification. The horizontal and vertical axes are the
percent of sections and the percent of NeuN-positive cells that also showed labeling for the opsin.
(G) To evaluate whether the RGS9L Cre line targets the two striatal pathways, we injected virus that expresses protein reporters into the striatum of RGS9L Cre
animals generated by crossing RGS9L Cre animals with either D1 td-tomato or D2-eGFP.
(H) Estimation of the proportion of D1 td-tomato-positive neurons from the neurons expressing eYFP after viral injection in RGS9L Cre.
(I) Estimation of the proportion of D2-eGFP-positive neurons from the neurons expressing td-tomato after viral injection in RGS9L Cre.
(J) Quantification of pixel intensity normalized to maximum intensity from three different target nuclei of the striatum. Cx, cortex; Str, striatum; GP, globus pallidus;
GPm, Globus pallidus medial and SNr, Subtantia nigra pars reticulata.
See also Figure S1 and Movie S1.
Cell 166, 703715, July 28, 2016 705
A Light-off (2nd Block) Light-on (3rd Block)

Magazine
Fiber tip placement
Magazine
RGS9L Cre
Lever
Lever
D1 Cre
D2/A2A Cre
latency
latency
infrared beam
triggers light
D1 Cre
(striatonigral-cells)
10
1
1
On
10
10
1
Lon
Seconds
10
On
Lon
(Light, 5 sec)
eYFP control
Lon
Ratio
ChR2-eYFP
(Light, 5Hz, 5 sec)
10
Lon
Ratio
Light-on
eYFP control
On
10
ArchT-GFP
10
10
Lon
Ratio
10
Light-on
10
Light-on
10
10
On
On
**
10
Ratio
10
Lon
10
Lon
Ratio
3rd Block /2nd block
Seconds
0,3
L
Light-on
Light-on
10
On
1
10
On
10
10
Ratio
Light-on
Ratio
Seconds
Lon
Light-on
10
Seconds
10
On
10
Seconds
(striatopallidal cells)
Ratio
Light-on
Seconds
10
10
Seconds
Seconds
Ratio
Light-on
11
Seconds
A2A/D2 Cre
RGS9L Cre
(striatonigral-striatopallidal cells)
10
ChR2-eYFP
(Light, 14Hz, 5 sec)
10
On
eYFP control
Lon
Figure 2. Optogenetic Modulations of the Activity of Striatal Projection Pathways before Sequence Initiation Increases the Latency to Initiate
the Action Sequence
(A) Experimental setup showing control blocks (light off, second block of a session) and optogenetic stimulation blocks (light on, third block of a session). To
investigate the contribution of the striatal pathway activity to the initiation of action sequences, light manipulations were triggered by crossing the infrared beam
when the animal moved from the magazine to the lever (red dot).
(B) Optic fiber placements for optogenetic manipulations in the DLS. Representative drawings depict the positions of the fiber tips from the animals used in this
study.
(C) Latency to initiate a sequence of lever presses for animals expressing either ArchT-GFP (dark green) or eYFP (light green) in both pathways (RGS9L); each
point is the mean latency for the session of one animal. Left: data from the last session without light manipulations (light off). Middle: a session with light manipulations (light on only during the third block). Right: ratios between the sessions with no light manipulations and those with light manipulation: Loff, off/off; Lon,
on/off.
(D and E) Same as in (C), except D1-Cre and D2/A2A-Cre animals were used, respectively.
(FH) Experiments presenting the effects of optogenetic activation of the striatal projection pathways with a 5-Hz stimulation on the latency to initiate the action
sequence (ChR2-eYFP, blue; eYFP, light green) for RGS9L, D1-Cre, or D2/A2A-Cre animals, respectively.
(IK) Same as in (F)(H), except, in this case, optogenetic activation occurred at 14 Hz;*p < 0.05; Wilcoxon test. Data are presented as the mean and SEM.
blocks of no light manipulation during the session of optogenetic

inhibition versus the control session). Furthermore, we also verify
that this effect was specific to the presence of ArchT in the striatal cells, because we did not observe it in animals expressing
only eYFP (RGS9L eYFP L_off = 2.6 1.0 versus L_on = 2.0
0.3, n = 7; Z = 0.86; p > 0.05, Wilcoxon test; Figure 2C, right
bottom).
These results show, in a temporally precise manner, that striatal activity is required during action initiation, but they do not
show whether there is a specific contribution of each projection
pathway. Therefore, we investigated the effect of inhibiting each
pathway on action initiation. Inhibition of striatonigral pathway
706 Cell 166, 703715, July 28, 2016
activity before the first press of a sequence (using the D1Cre lines) also increased the latency to initiate the action
sequence (D1-Cre ArchT-GFP, light off = 4.5 1.0 versus light
on = 15.4 6.2; n = 9, Z = 2.66; p < 0.01, Wilcoxon test; Figure 2D,
middle). This effect remained consistent when controlling for any
potential order effect of the light-on and light-off blocks during
the session (n = 9; Z = 2.07; p < 0.05, Wilcoxon test; Figure 2D,
upper right; Table S1) and was not observed in D1-Cre animals
expressing only eYFP (n = 6, Z = 1.57; p > 0.05, Wilcoxon
test; Figure 2D, bottom right; Table S1). Interestingly, inhibition
of the activity of the striatopallidal pathway had a similar effect
in increasing the latency to initiate a sequence of lever presses
(D2/A2A-Cre ArchT-GFP light-off = 2.7 0.9 versus light-on =

7.3 2.2, n = 10; Z = 2.80, p < 0.01, Wilcoxon test; Figures 2E,
middle). This effect remained significant when controlling for
any potential effect of the order of the blocks (n = 10; Z =
2.19; p < 0.03, Wilcoxon test; Figure 2E, right upper; Table
S1) and was not observed in D2/A2A-Cre animals expressing
only eYFP (n = 7; Z = 0.50; p > 0.05, Wilcoxon test; Figure 2E,
right bottom; Table S1). Interestingly, the magnitude of the increase in latency was the same when both pathways were inhibited versus only one or the other being inhibited (Figure S2J;
p > 0.05; Kruskal-Wallis test).
In a separate set of experiments we expressed ChR2 in parvalbumin-positive striatal interneurons (PV-Cre) to indirectly inhibit
the activity of striatal projection neurons during action initiation.
Increasing the activity of PV-positive GABAergic striatal interneurons before sequence initiation (14 Hz stimulation) also resulted in
an increase in the latency to perform the first press of a sequence
(PV-Cre ChR2-GFP light off = 1.8 0.4 versus light on = 2.9 0.8
s, n = 12; Z = 2.98, p < 0.004, Wilcoxon test; Figure S3B, Light on).
This effect was consistent when compared with a similar block
in a control session (n = 11; Z = 2.22; p < 0.05, Wilcoxon test;
Figure S3B, left; Table S1) and was not a general effect of
manipulating the activity of striatal cells, as it was not observed
when ChR2 was expressed in cholinergic-positive interneurons
(n = 9; Z = 0.17; p > 0.05, Wilcoxon test; Figure S3C; Table S1).
Together, these experiments suggest that the activity of
striatal projection neurons is critical for action initiation and
that activity in both striatal pathways is necessary for the appropriate initiation of an action sequence.
Increasing the Activity of Each Projection Pathway
before Sequence Initiation Also Increased the Latency
for Action Initiation
The previous results question the classical rate model in which the
relative firing rate of direct versus indirect pathway neurons promotes movemen, and the opposite promotes lack of movement.
We therefore tested directly whether activation of striatonigral or
striatopallidal neurons before movement initiation would promote
and inhibit movement initiation. We started by using a stimulation
frequency of 5 Hz, close to the average firing rate of striatal projection neurons in vivo (Costa et al., 2004; Jin and Costa, 2010; Tecuapetla et al., 2014). When we simultaneously stimulated the activity of both pathways before sequence initiation, we observed an
increase in the latency to initiate the action sequence (RGS9L
ChR2-eYFP 5Hz, light off = 3.1 0.4 versus light on = 5.8 0.4,
n = 7, Z = 2.36; p < 0.05, Wilcoxon test; Figure 2F, middle). A similar
result was obtained if striatal projection neurons were stimulated
at a higher frequency (RGS9L ChR2-eYFP 14 Hz, light off =
1.9 0.3 versus light on = 6.5 0.5, n = 7, Z = 2.36; p < 0.05, Wilcoxon test; Figure 2I, middle). Similarly, when we enhanced the
activity of either striatonigral or striatopallidal neurons just before
the onset of action sequence initiation, we also observed an
increase in the latency for sequence initiation (D1 ChR2-eYFP
5 Hz, light off = 1.3 0.1 versus light on = 6.1 0.7; D1 ChR2eYFP 14 Hz, light off = 2.7 1.1 versus light on = 10.0 2.3; D2/
A2A ChR2-eYFP 5 Hz, light off = 1.5 0.3 versus light on = 3.3
0.8; D2/A2A ChR2-eYFP 14 Hz, light off = 1.5 0.2 versus light
on = 7.3 0.9; p < 0.05, Wilcoxon test; Figures 2G, 2H, 2J, and
2K, middle). The same effects were observed when we normalized

the increase in latency of sessions with light stimulation to sessions without light stimulation (p < 0.05, Wilcoxon test; Figures
2G, 2H, 2J, and 2K, right and upper in each figure; Table S1),
and no effects on latency were observed in eYFP-expressing animals (p > 0.05, Wilcoxon test; Figures 2G, 2H, 2J, and 2K, right
bottom in each figure; Table S1).
These experiments show that enhancing the activity of either
striatal projection pathway before the onset of action initiation
did not facilitate, but rather impaired action sequence initiation,
suggesting that specific activity patterns (Jin et al., 2014), rather
than rate of activity in these pathways, are critical for appropriate
action initiation.
Manipulating the Activity of Striatonigral Neurons
Slowed Action Initiation, while Manipulating the Activity
of Striatopallidal Neurons Disrupted Action Initiation
To further characterize the increases in latency observed after inhibiting or stimulating the different basal ganglia pathways, we
analyzed the videos of the animals while initiating the action sequences (see the Experimental Procedures). We observed that
manipulations of the activity of the striatonigral pathway slowed
the initiation of the action sequence (Figures 3C and 3E, middle),
with the animals staying in the same path and zone of initiation
(Z1; determined for each animal; see the Experimental Procedures) and resuming immediately pressing after the stimulation
was turned off (Figure 3C; Movie S2). On the other hand, manipulations of striatopallidal activity prompted animals to leave
the initiation zone, and to abort sequence initiation (inhibition:
D2/A2A-Cre ArchT GFP switches between zone 1 to zone 2;
light off = 0.4 0.2 [median = 0] versus light on = 2.2 0.6
[median = 2]; n = 7; Z = 2.21; p < 0.05, Wilcoxon test; Figures 3D and 3E, lower; activation D2/A2A-Cre ChR2 eYFP
14 Hz switches between zones 1 to 2; light off = 0.6 0.5
[median = 0] versus light on = 11.1 2.5 [median = 9], n = 9;
Z = 2.56, p < 0.02, Wilcoxon test) (Movie S3). Furthermore, the
number of switches after inhibition or overactivation of striatopallidal neurons was significantly higher than the number of
switches after inhibition or overactivation of striatonigral neurons
(p < 0.05; Kruskal-Wallis test; Figure S4). Simultaneous inhibition
or activation of both BG pathways did not cause an abortion of
action sequence initiation, indicating that it was the imbalance
between striatopallidal and striatonigral activity that disrupted
sequence initiation (Figures 3B and 3E, upper).
Therefore, although manipulations of activity in both pathways
caused increased latency for action initiation, this occurred for
different reasons: manipulating the activity of striatonigral neurons slowed action initiation, while manipulating the activity of
striatopallidal neurons aborted action initiation, and caused the
animals to switch to other behaviors.
Inhibiting the Activity of Striatal Projection Pathways
during Action Sequence Performance Decreased
Pressing Frequency
To investigate whether the activity of each basal ganglia pathway
is required during the execution of action sequences, in separate
sessions, we inhibited the activity after sequence initiation. This
was achieved by triggering the light manipulations on the first
Cell 166, 703715, July 28, 2016 707
(2nd Block)
Light-on (3rd Block)
Magazine
Magazine
Lever
Lever
zone 1
switch
switch
zone 2
infrared beam
triggers light
# Switch Z1
ArchT-GFP
On
1
RGS9 Cre
0
Number of times
Off
5 sec
D2/A2A Cre
Top view
Cumulative frames
Tracking
Number of times
Off
On
Off
5 hz
5 sec
0
Off
On
0
Off
5 sec
On
14 hz
12
On
12
Number of times
D1 Cre
On
14 hz
5 hz
12
Off
Z2
ChR2-eYFP
On
Off
5 hz
On
14 hz
24
24
16
16
0
Off
On
Off
On
Off
On
Figure 3. Direct Pathway Manipulation Slows, while Indirect Pathway Manipulations Abort the Initiation of the Action Sequences
(A) Experimental setup with an arrow highlighting the possibility that animals would get out of the zone of performance of the task (Z1) during optogenetic
manipulations. Z1 was defined for each animal based on individual occupancy. See the Supplemental Experimental Procedures.
(BD) Left: gray pictures correspond to the upper view from three different animals in the behavioral box for each of the Cre lines depicted on the right (stimulation
at 14 Hz). Middle: heat color images of body occupancy from the videos acquired during the blocks of no light manipulation (off) versus the blocks of manipulation
(on). Color-code is normalized, where lighter colors show the areas in which animals spent more time during the corresponding block. Right: a representation of
the tracking of the center of mass for the same data. Light on, blue; light off, red.
(E) Quantification of the number of times animals left zone 1 for each manipulation. *p < 0.05, Wilcoxon test.
lever press of a sequence (Figure 4A). We observed that the

simultaneous or independent inhibition of each basal ganglia
pathway decreased the number of lever presses executed during the 5 s of light inhibition (p < 0.05, Wilcoxon test; not
observed in eYFP, p > 0.05, Wilcoxon test; Figure 4B, bottom;
Table S2). Furthermore, when analyzing the videos of the animals, we found that only the inhibition of the indirect pathway
showed an increase in the number of times that animals aborted
the ongoing performance (D2/A2A-Cre ArchT GFP light off =
0.4 0.2 [median = 0] versus D2/A2A-Cre ArchT GFP light
on = 2.2 0.6 [median = 2], n = 7, Z = 2.21, p < 0.03; Figure 4C).
These results show that the activity of both basal ganglia pathways is required for action performance and that appropriate activity of the indirect pathway is critical for animals to continue
ongoing behavior and not to switch to different behaviors.
Activation of the Direct Pathway after Sequence
Initiation Supports the Performance of Ongoing Action
Sequences
The results reported above are not consistent with a prokinetik/
antikinetic model for the different striatal projection pathways.
708 Cell 166, 703715, July 28, 2016
Rather, they suggest that both pathways are necessary for

movement initiation and performance, with the direct pathway
supporting the initiation and execution of the desired action (animals slow down/pause initiation and execution with direct
pathway inhibition) and the indirect pathway permitting it by
inhibiting competing actions (animals abort initiation and execution with indirect pathway inhibition). We therefore tested if
subtle biasing of the activity of direct pathway neurons after
sequence initiation would be sufficient to maintain ongoing
performance. Indeed, when we activated striatonigral neurons
at 5 Hz after sequence initiation, we observed an increase in
the number of lever press in the sequence (D1-Cre ChR2
eYFP5Hz-presses-L1 light off = 5.6 0.3 versus D1-Cre ChR2
eYFP5Hz-presses-L1 light on = 8.9 0.4; n = 12; Z = 3.05; p <
0.002; Wilcoxon test; Figures 5A and 5B, Stim L1; Movie S4),
and during the period of the stimulation (D1-Cre ChR2 eYFP
eYFP5Hz-L1-stim light off = 4.5 0.2 versus D1-Cre ChR2
eYFP5Hz-L1-stim light on = 5.9 0.6; n = 12; Z = 1.96; p < 0.05; Wilcoxon test; Figure 5C). The corresponding comparison with a
control session with no light manipulations showed that this
increase was not due to a change in performance along the
Light-off
(2nd Block)
Light-on
(3rd Block)
Magazine
Magazine
Lever
Lever
Z1
Z1
Z2
Z2
1st lever press

triggers light
Lever press during stimulation

# Lever presses
RGS9 Cre
D1 Cre
D2/A2A Cre
10
10
10
*
Off
Off On
On
*
Off On
On / Off
2
RGS9 Cre
D1 Cre
D2/A2A Cre
*
ArchT-GFP
eYFP control
# Switch Z1
Number of times
C
6
RGS9 Cre
Z2
D1 Cre
D2/A2A Cre
0
Off
On
0
Off
On
Off
On
Figure 4. Inhibiting the Activity of Striatal Projection Patterns during

Sequence Execution Impairs Performance
(A) Experimental setup for control blocks (light off, second block) and blocks of
light manipulations (light on, third block). Note that, in this case, light inhibition
was triggered when animals performed the first lever press in a sequence of
lever presses.
(B) Number of lever presses during 5 s of light manipulation (on) and no light
manipulation (off) to inhibit the striatal projection pathways targeted by the
corresponding Cre mouse lines. Bottom: the same data as in the upper panels
is presented, except the data are normalized (dark green); the corresponding
control groups of animals expressing only eYFP are included.*p < 0.05; Wilcoxon test.
(C) Analysis presenting the number of crosses out of zone 1 (Z1), tracking the
body position for blocks of no light manipulation (off) versus the blocks of light
manipulation (on). *p < 0.05; Wilcoxon test.
See also Table S2 and Movie S4.
session (see Figure S5A). This increase in performance was not

observed in D1-Cre animals expressing only eYFP (D1-Cre ChR2
eYFP5Hz-presses-L1 light off = 6.9 0.2 versus D1-Cre ChR2
eYFP5Hz-presses-L1 light on = 6.8 0.4; n = 8; Z = 0.14; p > 0.05;

Wilcoxon test; Figure 5B), or in D2/A2A-Cre animals (Figure 6B).
Furthermore, this maintenance of ongoing actions by low-frequency activation of the direct pathway was observed even if
we deliver the light stimulation later during the sequence, for
example, after the fourth lever press of the sequence (D1-Cre
ChR2 eYFP5Hz-presses-L4 light off = 7.2 0.6 versus D1-Cre
ChR2 eYFP5Hz-presses-L4 light on = 10.4 0.6; n = 7; Z = 2.36;
p < 0.02; Wilcoxon test; Figures 5E and 5F, Stim L4). Interestingly, the latency to initiate the next sequence was not decreased
(D1-Cre ChR2 eYFP5Hz-latency light off = 2.0 0.3 versus D1-Cre
ChR2 eYFP5Hz-latency light on = 3.8 1.0; n = 10; Z = 1.88; p >
0.05; Wilcoxon test; Figure 5D), suggesting that this brief stimulation was not reinforcing.
To directly test if this effect was due to reinforcement of
ongoing behavior or to the activation of neurons involved
in the performance of the well-learned sequences, we performed
a separate experiment in which we stimulated the activity of the
direct pathway, except we did so early in training (first day of FR8
training, third block; matching the days of ChR2 eYFP expression). The same low-frequency activation of direct pathway
neurons (5 Hz) early in training did not produce a change in the
number of lever presses (D1-Cre ChR2 eYFP5Hz-presses-L1early
light off = 6.4 0.4 versus D1-Cre ChR2 eYFP5Hz-presses-L1early
light on = 6.6 0.4; n = 4; Z = 0.73; p > 0.05; Wilcoxon test; Figure 5B, Sim L1 early), supporting an interpretation where the increase in performance was due to the activation of neurons
involved in sequence performance after training.
To further test that the increase in pressing due to direct
pathway activation was not positively reinforcing lever pressing,
we compared experiments of low-frequency activation of the
direct pathway in a block of light off before versus a block of light
off after the stimulation block. We reasoned that if the activation
of the D1-Cre ChR2 eYFP during the stimulation block would be
per se rewarding, we could expect to have an enhancement
in the performance of the light-offafter. We found no difference
between the number of lever presses before a block of direct
pathway stimulation (light offbefore) and immediately after (light
offafter; D1-Cre ChR2 eYFP light offbefore = 5.7 0.3 versus D1Cre ChR2 eYFP light offafter = 6.5 0.4; n = 12; Z = 1.25; p >
0.05; Wilcoxon test; Figure 5G). We also found no difference in
the latency to initiate the action sequence before and after stimulation (D1-Cre ChR2 eYFP5Hz-latency light offbefore = 2.0 0.3
versus D1-Cre ChR2 eYFP light offafter = 3.8 1.0; n = 10;
Z = 1.88; p > 0.05; Wilcoxon test; Figure 5H). Furthermore,
we observed that the average reward rate (ten per block)
did not change during the stimulation block (D1-Cre ChR2
eYFP5Hz-reward/rate light off = 632 322 s versus D1-Cre ChR2
eYFP5Hz-reward/rate light on = 660 353 s; n = 12; Z = 0.54; p >
0.05; Wilcoxon test).
These results are consistent with a view in which maintenance of the appropriate pattern of activation of dMSNs (direct
pathway medium spiny neurons) supports action performance.
However, they could also be viewed as supporting a prokinetic
rate model for striatonigral neurons. If the rate of dMSN activity
would promote the ongoing movement then overactivation of
dMSNS should further promote movement, and have the opposite effect of dMSN inhibition. If, on the other hand, it is the
Cell 166, 703715, July 28, 2016 709
lever press
5 hz L1
10
10
0
2
time (sec)
On
time (sec)
Light off
lever press
0
time (sec)
NS
On
Off
On
# Presses in sequence
Stim L4
# Lever in sequence
before vs after light
15
10
Off
15
NS
10
On
2nd
# Sequences
Sequences #
10
lever press
2
2
0
10
time (sec)
10
2nd
4th
# Presses during
stimulation
14 hz
14 hz
15
10
Seq
Seq per block
1
S-Lp
0
0
0
4th
Seq >= 2Lp
14 hz
14 hz
K
Single Lp
10
NS
10
J
10
On
Latency to start
before vs after light
15
Light on (14 Hz)
Light off
Off
5 hz L4
time (sec)
0
Off
F
# Lever presses
Sequences #
Light on (5 Hz L4)
12
On
5 hz L1
10
ChR2 eYFP
eYFP
E
12
Off
5 Hz L1
10
0
Off
# Lever presses
5 hz L1
15
Latency
to start
Seconds
# Lever presses
5 hz L1
15
# Presses during
stimulation
Seconds
Stim L1
Stim L1 (early)
12
12
Sequences #
Light on (5 Hz L1)
S>2Lp
S Lp
S>2Lp
eYFP
eYFP
0
Off
On
Off
On
Off
On
Off
On
# Lever presses
Light off
# Lever presses
10
14 hz
0
Off
On
time (sec)
L
D1 Cre
Off
On
# Switch Z1
Number of times
Off
On
Top view
Cumulative frames
Z2
14 hz
Tracking
40
20
0
Off
On
Figure 5. Direct Pathway Activation during Sequence Execution Can Support Ongoing Action Performance
(A) Example of lever pressing from a D1-Cre mouse expressing ChR2-eYFP in the DLS direct pathway neurons during a no light block manipulation (left) and a light
block (5 Hz, 5 s, light activation was triggered by the first lever press of each sequence; Stim L1); note the increase in the number of lever presses during light stimulation.
(B) Left to right: quantification of the total number of lever presses for several animals subjected to the same manipulation as in (A). Middle (light green): control
experiments from animals expressing only eYFP that were subjected to the same protocol. Right: the total number of lever presses per sequence from a group of
animals subjected to stimulation immediately after the first lever press, except that the stimulation occurred early in training (early, during the first day of training in FR8).
(C) Mean number of lever presses during 5 s of light manipulation, 5 Hz, 5 s. In (A)(C), the first press in the sequence triggered the light activation.
(D) Latency to start a sequence of lever presses without (off) and with a 5-Hz stimulation triggered by the first press in a sequence.
(E) Example of a D1-Cre animal as in (A), except the light manipulation is triggered by the fourth lever press in the sequence (5 Hz, 5 s, Stim L4).
(F) Behavior from several animals as in (E).
(G) Number of presses per sequence before (second) versus a block after light manipulations (fourth; in this case, the animals were trained with four blocks).
(H) Comparison of the latency to start a sequence of lever presses before (second) and after light manipulation (fourth).
(I) Example of the lever-pressing behavior from an animal as in (A), except activation is at 14 Hz. In this case, instead of an increase in performance, there is a
pause in the sequence of lever presses during stimulation, and pressing immediately resumes after the light is turned off.
(J) Left to right: quantification of individual presses (Single-Lp) and sequences of lever press (Seq >, 2 Lp) for D1-Cre ChR2-eYFP or D1-Cre eYFP mice stimulated
at 14 Hz. The rightmost plot shows the quantification of the left panels presented as the ratio between the block of light on/light off.
(K) Mean number of lever presses during 5 s of light stimulation, 14 Hz.
(L) Number of exits from the performance zone (Z1) during the block of no light versus the block of a 14-Hz stimulation.
710 Cell 166, 703715, July 28, 2016
Light on (5 Hz L1)
lever press
# Lever presses
Sequence #
Stim L1
10
10
0
0
time (sec)
5 hz L1
15
5 hz L1
10
Off
On
Off
5 hz L1
6
3
NS
On
NS
3
0
Off
14 hz
10
On
Latency
to start
Latency
to start
5 hz L1
time (sec)
# Presses during
stimulation
# Lever presses
Light off
Seconds
Seconds
Off
On
Off
On
ChR2-eYFP
eYFP
G
Light (14Hz)
2nd Block
3rd Block
Light off
4th Block
30
30
25
25
25
20
20
20
15
15
15
10
10
5
0
10
15
20
time (sec)
# Sequences
Sequence #
30
5
0
10
15
20
10
time (sec)
15
20
14 hz
14 hz
14 hz 15
14 hz
10
10
Seq >= 2Lp
Single Lp
10
Seq
Seq per block
# Presses during
stimulation
# Lever presses
Light off
1
5
time (sec)
0
Off
On
Off
On
Off
On
Off
S Lp
S Lp
S>2Lp
eYFP
eYFP
S>2Lp
15
14 hz
10
0
Off
On
On
I
D2/A2A Cre
# Switch Z1
On
Off
1
Off
On
Cumulative frames
Tracking
Number of times
Top view
Z2
14 hz
40
20
0
Off
On
Figure 6. Increasing the Activity of the Striatopallidal Cells during the Execution of the Action Sequence Aborts Ongoing Performance
(A) Example of lever pressing for a D2/A2A-Cre animal expressing ChR2-eYFP in DLS during a no light block (light off, left) and a block of light stimulation triggered
by the first press (L1) in the sequence.
(B) Left to right: the total number of presses for several animals subjected to the same manipulation as in (A). Middle (light green): control experiments from animals
subjected to the same protocol, except the animals express eYFP.
(C) Mean number of lever presses during 5 s of light manipulation, 5 Hz. The first press in the sequence triggered light activation.
(D) Latency to start a sequence of lever presses for D2/A2A-Cre ChR2-eYFP animals during the block of stimulation at 5 Hz.
(E) Latency to start a sequence of lever presses for D2/A2A-Cre ChR2-eYFP animals during the block of stimulation at 14 Hz.
(F) Example of the performance of a D2/A2A-Cre ChR2-animal during no light blocks (left and right, light off) and the block with light stimulation at 14 Hz, triggered
by the first press in the sequence (L1) (light on, middle).
(G) Left to right: quantification of individual presses (single Lp) and sequences of lever press (Seq >, 2 Lp) from D2/A2A-Cre ChR2-eYFP or D2/A2A-Cre eYFP (light
on versus light off). Right: the quantification of the left, presented as the ratio between the blocks of light on/light off, is shown.
(H) Mean number of lever presses during 5 s of light manipulation, 14 Hz.
(I) Exits from the performance zone (Z1) during the block of light versus no light (14 Hz).
See also Figures S5, S6, and S7 and Movie S5.
appropriate level of activity that supports the execution of the

action sequence, then both inhibiting the neurons and overactivating dMSNs should impair ongoing movement. We observed
that further increasing the light power at the tip of the fiber further
increased the number of presses for animals for which lower
power did not have a large effect, but decreased it for animals
where lower power had a substantial effect (Figure S5A). This
indicates that there are an ideal number of activated cells to
cause the increase in behavior, and that activation of more
cells resulted in less behavior, and suggests that the sustained
performance effect results from the activation of a specific motor
pattern, and not about reinforcing any behavior. We also
observed that increasing the frequency of stimulation (14 Hz,

well above the normal firing rate of MSNs), resulted in a decrease
in the number of lever presses during stimulation (D1-Cre ChR2
eYFP14Hz-press-stim light off = 5.3 0.4 versus D1-Cre ChR2
eYFP14Hz-press-stim light on = 2.2 0.3; n = 10; Z 2.70; p <
0.007; Wilcoxon test; Figure 5K). eYFP controls did not show
any effect (Figure 7F). Interestingly, animals resumed pressing immediately after stimulation (Figure 5I), and there was
no change in the total number of sequences (D1-Cre ChR2
eYFP14Hz-seq light off mean = 6.9 0.8 versus D1-Cre ChR2
eYFP14Hz-seq light on = 7.2 0.7; n = 10; Z = 0.119; p > 0.05; Wilcoxon test; Figure 5J). Furthermore, analyzing the videos of the
Cell 166, 703715, July 28, 2016 711
ArchT-GFP
eYFP control
(Cont., 5 sec)
Total number of lever presses

Light-on
Light-on
12
8
4
RGS9 Cre
# Lever presses
ChR2-eYFP (5hz)
eYFP control
(5Hz, 5 sec)
(14Hz, 5 sec)
On
10
5
0
Lon Lon
Lon Lon
2
1
3
2
1
0
On
On
E
ChR2-eYFP (5hz)
eYFP control
(5Hz, 5 sec)
Light-on
10
5
0
On
On
15
10
5
0
1
0 L
3
15
10
5
0
On
L
on
on
2
1
0 L
On
Lon Lon
On
(14Hz, 5 sec)
# Lever presses
ChR2-eYFP (14hz)
eYFP control
0
On
Ratio
3
Light-on
(Cont., 5 sec)
# Lever presses
ArchT-GFP
eYFP control
15
Lon Lon
15
Light-on
10
5
0
15
10
5
0
Lon Lon
15
Lon Lon
10
5
Lon Lon
2
1
0
On
(14Hz, 5 sec)
# Lever presses
ChR2-eYFP (14hz)
eYFP control
On
On
(5Hz, 5 sec)
# Lever presses
ChR2-eYFP (5hz)
eYFP control
Ratio
0
On
Light-on
(Cont., 5 sec)
# Lever presses

G D2/A2A Cre
ArchT-GFP
eYFP control
3
2
1
0
0
On
On
animals during stimulation, we observed that during 14-Hz

stimulation of the dMSN pathway, the animals remained in the
zone of performance (D1-Cre ChR2 eYFP switches between
zone1 to zone2 median14Hz light off = 1 [mean error = 1.2
0.4] versus D1-Cre ChR2 eYFP median14Hz light on = 0.5
[mean error = 1.5 0.8]; n = 8; Z = 0.00; p > 0.05, Wilcoxon
test; Figure 5L).
Taken together, these results suggest that subtle activation
of dorsolateral striatonigral neurons after sequence initiation
was sufficient to support the performance of well learned
actions. This effect was not due to reinforcement, and likely
due to the stimulation of the specific ongoing motor pattern
because it was not observed in animals with low levels of
training and was impaired (but not aborted) by higher frequency
of activation of these neurons, or by the activation of more
neurons.
Lon Lon

D1 Cre
# Lever presses
On
15
# Lever presses
ChR2-eYFP (14hz)
eYFP control
On
10
0
On
15
# Lever presses
Ratio
Activation of Striatopallidal Neurons after Sequence

Initiation Aborted Ongoing Action Sequences
Contrary to what was observed for dMSN, stimulation of
iMSN (indirect medium spiny neurons) at 5 Hz after the sequences were initiated (after the first press) did not affect
the number of presses during light activation (D2/A2A-Cre
ChR2 eYFP5Hz-presses light off = 4.52 0.31 versus D2/A2A-Cre
ChR25Hz-stimulation light on = 5.0 0.6; n = 13; Z = 1.15; p >
0.05; Wilcoxon test; Figure 6C) or the total number of lever press
(D2/A2A-Cre ChR2 eYFP5Hz-presses light off = 4.9 0.3 versus
D2/A2A-Cre ChR2 eYFP5Hz-presses light on = 5.8 0.6; n = 13;
Z = 1.81; p > 0.05; Wilcoxon test; Figures 6A and 6B). However,
during activation of the iMSN neurons at a higher frequency
(14 Hz), many sequences were aborted, with animals abandoning the usual zone of performance (D2/A2A-Cre ChR2
eYFP switches between zones 1 and 2 median 14Hz-stim light
off = 0 [mean error = 1.7 1.4] versus D2/A2A-Cre ChR2
eYFP median 14Hz-stim light on = 13.5 [mean error = 15.2
5.6]; n = 8; Z = 2.31; p < 0.021, Wilcoxon test; Figure 6I; Movie
S5). There was a decrease in the total number of sequences during the stimulation block (D2/A2A-Cre ChR2 eYFP14Hz-seq light
off = 8.6 0.5 versus D2/A2A-Cre ChR2 eYFP 14Hz-seq light
on = 5.2 0.8; n = 13; Z = 3.07; p < 0.003; Wilcoxon test; Figure 6G) and an increase in the number of trials in which mice
performed single lever presses (hence the decrease in total
presses per sequence, D2/A2A-Cre ChR2 eYFP 14Hz-SingleLp light
off = 0.3 0.1 versus D2/A2A-Cre ChR2 eYFP14Hz-SingleLp light
on = 3 0.5; n = 13; Z = 2.95; p < 0.004; Wilcoxon test; Figures
6G and 6H). Neither 5 Hz stimulation nor 14 Hz stimulation of this
pathway after sequence initiation affected the latency to initiate
action sequences in the same block (Figures 6D and 6E; p >
0.05, Wilcoxon test).
These data complement the striatopallidal inhibition findings
and indicate that proper activity of striatopallidal neurons during
sequence performance permits ongoing actions to continue,
while too little or too much activity of the striatopallidal pathway
Figure 7. Total Number of Presses in Completed Sequences for All

the Manipulations Presented
(AI) Four panels are shown in each row. Right: the overall performance during
the last session of FR8 training before light manipulations is shown. Left
middle: the intra-session comparison of the performance of animals during the
no light versus the light manipulation block is presented. Right middle: the
712 Cell 166, 703715, July 28, 2016
corresponding control animals for each Cre line are depicted. The panels on
the farthest right part of each row present the same data as in the three panels
explained above, except the data are normalized to the no light manipulation
block.
leads to sequence abortion and causes animals to switch to

different behaviors (Movie S5).
DISCUSSION
The present results demonstrate that activity of both striatal projection pathways in the DLS is required for proper action initiation
and for proper continuation of performance after initiation.
Manipulating the activity of the striatal projection pathways
before the initiation of an action sequence increased the latency
to initiate a sequence (Figure 2). However, the striatonigral
pathway manipulations slowed down or paused the initiation,
while the striatopallidal pathway manipulations increased the
number of aborted action sequence initiations and promoted
switching to other behaviors. Our findings support the simultaneous requirement of activity in both pathways for proper action
initiation (Cui et al., 2013; Gallistel, 1980; Hikida et al., 2010; Hikosaka et al., 2000; Isomura et al., 2013; Jin et al., 2014; Mink,
1996), but also highlight that activity in each pathway has
different contributions to initiation. These observations underscore the importance of using optogenetic inhibition (rather
than just activation) and state-dependent manipulations to probe
the role of ongoing activity in behavior and to disambiguate between different models of basal ganglia function. The results presented here are inconsistent with rate models of basal ganglia
activity, where the activation of direct striatonigral pathway
would be prokinetic while the activation of striatopallidal neurons
would be antikinetic (Albin et al., 1989). Rather, these results
support models in which precise and concomitant activity patterns in both striatonigral and striatopallidal neurons are necessary for proper action selection, with the striatonigral pathway
supporting the selection/initiation of a particular motor program
and the striatopallidal neurons having a more permissive role, for
example, by preventing competing motor programs in the same
context (Hikosaka et al., 2000; Mink, 1996; Nishizawa et al.,
2012) or by stopping previous actions and promoting switching
to new ones (Sano et al., 2013; Schmidt et al., 2013).
Our findings, however, really extend beyond these models,
as they show that the inhibition of both pathways after
action sequence initiation affects the ongoing performance by
decreasing the number of presses and slowing down movements during the period of inhibition (Desmurget and Turner,
2008) (Figure 4). Therefore, these results support a view where
activity in both BG pathways is also required for the appropriate
performance of learned actions sequences. Subtle activation of
the striatonigral, but not striatopallidal pathway, after sequence
initiation, was sufficient to support the continued performance
of the ongoing sequence. This effect did not seem to happen
through reinforcement of the action upon striatonigral stimulation, as observed when manipulating dorsomedial striatum in
previous studies (Kravitz et al., 2012), because it did not happen
early in training (Figure 5B) and did not change the probability or
latency of an action initiation in the following trials (Figures 5G
and 5H). Rather, the effects are consistent with further modulation of the neurons that were active during sequence performance, given the enrichment of striatonigral pathway neurons
displaying sequence-specific sustained activity after learning
(Jin et al., 2014) and that the same stimulation before movement
initiation (5 Hz) did not promote movement. The interpretation

that the effects were due to the modulation of neuronal activity
patterns involved in action execution and not just a mere prokinetic or reinforcing effect is further supported by the fact that
activation of more cells, or the same neurons at a higher frequency, did not sustain action performance, and actually paused
it. On the other hand, proper activity of striatopalidal neurons
seems to be necessary for allowing ongoing actions to continue
as both decrease and increase in activity lead to abortion of
ongoing actions. These observations are consistent with recent
studies proposing that the direct pathway is more related to
exploitative behavior (Chakravarthy et al., 2010), while the
STN-GP complex is more involved in action shifting (Isoda and
Hikosaka, 2008; Monchi et al., 2006) or exploration (Chakravarthy et al., 2010).
The data presented here suggest that action initiation and
execution are very dynamic processes, with different neurons
of each projection pathway being involved in the different aspects of behavior (consistent with Jin et al., 2014). These results
also suggest new vistas on action selection models, given that it
does not seem to be the rate of activity of striatonigral and striatopallidal neurons that selects the desired patterns and inhibits
undesired patterns. Rather, it appears that precise activity patterns in each pathway are necessary for initiation and for execution of action sequences; since overactivation and inhibition of
both pathways seem to have similar effects in many of the experiments described. These findings suggest that the functional
anatomy and organization of the basal ganglia is more complex
than traditionally viewed. This is not surprising given the reports
that striatopallidal neurons send substantial collaterals to striatonigral neurons (Taverna et al., 2008; Tecuapetla et al., 2009),
striatonigral neurons send collaterals to GPe (Cazorla et al.,
2014), that GPe archypallidal neurons innervate back both
MSN types (Mallet et al., 2012), and that GPe neurons can innervate directly cortical structures (Saunders et al., 2015). Still, in
our experiments, the vast majority of neurons showed positive
modulation following optogenetic activation (Figure S6F), and
we observed the expected effects on the basal ganglia output
nucleus SNr (Figure S7). Therefore, it is possible that it is the
precise timing/pattern of basal ganglia output activity that is
important for action selection (Goldberg et al., 2012) or that the
projections originating from SNr are more heterogeneous than
previously appreciated.
Organization of Action Sequences
Some models postulate that the production of action sequences
could happen in a serial manner, where initiation of one element
triggers the execution of the subsequent, etc. (reflex chains
[Sherrington, 1906]). Others defend that action sequences could
be represented in a hierarchical manner (Gallistel, 1980; Graybiel, 1998; Hikosaka et al., 2000; Lashley, 1951), where higher
representations of the number and order of elements in a
sequence may monitor and control execution of modules. An
interesting observation of this study is that despite that inhibition
of both striatal pathways decreased the number of presses
during stimulation, the total number of lever presses in a
sequence did not change (Figures 7A, 7D, and 7G). This shows
that after light stimulation, animals performed the number of
Cell 166, 703715, July 28, 2016 713
presses necessary to reach the same number of presses performed in sequences without stimulation, suggesting that areas
other than DLS can monitor and control the number of presses
that are supposed to be done in a sequence. This is even more
striking in the case, for example, of 14-Hz stimulation of the
dMSN were animals paused (Figures 5I and 5L), but resumed
performance immediately after pausing and completed the
same number of presses as in sequences without stimulation
(Figure 7F). Furthermore, with 5-Hz activation of the dMSN,
where animals increased the number of presses during stimulation (Figure 7E), they still performed more presses after stimulation, as if the presses driven by stimulation did not count for
the animal. Also, in the sequences that were not aborted during
14-Hz stimulation of the striatopallidal pathway (60%; Figure 6G),
animals performed sequences with the same number of presses
as with no light conditions (Figure 7I). These data suggest a
somewhat hierarchical organization of lever press sequences,
were the total length of elements in a sequence can be monitored
and encoded in circuits outside of these basal ganglia pathways.
However, that overactivation or inhibition of the striatopallidal
pathway (Figures 4C and 6F6I) was sufficient to abort
ongoing sequences and cause a switch in behavior, reveals
that sequence organization is more complex, and that changes
in activity of this pathway can abort an ongoing movement
sequence, and lead the animal to switch behaviors. Therefore,
these data support a mixed model of sequence organization,
with an overall hierarchical organization of action sequences,
but with striatopalidal neurons monitoring ongoing performance
and having the ability to abort the ongoing sequence and promote the switching to other behaviors.
Conclusions
In conclusion, the work presented here supports a model in
which both striatal projection pathways complement each other
during action initiation and performance, with the direct pathway
mainly supporting the selection/initiation and performance of
particular actions and the indirect pathway permitting proper
initiation by inhibiting competing movements or promoting
switching by monitoring ongoing performance and aborting
previous actions. These results show that appropriate complementary activity patterns in these pathways are critical for proper
motor control and may have important implications for pathological conditions that produce excessive repetitive behaviors or
excessive behavioral switching.
Animals
Male mice, 2- to 4-month-old, D1 Cre, A2A or D2 Cre and RGS9L Cre backcrossed into Black C57BL/6J or D1 td-tomato or D2 EGFP were used.
Training
We trained mice to develop self-paced sequences of lever press as previously
described (Jin and Costa, 2010; Jin et al., 2014). A sequence of lever presses
was defined as a bout of consecutive lever presses with no entry into the
magazine and licking.
Stereotaxic Virus Injections and Fiber Implantation
After anesthesia each animal was bilaterally injected using glass pipettes
with1 ml viral stock solution [DIO AAV2/1 ChR2-eYFP (UPENN) DIO AVV2/1
714 Cell 166, 703715, July 28, 2016
eYFP (UPENN) and DIO AAV2/1 ArchT-GFP (North Carolina)] by pressure

into the DLS, coordinates: AP: 0.5 mm, ML, 2.3 mm from bregma, and DV
2.3 mm below the surface of the brain. To deliver light into the striatum, a
200 (ChR2) or 300 (ArchT) mm diameter fiber optic were implanted.
Temporally Defined Optogenetic Striatal Manipulations In Vivo
To achieve the optogenetic manipulations of the BG pathways before the initiation of a sequence of lever presses we took advantage of the fact that animals
developed stereotypical sequences of lever press (Figures 1A1D; Movie S1).
Then by setting up a threshold (>610 licks signaling of consumption of the
reward previously delivered), the next time the animal would move from the
magazine to the lever, the infrared beam was broken, and this set a timestamp
to trigger light on and to quantify the latency to initiate the sequence of lever
presses (Figures 1C and 1D). To achieve the light manipulations during the
execution of the sequence, we used the timestamp of the first lever press (Figures 4, 5, and 6).
Anatomical Verification and Stereology Quantification
After extracting the brains of the experimental mice, sectioning the dorsal
striatum, mounting and sealing the 50 mm sections, 403 magnification Z
stacks (50 3 50 3 30 mm; 2 mm interslice) were acquired from the upper right
quadrant using a randomly positioned grid (square grid 200 mm) covering
the dorsal striatum (ZEN lite software, Zeiss). These Z stack were imported
into the stereo investigator software (MBF Bioscience) and quantification of
the NeuN-positive (as neuronal marker, see middle bottom part of Figure 1E),
eYFP-positive or GFP-positive cells was done. To evaluate whether the
RGS9L line targets the two striatal pathways, see the Supplemental Experimental Procedures.
seven figures, two tables, and five movies and can be found with this article
online at http://dx.doi.org/10.1016/j.cell.2016.06.032.
F.T. and R.M.C. designed and wrote the study; X.J. provided advice about
setting the lever press task; and S.Q.L. provided advice for the study and
the opsins used.
ACKNOWLEDGMENTS
We thank C.R. Gerfen and Gensat for the BAC transgenic mouse lines used in
this study, J. Alves da Silva, G. Dugue, M. Lorincz, M. Correia, J. Almeida, and
A. Vaz for electrophysiology, genotyping, and histology support, G. Martins for
comments on the manuscript, and J.O. Ramirez-Jarquin for setting up the Cre
lines at UNAM. This work was supported by an FCT fellowship, a Consejo
Nacional de Ciencia y Tecnologa (CONACyT-Mexico) grant (220412), and
a Direccion General de Asuntos del Personal Academico, UNAM Grant
(IA200815) to F.T. and a European Research Council Consolidator Grant, an
HHMI International Early Career Scientist Grant, an ERA-Net NEURON grant,
and a Simons Foundation grant (SFARI #294295) to R.M.C.
REFERENCES
Albin, R.L., Young, A.B., and Penney, J.B. (1989). The functional anatomy of
basal ganglia disorders. Trends Neurosci. 12, 366375.
Alexander, G.E., and Crutcher, M.D. (1990). Functional architecture of basal
ganglia circuits: neural substrates of parallel processing. Trends Neurosci.
13, 266271.
Bar-Gad, I., Morris, G., and Bergman, H. (2003). Information processing,

dimensionality reduction and reinforcement learning in the basal ganglia.
Prog. Neurobiol. 71, 439473.
Boyd, L.A., Edwards, J.D., Siengsukon, C.S., Vidoni, E.D., Wessel, B.D., and
Linsdell, M.A. (2009). Motor sequence chunking is impaired by basal ganglia
stroke. Neurobiol. Learn. Mem. 92, 3544.
Cazorla, M., de Carvalho, F.D., Chohan, M.O., Shegda, M., Chuhma, N., Rayport, S., Ahmari, S.E., Moore, H., and Kellendonk, C. (2014). Dopamine D2
receptors regulate the anatomical and functional balance of basal ganglia circuitry. Neuron 81, 153164.
Chakravarthy, V.S., Joseph, D., and Bapi, R.S. (2010). What do the basal
ganglia do? A modeling perspective. Biol. Cybern. 103, 237253.
Costa, R.M., Cohen, D., and Nicolelis, M.A. (2004). Differential corticostriatal
plasticity during fast and slow motor skill learning in mice. Curr. Biol. 14,
11241134.
Cui, G., Jun, S.B., Jin, X., Pham, M.D., Vogel, S.S., Lovinger, D.M., and Costa,
R.M. (2013). Concurrent activation of striatal direct and indirect pathways during action initiation. Nature 494, 238242.
DeLong, M.R. (1990). Primate models of movement disorders of basal ganglia
origin. Trends Neurosci. 13, 281285.
Desmurget, M., and Turner, R.S. (2008). Testing basal ganglia motor functions
through reversible inactivations in the posterior internal globus pallidus.
J. Neurophysiol. 99, 10571076.
Durieux, P.F., Schiffmann, S.N., and de Kerchove dExaerde, A. (2012). Differential regulation of motor control and response to dopaminergic drugs by D1R
and D2R neurons in distinct dorsal striatum subregions. EMBO J. 31, 640653.
Eliasmith, C., Stewart, T.C., Choo, X., Bekolay, T., DeWolf, T., Tang, Y., and
Rasmussen, D. (2012). A large-scale model of the functioning brain. Science
338, 12021205.
Gallistel, C.R. (1980). The organization of action: a new synthesis. Am. J. Psychol. 94, 190192.
Gerfen, C.R., Engber, T.M., Mahan, L.C., Susel, Z., Chase, T.N., Monsma, F.J.,
Jr., and Sibley, D.R. (1990). D1 and D2 dopamine receptor-regulated gene
expression of striatonigral and striatopallidal neurons. Science 250, 1429
1432.
Goldberg, J.H., Farries, M.A., and Fee, M.S. (2012). Integration of cortical
and pallidal inputs in the basal ganglia-recipient thalamus of singing birds.
J. Neurophysiol. 108, 14031429.
Graybiel, A.M. (1998). The basal ganglia and chunking of action repertoires.
Neurobiol. Learn. Mem. 70, 119136.
Jog, M.S., Kubota, Y., Connolly, C.I., Hillegaart, V., and Graybiel, A.M. (1999).
Building neural representations of habits. Science 286, 17451749.
Kao, M.H., Doupe, A.J., and Brainard, M.S. (2005). Contributions of an avian
basal ganglia-forebrain circuit to real-time modulation of song. Nature 433,
638643.
Kravitz, A.V., Freeze, B.S., Parker, P.R., Kay, K., Thwin, M.T., Deisseroth, K.,
and Kreitzer, A.C. (2010). Regulation of parkinsonian motor behaviours by
optogenetic control of basal ganglia circuitry. Nature 466, 622626.
Kravitz, A.V., Tye, L.D., and Kreitzer, A.C. (2012). Distinct roles for direct and
indirect pathway striatal neurons in reinforcement. Nat. Neurosci. 15, 816818.
Lashley, K.S. (1951). Cerebral Mechanisms in Behavior; the Hixon Symposium
(New York: Wiley).
Mallet, N., Micklem, B.R., Henny, P., Brown, M.T., Williams, C., Bolam, J.P.,
Nakamura, K.C., and Magill, P.J. (2012). Dichotomous organization of the
external globus pallidus. Neuron 74, 10751086.
Mink, J.W. (1996). The basal ganglia: focused selection and inhibition of
competing motor programs. Prog. Neurobiol. 50, 381425.
Monchi, O., Petrides, M., Strafella, A.P., Worsley, K.J., and Doyon, J. (2006).
Functional role of the basal ganglia in the planning and execution of actions.
Ann. Neurol. 59, 257264.
Nishizawa, K., Fukabori, R., Okada, K., Kai, N., Uchigashima, M., Watanabe,
M., Shiota, A., Ueda, M., Tsutsui, Y., and Kobayashi, K. (2012). Striatal
indirect pathway contributes to selection accuracy of learned motor actions.
J. Neurosci. 32, 1342113432.
Olveczky, B.P., Andalman, A.S., and Fee, M.S. (2005). Vocal experimentation
in the juvenile songbird requires a basal ganglia circuit. PLoS Biol. 3, e153.
Sano, H., Chiken, S., Hikida, T., Kobayashi, K., and Nambu, A. (2013). Signals
through the striatopallidal indirect pathway stop movements by phasic excitation in the substantia nigra. J. Neurosci. 33, 75837594.
Saunders, A., Oldenburg, I.A., Berezovskii, V.K., Johnson, C.A., Kingery, N.D.,
Elliott, H.L., Xie, T., Gerfen, C.R., and Sabatini, B.L. (2015). A direct GABAergic
output from the basal ganglia to frontal cortex. Nature 521, 8589.
Schmidt, R., Leventhal, D.K., Mallet, N., Chen, F., and Berke, J.D. (2013).
Canceling actions involves a race between basal ganglia pathways. Nat. Neurosci. 16, 11181124.
Sherrington, C.S. (1906). The Integrative Action of the Nervous System
(C. Scribners Sons).
Graybiel, A.M. (2005). The basal ganglia: learning new tricks and loving it. Curr.
Opin. Neurobiol. 15, 638644.
Tai, L.H., Lee, A.M., Benavidez, N., Bonci, A., and Wilbrecht, L. (2012). Transient stimulation of distinct subpopulations of striatal neurons mimics changes
in action value. Nat. Neurosci. 15, 12811289.
Hikida, T., Kimura, K., Wada, N., Funabiki, K., and Nakanishi, S. (2010). Distinct
roles of synaptic transmission in direct and indirect striatal pathways to reward
and aversive behavior. Neuron 66, 896907.
Taverna, S., Ilijic, E., and Surmeier, D.J. (2008). Recurrent collateral connections of striatal medium spiny neurons are disrupted in models of Parkinsons
disease. J. Neurosci. 28, 55045512.
Hikosaka, O., Takikawa, Y., and Kawagoe, R. (2000). Role of the basal ganglia
in the control of purposive saccadic eye movements. Physiol. Rev. 80,
953978.
Isoda, M., and Hikosaka, O. (2008). Role for subthalamic nucleus neurons
in switching from automatic to controlled eye movement. J. Neurosci. 28,
72097218.
Isomura, Y., Takekawa, T., Harukuni, R., Handa, T., Aizawa, H., Takada, M.,
and Fukai, T. (2013). Reward-modulated motor information in identified striatum neurons. J. Neurosci. 33, 1020910220.
Tecuapetla, F., Koos, T., Tepper, J.M., Kabbani, N., and Yeckel, M.F. (2009).
Differential dopaminergic modulation of neostriatal synaptic connections of
striatopallidal axon collaterals. J. Neurosci. 29, 89778990.
Tecuapetla, F., Matias, S., Dugue, G.P., Mainen, Z.F., and Costa, R.M. (2014).
Balanced activity in basal ganglia projection pathways is critical for contraversive movements. Nat. Commun. 5, 4315.
Yin, H.H. (2009). The role of the murine motor cortex in action duration and
order. Front. Integr. Nuerosci. 3, 23.
Jin, X., and Costa, R.M. (2010). Start/stop signals emerge in nigrostriatal circuits during sequence learning. Nature 466, 457462.
Yin, H.H. (2010). The sensorimotor striatum is necessary for serial order
learning. J. Neurosci. 30, 1471914723.
Jin, X., Tecuapetla, F., and Costa, R.M. (2014). Basal ganglia subcircuits
distinctively encode the parsing and concatenation of action sequences.
Nat. Neurosci. 17, 423430.
Yin, H.H., Ostlund, S.B., Knowlton, B.J., and Balleine, B.W. (2005). The role of
the dorsomedial striatum in instrumental conditioning. Eur. J. Neurosci. 22,
513523.
Cell 166, 703715, July 28, 2016 715
Article
Presynaptic Excitation via GABAB Receptors in

Habenula Cholinergic Neurons Regulates Fear
Memory Expression
Graphical Abstract
Authors
Juen Zhang, Lubin Tan, Yuqi Ren, ...,
Bernhard Bettler, Fengchao Wang,
Minmin Luo
Correspondence
luominmin@nibs.ac.cn
In Brief
Fear behavior in the face of danger relies
on presynaptic excitation in a habenular
circuit that is unexpectedly mediated by
GABAB receptors, synaptic proteins
typically responsible for inhibitory
signaling.
Highlights
d
Habenula cholinergic neurons reduce fear memory

expression
Presynaptic GABAB activity of habenula neurons facilitates
fear extinction
GABAB drastically potentiates the corelease of multiple
neurotransmitters
GABAB amplifies presynaptic Ca2+ entry through Cav2.3
channels
Zhang et al., 2016, Cell 166, 716728

Article
Presynaptic Excitation via GABAB Receptors
in Habenula Cholinergic Neurons
Regulates Fear Memory Expression
Juen Zhang,1,2,5 Lubin Tan,1,2,5 Yuqi Ren,2,3 Jingwen Liang,2 Rui Lin,2,3 Qiru Feng,2 Jingfeng Zhou,2,3 Fei Hu,2 Jing Ren,2
Chao Wei,2 Tao Yu,2 Yinghua Zhuang,2 Bernhard Bettler,4 Fengchao Wang,2 and Minmin Luo1,2,*
1School
of Life Sciences, Tsinghua University, Beijing 100084, China

Institute of Biological Sciences, Beijing 102206, China
3PTN Graduate Program, Peking University School of Life Sciences, Beijing 100081, China
4Department of Biomedicine, Institute of Physiology, Pharmazentrum, University of Basel, Klingelbergstrasse 50/70, 4056 Basel, Switzerland
5Co-first author
*Correspondence: luominmin@nibs.ac.cn
2National
SUMMARY
Fear behaviors are regulated by adaptive mechanisms that dampen their expression in the absence
of danger. By studying circuits and the molecular
mechanisms underlying this adaptive response, we
show that cholinergic neurons of the medial habenula reduce fear memory expression through GABAB
presynaptic excitation. Ablating these neurons or inactivating their GABAB receptors impairs fear extinction in mice, whereas activating the neurons or their
axonal GABAB receptors reduces conditioned fear.
Although considered exclusively inhibitory, here,
GABAB mediates excitation by amplifying presynaptic Ca2+ entry through Cav2.3 channels and potentiating co-release of glutamate, acetylcholine, and
neurokinin B to excite interpeduncular neurons. Activating the receptors for these neurotransmitters or
enhancing neurotransmission with a phosphodiesterase inhibitor reduces fear responses of both
wild-type and GABAB mutant mice. We identify the
role of an extra-amygdalar circuit and presynaptic
GABAB receptors in fear control, suggesting that
boosting neurotransmission in this pathway might
ameliorate some fear disorders.
INTRODUCTION
Excessive fear or fear that persists in the absence of threat is
dysfunctional, as in phobias and post-traumatic stress disorder
(PTSD) (Milad and Quirk, 2012; VanElzakker et al., 2014). The formation and expression of fear memory involves the extended
amygdala and its connected brain areas (Janak and Tye, 2015;
LeDoux, 2000; Tovote et al., 2015). The circuit between the amygdala and the limbic prefrontal cortex is critical for reducing fear responses during the extinction phase (Herry et al., 2010; Myers and
Davis, 2007). However, the functions and molecular mechanisms
of extra-amygdalar circuits in fear extinction remain unclear.
716 Cell 166, 716728, July 28, 2016 2016 Elsevier Inc.
The habenulo-interpeduncular pathway may regulate aversive

responses (Figures S1A and S1B). In mammals, the medial habenula (MHb) in the epithalamus receives excitatory signals from
brain areas that promote fear and anxiety (Yamaguchi et al.,
2013). A majority of MHb neurons express cholinergic markers
and corelease glutamate and acetylcholine to activate their postsynaptic neurons in the midbrain interpeduncular nucleus (IPN)
(Frahm et al., 2015; Hu et al., 2012; Qin and Luo, 2009; Ren
et al., 2011). IPN neurons then project caudally to a set of brainstem nuclei, where neurons send their outputs back to the limbic
forebrain areas (Goto et al., 2001; Pollak Dorocic et al., 2014)
(Figure S1A). The MHb in mammals is involved in behavioral
processes related to aversive and/or appetitive stimuli, such
as stress, pain, and nicotine addiction (Fowler et al., 2011; Frahm
et al., 2011, 2015; Hu et al., 2012; Kobayashi et al., 2013; SoriaGomez et al., 2015). Disrupting this pathway in fish impacts
behavioral responses to aversive stimuli (Agetsuma et al.,
2010; Lee et al., 2010). It remains unclear whether the findings
from the studies of fish apply to mammals, because the fish
MHb homolog receives direct sensory but not limbic inputs (Stephenson-Jones et al., 2012).
Habenula cholinergic neurons express a particularly high level
of GABAB receptors along their axonal projections to the IPN
(Margeta-Mitrovic et al., 1999). GABAB is a key receptor for the
major neurotransmitter GABA (gamma aminobutyric acid), and
GABAB has to date been considered to be exclusively inhibitory
(Gassmann and Bettler, 2012). Activating GABAB suppresses
transmitter release at presynaptic sites and hyperpolarizes
membrane potentials at postsynaptic sites (Dutar and Nicoll,
1988a; Gassmann and Bettler, 2012; Newberry and Nicoll,
1984). Altered GABAB signaling is associated with anxiety and
fear disorders (Cryan and Kaupmann, 2005). The GABAB agonist
baclofen has been used to treat PTSD in clinical trials (Drake
et al., 2003; Manteghi et al., 2014), suggesting a potential role
in fear control for GABAB receptors that are expressed in MHb
neurons. However, no studies have as yet addressed the behavioral functions, physiological effects, or molecular mechanisms
of GABAB signaling in the habenulo-interpeduncular pathway.
Here, we analyzed the functions of habenula cholinergic neurons and their GABAB receptors in the acquisition and expression of cued fear memory. We found that both the neuronal
activity and the presynaptic GABAB signaling of these neurons

are critical for extinguishing fear memory. Critically, GABAB mediates its behavioral effect by strongly potentiating the corelease
of multiple neurotransmitters. At the molecular level, GABAB produces presynaptic excitation by facilitating Ca2+ entry through
the Cav2.3 channel. Our data demonstrate that GABAB signaling
in a specific synapse of an evolutionarily-conserved pathway
controls expression of fear memory, hence suggesting alternative therapeutic approaches to treating fear disorders.
RESULTS
Habenula Cholinergic Neurons Control Fear Memory
Expression
We first ablated cholinergic neurons in the MHb to test whether
they are required for forming or expressing fear memory. To
selectively kill these neurons, we infused the Cre-dependent adeno-associated virus (AAV) vector AAV-flex-taCasp3-TEVp into
the bilateral habenula of a ChAT-Cre mouse, which expressed
the Cre recombinase under the promoter of the gene encoding
choline acetyltransferase (ChAT) (Figures 1A and 1B). As a control for the accuracy of the ChAT-Cre driver line (Gong et al.,
2007), infusion of the AAV-DIO-EmGFP vectors led to the
expression of enhanced membrane green fluorescent protein
(EmGFP) in ChAT-immunopositive neurons within the MHb of
these mice (henceforth referred to as MHb-ChAT-EmGFP mice
for simplicity; Figure 1C). This approach also selectively labeled
ChAT-expressing axons from the MHb to the IPN (Figure S1C).
Consistent with the effect of taCasp3 on inducing cell apoptosis
(Yang et al., 2013), injecting AAV-flex-taCasp3-TEVp into the
ChAT-Cre (abbreviated as MHb-ChAT-taCasp3) mice eliminated
cholinergic neurons in the MHb and abolished cholinergic axonal
terminals in the IPN (Figures 1C, S1C, and S1D). The ChAT
expression pattern remained unchanged in other major cholinergic brain areas (Figures S1ES1G), indicating that the vectorinduced loss of cholinergic neurons was restricted within the
MHb and that cholinergic inputs to the IPN arose predominantly
from the MHb.
We compared the MHb-ChAT-taCasp3 mice with the MHbChAT-EmGFP control mice on a test of cued fear memory. In
this test, we conditioned mice by presenting a 20-s auditory
tone (conditioned stimulus [CS]) that co-terminated with a 1-s
footshock for five trials (Figure 1D). As training progressed,
both groups became increasingly immobilized (freezing) during
the tone. The two groups of mice exhibited similar freezing levels
during the conditioning phase (Figure 1E; see Table S1 for statistical analysis), suggesting that habenula cholinergic neurons are
dispensable for forming fear memory.
We then examined whether ablating these neurons affected
fear memory expression. Following behavioral protocols for
testing fear extinction (Milad and Quirk, 2002; Soria-Gomez
et al., 2015), we placed a mouse in a test chamber that was
distinct from the training chamber and repetitively delivered CS
tones in the absence of footshock (ten trials with random intervals) (Figure 1D). The MHb-ChAT-EmGFP control mice showed
the normal response pattern: freezing responses decreased
gradually within a session and became fully extinguished after
3 days. The freezing levels of MHb-ChAT-taCasp3 mice were
initially similar to those of control mice, but decayed more slowly,

resulting in significantly higher freezing ratios, even after 5 days
(Figures 1F and 1G; Table S1). In a different experiment, we
tested animal freezing responses to a 180-s continuous auditory
tone (Soria-Gomez et al., 2015) (Figure 1D). In both test sessions
on days 1 and 6, the MHb-ChAT-taCasp3 mice showed longlasting freezing responses during the tones, resulting in significantly higher total freezing time than for control mice (Figures
1H and 1I; Table S1). Ablating habenula cholinergic neurons
did not affect locomotor activity (Figures S1H and S1I; Table
S1), thus ruling out the possibility that a general suppression
of locomotor activity caused the freezing increase. Therefore,
habenula cholinergic neurons are necessary for reducing fear
responses to conditioned stimuli that no longer actively predict
threat.
We next examined whether activating habenula cholinergic
neurons would be sufficient for reducing freezing responses. In
these experiments, we used ChAT-ChR2-EYFP mice, in which
the light-sensitive cation channel ChannelRhodopsin2 (ChR2)
was selectively expressed in the somata and axons of cholinergic neurons (Figures 1J, S1J, and S1K) (Ren et al., 2011). Taking advantage of the fact that the axons of bilateral MHb neurons
converge into a single nucleus along the midline, we used one
optical fiber to activate the bilateral cholinergic inputs. The optical fiber was implanted into the dorsal IPN with its tip tapered to
minimize tissue damage (Figures 1J and S1L). We induced
strong fear responses by conditioning ChAT-ChR2-EYFP mice
with ten tone-footshock pairs (Figure S1M). During the extinction
phase in the following days, the mice in the stimulation group
received light pulses to activate the cholinergic terminals during
auditory tones (Figure 1K). Optogenetic stimulation resulted in
significantly lower levels of freezing throughout the test. As a
control, delivering light pulses to wild-type littermate mice and
omitting light pulses to the control ChAT-ChR2-EYFP mice had
no effect (Figures 1L and 1M; Table S1). These results therefore
suggest that activating habenula cholinergic neurons can effectively substitute for fear extinction.
GABAB Activity in the Habenulo-Interpeduncular
Pathway Controls Fear Extinction
We investigated how GABAB receptors in habenula cholinergic
neurons may affect fear behavior. Both essential subunits,
GABAB(1) and GABAB(2), are richly expressed in habenula
cholinergic neurons and their axon terminals within the IPN
(Figures 2A, 2B, and S2AS2C). To genetically inactivate the
GABAB(1)-encoding gene (Gabbr1) in cholinergic neurons, we
generated GABAB(1) conditional knockout (ChAT-GABAB(1)KO; CKO) mice by crossing the ChAT-Cre mouse line with
the GABAB(1)lox511/lox511 line. In the ChAT-GABAB(1)-KO mice,
GABAB(1) expression was abolished in the cholinergic neurons
in the MHb and in their axonal terminals within the central
and rostral IPN (Figures 2C and 2D and S2DS2F); GABAB(1)
expression appeared normal in the lateral IPN, cortex, and hippocampus of these mice (Figures S2GS2I). This allowed us to
investigate the potential role in fear memory expression for the
GABAB activity of cholinergic neurons.
Both ChAT-GABAB(1)-KO mice and their wild-type littermates
showed similar increases in freezing during the conditioning
Cell 166, 716728, July 28, 2016 717
AAV-flex-taCasp3-TEVp
taCasp3-2A-TEVp
EF1
taCasp3-2A-TEVp
WPRE
virus
EmGFP
overlay
EmGFP
Cre
EF1
ChAT
WPRE
100 m
taCasp3-TEVp
& EmGFP
conditioning
MHb
IPN
ChAT-Cre mouse
E
Freezing (%)
40
30
lesion (n=25) 60
ctrl (n=27)
day 2
day 3
****
****
lesion
ctrl
**
20
20
0
1 3 5 7 9
1 3 5 7 9
1 3 5 7 9
25
1 3 5 7 9
Trials
optical fiber
200 m
Freezing (%)
3 5
Test day
M
test day 1
day 2
day 3
60
ChR2 laser on (n=8)
IPN
Time (h
h20 s)
Test days
75
100 m
50
****
** 25
10
1 3 5 7 9
**
ctrl (n=9)
50
***
ctrl
lesion
75
lesion (n=13)
****
****
20
optical fiber
75
day 6
***
40
Trials
I
test day 1
lesion (n=12)
ctrl (n=18)
****
1 2 3 4 5
180 s
day 5
40
n.s.
day 4
60
****
test 2
G
test day 1
1s
20 s
test 1
F
conditioning day 0
50
test
20 s
conditioning
ChR2 laser off (n=9)
50
ctrl laser on
ChR2 laser off
ChR2 laser on
n.s. **
n.s.
ctrl laser on (n=10)
40
n.s.
25
n.s.
20
****
****
0
1
Trials
Test day
Figure 1. Ablating Habenula Cholinergic Neurons Enhances the Expression of Conditioned Fear, whereas Activating These Cells Reduces
Fear Responses
(A and B) We generated a MHb-ChAT-taCasp3 (lesion) mouse or a MHb-ChAT-EmGFP (ctrl) mouse by infusing a mixture of AAV-DIO-EmGFP vectors and
AAV-flex-taCasp3-TEVp (A) or AAV-DIO-EmGFP only vectors into the MHb of a ChAT-Cre mouse (B).
(C) Coexpression of EmGFP (green) and ChAT (red) in the bilateral MHb of a control mouse (upper) and the lack of such expression and reduction of MHb volume
in a lesion mouse (lower).
(D) Methods for cued fear conditioning and tests. In test sessions, mice were exposed to either ten discrete conditioned stimuli (test 1) or one 180-s continuous
tone (test 2).
(E) Freezing response levels across trials during the cued fear conditioning on day 0.
(F) Freezing responses across trials in the extinction sessions on days 15, in which we presented ten discrete conditioned stimuli but omitted footshocks.
(G) Overall freezing responses of the lesion mice and control mice to discrete tones.
(H and I) Effects of lesion on the freezing responses (H) and overall freezing ratio (I) to a continuous 180 s tone.
(J) Distribution of ChR2-EYFP-expressing fibers (green) within the IPN and the method of delivering light with a tapered optical fiber.
(K) Method of testing the effect of optogenetic stimulation on freezing responses. Blue bar indicates light stimuli (5 ms light pulses at 50 Hz).
(L and M) Freezing responses across trials (L) and total freezing response levels (M) of the stimulation mice and control mice during extinction sessions. For a
control, we gave light stimulation to wild-type littermates or omitted light stimulation to ChAT-ChR2-EYFP mice.
*p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001; n.s., not significant; two-way ANOVA on difference between phenotypes or between stimulation and control
(E, F, H, and L) and t tests (G, I, and M). Error bars indicate SEM.
See Table S1 for statistical analyses in detail. See also Figure S1.
phase (Figure 2E; Table S1), indicating that the GABAB activity of
cholinergic neurons is not required for forming cued fear memory. In the extinction sessions using discrete CS, a tone without
shock elicited significant higher freezing levels from GABAB CKO
mice, even after extinction training for 5 days (Figures 2F and 2G;
Table S1). When tested with a continuous 180-s tone, wild-type
mice gradually reduced their freezing, but the GABAB CKO mice
continued to freeze and exhibited significantly higher freezing
levels on days 1 and 6 (Figures S2J and S2K; Table S1). ChATGABAB(1)-KO mice did not show abnormality in locomotor activity before the tests, nor in general locomotion in an open field, nor
718 Cell 166, 716728, July 28, 2016
in anxiety levels in heightened or illuminated open spaces, nor

in pain sensitivity to footshock (Figures S2LS2R; Table S1).
Apparently GABAB activity of the cholinergic neurons does not
affect various global functions. Rather, it is critical to controlling
fear extinction.
We then examined how pharmacological manipulation of
GABAB activity in the IPN of wild-type mice would affect fear responses. C57BL6/N wild-type mice were conditioned with five
tone-footshock pairs. Immediately before the extinction sessions, we infused the GABAB antagonist 2-OH-saclofen (saclofen) directly into the IPN (Figure S3A). Saclofen pretreatment
GABAB(1)
100 m
EYFP
EYFP
GABAB(1)
overlay
EYFP
100 m
EYFP
50
CKO (n=9)
G
test day 1
100
ctrl (n=9)
40
day 2
day 3
day 4
75
75
day 5
****
50
****
****
20
Trials
0
1
25
****
25
CKO
ctrl
**
50
ctrl (n=10)
****
n.s.
**
CKO (n=8)
30
10
overlay
IPN
overlay
IPN
GABAB(1)
MHb
100 m
GABAB(1)
100 m
MHb
Freezing (%)
overlay
Test day
Trials
Figure 2. Genetically Inactivating GABAB in Cholinergic Neurons Enhances the Expression of Learned Fear
(A and B) Strong GABAB(1) immunoreactivity (red) in habenula cholinergic neurons (green; A) and their axonal terminals in the IPN (green; B) of a ChAT-ChR2-EYFP
mouse.
(C and D) GABAB(1) immunoreactivity was markedly reduced in the MHb (C) and IPN (D) of a ChAT-GABAB(1)-KO (CKO) mouse. GABAB(1) expression remained
unchanged in the lateral IPN that received non-cholinergic inputs from the dorsal MHb.
(E) Freezing responses of both CKO mice and wild-type littermate controls (ctrl) during fear conditioning.
(F) Freezing responses of CKO mice and control mice during the fear extinction sessions on days 15. ****p < 0.0001 for difference between phenotypes in all
sessions, two-way ANOVA.
(G) Total freezing responses of the mutant and control mice. *p < 0.05; **p < 0.01; t tests. Error bars indicate SEM.
significantly increased the freezing responses to both discrete

trials of conditioned stimulus and a 180-s continuous tone (Figures 3A, 3B, S3B, and S3C; Table S1). Saclofen did not reduce
locomotor activity prior to the presentation of auditory tone (Figure S3D; Table S1), again indicating that GABAB activity in the
IPN reduces freezing levels but not general locomotion.
If blocking GABAB in the IPN could increase fear responses,
then we reasoned that activating GABAB might do the opposite,
reducing freezing responses during extinction tests. One day after a wild-type mouse was subjected to fear conditioning with ten
trials of tone-footshock stimuli, intra-IPN pretreatment of baclofen (3 pmol) accelerated fear extinction for all behavioral sessions that were based on discrete auditory tones (Figures 3C
and 3D; Table S1). At the high dose of 30 pmol, baclofen produced a sedative effect, and we needed to increase the waiting
time from 5 min to 1.5 hr before starting the tests. Possibly
because of the longer waiting time, this baclofen pretreatment
significantly reduced freezing on extinction session on day 6,
but not on day 1, when we tested with a 180-s continuous tone
(Figures S3ES3G; Table S1). Knocking out GABAB in cholinergic neurons prevented baclofen from facilitating fear extinction
(Figures 3E, 3F, and S3HS3J; Table S1), demonstrating that the
fear-reducing effect of baclofen requires GABAB receptors

within the cholinergic neural processes.
GABAB Activity Mediates Presynaptic Excitation
To study the synaptic basis for the fear-reducing effect of GABAB
activity, we performed whole-cell patch recording from interpeduncular neurons in brain slices of ChAT-ChR2-EYFP mice (Figure 4A). Stimulating the ChR2-EYFP-expressing terminals with
a single brief light pulse evoked fast excitatory postsynaptic currents (EPSCs) (Figures 4A and 4B), whereas shining light onto
EmGFP-expressing terminals of a control mouse did not produce any current (Figure S4A). In agreement with glutamate corelease by the habenula cholinergic neurons (Hu et al., 2012; Ren
et al., 2011), the EPSCs were resistant to nicotinic blockers but
were blocked by an AMPA-type glutamate receptor antagonist
(Figures 4C and S4B).
We then examined how baclofen affects the glutamatergic
EPSCs. We had expected that it would reduce the EPSCs,
because GABAB action has to date been considered to be
purely inhibitory. Unexpectedly, the drug increased the EPSC
amplitude in a cell from 100 pA to nearly 2,000 pA (Figures 4B and 4C). For the entire population of recorded
Cell 166, 716728, July 28, 2016 719
day 1
100
day 2
75
ACSF (n=6)
Saclofen (n=7)
****
75
day 3
Freezing (%)
Freezing (%)
***
50
****
25
50
n.s.
n.s.
25
0
1
Trials
100
Test day
100
ACSF (n=7)
baclofen (n=6)
75
Freezing (%)
Freezing (%)
ACSF
saclofen
****
50
****
25
ACSF
baclofen
75
**
50
25
**
****
0
0
1
Trials
100
n.s.
n.s.
75
ACSF (n=5)
baclofen (n=5)
Freezing (%)
Freezing (%)
n.s.
50
25
Test day
100
n.s.
ACSF
baclofen
75
n.s.
50
Figure 3. GABAB Activity in the Interpeduncular Nucleus Facilitates Fear Extinction

(A and B) The effects of pre-test intra-IPN infusion of
the GABAB antagonist 2-OH-saclofen (saclofen; 300
pmol) on freezing responses across trials (A) and total
freezing response levels (B) of C57BL6/N wild-type
mice during the extinction sessions on days 13.
(C and D) The effects of pre-test injection of baclofen (3 pmol) into the IPN on the freezing responses
across trials (C) and overall freezing ratio (D) of wildtype mice.
(E and F) Lack of significant effects by intra-IPN
pretreatment of baclofen on the freezing responses
across trials (E) and overall freezing ratio (F) of
ChAT-GABAB(1)-KO mice. Mice were trained with
five conditioning trials in (A) and (B) and ten trials in
(C)(F).
*p < 0.05; **p < 0.01; ***p < 0.001; ****p < 0.0001; n.s.,
not significant; two-way ANOVA for difference between drugs and ACSF control in (A), (C), and (E)
and t tests in (B), (D), and (F). Error bars indicate SEM.
See Table S1 for statistical analyses in detail. See
also Figure S3.
n.s.
25
0
0
1
Trials
neurons, baclofen potentiated the EPSCs by more than 10-fold,

increasing the EPSC amplitude from 100 pA to >1,000 pA
(Figure 4D; Table S1). The potentiation was dose-dependent
and was saturable, with an EC50 of 1 mM (Figures S4C and
S4D); this value is consistent with the binding affinity of baclofen for GABAB receptors (Kaupmann et al., 1997). Baclofen did
not change the EPSC rising time (Figures S4E and S4F), suggesting that its effect is monosynaptic. Baclofen also potently
enhanced the EPSCs elicited by electrically stimulating the
fasciculus retroflexus (fr), which contains the incoming fiber
tract from the MHb (Figures S4GS4I; Table S1). This rules
out the possibility that baclofen acts on ChR2 to enhance
EPSCs. After ablating habenula cholinergic neurons, electrical
stimulation failed to elicit any EPSCs, even in the presence of
baclofen (Figure S4J), indicating that baclofen selectively increases neurotransmitter release from the axonal terminals of
these neurons.
We carried out three experiments to test whether presynaptic
GABAB receptors, or a yet-to-be-identified receptor for baclofen, were mediating the unusual excitatory effect. First, knocking
out GABAB(1) in cholinergic neurons prevented baclofen from
potentiating EPSCs (Figures 4E and 4F; Table S1), indicating
that GABAB receptors in the presynaptic neurons are essential.
Second, antagonizing GABAB with saclofen reversibly abolished
baclofens potentiatory effect (Figures S4KS4M; Table S1),
demonstrating the necessity of GABAB activity within the IPN.
Third, baclofen did not alter inward currents induced by directly
puffing the glutamate agonist a-amino-3-hydroxy-5-methyl-4isoxazolepropionic acid (AMPA) onto the interpeduncular neurons of wild-type mice (Figures S4N and S4O; Table S1),
excluding the possibility that baclofen directly increased the
activity of postsynaptic glutamate receptors. We can thus
720 Cell 166, 716728, July 28, 2016
conclude that baclofen activates presynaptic GABAB receptors to boost glutamate

release.
Baclofen mimics the effect of endogenous GABA. Directly
perfusing GABA into the slice preparation significantly potentiated light-evoked EPSCs (Figures 4G4I; Table S1). Furthermore, endogenously released GABA also enhances EPSCs. To
show this, we induced GABA release by optogenetically stimulating the GABAergic neuropil in the IPN of VGAT-ChR2-EYFP
mice, in which ChR2 expression was under the control of the
promoter of the gene encoding the vesicular GABA transporter
(VGAT) (Zhao et al., 2011) (Figures 4J, S4P, and S4Q). Optically-induced GABA release significantly enhanced the EPSCs
evoked by electrical stimulation of the incoming habenular fiber
(Figures 4K and 4L; Table S1).
Test day
GABAB Potentiates Neurotransmitter Corelease to

Reduce Fear
In light of glutamate and acetylcholine corelease by habenula
cholinergic neurons, we asked whether GABAB activity could
also potentiate cholinergic EPSCs. In the presence of the glutamate antagonist 6,7-dinitroquinoxaline-2,3-dione (DNQX), a train
of light pulses (10 Hz, 5-ms pulses for 1 s) elicited a slow EPSC
of <50 pA in an interpeduncular neuron of a ChAT-ChR2-EYFP
mouse (Figures 5A and 5B). Following baclofen application,
the EPSC reached the peak amplitude of 700 pA and quickly
decayed despite continuous light stimulation. In most cells
(12/18), these slow EPSCs were fully blocked by nicotinic antagonists, indicating that they were cholinergic in nature (Figures 5A5C; Table S1). For the 12 fully-blocked cells, baclofen
amplified cholinergic EPSCs 9-fold (Figure 5C; Table S1), indicating that GABAB activity can strongly enhance acetylcholine
release.
Baclofen also potentiated EPSCs mediated by the peptide
neurokinin B (NKB). This was revealed by our observation that
2, baclofen 100 ms
ChAT-ChR2-EYFP
bac
2.0
2, baclofen
1.6
1.2
0.8
0.4
0
3, wash
1, ctrl
3.5
***
2.5
2.0
1.5
1.0
0.5
ctrl
balofen
10
15
20
25
30
35
baclofen
wash
GABA
puffing
100 pA
fr
50 ms
GABA
ChAT-ChR2-EYFP
wash
0.2
0.1
ChAT-ChR2-EYFP
& ChAT-GABAB(1)-KO
J
**
ctrl
0.8
0.6
fr
0.4
0.2
wash
20 ms
VGAT stimulation
0
ctrl
GABA
wash
ctrl
fr stimulation
1.0
**
n.s.
0
ctrl
200 pA
ctrl
EPSC amplitude (nA)
10 ms
0.3
0
0
Time (min)
***
3.0
VGAT-ChR2-EYFP
EPSC amplitude (nA)
500 pA
fr
EPSC amplitude (nA)
3, wash
picrotoxin & HMT & Mec
2.4
EPSC amplitude (nA)
C
1, ctrl
20 pA
EPSC amplitude (nA)
1.0
**
baclofen
***
0.8
0.6
0.4
0.2
0
Figure 4. Presynaptic GABAB Activity Drastically Potentiates Glutamate Release from Habenular Neurons
(AD) Baclofen (1 mM) produced an over tenfold increase in glutamatergic EPSCs. The cartoon in (A) illustrates the method of whole-cell recordings and
optogenetic stimulation of the ChR2-EYFP-expressing axons from the fasciculus retroflexus (fr). Raw traces (B) and a time-series plot of the EPSC amplitude (C)
draw data from one representative cell. The blue dot in (B) indicates 5-ms blue light stimulation. Horizontal lines in (C) show the timing of drug applications.
Glutamatergic EPSCs were isolated with the GABAA blocker picrotoxin (50 mM) and a mixture of nicotinic blockers hexamethonium (HMT, 50 mM) and mecamylamine (MEC, 5 mM). Numbers in (B) correspond to time in (C). n = 11 cells in (D).
(E and F) Raw traces from a single cell (E) and group data (F) show that knocking out GABAB(1) in cholinergic neurons abolished the potentiatory effect of baclofen
(n = 8 cells).
(GI) GABA puff (1 mM) potentiated glutamatergic EPSCs. Schematic in (G) illustrates the method of recording and GABA puff. Raw traces in (H) shows the
responses of a single cell and (I) shows group data (n = 7 cells).
(JL) Optogenetically stimulating (5 ms; 10 Hz, 15 s) GABAergic neuropils in the IPN of a VGAT-ChR2-EYFP mouse potentiated EPSCs that were evoked by
electric stimulation of the fiber tract fr. (J) illustrates the method of recording and optogenetic stimulation. (K) shows the raw traces from a single cell, with
stimulation artifacts clipped for clarity. (L) shows group data (n = 6 cells).
***p < 0.001; **p < 0.01; n.s., not significant; two-sided paired t tests. Error bars indicate SEM.
some cells (6/18 cells) expressed a large inward current that was
resistant to nicotinic blockers (Figure 5D). Consistent with the
expression of NKB in habenular neurons (Marksteiner et al.,
1992), the presence of NKB receptor antagonists abolished
this additional current in the six cells tested (Figures 5E and
5F; Table S1). This provides evidence both that NKB serves as
a neurotransmitter in the brain and that GABAB activity gates
the release of NKB.
Given that all three coreleased neurotransmitters evoked
excitatory currents, we tested whether baclofen could indeed
increase the ability of habenular neurons to elicit action potential
firing in interpeduncular neurons. We stimulated ChR2-expressing axonal terminals at various frequencies (5-ms pulses at
120 Hz for 1 s) and recorded action potential firing from interpeduncular neurons using cell-attached recording. Baclofen
significantly increased the number of action potentials evoked
by individual light pulses, especially when the stimulation frequency was <10 Hz (Figures 5G, 5H, S4R, and S4S; Table
S1). Therefore, presynaptic GABAB activity facilitates the spread
of excitatory signals across the habenulo-interpeduncular
synapse.
Next, we examined whether activating the receptors of
the coreleased neurotransmitters could reduce fear memory
expression. Wild-type C57BL6/N mice were conditioned with
ten tone-footshock pairs. Before the tests in extinction sessions,

we separately infused the ionotropic glutamate receptor agonists (AMPA and NMDA), acetylcholine, or a NKB receptor
agonist (senktide) into the IPN of individual mice. All of these
drugs significantly reduced freezing responses (Figures 6A6F;
Table S1). Similarly, intra-IPN pretreatment of these drugs
significantly reduced the freezing responses of ChAT-GABAB(1)KO mice to levels comparable to those of wild-type mice (Figures S5AS5E; Table S1). Therefore, activating the receptors
for the coreleased neurotransmitters reduces fear memory
expression.
The widespread expression of GABAB receptors in the brain
raises the concern that higher doses of baclofen may be sedative.
We tested whether fear could be reduced by potentiating the
neurotransmission of habenula cholinergic neurons in a GABABindependent manner. Habenula cholinergic neurons richly express phosphodiesterase 2A (PDE2A), and blocking PEDA2A
activity increases presynaptic cyclic adenosine monophosphate
(cAMP) levels and enhances neurotransmitter release (Hu et al.,
2012). Intra-IPN pretreatment of the selective PDE2A inhibitor
(Bay 60-7550) significantly reduced the freezing responses of
wild-type mice during the test sessions for cued fear memory (Figures 6G, 6H, and S6FS6I; Table S1), suggesting an alternative
choice of drugs for the control of fear memory expression.
Cell 166, 716728, July 28, 2016 721
200 pA
400 ms
2
picrotoxin & DNQX
1.0
baclofen (bac)
0.8
HMT & Mec
0.6
0.4
0.2
3
0
0
10
20
30
40
EPSC amplitude (nA)
EPSC amplitude (nA)
50
1.0
***
0.8
**
0.6
0.4
0.2
0
ctrl
bac
E
1
200 pA
400 ms
F
1.8
EPSC amplitude (nA)
EPSC amplitude (nA)
Time (min)
picrotoxin & DNQX

baclofen
1.5
HMT & Mec
1.2
GR & SB
0.9
0.6
0.3
20
40
60
80 90
2.5
2.0
1.5
1.0
0.5
0
Time (min)
H
ctrl
baclofen
100
Evoked spikes/shot
Figure 5. GABAB Activity Potentiates the

Corelease of Acetylcholine and Neurokinin
B and Increases the Effectiveness of Propagating Activity from the MHb to the IPN
(AC) Raw trace (A) and time-series plot of EPSC
amplitudes (B) from a representative cell and population data (C) show that baclofen (1 mM) potently
increased cholinergic EPSCs, which were isolated
by the glutamate receptor blocker DNQX (10 mM)
and subsequently eliminated by nicotinic blockers
HMT (50 mM) and Mec (5 mM; n = 12 cells).
(DF) Raw trace (D) and time-series plot of EPSC
amplitudes (E) from a single cell and population data
(F) show that, in some cells, baclofen potentiated
peptidergic EPSCs, which were resistant to nicotinic blockers but were abolished by a mixture of
NKB blockers GR159897 (GR; 5 mM) and SB222200
(SB; 20 mM; n = 6 cells).
(G and H) Raster plots of spiking activity of an IPN
neuron (G) and mean firing rates of all tested cells
(n = 13 cells in H) show the effect of baclofen on
light-evoked spiking responses in the IPN of ChATChR2-EYP mice. Each row in (G) indicates one trial;
dots represents spikes. Blue dashed lines correspond to light pulses.
*p < 0.05; **p < 0.01; ***p < 0.001; two-sided paired
t tests. Error bars indicate SEM.
also Figure S4.
***
3.0
***
Trial number
Ca2+ is another key second messenger

of G protein-coupled receptors, and prebaclofen
synaptic Ca2+ is known to be essential
for neurotransmitter release (Wheeler et al.,
1.0
***
ctrl
1994). Although GABAB activity inhibits the
activity of Ca2+ channels elsewhere (Mintz
0
0
1
2
5
10
20
and Bean, 1993; Wu and Saggau, 1995),
0.5 s
we speculated that it may increase presynStimulation frequency (Hz)
aptic Ca2+ levels to facilitate neurotransmitter release from habenula cholinergic
neurons. To analyze baclofens effect on
GABAB Activity Increases Presynaptic Ca2+ Entry
presynaptic Ca2+ levels, we expressed a genetically encoded
through Cav2.3-Containing Channels
Ca2+ indicator (GCaMP6) in habenular terminals by injecting
What could explain the GABAB receptors novel excitatory AAV-DIO-GCaMP6m vectors into the MHb of a ChAT-Cre mouse
effect? To address this question, we first tested whether presyn- (Figures 7A7C) (Chen et al., 2013). We then measured GCaMP
aptic GABAB signals via a G protein. Pertussis toxin inactivates fluorescence changes within the cholinergic axonal terminals ussome G proteins and blocks the inhibitory action of GABAB ac- ing 2-photon microscopy. Baclofen significantly increased Ca2+
tivity (Dutar and Nicoll, 1988b; Kajikawa et al., 2001). We in- transients in habenular terminals evoked by electrical stimulation
jected pertussis toxin into the IPN of a ChAT-ChR2-EYFP mouse of their axons at either 1 Hz or 10 Hz (Figures 7D7G, S6M, and
and then recorded EPSCs 2 days after toxin treatment. S6N), demonstrating that GABAB activity amplifies presynaptic
Pertussis toxin, but not its control vehicle, prevented baclofen Ca2+ transients.
We identified the Ca2+ channel downstream of GABAB
from potentiating EPSCs, confirming that presynaptic GABAB
acts through a pertussis toxin-sensitive G protein (Figures signaling. Nickel (Ni2+), a blocker of the R-type Ca2+ current
S6AS6F; Table S1). Increasing presynaptic cAMP levels signif- (Wu et al., 1998), abolished the increase of presynaptic Ca2+
icantly amplified glutamatergic EPSCs (Figures S6GS6J; Table influx (Figures 7H and 7I; Table S1). Ni2+ also prevented baclofen
S1). However, baclofen did not alter cAMP concentration (Fig- from potentiating EPSCs, whereas blockers of other types of calure S6G; Table S1); and when cAMP synthesis was blocked, cium channel (P/Q-type, T-type, N-type, or L-type) had no effect
it could still potentiate EPSCs (Figures S6K and S6L; Table (Figures 7J7L and S7AS7L; Table S1). R-type Ca2+ channels
S1). Thus, cAMP cannot be a key downstream messenger for contain the Cav2.3 subunit (Sochivko et al., 2002), which is richly
expressed in habenula cholinergic neurons and their axonal
GABAB.
722 Cell 166, 716728, July 28, 2016
***
2.0
**
day1
Freezing (%)
100
day 2
day 3
100
Freezing (%)
ACSF (n=7)
AMPA&NMDA (n=6)
75
50
****
25
****
0
50
25
****
***
***
ACSF (n=7)
Senktide (n=6)
****
50
25
****
1
****
ACSF
Senktide
**
50
**
25
purely inhibitory, GABAB actually mediates

the fear-reducing effect by strongly potentiating the corelease of multiple neurotransmitters from habenula cholinergic
neurons. Third, GABAB produces presyn***
aptic excitation by enhancing Ca2+ influx
**
through a special Ca2+ channel. These
**
conclusions establish the habenulo-inter1
2
3
Test day
peduncular pathway as the key site for
fear control and substantially expand our
understanding of the signaling capacity of
GABA and GABAB receptors. Moreover, our findings provide a
circuit and molecular explanation of baclofens clinical efficacy
on fear-related disorders and suggest alternative pharmacotherapeutic solutions.
2
Test day
100
75
50
25
****
0
75
****
25
Test day
(A and B) The effect of intra-IPN pretreatment of

AMPA (3 pmol) and NMDA (12 pmol) on the freezing
responses of wild-type mice across trials (A) and
overall freezing ratio (B) of wild-type mice during
extinction sessions on days 13. Mice were conditioned with ten tone-shock pairs on day 0. For a
control, ACSF was infused prior to the extinction
sessions.
(CH) The effect of intra-IPN pretreatment of
acetylcholine (0.6 nmol; C and D), the NKB agonist
senktide (3 pmol; E and F), and the PDE2A inhibitor
Bay 60-7550 (30 pmol; G and H) on the freezing
responses of wild-type mice.
*p < 0.05; **p < 0.01; ***p < 0.001; n.s., not significant; Two-way ANOVA in (A), (C), (E), and (G)
and t tests in (B), (D), (F), and (H). Error bars
indicate SEM.
also Figure S5.
ACSF (n=7)
Bay 60-7550 (n=6)
50
n.s.
Figure 6. Activating Interpeduncular Neurons or Potentiating Habenular Neurotransmission Reduces Fear Memory Expression.
**
7
Trials
75
n.s.
25
100
****
1
*
50
75
ACSF
Acetylcholine
75
Freezing (%)
Freezing (%)
Trials
**
3
Test day
Freezing (%)
ACSF (n=7)
Acetylcholine (n=6)
100
Freezing (%)
***
2
100
100
25
75
***
Freezing (%)
Freezing (%)
50
100
75
****
Trials
ACSF
AMPA & NMDA
Trials
terminals in the IPN (Parajuli et al., 2012) (Figures 7M and S7M).

Using the CRISPR/Cas9 genome editing tool (Cong et al., 2013),
we generated Cav2.3 null mice in the background of ChAT-ChR2EYFP mouse line (Figures 7N, S7N, and S7O). Genetically inactivating the Cav2.3-containing channel abolished baclofens
potentiatory effect (Figures 7O7Q; Table S1). Therefore, presynaptic GABAB receptors increase neurotransmitter release
by facilitating the opening of the voltage-dependent R-type
calcium channels formed with Cav2.3 subunits.
DISCUSSION
Animals and humans adjust their behavioral responses to threats
in an experience-dependent manner. Research on fear conditioning and extinction has focused predominantly on the cortico-amygdala circuit (Herry et al., 2010; Milad and Quirk,
2012; Myers and Davis, 2007; Tovote et al., 2015; VanElzakker
et al., 2014). Here, we demonstrate that habenula cholinergic
neurons and the GABAB activity in their axonal terminals control
fear memory expression through a unique signaling mechanism.
We report three major findings. First, ablating habenula cholinergic neurons or selectively inactivating their GABAB receptors
impairs the extinction of cued fear memory, whereas activating
these neurons or their GABAB receptors reduces the expression
of conditioned fear. Second, although previously considered
ACSF
Bay 60-7550
Habenula Cholinergic Neurons Suppress Fear Memory

Expression
Selectively ablating cholinergic neurons in the mouse medial
habenula increases freezing responses to a cue that was previously associated with danger but no longer represents a threat.
This lesion does not affect fear memory formation or general
locomotor activity. Moreover, optogenetically stimulating the
axonal terminals of habenula cholinergic neurons reduces
freezing responses. The complementary loss- and gain-of-function phenotypes strongly suggest that the activity of habenula
cholinergic neurons plays an important role in suppressing the
expression of fear memory in mammals.
Habenula cholinergic neurons receive inputs from the septal
areas that promote anxiety (Yamaguchi et al., 2013). This at first
seems contradictory to our findings on the role of these neurons
in reducing fear. However, it is known that the MHb receives
inputs from regions other than the septal areas (Qin and Luo,
2009). By focusing on the output of habenula cholinergic neurons
rather than a subset of their inputs, this study reveals that the
activity of these neurons accelerates the decrease of freezing
Cell 166, 716728, July 28, 2016 723
GCaMP6m
EF1
GCaMP6
AAV-EF1-DIO-GCaMP6m
ctrl
baclofen
wash
F/F
3
WPRE
2
Cre
EF1
GCaMP6m
WPRE
1
50 m
0.2
MHb
IPN
ctrl
baclofen
F/F
3
1
189
100 m
-3
-3
Time (s)
I
baclofen
baclofen & Ni2+
F/F
J
100
F/F (%)
60
40
20
40 m
3
1
0.2
1.5
50 ms
-3
-3
Time (s)
baclofen
baclofen
baclofen
0.6
0.4
0.2
R
0
10
20
Time (s)
Ni2+
0.8
1.0
0.5
0.2
***
80
2.0
0.6
0.4
ctrl
baclofen
2.5
0.8
Time (s)
100 pA
EPSC amplitude (nA)
100 m
1.0
F/F
ctrl
baclofen
F/F
EPSC amplitude (nA)
30
40
50
1.2
***
***
***
0.9
0.6
0.3
0
60
Time (min)
overlay
IPN
CaV2.3
EYFP
overlay
2
1
50 ms
Q
0.6
baclofen
0.4
0.2
0
100 m
EPSC amplitude (nA)
100 m
EYFP
EPSC amplitude (nA)
CaV2.3
100 pA
10
15
20
25
0.4
n.s.
0.3
0.2
0.1
0
ctrl
baclofen
Time (min)
Figure 7. GABAB Mediates Excitation by Increasing Presynaptic Ca2+ Entry through Cav2.3-Containing Channels
(A and B) Injecting AAV-DIO-GCaMP6m vectors (A) into the MHb of a ChAT-Cre mouse resulted in GCaMP6 expression in MHb cholinergic neurons and their
axonal terminals in the IPN (B).
(CG) Baclofen (1 mM) increased GCaMP signals in the IPN. (C) Zoom-in view of GCaMP6 expression in the central IPN. (D) Pseudo-color representation of
GCaMP fluorescence change within the same area of (C) following electrical fiber stimulation (ten pulses at 10 Hz). (E) Raster plots of GCaMP6 signals for 189
terminal areas from three mice. (F and G) Average GCaMP signals in response to the single-shot (F) and trains (ten pulses at 10 Hz; G) of electrical stimulation.
Dashed lines indicate the timing of stimulation. Line width indicates SEM.
(H and I) Ni2+ blocked baclofens potentiatory effect on the evoked Ca2+ transients in the axonal terminals of habenula cholinergic neurons. (H) shows the
pseudocolor representation of GCaMP fluorescence changes within the IPN. (I) shows average amplitudes of Ca2+ transients evoked by single-shot electrical
stimulation.
(JL) Data from a single cell (J and K) and the entire test group (L) show that Ni2+ (50 mM) reversibly blocked EPSC potentiation by baclofen (n = 6 cells).
(M and N) Cav2.3 immunoreactivity was observed in ChR2-EYFP+ terminals within the IPN of a ChAT-ChR2-EYFP Ca2.3+/+ mouse (M), but not a Cav2.3 null
mouse (N).
(OQ) Data from a cell (O and P) and the entire test group (Q) demonstrate that mutating Cav2.3 prevented baclofen from potentiating EPSCs (n = 13 cells).
***p < 0.001; n.s., not significant; two-sided paired t tests in (I, L, and Q). Error bars indicate SEM.
See Table S1 for statistical analyses in detail. See also Figures S6 and S7.
responses to a previous danger-predicting cue. Our observations are consistent with the finding that inactivating or ablating
the fish homolog of the MHb causes helpless behaviors (Agetsuma et al., 2010; Lee et al., 2010), although it had been unclear
whether the fish effect could be extrapolated to mammals, as the
724 Cell 166, 716728, July 28, 2016
habenular structure in fish and mammals receive distinct inputs

(Stephenson-Jones et al., 2012). Our findings thus indicate that,
despite the difference in inputs across taxa, the habenulo-interpeduncular pathway is evolutionarily conserved to promote
active coping to aversive stimuli.
Because neither the MHb nor the IPN connects directly with
the amygdala, the current study raises the intriguing question
of how habenula neurons contribute to fear control. The habenulo-interpeduncular pathway connects the limbic forebrain
areas related to fear processing to several brainstem modulatory
centers, including the raphe nuclei (Figure S1A). Serotonergic
neurons in the dorsal raphe encode reward signals and selective
serotonin reuptake inhibitors alleviate the clinical symptoms
of PTSD (Li et al., 2016; Liu et al., 2014; Stein et al., 2000).
Possibly, habenula cholinergic neurons regulate the expression
of fear memory by controlling the activity of brainstem neurons,
including serotonergic neurons, thus modulating fear-related
forebrain areas, including the medial prefrontal cortex and amygdala (Herry et al., 2010; Myers and Davis, 2007).
GABAB Activity in Habenula Cholinergic Neurons
Facilitates Fear Extinction
A key finding of the present study is that GABAB activity on the
axonal terminals of habenula cholinergic neurons reduces
the expression of cued fear memory. Genetically inactivating
GABAB receptors in these neurons slows the decay of freezing
response levels during the extinction phase, but does not disrupt
fear memory formation or general locomotion, general anxiety, or
pain. Neurochemistry experiments further determine the IPN as
the key anatomical site for the fear-reducing effects of GABAB
activity. Blocking GABAB activity in the IPN immediately before
the extinction tests similarly increases freezing, whereas activating GABAB in the IPN reduces freezing in wild-type mice but
not in the GABAB conditional knockout mice. Tests using
discrete conditioned stimuli or 180-s continuous tone revealed
similar behavioral phenotypes, suggesting that the change in
fear responses is associated with memory expression rather
than sensory habituation.
Our results thus indicate that GABAB receptors in habenula
cholinergic neurons, particularly those in the axonal terminals,
are critically important for controlling responses to conditioned
stimuli that no longer predict threat. Pharmacological activation
of GABAB or the receptors for the coreleased neurotransmitters
in the IPN speeds up the decay, further supporting the assertion
that presynaptic GABAB activity facilitates fear extinction. On
the other hand, optogenetic stimulation of habenula cholinergic
neurons or infusion of a PDE2A inhibitor into the IPN reduces
the overall freezing response levels. Intra-IPN application of
saclofen increases the overall freezing response levels. Unlike
conditionally knocking out GABAB(1), infusing saclofen also inactivates GABAB receptors that are expressed by non-cholinergic
neurons within the IPN. Thus, the habenulo-interpeduncular
pathway can regulate both fear extinction and the overall fear
response intensity.
GABAB Mediates Presynaptic Excitation
Because GABAB activity had been thought previously to produce only inhibitory responses (Chalifoux and Carter, 2011;
Dutar and Nicoll, 1988a, 1988b; Gassmann and Bettler, 2012;
Newberry and Nicoll, 1984), it at first appears counterintuitive
that both the activity of habenula cholinergic neurons and the activity of GABAB receptors in their axonal terminals play similar
behavioral roles. Our finding that GABAB mediates strong pre-
synaptic excitation resolves this paradox. Activating GABAB

receptors with baclofen strikingly potentiates glutamatergic
EPSCs, whereas knocking out GABAB in cholinergic neurons
completely abolishes the potentiatory effect. Our study provides
the first demonstration that GABAB excites neurons. Baclofen
also dramatically potentiates the evoked cholinergic EPSCs,
thus revealing that a presynaptic receptor controls neurotransmitter corelease. GABAB activity further gates the release of
NKB to excite interpeduncular neurons, demonstrating both
that NKB is exocytotically released as an excitatory neurotransmitter, and a presynaptic receptor gates neuropeptide transmission. GABAergic neurons are densely distributed in the IPN (Hsu
et al., 2013), suggesting that GABA may function as a retrograde
messenger from interpeduncular neurons to amplify the excitatory effect of habenular outputs.
Thus, baclofen may reduce fear memory expression by potentiating neurotransmitter release from habenula cholinergic neurons to interpeduncular neurons. In support of this possibility,
pharmacological activation of receptors for glutamate, acetylcholine, and NKB within the IPN all facilitated fear extinction.
The fear-reducing effect of intra-IPN acetylcholine infusion is
consistent with the recent finding that inactivating CB1 cannabinoid receptors within habenula neurons increases cholinergic
release and reduces behavioral responses to aversive stimuli
(Soria-Gomez et al., 2015).
Our data provide plausible circuit- and molecular-level explanations for how baclofen alleviates the symptoms of PTSD patients (Cryan and Kaupmann, 2005; Drake et al., 2003; Manteghi
et al., 2014), suggesting that enhancing neurotransmission from
habenula cholinergic neurons may prove effective in treating
PTSD. One strategy is to target phosphodiesterase 2A, which
constitutively inhibits neurotransmission of habenula cholinergic
neurons by negatively coupling with the cAMP pathway (Hu
et al., 2012). Indeed, locally infusing a selective PDE2A inhibitor
into the IPN resulted in decreased freezing responses, suggesting that PDE2A inhibitors may be of use for treating anxiety disorders such as PTSD with the benefit of avoiding baclofens
potentially sedative effect.
GABAB Activity Increases Presynaptic Ca2+ Influx
through Cav2.3
Finally, our study reveals the presence of novel signaling cascades for GABAB receptors (Figure S7P). Our results indicate
that, upon ligand binding, presynaptic GABAB receptors in habenula cholinergic axons dissociate the pertussis toxin-sensitive G protein Gai/o-bg, which in turn facilitates the opening of
voltage-dependent calcium channels to increase Ca2+ influx.
This effect is surprising, because the effect of presynaptic
G protein-coupled receptors on Ca2+ levels were until now
thought to only be inhibitory. Using both pharmacological
and genetic approaches, we identify the CaV2.3-constructing
Ca2+ channel as the key downstream mediator of GABAB
signals. It remains to be further dissected at the molecular
level precisely how dissociated G protein subunits couple
to Cav2.3. The IPN is the only brain region where CaV2.3
expression is predominantly presynaptic (Parajuli et al.,
2012), suggesting that the observed excitatory action requires
the compartmentalization of GABAB receptors, Gai/o-bg, and
Cell 166, 716728, July 28, 2016 725
CaV2.3 channels into precise microdomains in axonal terminals. The habenulo-interpeduncular pathway may serve as
a valuable model for dissecting the detailed molecular
mechanism underlying the rich signaling capacity of GABAB
receptors.
Taken together, this study reveals that GABAB receptors in
the habenula cholinergic neurons facilitate fear extinction by
potentiating corelease of multiple neurotransmitters. Malfunctions in the habenulo-interpeduncular pathway and in GABAB
signaling have been implicated in psychiatric disorders associated with fear and anxiety (Cryan and Kaupmann, 2005; Hikosaka, 2010; Lecourtier and Kelly, 2007). Our results suggest
that potentiating the habenulo-interpeduncular pathway, as
by activating habenular GABAB or inhibiting PDE2A, presents
a potentially effective therapeutic approach for treating such
disorders.
Detailed materials and methods are available in the Supplemental Experimental Procedures.
Mice
Adult mice of either sex were used. We used simplified genotypes
of mouse strains for clarity. Mouse strains include ChAT-ChR2-EYFP,
VGAT-ChR2-EYFP, ChAT-Cre, GABAB(1)lox511/lox511, Cav2.3 null, and wildtype C57BL6/N mice. All procedures were conducted with the approval
of the Animal Care and Use Committee of the National Institute of Biological Sciences, Beijing in accordance with governmental regulations of
China.
with a 203 water immersion objective on a 2-photon microscope (3 frames/s).

For photostimulation, blue light pulses (470 nm) were generated with an LED
and delivered via a 403 water immersion lens (5 ms, 14 mW). For electrical
stimulation, the fasciculus retroflexus was stimulated with bipolar stainless
steel microelectrodes (0.2 ms, 100150 mA). The drugs were delivered either
through perfusion or by local ejection.
Measurement of cAMP Levels
Measurement of cAMP levels with ELISA kits were performed as described
elsewhere (Hu et al., 2012).
Histology and Immunohistochemistry
History, immunostaining, and fluorescent microscopy were performed essentially as described previously (Ren et al., 2011). For details on procedures and
antibodies used, see the Supplemental Experimental Procedures.
seven figures and one table and can be found with this article online at
J. Zhang and M.L. designed the experiments. J. Zhang, J.L., Y.R., L.T., J.
Zhou, C.W., and T.Y. performed behavioral assays. J. Zhang performed physiological recordings. J. Zhang, R.L., J.R., and F.H. performed calcium imaging.
L.T., Y.Z., F.W., and J. Zhang generated Cav2.3 mutant mice. Q.F., R.L., and
L.T. prepared AAV vectors. B.B. provided GABAB(1)lox511/lox511 mice. J. Zhang
and M.L. analyzed the data and wrote the paper.
ACKNOWLEDGMENTS
AAV Vectors and Injection

pAAV-flex-taCasp3-TEVp, pAAV-EF1a-DIO-GCaMP6m, and pAAV-EF1a-DIOEmGFP were packaged into AAV serotype 2/9 vectors. AAV vectors were
injected with pressure into the bilateral MHb of anaesthetized mice.
Behavior
Fear conditioning and extinction were performed essentially as described
previously (Milad and Quirk, 2002; Soria-Gomez et al., 2015). For fear conditioning, auditory tones (20 s, 7.5 kHz, 8590 dB) were coupled to co-terminating footshocks (1 s, 0.7 mA) for five or ten trials. For fear extinction, a
mouse was presented with conditioned auditory tones that were no longer
coupled to footshocks. We used two slightly different test paradigms to
apply tones. First, we repetitively applied the conditioned auditory tone
(20 s) for ten discrete trials during the daily extinction phases for 35 consecutive days. Alternatively, we applied a continuous auditory tone (7.5 kHz,
180 s) during the extinction sessions on days 1 and 6. Two trained observers
scored the mouse behavior off-line in a double-blind manner. Mice were
also tested for pain threshold, open field, light/dark box, and elevated
plus-maze by following protocols described in the Supplemental Experimental Procedures.
Optogenetic stimulation was carried out essentially as described elsewhere (Liu et al., 2014). Light was delivered through a tapered optical fiber
(5-ms pulse duration, 50 Hz frequency, 30 mW output power; fiber outside
diameter [OD] = 200 mm and numerical aperture [NA] = 0.39). For the intraIPN drug injections, a guide cannula was implanted with its tip targeting
the dorsal IPN. Immediately prior to extinction tests, drugs (saclofen, baclofen, AMPA and NMDA, senktide, acetylcholine, Bay 60-7550) or artificial
cerebrospinal fluid (300 nl) was slowly infused to the IPN via an internal
cannula.
Slice Recording and 2-Photon Imaging
Slice preparation and whole-cell recordings were performed as described
elsewhere (Ren et al., 2011). GCaMP6m fluorescent signals were imaged
726 Cell 166, 716728, July 28, 2016
We thank G. Feng (Massachusetts Institute of Technology) for the ChATChR2-EYFP and VGAT-ChR2-EYFP transgenic mice, D. Duan (University of
Missouri) for advice on AAV virus preparation, Y. Li (Peking University) for
anti-synaptophysin1 antibody, and P. Sterling (University of Pennsylvania),
G. Marsicano (INSERM), and D. Perkel (University of Washington) for comments and discussions. M.L. is supported by China MOST (2012CB837701,
2012YQ03026005, 2013ZX0950910, 2015BAI08B02), NNSFC (91432114),
and the Beijing Municipal Government.
Received: November 8, 2015
REFERENCES
Agetsuma, M., Aizawa, H., Aoki, T., Nakayama, R., Takahoko, M., Goto, M.,
Sassa, T., Amo, R., Shiraki, T., Kawakami, K., et al. (2010). The habenula is
crucial for experience-dependent modification of fear responses in zebrafish.
Nat. Neurosci. 13, 13541356.
Chalifoux, J.R., and Carter, A.G. (2011). GABAB receptor modulation of synaptic function. Curr. Opin. Neurobiol. 21, 339344.
Chen, T.-W., Wardill, T.J., Sun, Y., Pulver, S.R., Renninger, S.L., Baohan, A.,
Schreiter, E.R., Kerr, R.A., Orger, M.B., Jayaraman, V., et al. (2013). Ultrasensitive fluorescent proteins for imaging neuronal activity. Nature 499, 295300.
Cong, L., Ran, F.A., Cox, D., Lin, S., Barretto, R., Habib, N., Hsu, P.D., Wu, X.,
Jiang, W., Marraffini, L.A., and Zhang, F. (2013). Multiplex genome engineering
using CRISPR/Cas systems. Science 339, 819823.
Cryan, J.F., and Kaupmann, K. (2005). Dont worry B happy!: a role for
GABA(B) receptors in anxiety and depression. Trends Pharmacol. Sci. 26,
3643.
Drake, R.G., Davis, L.L., Cates, M.E., Jewell, M.E., Ambrose, S.M., and Lowe,
J.S. (2003). Baclofen treatment for chronic posttraumatic stress disorder. Ann.
Pharmacother. 37, 11771181.
Liu, Z., Zhou, J., Li, Y., Hu, F., Lu, Y., Ma, M., Feng, Q., Zhang, J.E., Wang, D.,
Zeng, J., et al. (2014). Dorsal raphe neurons signal reward through 5-HT and
glutamate. Neuron 81, 13601374.
Dutar, P., and Nicoll, R.A. (1988a). A physiological role for GABAB receptors in
the central nervous system. Nature 332, 156158.
Manteghi, A.A., Hebrani, P., Mortezania, M., Haghighi, M.B., and Javanbakht,
A. (2014). Baclofen add-on to citalopram in treatment of posttraumatic stress
disorder. J. Clin. Psychopharmacol. 34, 240243.
Dutar, P., and Nicoll, R.A. (1988b). Pre- and postsynaptic GABAB receptors in
the hippocampus have different pharmacological properties. Neuron 1,
585591.
Fowler, C.D., Lu, Q., Johnson, P.M., Marks, M.J., and Kenny, P.J. (2011). Habenular a5 nicotinic receptor subunit signalling controls nicotine intake. Nature
471, 597601.
Frahm, S., Slimak, M.A., Ferrarese, L., Santos-Torres, J., Antolin-Fontes, B.,
Auer, S., Filkin, S., Pons, S., Fontaine, J.F., Tsetlin, V., et al. (2011). Aversion
to nicotine is regulated by the balanced activity of b4 and a5 nicotinic receptor
subunits in the medial habenula. Neuron 70, 522535.
Frahm, S., Antolin-Fontes, B., Gorlich, A., Zander, J.F., Ahnert-Hilger, G., and
Ibanez-Tallon, I. (2015). An essential role of acetylcholine-glutamate synergy at
habenular synapses in nicotine dependence. eLife 4, e11396.
Gassmann, M., and Bettler, B. (2012). Regulation of neuronal GABA(B) receptor functions by subunit composition. Nat. Rev. Neurosci. 13, 380394.
Gong, S., Doughty, M., Harbaugh, C.R., Cummins, A., Hatten, M.E., Heintz, N.,
and Gerfen, C.R. (2007). Targeting Cre recombinase to specific neuron populations with bacterial artificial chromosome constructs. J. Neurosci. 27, 9817
9823.
Goto, M., Swanson, L.W., and Canteras, N.S. (2001). Connections of the nucleus incertus. J. Comp. Neurol. 438, 86122.
Herry, C., Ferraguti, F., Singewald, N., Letzkus, J.J., Ehrlich, I., and Luthi, A.
(2010). Neuronal circuits of fear extinction. Eur. J. Neurosci. 31, 599612.
Margeta-Mitrovic, M., Mitrovic, I., Riley, R.C., Jan, L.Y., and Basbaum, A.I.
(1999). Immunohistochemical localization of GABA(B) receptors in the rat central nervous system. J. Comp. Neurol. 405, 299321.
Marksteiner, J., Sperk, G., and Krause, J.E. (1992). Distribution of neurons
expressing neurokinin B in the rat brain: immunohistochemistry and in situ hybridization. J. Comp. Neurol. 317, 341356.
Milad, M.R., and Quirk, G.J. (2002). Neurons in medial prefrontal cortex signal
memory for fear extinction. Nature 420, 7074.
Milad, M.R., and Quirk, G.J. (2012). Fear extinction as a model for translational
neuroscience: ten years of progress. Annu. Rev. Psychol. 63, 129151.
Mintz, I.M., and Bean, B.P. (1993). GABAB receptor inhibition of P-type Ca2+
channels in central neurons. Neuron 10, 889898.
Myers, K.M., and Davis, M. (2007). Mechanisms of fear extinction. Mol. Psychiatry 12, 120150.
Newberry, N.R., and Nicoll, R.A. (1984). Direct hyperpolarizing action of baclofen on hippocampal pyramidal cells. Nature 308, 450452.
Parajuli, L.K., Nakajima, C., Kulik, A., Matsui, K., Schneider, T., Shigemoto, R.,
and Fukazawa, Y. (2012). Quantitative regional and ultrastructural localization
of the Ca(v)2.3 subunit of R-type calcium channel in mouse brain. J. Neurosci.
32, 1355513567.
Hikosaka, O. (2010). The habenula: from stress evasion to value-based decision-making. Nat. Rev. Neurosci. 11, 503513.
Pollak Dorocic, I., Furth, D., Xuan, Y., Johansson, Y., Pozzi, L., Silberberg, G.,
Carlen, M., and Meletis, K. (2014). A whole-brain atlas of inputs to serotonergic
neurons of the dorsal and median raphe nuclei. Neuron 83, 663678.
Hsu, Y.W., Tempest, L., Quina, L.A., Wei, A.D., Zeng, H., and Turner, E.E.
(2013). Medial habenula output circuit mediated by a5 nicotinic receptor-expressing GABAergic neurons in the interpeduncular nucleus. J. Neurosci. 33,
1802218035.
Qin, C., and Luo, M. (2009). Neurochemical phenotypes of the afferent and
efferent projections of the mouse medial habenula. Neuroscience 161,
827837.
Hu, F., Ren, J., Zhang, J.E., Zhong, W., and Luo, M. (2012). Natriuretic peptides block synaptic transmission by activating phosphodiesterase 2A and
reducing presynaptic PKA activity. Proc. Natl. Acad. Sci. USA 109, 17681
17686.
Janak, P.H., and Tye, K.M. (2015). From circuits to behaviour in the amygdala.
Nature 517, 284292.
Kajikawa, Y., Saitoh, N., and Takahashi, T. (2001). GTP-binding protein beta
gamma subunits mediate presynaptic calcium current inhibition by GABA(B)
receptor. Proc. Natl. Acad. Sci. USA 98, 80548058.
Kaupmann, K., Huggel, K., Heid, J., Flor, P.J., Bischoff, S., Mickel, S.J.,
McMaster, G., Angst, C., Bittiger, H., Froestl, W., and Bettler, B. (1997).
Expression cloning of GABA(B) receptors uncovers similarity to metabotropic
glutamate receptors. Nature 386, 239246.
Kobayashi, Y., Sano, Y., Vannoni, E., Goto, H., Suzuki, H., Oba, A., Kawasaki,
H., Kanba, S., Lipp, H.P., Murphy, N.P., et al. (2013). Genetic dissection of
medial habenula-interpeduncular nucleus pathway function in mice. Front. Behav. Neurosci. 7, 17.
Ren, J., Qin, C., Hu, F., Tan, J., Qiu, L., Zhao, S., Feng, G., and Luo, M. (2011).
Habenula cholinergic neurons co-release glutamate and acetylcholine and
activate postsynaptic neurons via distinct transmission modes. Neuron 69,
445452.
Sochivko, D., Pereverzev, A., Smyth, N., Gissel, C., Schneider, T., and Beck, H.
(2002). The Ca(V)2.3 Ca(2+) channel subunit contributes to R-type Ca(2+) currents in murine hippocampal and neocortical neurones. J. Physiol. 542,
699710.
Soria-Gomez, E., Busquets-Garcia, A., Hu, F., Mehidi, A., Cannich, A., Roux,
L., Louit, I., Alonso, L., Wiesner, T., Georges, F., et al. (2015). Habenular
CB1 receptors control the expression of aversive memories. Neuron 88,
306313.
Stein, D.J., Seedat, S., van der Linden, G.J., and Zungu-Dirwayi, N. (2000). Selective serotonin reuptake inhibitors in the treatment of post-traumatic stress
disorder: a meta-analysis of randomized controlled trials. Int. Clin. Psychopharmacol. 15 (Suppl 2 ), S31S39.
Lecourtier, L., and Kelly, P.H. (2007). A conductor hidden in the orchestra?
Role of the habenular complex in monoamine transmission and cognition.
Neurosci. Biobehav. Rev. 31, 658672.
Stephenson-Jones, M., Floros, O., Robertson, B., and Grillner, S. (2012).

Evolutionary conservation of the habenular nuclei and their circuitry controlling
the dopamine and 5-hydroxytryptophan (5-HT) systems. Proc. Natl. Acad. Sci.
USA 109, E164E173.
LeDoux, J.E. (2000). Emotion circuits in the brain. Annu. Rev. Neurosci. 23,
155184.
Tovote, P., Fadok, J.P., and Luthi, A. (2015). Neuronal circuits for fear and anxiety. Nat. Rev. Neurosci. 16, 317331.
Lee, A., Mathuru, A.S., Teh, C., Kibat, C., Korzh, V., Penney, T.B., and Jesuthasan, S. (2010). The habenula prevents helpless behavior in larval zebrafish.
Curr. Biol. 20, 22112216.
VanElzakker, M.B., Dahlgren, M.K., Davis, F.C., Dubois, S., and Shin, L.M.
(2014). From Pavlov to PTSD: the extinction of conditioned fear in rodents, humans, and anxiety disorders. Neurobiol. Learn. Mem. 113, 318.
Li, Y., Zhong, W., Wang, D., Feng, Q., Liu, Z., Zhou, J., Jia, C., Hu, F., Zeng, J.,
Guo, Q., et al. (2016). Serotonin neurons in the dorsal raphe nucleus encode
reward signals. Nat. Commun. 7, 10503.
Wheeler, D.B., Randall, A., and Tsien, R.W. (1994). Roles of N-type and Q-type
Ca2+ channels in supporting hippocampal synaptic transmission. Science
264, 107111.
Cell 166, 716728, July 28, 2016 727
Wu, L.G., and Saggau, P. (1995). GABAB receptor-mediated presynaptic inhibition in guinea-pig hippocampus is caused by reduction of presynaptic Ca2+
influx. J. Physiol. 485, 649657.
Wu, L.G., Borst, J.G., and Sakmann, B. (1998). R-type Ca2+ currents evoke
transmitter release at a rat central synapse. Proc. Natl. Acad. Sci. USA 95,
47204725.
Yamaguchi, T., Danjo, T., Pastan, I., Hikida, T., and Nakanishi, S. (2013).
Distinct roles of segregated transmission of the septo-habenular pathway in
anxiety and fear. Neuron 78, 537544.
728 Cell 166, 716728, July 28, 2016
Yang, C.F., Chiang, M.C., Gray, D.C., Prabhakaran, M., Alvarado, M., Juntti,
S.A., Unger, E.K., Wells, J.A., and Shah, N.M. (2013). Sexually dimorphic neurons in the ventromedial hypothalamus govern mating in both sexes and
aggression in males. Cell 153, 896909.
Zhao, S., Ting, J.T., Atallah, H.E., Qiu, L., Tan, J., Gloss, B., Augustine, G.J., Deisseroth, K., Luo, M., Graybiel, A.M., and Feng, G.
(2011). Cell typespecific channelrhodopsin-2 transgenic mice for
optogenetic dissection of neural circuitry function. Nat. Methods 8,
745752.
Theory
The Synchronization of Replication and Division

Cycles in Individual E. coli Cells
Graphical Abstract
Authors
Mats Wallden, David Fange,
Ebba Gregorsson Lundius,
Ozden Baltekin, Johan Elf
Correspondence
johan.elf@icm.uu.se
In Brief
Cell-to-cell variation in division timing and
cell size in E. coli is due to differences in
growth rate, whereas the timing of
replication is triggered at an invariant
fixed volume per chromosome.
Highlights
d
Replication initiates at a nearly fixed volume per

chromosome for all growth rates
The time from initiation to division depends on the individual
cells growth rate
Variation in growth rate sets the variation in generation time
and division size
E. coli appears as a sizer at slow growth and an adder at
fast growth
Wallden et al., 2016, Cell 166, 729739

Theory
The Synchronization of Replication
and Division Cycles in Individual E. coli Cells
Mats Wallden,1,2 David Fange,1,2 Ebba Gregorsson Lundius,1 Ozden Baltekin,1 and Johan Elf1,*
1Department
of Cell and Molecular Biology, Science for Life Laboratory, Uppsala University, Husargatan 3, 75124 Uppsala, Sweden
author
*Correspondence: johan.elf@icm.uu.se
2Co-first
SUMMARY
Isogenic E. coli cells growing in a constant environment display significant variability in growth rates,
division sizes, and generation times. The guiding
principle appears to be that each cell, during one
generation, adds a size increment that is uncorrelated to its birth size. Here, we investigate the mechanisms underlying this adder behavior by mapping
the chromosome replication cycle to the division
cycle of individual cells using fluorescence microscopy. We have found that initiation of chromosome
replication is triggered at a fixed volume per chromosome independent of a cells birth volume and
growth rate. Each initiation event is coupled to a division event after a growth-rate-dependent time. We
formalize our findings in a model showing that cellto-cell variation in division timing and cell size is
mainly driven by variations in growth rate. The model
also explains why fast-growing cells display adder
behavior and correctly predict deviations from the
adder behavior at slow growth.
INTRODUCTION
Balanced growth requires that DNA replication keeps up with
cell-division events (Schaechter et al., 1958). This may, at first,
seem hard to achieve for rapid-growth Escherichia coli, where
it takes more time to replicate a chromosome than to double
the biomass. The solution to the apparent paradox was first
described by Cooper and Helmstetter (Cooper and Helmstetter,
1968). In their model, new rounds of DNA replication are started
before the previous round has finished (Figure 1A). As long as
cells, on average, initiate one round of replication per chromosomal origin per generation and divide only once following
each replication termination, the model produces stable cell cycles at all growth rates. An example of a deterministic, i.e., noise
free, simulation of replication and division cycles including upand downshifts in growth rates is given in Figure 1B.
A missing component of Coopers and Helmstetters model
is how the cell manages to trigger replication initiation once
per generation. An answer proposed by Donachie (1968) is that
replication initiates at a critical volume per origin. This guarantees that, on average, the concentration of origins is constantly
maintained as the cells birth and division volumes exponentially

change in response to the growth rate (Schaechter et al., 1958)
(Figure 1B). However, Donachies (1968) proposal was later
refuted on the basis that the introduction of extra copies of the
origin of replication region does not disrupt the cell cycle (Helmstetter and Leonard, 1987) and evidence gathered from cells
synchronized at the time of division using the baby-column technique (Bates and Kleckner, 2005). It was instead proposed that
division issues a license for an initiation event to occur at a
well-defined time later. Recently, Hill et al. (2012) used the rifampicin (rif) run-out technique (Skarstad et al., 1986) in a number
of size mutants to re-establish the constant volume model. Unfortunately, it is not possible in these experiments to correlate
the replication initiation volume in one cell to its growth rate or division size. Without these correlated measurements, it is not
possible to determine what drives the variability and accuracy
of the division and replication cycles.
Recent developments in microscopy, microfluidics, and image
analysis techniques help us answer this question by circumventing the need to experimentally synchronize cell cycles; synchronization can be achieved in post-processing of the images
(Sliusarenko et al., 2011; Ullman et al., 2012; Wang et al.,
2010). These techniques provide direct observations of sizes
and lifespans of individual bacteria growing exponentially under
well-controlled conditions. For example, Wang et al. (2010)
demonstrate that no clear dependence can be inferred between
the age of a cell, as defined by the number of divisions since the
establishment of the oldest pole and the growth rate.
The data from Wang et al. (2010) was later used by Osella et al.
(2014) to investigate the mechanisms governing cell division. The
authors conclude that the observed correlations are inconsistent
with either a purely time-dependent or purely size-dependent
control mechanism of division. Instead, they defined a phenomenological description that included both time and size dependence. The composite control scenario for division has been
recently explained by an adder model in which the added volume between one replication initiation event and the next is independent of cell size (Amir, 2014). Soon after, an alternative version
of the adder model was presented by Campos et al. (2014) and
(Taheri-Araghi et al. (2015). Here, the volume added between
birth and division is independent of the individual cells birth
size; this model has shown to be consistent with an overwhelming
body of additional experimental data. Despite that the adder
model displays a striking predictability for cell-cycle-related distributions over a large range of growth conditions (Taheri-Araghi
et al., 2015), it is not known what gives rise to the adding.
Here, elucidation of the molecular mechanism underlying the

adder model is the main focus. Such a mechanism should
explain why cells behave as adders, what gives rise to the
observed correlations in division sizes and generation times,
and when deviations from the adder model should be expected.
Our approach is to make a single-cell version of the classical
Cooper-Helmstetter (CH) model for the coupling of replication
and division cycles in E. coli and test if the model accounts for
the observed cell-to-cell variation. To do this, we revisit the assumptions of the CH model with the following questions: what
determines the point in the cell cycle at which replication starts?
How long does it take to replicate the genome and divide? How
are these two stochastic parameters correlated and dependent
on the growth rate of an individual cell?
In order to acquire the data we need to answer these questions, we combine tracking of growing cells throughout their division cycles (Ullman et al., 2012) with fluorescence microscopy
of labeled DNA replication components (Adiciptaningrum et al.,
2015; Reyes-Lamothe et al., 2010). Following parameterization,
the model was validated based on its ability to predict the variation in division timing and cell size. Finally, we have used the
model to explain the adder behavior previously observed in
fast growing cells, and we also accurately predict a deviation
from the adder model at slow growth.
RESULTS
Characterizing Replication and Division in Individual
Cells
To directly study the coordination of the replication and division
cycles, we imaged a fluorescently labeled epsilon subunit of
DNA polymerase (Pol) III, named DnaQ (Reyes-Lamothe et al.,
2010). Because of the low diffusivity of the Pol IIIs engaged in
730 Cell 166, 729739, July 28, 2016
Figure 1. Coupling of Replication and Division

Cycles in E. coli
(A) An illustration of the Cooper-Helmstetter model,
where division events are scheduled a fixed time after
replication initiates, is shown for slow (top) and intermediate growth (bottom). Initiation once per generation results in overlapping replication cycles at fast growth.
(B) Simulated volume expansion and division for an
idealized cell lineage going through an upshift and a
downshift in growth rate. Replication is initiated at a fixed
volume per chromosome (red circle), and the cells divide
a fixed period of time later, including the required time for
replication (C-period) and chromosome segregation and
septum formation (D-period). A dashed green arrow indicates the relation between initiation of replication and
its corresponding division.
replication, individual replisomes can be localized by using single-molecule fluorescence

imaging (Figure 2A). We ensure that labeling
does not influence growth (Figures S1A and
S1B) or replication initiation (Figures S1CS1E).
Cell size and division events were determined by time-lapse phase-contrast microscopy of cells grown in a microfluidic device
(Ullman et al., 2012) (Figure 2D). This device keeps the E. coli
cells in a state of exponential growth for the days during which
they are simultaneously imaged in the microscope. The individual cells growth rate, generation time, lineage, and size were obtained from phase-contrast images by using customized image
analysis and tracking tools (Magnusson et al., 2015; Sadanandan et al., 2014). The precision in estimating the division time using phase contrast was found to be 2 min based on a comparison
to the division time established using a fluorescent segmentation
marker (Figure S2). Individual cells growth rates determined by
phase-contrast microscopy were very similar to those determined by fluorescence microscopy (Figure S3D). Individual cells
grew exponentially, independent of the position within the device
and unperturbed by imaging laser exposure (Figures S3AS3C).
The individual cells growth rate is defined by a fit of an exponential function VBemt to the cells volume expansion, where VB is the
birth volume and m is the growth rate. We observed significant
cell-to-cell variation in division size, generation time, and growth
rate (Figures 2E2G). A model of the cell cycle should explain
how these distributions are related and what drives the variation.
Based on the assumption that the cells do not limit their growth
rate to reach a particular size or division time, we hypothesized
that the division time and size depend on the growth rate. We
therefore used growth rate as the basis for the single-cell version
of the CH model and tested if it could predict the variation in division time and cell size when the variation in growth rate was
used as an input.
Initiation of Replication Occurs at a Constant Volume
The single-cell CH model requires that we determine when replication is initiated. Figures 3A3C shows the localization of the
replication complexes, the replisomes, along the long axis of
the cells as a function of the cell volume for three different growth
Time
Figure 2. Characterizing Growth and Chromosome Replication at the Single-Cell Level
PDF
(A) A fluorescence image with identified DnaQ-Ypet.

(B and C) Fluorescence and phase-contrast time
lapse of an individual cell with fluorescently labeled
replisomes.
(D) Automatically segmented phase-contrast image
of E. coli growing in a microfluidic device.
(E) Distributions of single-cell growth rates for
fast (black), intermediate (blue), and slow (red)
conditions (see the Experimental Procedures for
definitions) fitted to normal distributions (correE
F
G
sponding solid lines). Estimated averages for the
0.1
1
100
growth rate distributions are < m > = 0.0241 min1,
<
m > = 0.0111 min1, and < m > = 0.0043 min1 and
0.01
their corresponding SDs sm = 0.0039, sm = 0.0026,
0.01
1
and sm = 0.00098 for the fast, intermediate, and slow
1
2
3
4
0.001
0.01
10
100
growth conditions, respectively. The number of cells
Growth Rate (min-1)
Birth Volume (m3)
Generation Time (min)
included in the distributions is 5,385, 1,217, and
1,683 for fast-, intermediate, and slow-growth
conditions, respectively. The data were collected from six microfluidics traps in one microfluidic chip for each condition. Filtering criteria are described in the
(F) Distributions of generation times corresponding to (E).
(G) Distributions of birth volumes corresponding to (E).
See also Figures S1, S2, S3, and S5.
conditions. New rounds of replication were observed to start at

defined cell sizes (Figures 3A3C, red dashed line). If the cells,
instead, were aligned by the time from division, the distribution
of replisomes was less coherent (Figures S4AS4C). This suggests that control of the replication cycle is related to size rather
than to the time from division (Donachie, 1968; Hill et al., 2012).
The origin of replication locus, oriC, co-localized with the replisomes at replication initiation, as shown by a fluorescently
labeled MalI transcription factor bound at the oriC-proximal
bgLG locus (Figures 3D3F). Furthermore, the number of
ongoing replication cycles was validated by replication runout
experiments (Skarstad et al., 1986) adapted to the microfluidic
environment. For slow, intermediate, and fast growth, the number of origins was typically 1 or 2, 2 or 4, and 4 or 8, respectively
(Figure S4D). The observed initiation volumes (Figures 3A3C,
red dashed lines), divided by the corresponding number of origins, are relatively invariant for the different growth conditions
(0.91.0 mm3). This is particularly clear for slow growth, where
a significant fraction of the cells initiated one round of replication
at 0.9 mm3 and another at 1.8 mm3 (Figure 3A). Replication initiations at integer multiples of a fixed volume, i.e., just after division
and just before division as seen in the slow growth case, invalidates any model in which replication is initiated at a certain
time or added volume following division.
Initiation Volume Is Uncorrelated with Volume at Birth
and Growth Rate
Next, we asked if the small cell-to-cell variation in initiation volume per chromosome depends on the individual cells birth volume or growth rate. The detection of DnaQ does not allow us to
reliably monitor the individual replication initiation events in individual cells due to fluorophore maturation, blinking and/or
bleaching. For this reason, we also studied replication using a
strain with a chromosomal seqA-ypet fusion (Babic et al.,
2008) (courtesy of the Waldminghaus lab). A large number of
SeqA molecules bind the hemimethylated DNA, which trails the

replication forks (Waldminghaus et al., 2012), resulting in highly
fluorescent dots at the sites of replication that can be detected
throughout the cell cycle of individual cells (Figure 4A) (Adiciptaningrum et al., 2015). The SeqA fusion strain did not display any
significant alteration in the location of replication events over the
cell cycle (Figure S5). However, different regions of the chromosome are hemimethylated for different periods of time (Campbell
and Kleckner, 1990), and the SeqA-Ypet signal will therefore not
be directly proportional to the number of replisomes. Given this
limitation, we only used this strain to determine the timing of
replication initiation in relation to the division events.
By studying the SeqA strain, we could determine that the cells
initiate replication at 0.92 mm3 in the slow-growth condition with
one chromosomal origin, oriC, and at 1.73 mm3 in the intermediate growth condition with two oriCs. The SD for the cell-to-cell
variation in initiation volume is 0.07 mm3 for slow growth and
0.17 mm3 for intermediate growth. At fast growth, there are too
many ongoing rounds of replications to unambiguously identify
the initiation events. Although the variation in initiation volume
per oriC is only 10%, it should be seen as an upper limit since
any error in estimating the cell volume will contribute to this
number.
The small variation in initiation volume between cells is not
correlated with the individual cells growth rate or volume at birth
(Figures 4B and 4C). However, lack of correlation can be due to
many factors, including measurement errors in the birth volume,
VB. For this reason, we test the more specific prediction that
cells born small spend more time, tB, between birth and initiation. In fact, if the initiation volume VI is constant, we would
expect tB z(ln(VI)- ln(VB))/m. Using our measured average growth
rates, we have found that the birth volumes and initiation times
are related as expected (Figure 4D). This confirms that the lack
of correlation between VB and VI is not a consequence of inaccurate measurements. The fixed initiation volume per chromosome
Cell 166, 729739, July 28, 2016 731
A
Cell volume (m3)
-1
-2
-1
Figure 3. Distributions of Replication Timing
C DnaQ localisation (m)
B DnaQ localisation (m)
DnaQ localisation (m)
-3
-2
-1
1
2
4
5
2
3
6
D
Cell volume (m3)
oriC localisation (m)

-2
-1

-2
-1
-3
2

-2
1
2
4
-1
(AC) The distributions of localized DnaQ along the

long axis of the cell (x axis) for cells of different
volumes (y axis) are shown. Panels correspond to
slow- (A), intermediate (B), and fast-growth (C)
conditions (see the Experimental Procedures for
definitions). Replication initiation is indicated by red
dashed lines. White dashed lines indicate average
volumes at birth and division. Note that the two
replication forks initiated from the same origin of
replication are spatially too close to be visualized as
separate distributions. The number of cells included
in the distributions is 10,774, 5,946, and 4,257, for
fast-, intermediate, and slow-growth conditions,
respectively.
(DF) Same as (A)(C), but with localization of MalIvenus bound to malO sites proximal to oriC.
2
3
6
can therefore be seen as a reset point for the otherwise correlated variations in the cell cycle.
Time from Replication Initiation to Division Is GrowthRate Dependent
The next step in making a single-cell version of the CH model
was to determine how much time cells spend in replication,
segregating their chromosomes, and dividing (the C and the D
periods). Cooper and Helmstetter have assumed this time to
be constant for generation times faster than 60 min, and the
main question is thus if this holds at all growth conditions. By
mapping replication to the division cycle using the DnaQ data,
we determined the C and D periods for the different growth conditions (Figures 5A5C, Determining the C+D periods in DnaQYpet strain in the Supplemental Experimental Procedures). We
have found that the average C+D-period, t, is relatively constant
for the fast and intermediate growth conditions, but is much
longer at slow growth (Figure 5C).
Previous replication run-out experiments in bulk (Michelsen
et al., 2003) suggest a functional form for the growth-rate dependence as t = amb+g. In this equation, g can be thought of as the
minimal time for chromosome replication, chromosome segregation, and septum formation at fast-growth rates. Conversely,
at slow growth, these processes are limited by factors related
to growth, where a 1% decrease in growth rate results in approximately a b% increase in t . Fitting the data from the DnaQ strain,
which has been stratified based on growth rates in the different
media, gives a = 1.3, b = 0.84, and g = 42 (min). This implies
that t is strongly dependent on growth rate at slow growth.
To test if the t dependence estimated from the population averages applies also for individual cells, we used the data from the
SeqA-YFP strain, where initiation of replication could be accurately determined in individual cells (Figure 5D). Our result shows
two things. First, cells with the same growth rate in different media have different t and therefore t does not only depend on
growth rate. The difference in t for cells growing at the same
rate in different media in combination with a fixed initiation volume will result in difference cell sizes, which is in agreement
732 Cell 166, 729739, July 28, 2016
with the observations by (Taheri-Araghi et al., 2015). Second,

there is also a strong growth-rate dependence on t at the single-cell level; in the same media individual cells growing slower
display a longer t than individuals growing faster. For cells
growing under the slow-growth condition, the cell-to-cell variation is ta/m, i.e., an increase in generation time by 1% results
in an increase of t by 1%. This strong response to the individual
cells growth rate implies that slowly growing cells get more time
to grow before they divide than more rapid-growth cells in the
same media.
Single-Cell Cooper-Helmstetter Model
On the basis of our data we can recast the Cooper and Helmstetter model (Figure 1A) in a stochastic single cell setting. The two
basic assumptions of the model are that replication is initiated
when the cell has grown to a fixed volume per origin (Figures 3
and 4) and that the cell divides at a growth rate dependent time
after initiation (Figure 5). The initiation volume, VI, and the time between initiation and division, t, are parameterized according to
our experimental measurements (Figures 3, 4, and 5). To account
for the experimentally observed cell-to-cell variation we can
introduce variability in the growth rate, m, the initiation volume,
VI, the C+D period, t, or in combinations of the three. In Figure 6A
we illustrate a few simulated cell cycles where the growth rates of
new-born cells are sampled from the experimental distribution.
When a cell grows past the replication initiation size, which is
independently sampled from the distribution of initiation sizes,
a division event is scheduled at a time t later. (See Modeling
in the Supplemental Experimental Procedures for a more detailed
description of the simulation.) When a large number of cell cycles
have been simulated, the statistical properties of these can be
compared to the experimental distributions.
In Figure 6B we show the experimentally measured distributions of cell sizes at birth and division as a function of generation
times and growth rates respectively. The same distributions
were also calculated from the model where the cell-to-cell variability is introduced in the growth rate only (Figure 6C). There is
an approximate agreement with respect to the experimental
Initiation Volume (m3)
Interm. Growth Slow Growth
Time -->
1.5
0.5
0.5
1.5
1.5
2
1.5
1
0.5
0
0
0.005
0.01
0.015
0.02
-1
Growth Rate (min )
200
D
Initiation Time (min)
150
100
50
0
0
0.5
1.5
Birth Volume (m3)
1.2
1
1
0.8
0.8
0.6
0.4
0.2
0.6
0.6
0.8
Probability dot in first frame
Birth Volume (m3)
1.2
Birth Volume (m3)
Figure 4. Initiation of Replication in Individual Cells

(A) Fluorescence image time-lapse series of the SeqA-Ypet strain. Top: the division cycle of one cell at slow growth. The frames are separated by 18 min. Bottom:
one-and-a-half division cycles for a cell at intermediate growth. The frames are separated by 14 min. See also Movies S1 and S2 for additional examples of
fluorescence image time-lapse series.
(B) Scatter plot of birth volumes and initiation volumes for individual cells at the slow-growth rate (red) and intermediate growth rate (blue). No initiations can be
observed below the dashed line since this would correspond to initiation in the mother cell. The inset shows the fraction of cells that are born with ongoing
replication for different birth volumes.
(C) The correlation of the individual cells growth rate and initiation volume visualized in a scatter plot.
(D) The individual cells birth volume plotted against the time to initiation. The dashed lines correspond to the theoretical predictions t = (ln(VI) ln(VB))/ < m >,
assuming a constant initiation volume. The number of cells included is 605 for intermediate and 401 for slow-growth conditions.
distributions, especially in the marginal distribution (Figure 6E),

but the cell size distributions are too narrow, especially for
slow growth (Figure 6C, red curves). This is because of the fixed
initiation volume and a perfect compensation in t to variations in
growth rate. In Figures 6D and 6F, the variation in replication initiation volume and the C + D period was also included. With these
additional sources of variation, there is an overall excellent agreement with the experimental data, suggesting that the model
captures the most important aspects of cell cycles variations.
Basis for the Adder Mechanism and When It Breaks
Down
Given the accuracy of the model in describing the division timing
and cell sizes, we could use it to understand the mechanisms
underlying the adder model (Amir, 2014; Campos et al.,
2014; Taheri-Araghi et al., 2015; Voorn and Koppes, 1998). We
started by looking at what the single-cell CH model predicts
under the growth conditions for which the adder has been
described. Here, the growth is relatively fast; replication is initiated in the mother or grandmother generation and t is, as a first
approximation, constant. This means that, regardless of its birth
size, a cell will divide a fixed time after a replication initiation
event that occurred in a previous generation (Figure 7A). This
results in a volume expansion per generation that is uncorrelated
to the birth volume (Figure 7B) and instead depends on the

growth rate (Figure 6B), i.e., it is an adder.
The situation is different in slowly growing cells where replication initiation occurs in the same generation as the corresponding
division event and t is strongly dependent on growth rate. This
implies that slowly growing cells have more time to grow from
the initiation volume to cell division than more rapid-growth cells
(Figure 7C). Since tza=m, the division volume will be proportional
to the initiation volume, i.e., VD = VIemt = VIea. This striking relation
is seen for the slow-growing cells (Figure 7D). (The dashed line in
7D is not a fit, but a theoretical prediction with slope ea, where
a estimated by a regression of the C+D periods for slow growth
in Figure 5D to t = a/m.) Furthermore, since the initiation volume
does not depend on the growth rate or the birth volume (Figure 4),
the division volume will be independent of growth rate and the
birth volume. This implies that the slowly growing cells behave
more like sizers, although the main reason is not an explicit
size sensor at division but rather at initiation of replication.
Furthermore, cells born small are predicted to add a relatively
greater volume compared to cells born large resulting in negative
correlations between birth volume and added volume. This is also
observed in the experimental data (Figure 7B, red curves).
The negative correlation of the added volume to the birth volume in slow-growing cells (Figure 7B) is a clear deviation from
Cell 166, 729739, July 28, 2016 733
200
Fast
Intermediate (37C)
Intermediate (30C)
Slow
100
0
0.005 0.01
C+D (min)
C+D (min)
300
400
fit: 0.73/0.94
fit: 14/0.45
300
200
100
0.015 0.02
Growth Rate (min-1)
0
0
0.005
0.01
0.015
0.02
Growth Rate (min-1)
Figure 5. The C+D Period for Different Growth Conditions

(A and B) Illustrations of the procedure used to compute the C+D periods for
slow (A) and intermediate (B) growth. The distributions of DnaQ as functions of
volume from Figures 3A and 3B are concatenated based on division time to
allow for C+D period determination. The inferred C and D periods are shown as
gray bars. As in Figure 3, the red dashed line indicates initiation volumes, and
the white dashed lines indicate division volumes.
(C) The C+D periods determined in the DnaQ-Ypet strain plotted against the
growth rate. The data are fitted with a power-law curve, 1.3m0.84+42 min
(dashed line). The data for slow (red circles) and intermediate growth (blue
circles) are stratified based on growth rate (see Determining the C+D period
for DnaQ-Ypet strain in the Supplemental Experimental Procedures). The
number of cell cycles used in each point is, in order of increasing growth rate:
593, 1,427, 1,464, 595, 814, 2,023, 2,020, 844, 3,758, and 10,774.
(D) C+D periods measured for individual cells in the SeqA-YFP strain. The fitted
curves are given in the inset.
the adder model. This negative correlation would however also

be observed if the measured birth and division volumes were uncorrelated due to measurement errors. To reduce such errors as
much as possible, we repeated the slow-growth experiment with
cells expressing high concentrations of a fluorescent segmentation marker. By using a sensitive, high-pixel density sCMOS
camera, we could image magnified cells at low-laser power,
which allowed for high-resolution segmentation without using
active contour modeling. In Figure 7E we show that the slowgrowing cells still display deviation from what would be
expected in the adder model.
DISCUSSION
The single cell version of the Cooper and Helmstetter model
has two basic components: an initiation volume per chromosome that does not depend on the birth size of the cell or
its growth rate, and a growth rate dependent time for replication and division. The model explains the adder behavior
at fast growth and sizer behavior at slow growth. Here we
discuss what these observations say about the underlying
mechanisms.
Replication Initiation Size
We find that the cell-to-cell variation in volume at which DNA
replication initiates is approximately 10% (SD/mean). The obser734 Cell 166, 729739, July 28, 2016
vation that replication initiates at a relatively constant volume per

oriC is not novel in itself (Donachie, 1968; Hill et al., 2012; Wold
et al., 1994), but the single cell time-lapse measurements also
allow us to determine that the initiation volume is not correlated
to the birth size of the cell or its growth rate. Our findings therefore imply that it is meaningful to think of replication initiation as
the starting point of the cell cycle.
Although we find that coordination between replication and division is assured by initiating DNA replication at a constant size
per chromosome origin, it is not likely that it is the oriC locus itself
that is regulated (Helmstetter and Leonard, 1987) but rather
some closely linked locus. A likely candidate is datA that binds
the active form of the replication initiation protein DnaA-ATP
and promotes its hydrolysis to its inactive form DnaA-ADP (Donachie and Blakely, 2003; Kasho and Katayama, 2013). Therefore, when datA is replicated, the initiation potential drops,
which makes datA a good candidate for how the control system
for initiation senses chromosome copy number. For a review on
DnaA mediated replication initiation please see (Skarstad and
Katayama, 2013).
At fast growth, i.e., when the cell should initiate replication at
multiple origins, it is also important that this occurs synchronously (Skarstad et al., 1986). All origins have to fire within the
period when newly replicated origins are hemimethylated and
protected for re-initiation by sequestration for approximately
13 min (Lu et al., 1994). In Figure S5L, we show that at least
90% of the origins fire within this time at the intermediate growth
rate where it is possible to identify the appearance of replication
forks at quarter position in the same cell.
What does the accuracy in initiating replication at a fixed
volume say about the underlying mechanism? The data show
that in 95% of the cells, initiation is triggered within in a range
corresponding to a 50% change in volume. This implies that
the initiation rate, whatever determines it, has to increase sharply
in a narrow volume range. For example, if the instantaneous
initiation rate, r, responds to the volume, V, as a Hill function
r = aV=VI n =1 + V=VI n , the exponent, n, needs to exceed
20, in order to initiate in the observed volume range (see Initiation Sensitivity Analysis in the Supplemental Experimental Procedures). To achieve this, the cell needs a very sensitive control
system (Paulsson and Ehrenberg, 2001; Savageau, 1976). In a
simple control system, in which a repressor of replication would
dissociate from oriC due to dilution by volume growth, the
repression would at the most decrease by 1% for a 1% volume
growth. The experimental data instead requires a mechanism
that can respond by 20% to a 1% change in volume. This cannot
be achieved by models that rely on simple titration (Hansen et al.,
1991; Pritchard, 1968; Sompayrac and Maaloe, 1973). Plausible
mechanisms are instead based on energy dependent cycling of
for example DnaA between its ATP and ADP forms (Kurokawa
et al., 1999; Sekimizu et al., 1987). This could for example be
a zero-order modification-demodification scheme (Koshland
et al., 1982) sensitive to the chromosome to volume ratio, or an
irreversible multi-step processes (Paulsson and Ehrenberg,
2000) where DnaA-ATP builds up an initiation complex that is
interrupted by the incorporation of DnaA-ADP. The requirement
to reach the experimentally observed sensitivity should guide
further thinking about possible mechanisms.
A
C
C
Volume
C
C
C
D
D
C
C
D
D
D
D
Time
Cell Volume (m )
5
4
3
2
1
100
200
6
5
4
3
2
1
300
100
2
1
0.02
0.005
Marginal Distributions(variation in )
0.01
0.08
0.0025
3
2
0.04
200
300
3
2
1
100
0.02
0.02
0.01
Marginal Distributions (variation in ,,VI )

PDF
PDF
0.005
Growth Rate (1/min)
0.06
0.02
0.02
0.04
300
Growth Rate (1/min)
0.06
200
0.5
0.0025
PDF
0.08
3
4
Growth Rate (1/min)
0.5
0.01
100
Cell Volume (m )
0.005
Cell Volume (m )
0.0025
300
0.5
PDF
200
Cell Volume (m )
7
3
Model (variation in ,,VI )
Cell Volume (m )
Model (variation in )
Experiment
7
Cell Volume (m )
0
1
Volume at Birth (m 3 )
100
200
300
Volume at Birth (m 3 )
Figure 6. The Growth Rate Variation Propagated to Other Variables and Compared to Experimental Observations
(A) The cell volume (black solid line) of a single cell lineage simulated according to the single-cell CH model throughout an upshift in growth conditions. The growth
rates are sampled from the observed distributions. When the cell passes the initiation volume(s) (red lines), a division event is triggered (green arrow) after a time
period corresponding to the C and D periods (gray bars). Variation can be introduced in growth rates, initiation volume, or C+D period.
(B) Top: The joint distribution of generation time and cell size at birth and division as observed in the DnaQ-YPet experiments. Bottom: The same data plotted as a
function of growth rate.
(C) Model predictions of the distribution in B when cell-to-cell variation is introduced in growth rate only. SD and means are as estimated from experiments
(Figure 2F). Here, SD/mean = 0.16, 0.23, and 0.23 were used for fast-, intermediate, and slow-growth rate, respectively.
(D) Same as (C), but cell-to-cell variation is also introduced into the initiation volume (SD/mean = 0.095) and C+D period (SD/mean = 0.05 for fast and intermediate
and SD/mean = 0.1 for slow growth).
(E) Marginal distributions for generation time and volume at birth from (C) shown as solid lines. Corresponding experimental distributions from (B) added for
comparison (dashed lines).
(F) Marginal distributions for generation time and volume at birth from (D) shown as solid lines. Corresponding experimental distributions from (B) added for
comparison (dashed lines). The cells included in experimental data are the same as in Figure 2.
Cell-to-Cell Variation in Growth Rate

We have found that the cell-to-cell variation in generation time
and division size is mainly caused by variation in growth rate.
A remaining question is what determines the cell-to-cell variation

in growth rate? Why do not all cells grow as fast as the fastest cell
for a specific growth medium, when cells with this composition
Cell 166, 729739, July 28, 2016 735
Overlapping replication and fixed C+D
Figure 7. The Added Volume over One Division Cycle
C+D
Size
Added Volume (m )
Time
Size
C Non-overlapping replication and variable C+D

C+D
C+D
Experiment
4
3
2
1
C+D
Model
1
4
3
E
3
1.5
2.5
Initiation Volume (m )
3
Added Area(m2)
Division Volume (m3 )
Birth Volume (m )
1.5
0.5
0.5
1
2
Birth Area(m )
would presumably outcompete their siblings? The answer is

likely related to a high fitness cost of the hypothetical control systems that would be required to tune the cell composition to the
maximal growth in each specific condition. Instead, it appears
that the cell uses a slightly sloppy control system that results
in important components getting into suboptimal balance. Since
the growth-rate correlation is rapidly lost over generations (Figure S6), it appears as if cell division itself causes the suboptimal
composition (Huh and Paulsson, 2011), but we do not know
which components are out of balance. Recently, it has been reported that ribosomes are more unevenly inherited between
sisters than expected by randomly putting each ribosome in
one of the daughter cells (Chai et al., 2014). Thus, we tested if
the difference in ribosome inheritance impacts the growth rate
of the sisters, but found only limited correlation (Figures S7A
S7D). It appears the situation is more complicated.
A possible key is that different experimental conditions display
different ranges of cell-to-cell variation in growth rate. Our setup
typically gives similar cell-to-cell variation in the growth rate as
those observed in Campos et al. (2014), but larger than those
observed by Taheri-Araghi et al. (2015). Although occasional experiments also in our setup, for example, the ribosome-labeled
strain (Figures S7ES7J), displayed smaller variation in growth
rates, we have not been able to determine a common denominator that explains the differences.
736 Cell 166, 729739, July 28, 2016
1.5
(A) An example of how division control according to

the model at intermediate growth appears as an
adder. The division volumes for rapid- (gray solid
line) and slow-growth (black solid line) siblings will
be different since the initiation occurred in the
mother generation and the C+D period is insensitive
to growth rate.
(B) The joint distribution of volume added during one
generation and the birth volume are shown for slow,
intermediate, and fast growth as red, blue, and black
solid lines. The cells included are the same as in
Figure 2.
(C) An example of how division control according
to the model at slow growth with non-overlapping
replication cycles appears as a sizer. This effect is
illustrated with two siblings following division. In the
rapid growth case (gray solid line) the C+D period,
following initiation, is shorter than in the slow-growth
case (black solid line), and thus they will divide at the
same size.
(D) The division volumes and initiation volumes of the
SeqA-Ypet strain are plotted for slow (red) and intermediate growth (blue). The black line represent
the prediction from VD = VI ea, where a is based
on fitting the C+D period in Figure 5B to t = a/m
(a = 0.518).
(E) Same as the experimental data for slow growth
in (B), but here size (in area) is determined by
using a strain expressing a fluorescent segmentation marker (inset). The number of cells included in
the distribution is 849.
See also Figure S5.
Growth-Rate-Dependent C+D Period

Our version of the Cooper-Helmstetter model explains the
observation that cells add a growth-condition-dependent volume each generation independently of birth size at fast and intermediate growth (the adder mechanism) (Campos et al., 2014;
Taheri-Araghi et al., 2015). In essence, cells with overlapping
replication cycles will complete the replication-division program
that was started in a previous generation at a time that is independent of their birth sizes. Hence, the division volume will not
depend on the birth volume in these cases. The model also predicts a deviation from the adder model under conditions,
which prior to our work, had not been previously experimentally
studied. At slow growth, where the replication-division program
is started and completed in the same generation, the program
takes a much longer for cells growing slowly than for those
growing fast. It appears as if cell division is limited by making a
certain number of specific components in order to divide and
that this number is only reached at a specific size.
Perspective
The model we present structures our understanding of the E. coli
cell cycle and clarifies the mechanisms underlying the cell-to-cell
variation with respect to cell division. It also suggests three directions in which further studies are needed: (1) what causes
the sensitivity to be high enough to trigger replication at the
experimentally observed volume; (2) what causes the cell-to-cell

variation in growth rate; and (3) what are the factors limiting replication and division at the slow-growth rate?
Answering these questions will guide us toward a molecular
understanding of the E. coli cell-cycle control system.
grown under fast conditions with 5 mM IPTG present in the medium to induce the
expression of FtsQ-GFP molecules. For the control experiment showing deviations from adder behavior (Figure 7E), strain JE201 was grown in slow-growth
conditions. Strain JE202 was studied under fast-growth conditions. All media
used in the microfluidic experiments contained a surfactant, Pluronic F108
(CAS 9003-11-6, Sigma-Aldrich), at a final concentration of 0.85 gLl1.
Microfluidic Sample Management

The preparation and operation of the microfluidic devices used in experiments
with strains JJC5350, JE200, DS116, EC442, JE201, and JE202 were performed as described in Ullman et al. (2012). For experiments under slowand intermediate growth conditions, the trap depth used was 800 nm. For all
other microfluidic experiments, 900 nm was used.
Strains
DnaQ localization was investigated using the MG1655 strain JJC5350 (a gift
from the lab of Benedict Michelle; Reyes-Lamothe et al., 2010), carrying a genetic fusion of the replication factor DnaQ and the yellow fluorescent protein
YPet, encoded in the native dnaQ locus. The construct was also transferred
to the MG1655 strain BW25993 (Datsenko and Wanner, 2000) using a P1
phage. We found the average growth rate in bulk experiments (Figures S1A
and S1B) and the growth-rate distribution and DnaQ localization (Figures S1I
and S1J) in microfluidics experiments to be very similar. Based on these similarities, we used the longest high-quality imaging data series regardless of
strain origin (JJC5350 in fast- and intermediate growth conditions and
BW25993 in slow-growth conditions) when comparing cell physiology to simulations (Figures 2, 6, and 7B). To increase the number of imaged cell cycles in
the fast- and intermediate growth conditions, we used both strains for DnaQ
localizations in Figures 3A3B.
SeqA localization (Figures 4, 5D, and 7D) was investigated using an MG1655
strain DS116 carrying a SeqA-YPet fusion. The strains were a kind gift from the
Waldminghaus lab, who replaced the native seqA gene with the seqA-ypet
fusion, constructed in the Radman lab (Babic et al., 2008).
The location of origins was investigated using strain JE200 (Figures 3D3F),
in which a genetic fusion of the transcription factor malI and the gene encoding
the yellow fluorescent protein variant Venus was introduced in the origin-proximal bgLG locus in strain BW25993 using the lambda-red protocol (Datsenko
and Wanner, 2000). The construct contains two tandem operator sites, malO,
to which MalI-Venus binds tightly. Further, the native malI gene, as well as
the native malO sites, was deleted using the lambda-red method. This minimal
construct was selected to avoid the risks associated with having a large number
of operator-transcription factor complexes present in the cell (Lau et al., 2003).
Precision in the determination of division timing (Figure S2; see Validation
of the phase contrast division classifier the Supplemental Experimental Procedures ) was investigated in strain EC442 (Soderstrom et al., 2014), containing a genetic fusion of the division factor, ftsQ, and a green fluorescent protein,
gfp, introduced in MG1655.
Accuracy in determining individual growth rates (Figure S3D; see Comparing
growth rates in phase contrast and fluorescence in the Supplemental Experimental Procedures) and the control experiment for deviation from adder
behavior (Figure 7E) was investigated in strain JE201, in which a gene encoding
a red fluorescence protein, tRFP, regulated by the constitutive ribosomal RNA
promoter P2rrnB was introduced at the intC locus using the lambda-red method
in BW25993.
The dependence of the growth rate on ribosome content (Figure S7; see
Dependence of growth rate and ribosome content and partitioning at birth
in the Supplemental Experimental Procedures) was studied in strain JE202,
in which gene rpsB was genetically fused to yellow fluorescence protein,
Venus, and gene rplI to a red fluorescent protein, mCherry, by using the
lamda-red protocol. In both cases the constructs replaced the native genes.
The rpsB and rplI genes express the proteins S2 and L9 that associate to
the small and large subunit of the ribosome, respectively.
Growth Conditions
Growth conditions are as follows: fast: M9 minimal medium and 0.4% glucose
supplemented with RPMI 1640 amino acids (R7131, Sigma-Aldrich) at
37 C; intermediate: M9 minimal medium and 0.4% succinate supplemented
with RPMI 1640 amino acids (R7131, Sigma-Aldrich) at 30 C; and slow:
M9 minimal medium and 0.4% acetate at 37 C.
All strains were grown in M9 minimal media, except for the experiment that determines the accuracy in individual growth rates (Figure S3D), in which the strain
JE201 was grown in Luria-Bertani liquid medium (LB) at 37 C. Strain EC442 was
Microscopy and Imaging Conditions

All microscopy experiments were performed using an inverted microscope (Nikon Ti-E) with 1003 oil-immersion objectives (either an Apo TIRF 1.49 na or a
1003 Plan Apo l 1.45 na). For phase-contrast imaging, a CFW-1312M (Scion),
a DMK 23U274 (the Imaging Source) or an Infinity 2-5M (Lumenera) camera
was used. Fluorescence and bright-field images were recorded on Andor
Ixon EMCCD cameras, unless otherwise stated. The Andor cameras were
equipped with an additional 23 (Diagnostic instruments DD20NLT) or 2.53
lens (Nikon Instruments).
JJC5350, DS116, and JE200 Imaging: phase-contrast images were acquired with a 125-ms exposure at a rate of 2 min1. For fluorescence imaging,
a 514-nm laser (Coherent Genesis CX STM) was used. Laser exposure
time and power were 1 s and 3.0 Wcm2 per frame for experiments using
JJC5350, 4 s and 0.75 Wcm2 for JE200, and 1 s and 0.3 Wcm2 for
DS116. Fluorescence and bright-field imaging frequency were 1/3 min1.
EC442 Imaging: phase-contrast images were acquired with a 125-ms exposure and at a rate of 2 min1. For fluorescence imaging, a 488-nm laser (Cobolt
MLD) was used. Laser exposure time was 100 ms. Fluorescence and brightfield imaging frequency were 1 min1.
JE201 Imaging: for Figure S3D, phase-contrast images were acquired with a
125-ms exposure at a rate of 1 min1. For fluorescence imaging, a 561-nm
laser (Genesis MX, Coherent) was used. Laser exposure time was 300 ms.
Fluorescence and bright-field imaging frequency were 1 min1. For Figure 7E,
phase-contrast images were acquired with a 500-ms exposure at a rate of
1/3 min1. For fluorescence imaging, a 561-nm laser (Genesis MX, Coherent)
was used. Laser exposure time and power were 200 ms and 10 Wcm2. Fluorescence and bright-field imaging frequency were 1/3 min1. Here, a Zyla 4.2
PLUS sCMOS (Andor) was used.
JE202 Imaging: phase-contrast images were acquired with 125-ms exposure at a rate of 6 min1. For fluorescence imaging, a 514-nm laser (Fandango
150, Cobolt) and 580-nm laser (F-04306-03, MBP Com.) were used. Laser
exposure time and power were 30 ms and 18.5 Wcm2 for 514 nm and
200 ms at 61.5 Wcm2. Fluorescence and bright-field imaging frequency
were 1/3 min1.
Rif Run-out in Microfluidics: a Spectra light engine (Lumencor) with a bandpass filter transmitting light between 352 and 402 nm was used for fluorescence imaging (see Microfluidic Replication Run-Out Experiments in the
Supplemental Experimental Procedures for details).
The microscope was controlled using m-Manager (Edelstein et al., 2014), and
automated acquisitions were performed using in-house micro-manager plugin.
Unless otherwise stated time-lapsed acquisitions were performed in parallel at
multiple microfluidic trap regions, one of which was not exposed to laser. Exceptions are JE201 and JE202 imaging, in which all traps were laser exposed,
and only one trap was imaged in the JE201 case. The duration of the acquisition
varied from 224 hr, depending on the growth conditions. In all cases, cells
were grown in the microfluidic devices for at least 24 hr prior to imaging to
ensure steady-state exponential growth before the start of image acquisition.
The temperature of the microfluidic device was maintained using a cage incubator (either OKO lab or Haison) encapsulating the microscope stage.
Cell Segmentation, Tracking, and Detection of Single Molecules
A custom-written, fully automated analysis pipeline written in MATLAB
was used to analyze the time-lapsed microscopy data. Cells in each
Cell 166, 729739, July 28, 2016 737
phase-contrast image were segmented using the method described in Sadanandan et al. (2014). An active contour model based on Sliusarenko et al.
(2011) was developed, and a contour was computed for each segmented object. Cells were tracked between frames using the method described in Magnusson et al. (2015). The output was filtered so that each cell cycle must have a
parent and two children and that the maximum displacement of that cell between any consecutive frames was less than the cell width. Growth-condition-dependent filter criteria were added so that only cells with a cycle time
of 50, 30, and 15 min were retained for slow, intermediate, and fast growth,
respectively. To attain high-precision volume estimates, the estimated volumes from the active contours were fitted to V(t) = VB exp(mt), where t is the
time from cell birth, VB is the birth volume, and m is the growth rate. Cell generations were filtered based on root-mean-squared errors of the fit being
smaller than 5% of the average cell volume. The fitted volumes at times of division and initiation were used in all figures, except for Figure 3, where a large
number of cell generations were required and it is sufficient to know the size of
the cell regardless of division parameters. The analysis of the precision in division timing is described in the Supplemental Experimental Procedures (see
Validation of the phase contrast division classifier) (Figure S2). The determination of length, areas, volumes, and widths was based on the contour model
as in Sliusarenko et al. (2011).
Phase-contrast and fluorescent images of strain JE202 and the fluorescent
images of JE201 were segmented and tracked as described above, but not
fitted with active contours.
Replisomes and origins were detected in the raw fluorescence images using
the method described in Hammar et al. (2014). The coordinates were set in a
cellular reference frame using the contour model derived from the phasecontrast image taken in conjunction with the fluorescence image.
seven figures, and two movies and can be found with this article online at
M.W., J.E., and D.F. designed the experiments. D.F., J.E., and M.W. developed the model. J.E., D.F., and M.W. wrote the paper; M.W., D.F., E.G.L.,
and O.B. made the experiments; M.W. and D.F. made the analysis codes
and performed analysis; and D.F. built the setup and made the simulation
code and the simulations.
ACKNOWLEDGMENTS
We thank Gustaf Ullman, Alexis Boucharin, Erik Marklund, and Prune Leroy for
valuable support in programming, data analysis, and cloning; Sajith Kecheril
and Carolina Whalby for providing image analysis code; Bill Soderstrom for
providing EC442; Benedict Michelle for providing JJC5350; and Torsten Waldminghaus and Daniel Schindler for providing DS116. This work was supported
by the European Research Council, Vetenskapsradet, and the Knut and Alice
Wallenberg Foundation.
Bates, D., and Kleckner, N. (2005). Chromosome and replisome dynamics in

E. coli: loss of sister cohesion triggers global chromosome movement and
mediates chromosome segregation. Cell 121, 899911.
Campbell, J.L., and Kleckner, N. (1990). E. coli oriC and the dnaA gene promoter are sequestered from dam methyltransferase following the passage of
the chromosomal replication fork. Cell 62, 967979.
Campos, M., Surovtsev, I.V., Kato, S., Paintdakhi, A., Beltran, B., Ebmeier,
S.E., and Jacobs-Wagner, C. (2014). A constant size extension drives bacterial
cell size homeostasis. Cell 159, 14331446.
Chai, Q., Singh, B., Peisker, K., Metzendorf, N., Ge, X., Dasgupta, S., and
Sanyal, S. (2014). Organization of ribosomes and nucleoids in Escherichia
coli cells during growth and in quiescence. J. Biol. Chem. 289, 1134211352.
Cooper, S., and Helmstetter, C.E. (1968). Chromosome replication and the
division cycle of Escherichia coli B/r. J. Mol. Biol. 31, 519540.
Datsenko, K.A., and Wanner, B.L. (2000). One-step inactivation of chromosomal genes in Escherichia coli K-12 using PCR products. Proc. Natl. Acad.
Sci. USA 97, 66406645.
Donachie, W.D. (1968). Relationship between cell size and time of initiation of
DNA replication. Nature 219, 10771079.
Donachie, W.D., and Blakely, G.W. (2003). Coupling the initiation of chromosome replication to cell size in Escherichia coli. Curr. Opin. Microbiol. 6,
146150.
Edelstein, A.D., Tsuchida, M.A., Amodaj, N., Pinkard, H., Vale, R.D., and Stuurman, N. (2014). Advanced methods of microscope control using mManager
software. J. Biol. Methods 1, 1.
Hammar, P., Wallden, M., Fange, D., Persson, F., Baltekin, O., Ullman, G., Leroy, P., and Elf, J. (2014). Direct measurement of transcription factor dissociation excludes a simple operator occupancy model for gene regulation. Nat.
Genet. 46, 405408.
Hansen, F.G., Christensen, B.B., and Atlung, T. (1991). The initiator titration
model: computer simulation of chromosome and minichromosome control.
Res. Microbiol. 142, 161167.
Helmstetter, C.E., and Leonard, A.C. (1987). Coordinate initiation of chromosome and minichromosome replication in Escherichia coli. J. Bacteriol. 169,
34893494.
Hill, N.S., Kadoya, R., Chattoraj, D.K., and Levin, P.A. (2012). Cell size and the
initiation of DNA replication in bacteria. PLoS Genet. 8, e1002549.
Huh, D., and Paulsson, J. (2011). Non-genetic heterogeneity from stochastic
partitioning at cell division. Nat. Genet. 43, 95100.
Kasho, K., and Katayama, T. (2013). DnaA binding locus datA promotes DnaAATP hydrolysis to enable cell cycle-coordinated replication initiation. Proc.
Koshland, D.E., Jr., Goldbeter, A., and Stock, J.B. (1982). Amplification and
adaptation in regulatory and sensory systems. Science 217, 220225.
Kurokawa, K., Nishida, S., Emoto, A., Sekimizu, K., and Katayama, T. (1999).
Replication cycle-coordinated change of the adenine nucleotide-bound forms
of DnaA protein in Escherichia coli. EMBO J. 18, 66426652.
Lau, I.F., Filipe, S.R., Sballe, B., kstad, O.A., Barre, F.X., and Sherratt, D.J.
(2003). Spatial and temporal organization of replicating Escherichia coli chromosomes. Mol. Microbiol. 49, 731743.
Lu, M., Campbell, J.L., Boye, E., and Kleckner, N. (1994). SeqA: a negative
modulator of replication initiation in E. coli. Cell 77, 413426.
REFERENCES
Magnusson, K.E., Jalden, J., Gilbert, P.M., and Blau, H.M. (2015). Global linking of cell tracks using the Viterbi algorithm. IEEE Trans. Med. Imaging 34,
911929.
Adiciptaningrum, A., Osella, M., Moolman, M.C., Cosentino Lagomarsino, M.,

and Tans, S.J. (2015). Stochasticity and homeostasis in the E. coli replication
and division cycle. Sci. Rep. 5, 18261.
Michelsen, O., Teixeira de Mattos, M.J., Jensen, P.R., and Hansen, F.G.
(2003). Precise determinations of C and D periods by flow cytometry in Escherichia coli K-12 and B/r. Microbiology 149, 10011010.
Amir, A. (2014). Cell size regulation in bacteria. Phys. Rev. Lett. 112, 208102.
Babic, A., Lindner, A.B., Vulic, M., Stewart, E.J., and Radman, M. (2008). Direct
visualization of horizontal gene transfer. Science 319, 15331536.
738 Cell 166, 729739, July 28, 2016
Osella, M., Nugent, E., and Cosentino Lagomarsino, M. (2014). Concerted

control of Escherichia coli cell division. Proc. Natl. Acad. Sci. USA 111,
34313435.
Paulsson, J., and Ehrenberg, M. (2000). Molecular clocks reduce plasmid loss
rates: the R1 case. J. Mol. Biol. 297, 179192.
Paulsson, J., and Ehrenberg, M. (2001). Noise in a minimal regulatory network:
plasmid copy number control. Q. Rev. Biophys. 34, 159.
Pritchard, R.H. (1968). Control of DNA synthesis in bacteria. In DNA Synthesis,
I. Molineux and M. Kohiyama, eds. (Springer), pp. 126.
Reyes-Lamothe, R., Sherratt, D.J., and Leake, M.C. (2010). Stoichiometry and
architecture of active DNA replication machinery in Escherichia coli. Science
328, 498501.
Sadanandan, S., Baltekin, O., Magnusson, K., Baucharin, A., Ranefall, P., Jalden, J., Elf, J., and Wahlby, C. (2014). Segmentation and track-analysis in timelapse imaging of bacteria. IEEE J. Sel. Topics Signal Process 10, 174184.
Savageau, M.A. (1976). Biochemical Systems Analysis: A Study of Function
and Design in Molecular Biology (Addison-Wesley).
Schaechter, M., Maaloe, O., and Kjeldgaard, N.O. (1958). Dependency on medium and temperature of cell size and chemical composition during balanced
grown of Salmonella typhimurium. J. Gen. Microbiol. 19, 592606.
Sekimizu, K., Bramhill, D., and Kornberg, A. (1987). ATP activates dnaA protein
in initiating replication of plasmids bearing the origin of the E. coli chromosome. Cell 50, 259265.
Skarstad, K., and Katayama, T. (2013). Regulating DNA replication in bacteria.
Cold Spring Harb. Perspect. Biol. 5, a012922.
Skarstad, K., Boye, E., and Steen, H.B. (1986). Timing of initiation of chromosome replication in individual Escherichia coli cells. EMBO J. 5, 17111717.
Sliusarenko, O., Heinritz, J., Emonet, T., and Jacobs-Wagner, C. (2011). Highthroughput, subpixel precision analysis of bacterial morphogenesis and
intracellular spatio-temporal dynamics. Mol. Microbiol. 80, 612627.
Soderstrom, B., Skoog, K., Blom, H., Weiss, D.S., von Heijne, G., and
Daley, D.O. (2014). Disassembly of the divisome in Escherichia coli: evidence that FtsZ dissociates before compartmentalization. Mol. Microbiol.
92, 19.
Sompayrac, L., and Maaloe, O. (1973). Autorepressor model for control of DNA
replication. Nat. New Biol. 241, 133135.
Taheri-Araghi, S., Bradde, S., Sauls, J.T., Hill, N.S., Levin, P.A., Paulsson, J.,
Vergassola, M., and Jun, S. (2015). Cell-size control and homeostasis in bacteria. Curr. Biol. 25, 385391.
Ullman, G., Wallden, M., Marklund, E.G., Mahmutovic, A., Razinkov, I., and Elf,
J. (2012). High-throughput gene expression analysis at the level of single proteins using a microfluidic turbidostat and automated cell tracking. Philos.
Trans. R. Soc. Lond. B Biol. Sci. 368, 20120025.
Voorn, W.J., and Koppes, L.J. (1998). Skew or third moment of bacterial generation times. Arch. Microbiol. 169, 4351.
Waldminghaus, T., Weigel, C., and Skarstad, K. (2012). Replication fork movement and methylation govern SeqA binding to the Escherichia coli chromosome. Nucleic Acids Res. 40, 54655476.
Wang, P., Robert, L., Pelletier, J., Dang, W.L., Taddei, F., Wright, A., and Jun,
S. (2010). Robust growth of Escherichia coli. Curr. Biol. 20, 10991103.
Wold, S., Skarstad, K., Steen, H.B., Stokke, T., and Boye, E. (1994). The initiation mass for DNA replication in Escherichia coli K-12 is dependent on growth
rate. EMBO J. 13, 20972102.
Cell 166, 729739, July 28, 2016 739
Resource
Integrated Proteogenomic Characterization of

Human High-Grade Serous Ovarian Cancer
Graphical Abstract
Authors
Hui Zhang, Tao Liu, Zhen Zhang, ...,
Daniel W. Chan, Karin D. Rodland,
the CPTAC Investigators
Correspondence
dchan@jhmi.edu (D.W.C.),
karin.rodland@pnnl.gov (K.D.R.)
In Brief
Layering proteomic and genomic data
from ovarian tumors provides insights
into how signaling pathways correspond
to specific genome rearrangements and
points to the benefit of using protein
signatures for assessing prognosis and
treatment stratification.
Highlights
d
Comprehensive proteomic characterization of 174 ovarian

tumors are analyzed
Copy-number alterations affect the proteome in trans,
converging on pathways
Acetylation of histone H4 correlates with homologous repair
deficiency status
Protein and phosphoprotein abundance identifies pathways
associated with survival
Zhang et al., 2016, Cell 166, 755765

Resource
Integrated Proteogenomic Characterization of
Human High-Grade Serous Ovarian Cancer
Hui Zhang,1,15 Tao Liu,2,15 Zhen Zhang,1,15 Samuel H. Payne,2,15 Bai Zhang,1 Jason E. McDermott,2 Jian-Ying Zhou,1
Vladislav A. Petyuk,2 Li Chen,1 Debjit Ray,2 Shisheng Sun,1 Feng Yang,2 Lijun Chen,1 Jing Wang,3 Punit Shah,1
Seong Won Cha,4 Paul Aiyetan,1 Sunghee Woo,4 Yuan Tian,1 Marina A. Gritsenko,2 Therese R. Clauss,2 Caitlin Choi,1
Matthew E. Monroe,2 Stefani Thomas,1 Song Nie,2 Chaochao Wu,2 Ronald J. Moore,2 Kun-Hsing Yu,5,6 David L. Tabb,3
David Fenyo,7 Vineet Bafna,8 Yue Wang,9 Henry Rodriguez,10 Emily S. Boja,10 Tara Hiltke,10 Robert C. Rivers,10
Lori Sokoll,1 Heng Zhu,1 Ie-Ming Shih,11 Leslie Cope,12 Akhilesh Pandey,13 Bing Zhang,3 Michael P. Snyder,6
Douglas A. Levine,14 Richard D. Smith,2 Daniel W. Chan,1,16,* Karin D. Rodland,2,16,* and the CPTAC Investigators
1Department
of Pathology, Johns Hopkins Medical Institutions, Baltimore, MD 21231, USA

Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
3Department of Biomedical Informatics, Vanderbilt University School of Medicine, Nashville, TN 37203, USA
4Department of Electrical and Computer Engineering, University of California, San Diego, La Jolla, CA 92093, USA
5Biomedical Informatics Training Program, Stanford University School of Medicine, Stanford, CA 94305, USA
6Department of Genetics, Stanford University School of Medicine, Stanford, CA 94305, USA
7Center for Health Informatics and Bioinformatics and Department of Biochemistry and Molecular Pharmacology, New York University School
of Medicine, New York, NY 10016, USA
8Department of Computer Science and Engineering, University of California, San Diego, La Jolla, CA 92093, USA
9Bradley Department of Electrical and Computer Engineering, Virginia Polytechnic Institute and State University, Arlington, VA 22203, USA
10Office of Cancer Clinical Proteomics Research, National Cancer Institute, Bethesda, MD 20892, USA
11Department of Gynecology and Obstetrics, Johns Hopkins Medical Institutions, Baltimore, MD 21231, USA
12Department of Oncology, Johns Hopkins Medical Institutions, Baltimore, MD 21231, USA
13McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins Medical Institutions, Baltimore, MD 21287, USA
14Department of Gynecologic Oncology, Laura and Isaac Perlmutter Cancer Centre, NYU Langone Medical Center, New York,
NY 10016, USA
15Co-first author
16Co-senior author
*Correspondence: dchan@jhmi.edu (D.W.C.), karin.rodland@pnnl.gov (K.D.R.)
2Biological
SUMMARY
INTRODUCTION
To provide a detailed analysis of the molecular components and underlying mechanisms associated
with ovarian cancer, we performed a comprehensive
mass-spectrometry-based proteomic characterization of 174 ovarian tumors previously analyzed by
The Cancer Genome Atlas (TCGA), of which 169
were high-grade serous carcinomas (HGSCs). Integrating our proteomic measurements with the
genomic data yielded a number of insights into disease, such as how different copy-number alternations influence the proteome, the proteins associated
with chromosomal instability, the sets of signaling
pathways that diverse genome rearrangements
converge on, and the ones most associated with
short overall survival. Specific protein acetylations
associated with homologous recombination deficiency suggest a potential means for stratifying patients for therapy. In addition to providing a valuable
resource, these findings provide a view of how the somatic genome drives the cancer proteome and associations between protein and post-translational
modification levels and clinical outcomes in HGSC.
A comprehensive molecular view of cancer is necessary for understanding the underlying mechanisms of disease, improving
prognosis, and ultimately guiding treatment (Hanahan and Weinberg, 2011). The Cancer Genome Atlas (TCGA) conducted an
extensive genomic and transcriptomic characterization of
ovarian high-grade serous carcinoma (HGSC) aimed at defining
the genomic landscape and aiding the development of targeted
therapies for this highly lethal malignancy (Cancer Genome Atlas
Research Network, 2011). Key findings from TCGA were: (1) the
prevalent role of TP53 mutations, (2) extensive DNA copy alterations, (3) preliminary transcriptional signatures associated
with survival, (4) varied mechanisms of BRCA1/2 inactivation,
and lastly, (5) CCNE1 aberrations. Subsequent analysis of
genomic data from the TCGA consortium led to the refinement
of the transcript-defined signatures, improving the statistical
association with patient outcome (Yang et al., 2013), and integrating microRNA and mRNA expression profiles associated
with HGSC to identify candidate microRNA targets (Creighton
et al., 2012).
While the insights from genomic analyses are substantial,
functions encoded in the genome are generally executed at the
protein level and are often further modulated by post-translational modifications (PTMs) (Vogel and Marcotte, 2012). For
example, TCGA used a reverse-phase protein array (RPPA)
analysis of 172 proteins (including 31 phosphoproteins with

phospho-specific antibodies) to generate a signature associated
with the risk of tumor recurrence (Yang et al., 2013). To obtain a
more comprehensive assessment of the complex ovarian HGSC
phenotype, the Clinical Proteomic Tumor Analysis Consortium
(CPTAC) conducted an extensive mass-spectrometry (MS)based proteomic and phosphoproteomic characterization of
HGSC tumors characterized by TCGA, providing quantitative
measurements for a combined total of 9,600 proteins from 174
tumors (an average of 7,952 proteins per tumor) and a total of
24,429 phosphosites from 6,769 phosphoproteins in a subset
of 69 tumors (an average of 7,677 phosphosites per tumor).
Our results provide insights on HGSC biology and correlate differences in protein and PTM levels with a clinical outcome complementary to that of TCGA genomic analyses.
RESULTS
Proteomic Analysis of TCGA HGSC Samples
HGSC biospecimens and clinical data from 174 patients
collected by TCGA were analyzed at two independent CPTAC
centers, Johns Hopkins University (JHU) and Pacific Northwest
National Laboratory (PNNL); 32 samples were analyzed at both
JHU and PNNL. Tumors were selected by examining the associated TCGA metadata to select tumors either (1) on the basis of
putative homologous recombination deficiency (HRD), defined
by the presence of germline or somatic BRCA1 or BRCA2 mutations, BRCA1 promoter methylation, or homozygous deletion of
PTEN (Woodbine et al., 2014) (122 samples; 67 classified as
HRD, 55 as non-HRD by the aforementioned criteria; JHU); or
(2) to maximize differences in overall survival (84 samples;
PNNL). For selection purposes, short survival was defined as
overall survival of fewer than 3 years, and long survival was
defined as greater than 5 years. All but five tumors had somatic
TP53 mutations, a characteristic feature of HGSC (Cancer
Genome Atlas Research Network, 2011); these five tumors
were subsequently reclassified as other than HGSC (Vang
et al., 2016) and removed from protein functional analyses
(e.g., subtyping and survival analyses). The tumor selection
criteria and the associated metadata are provided in Table S1.
Proteomics measurements used isobaric tags for relative and
absolute quantitation (iTRAQ; Ross et al., 2004) in conjunction
with offline liquid chromatography fractionation via high-pH
reverse-phase liquid chromatography (RPLC) and online RPLC
with high-resolution tandem MS to provide broad coverage for
peptide and protein identification and quantification (Supplemental Experimental Procedures); this also alleviated quantitative interference potentially associated with the use of isobaric
tags. We used the relative abundance measurements for each
protein in the 32 patient samples analyzed at both JHU and
PNNL to normalize across the two analysis sites and then used
clustering, principal-component analysis (PCA) and statistical
tests to identify any significant batch effects associated with
the site of analysis (a detailed comparison of within-site, between-site, and between-sample measurement variability and
the process used to merge the JHU and PNNL data are given
in Figure S1). As shown in Figure S1C, the median coefficient of
variation (CV) between measurements at the two sites was 16%.
756 Cell 166, 755765, July 28, 2016
A total of 9,600 proteins were identified with high confidence in

all tumors, and the relative abundances in each tumor are given
in Table S2. Functional analyses and proteome-transcriptome
associations were restricted to 3,586 proteins observed and
quantified in all 169 HGSC samples used for protein functional
analyses and where sample variability (signal) exceeded technical variability (noise) in the merged data (Table S2), calculated
as described in the Supplemental Experimental Procedures. On
average, we identify peptides covering 29% of the amino acids in
any of these 3,586 proteins in a given sample, with a range from
10% to 47%. In addition to protein abundance levels, phosphoproteomics data were acquired for 69 tumors with a sufficient
sample (Table S2). As with proteins, phosphopeptide abundances were calculated relative to the pooled reference sample.
Because isobaric labeling was performed prior to splitting the
samples for proteome and phosphoproteome analyses, phosphopeptide abundance could be corrected for changes in parent
protein abundance to identify differences in the relative extent of
phosphorylation at specific sites for each protein (Table S2).
Overall, we achieved a protein dynamic range encompassing
more than four orders of magnitude, ranging from low-level transcription factors to abundant structural proteins, i.e., actin and
tubulin.
Proteogenomic Landscape of HGSCs
The degree to which alterations observed at the genome and
transcriptome levels are manifested at the protein level is variable, both qualitatively and quantitatively, and partially driven
by multiple levels of post-transcriptional regulation (Zhang
et al., 2014; Kislinger et al., 2006; Jovanovic et al., 2015; Mertins
et al., 2016). To identify peptides encoded by single amino-acid
variants (SAAVs), splice variants, and novel exons documented
by TCGA, mass spectra were searched against a custom graph
database (Woo et al., 2014a) containing all peptide variations
projected from the TCGA genomic analyses of the cohort using
a multi-stage analysis pipeline (Supplemental Experimental Procedures). The most frequently observed variant peptides represented SAAVs and gene-level events. The evidence supporting
each novel event is detailed in Table S2, including event type,
genomic location, abundance information, and estimated false
discovery rate (FDR). The validity of these variant peptides was
further evaluated by the synthesis of 20 examples selected at
random across the entire range of spectrum match scores. We
obtained tandem mass spectra for all 20 synthetic peptides
that matched spectra from our analyses, strongly supporting
our observation of these variant peptides; three representative
examples are given in Figure S1E. These results demonstrate
the ability to confidently detect, identify, and validate genomelevel predictions at the protein level. More novel peptides would
likely be observed if there were sufficient samples for more
extensive fractionation and/or replicate analyses (Ruggles
et al., 2016); these limitations preclude any conclusions about
the biological significance of unobserved events. For example,
only two mutant p53 peptides were identified, despite the presence of p53 mutations in all tumors examined. Such low
coverage can occur for reasons that include: excessively large
or small tryptic fragments, inability to distinguish some amino
acids, some possible biases against highly hydrophobic and
3.0
Figure 1. Correlations between mRNA and

Protein Abundance in TCGA Tumors
Median = 0.45
1.0
1.5
2.0
79.4% significant positive correlation

(adjusted p value < 0.01)
0.0
0.5
Probability Density
2.5
90.8% positive correlation
-1.0
-0.5
0.0
Spearmans Correlation
0.5
Ribosome (0.15; 2.4e-10)

Oxidative phosphorylation (0.24; 4.4e-08)
mRNA splicing (0.28; 1.3e-06)
Complement and coagulation (0.20; 4.3e-05)
Acute inflammatory response (0.02; 8.1e-03)
Interferon-responsive genes (0.67; 3.4e-07)
Nucleotide metabolism (0.51; 1.7e-04)
Amino acid metabolism (0.53; 5.7e-03)
hydrophilic peptides, or low-abundance peptides co-eluting with

very high-abundance peptides.
To assess the potential for post-transcriptional regulation
(e.g., translational efficiency or protein degradation), we compared each protein to its corresponding transcript across all tumors, and correlation was assessed for those pairs with reliable
measurements for both mRNA and protein, i.e., proteins with a
corresponding mRNA measurement observed in all 169 HGSC
tumors where sample variability exceeded technical variability
(3,196 pairs). We excluded 391 proteins observed without a
corresponding mRNA (e.g., HBA1) or discordant gene symbol
annotation in the protein database (e.g., THOC4). Overall,
79.4% of the mRNA-protein pairs showed statistically significant
positive Spearman correlations (Benjamini-Hochberg adjusted
p value < 0.01; Figure 1) when changes in mRNA abundance
were compared to changes in relative protein abundance. The
observed median r value of 0.45 for each mRNA-protein pair
across all 169 tumors is similar to those for observations in colorectal cancer (Zhang et al., 2014), breast cancer (Mertins et al.,
2016), and mouse tissues (Kislinger et al., 2006), although less
than those found for cell lines (Wu et al., 2013). In comparison,
the median correlation of paired protein measurements from
the same sample, but measured at two sites, was substantially
higher, at 0.69 (Figure S1).
A wide range of mRNA-protein correlations was observed. In
general, weaker correlations were observed for highly stable
and abundant proteins associated with housekeeping or nonintrinsic functions (e.g., ribosomes, mRNA splicing, oxidative
phosphorylation, complement cascade), while more dynamic
proteins known to be transcriptionally regulated in response to
nutrient demand or other perturbations (e.g., nucleotide metabolism, amino acid metabolism, acute inflammatory responses)
were more highly correlated (Figure 1; Table S3). This result is
1.0
mRNA and protein were correlated across all the

samples, resulting in 90.6% positive correlations,
and 79.4% were significantly correlated (Benjamini-Hochberg adjusted p value < 0.01), with a
mean Spearmans correlation of 0.38 and a median value of 0.45 (top panel). Different biological
pathways and processes showed significantly
different levels of correlation (bottom panel).
Metabolic pathways and the interferon response
displayed high mRNA-protein correlation; and
ribosome, mRNA splicing, oxidative phosphorylation, and complement and coagulation cascade
were poorly correlated. The mean correlation is
shown in parentheses, followed by BenjaminiHochberg adjusted p values calculated using a
KolmogorovSmirnov test following the functional
group names from MSigDB (the Molecular Signature Database). Blue bars indicate positive correlations, and yellow indicates negative correlations;
individual proteins (represented as bars on the
x axis) are sorted by correlation from low to high
(bottom panel).
consistent with a previous colorectal cancer study (Zhang

et al., 2014) and supports the hypothesis that, while many biological functions are primarily regulated by mRNA abundance, posttranscriptional mechanisms likely have an important role in regulating certain house-keeping functions (Jovanovic et al., 2015;
Komili and Silver, 2008; Lu et al., 2007; Marguerat et al., 2012).
Additionally, we found that 23 mRNAs lacking poly(A) tails displayed lower correlation (mean 0.15) with their cognate proteins
than did polyadenylated mRNAs, consistent with the decreased
stability of mRNAs lacking poly(A) tails (Yang et al., 2011).
Clustering of Tumors Based on Protein Abundance
HGSC is the most common of the four major histological subtypes of epithelial ovarian cancer and is characterized by distinct
morphological features. Recent studies using mRNA abundance
data have suggested four transcriptomic HGSC subtypes designated as differentiated, immunoreactive, mesenchymal, and
proliferative (Yang et al., 2013). To build an unbiased molecular
taxonomy of ovarian HGSC, we used protein abundance data
from the 169 tumors to identify subtypes that might show biological differences that could be exploited in future studies. Figure 2
shows the resulting clustering analysis of individual tumors (vertical columns) by protein abundance (horizontal rows). The cluster assignment for each sample is provided in Table S4, and a
consensus value matrix for the subtype comparisons is shown
in Figure S2A. The results of a weighted gene co-expression
network analysis (WGCNA) (Langfelder and Horvath, 2008) of
protein functional enrichment by subgroups are provided for
the protein clusters in Figure S2B and Table S4. The enrichment
of KEGG (Kyoto Encyclopedia of Genes and Genomes) and Reactome ontologies in the WGCNA-derived modules are shown in
Figures S2C and S2D, respectively; membership of enriched
pathways is provided in Table S3.
Cell 166, 755765, July 28, 2016 757
Protein
mRNA
DNA replication
Protein
cell-cell communications
differentiated
metabolic
cytokine signaling
proliferative
mesenchymal
erythrocyte and platelet
stromal
ECM interaction
mRNA
differentiated
complement cascade
immunoreactive
proliferative
mesenchymal
n.a.
metabolism
Z-score
Figure 2. Proteomic Subtypes and Corresponding Driving Protein Modules

Global proteomics abundance is shown in a heatmap, with TCGA samples represented by columns ordered by protein subtype where rows represent proteins.
Color of each cell indicates Z score (log2 of relative abundance scaled by proteins SD) of the protein in that sample; red is increased, and blue is decreased
(relative to the pooled reference). Transcriptome-based subtypes (Verhaak et al., 2013) and the proposed proteomic subtypes are indicated in color above the
heatmap. WGCNA-derived modules are delineated with the row color panel and annotated according to the pathways based on enrichment of KEGG and
Reactome ontologies.
Four of the proteomic clusters showed a clear correspondence to the mesenchymal, proliferative, immunoreactive, and
differentiated subtypes defined by the TCGA transcriptome analysis (Figure 2; Figure S2E). A relatively small fifth cluster of
tumorssignificantly enriched in proteins associated with extracellular matrix interactions, erythrocyte and platelet functions,
and the complement cascadewas also observed using
multiple approaches, including model-based clustering with
Bayesian information criteria, consensus clustering, and Visual
Statistical Data Analyzer (VISDA)-based sub-phenotype clustering. This new group could not be attributed to tissue source
site sampling bias or any other metadata category but may be
related to tumor characteristics, including vascularization and
tumor content, as the tumor purity score for this subtype (and
the mesenchymal subtype) was significantly lower than that of
the other three subtypes (Figure S2F). The clinical relevance of
these protein-based clusters will require validation in independent HGSC sample sets, particularly as no significant difference
in survival was observed (Figure S2G), similar to mRNA-based
clustering analysis (Cancer Genome Atlas Research Network,
2011).
Proteomic Analysis of CNA Effects
Chromosomal instability, marked by extensive copy-number alterations (CNAs) in each tumor, is a hallmark of HGSC and a likely
source of driver alterations in this disease (Cope et al., 2013; Kuo
et al., 2009; Kobel et al., 2008). CNAs can affect the abundance
758 Cell 166, 755765, July 28, 2016
of proteins at the same locus (cis effects) and may also act in
trans either directly or indirectly.
Hypothesizing that CNAs with strong trans effects are more
likely to elicit a molecular phenotype and confer selective advantages, we sought to identify those CNA regions that have the
broadest effect on protein expression. In all, 29,393 CNA loci
(Table S5) were compared to our global proteomics data, with
950,209 CNA/protein pairs (0.72% of the total) exhibiting significant association (Benjamini-Hochberg adjusted p value < 0.01).
We provide a complete list of the significantly associated proteins for each CNA in Table S5. The diagonal line evident in Figure 3 corresponds to cis effects, and vertical stripes correspond
to trans effects, in which changes in copy number affect expression of numerous proteins across the genome. We performed
the same analysis for mRNA to identify sites where associations
are transcriptionally mediated. A similar number of CNA/mRNA
pairs were found to be significantly associated (1,113,164 at a
Benjamini-Hochberg adjusted p value < 0.01; Figure 3). This
analysis revealed regions on chromosomes 2, 7, 20, and 22
correlated in trans with more than 200 proteins. In contrast to
colorectal cancer, where most of the trans-regulation of protein
by CNA was accompanied by similar changes in mRNA (Zhang
et al., 2014), we observed several loci associated with differences in protein abundance without a corresponding change in
mRNA. For example, large regions on chromosome 2 have relatively little trans effect on mRNA levels but are associated with
more than 200 proteins in trans. Dissecting the mechanisms by
CNA-protein correlation
6
4
3
2
1
100 200 300 400
-200 -100 0
Significant
correlations in trans
Figure 3. Functional Impact of Copy-Number Alterations

The top panel shows the correlation of CNAs to
protein abundance (right) or mRNA (left), with significant positive correlations in red and negative
correlations in blue (Benjamini-Hochberg adjusted
p value < 0.01, Spearmans correlation coefficient). The x axis plots the 29,393 CNAs obtained
from TCGA. The y axis plots 3,202 proteins. Genes
are ordered by chromosomal location on both the
x and y axes. The bottom panel shows the summed number of significantly correlated proteins (or
mRNA) for each individual CNA. In blue is shown
the total number, and in black are those genes
significantly correlated as both mRNA and protein.
Where blue and black lines have similar magnitude, e.g., the hotspot on chromosome 20, the
CNA associations are shared between protein and
mRNA. Where a strong blue line has no mirrored
black line, e.g., the protein hotspot on chromosome 2, the associations with CNA are largely
unique.
Gene location
15 17 19 21
7 8 9 10 11 12 13 14 16 18 20 22 X Y
CNA-mRNA correlation
8 9 10 11 12 1314 16 18 20 22 X Y
15 17 19 21
CNA location
8 9 10 11 12 1314 16 18 20 22 X Y
15 17 19 21
CNA location
which a specific CNA can alter protein levels in trans without

affecting the corresponding mRNA is difficult, given the extent
of the amplified or deleted regions and the numerous genes
affected at a given locus. Possible mechanisms include cis-regulation of proteins associated with mRNA stability and translational efficiency, such as microRNAs and RNA-binding proteins.
Given the complex pattern of CNAs observed in HGSC, it has
been difficult to identify a limited number of high-impact genomic
alterations that could function as drivers of the disease. We interrogated the trans-affected proteins associated with each putative CNA for common functions, using pathways defined by
the KEGG, National Cancer Institute Pathway Integration Database (NCI PID), and Reactome database (Supplemental Experimental Procedures). Proteins associated with cell invasion and
migration and proteins related to immune function appeared to
be enriched in association with multiple CNAs (Figure S3; Table
S3). These observations suggest a convergence of multiple CNA
targets on a common set of biological functions; namely,
motility/invasion and immune regulation, functions that are
among the hallmarks of cancer (Hanahan and Weinberg, 2011).
Association of CNA trans-Affected Proteins with Overall
Survival
Availability of the TCGA survival data allowed us to use transaffected protein data from the most influential CNAs (e.g., the
four altered regions described on chromosomes 2, 7, 20, and
22; Figure 3) to build a model of overall survival. Because each
CNA affects many proteins, we used a regression approach
that identifies parsimonious Cox proportional hazards models
with maximal predictive ability from the list of significantly correlated proteins for that CNA. We trained models on the proteomics data from PNNL (82 tumors) and tested the ability of the
model to predict survival times in the data from JHU (87 tumors,
not including the 32 overlapping). Each of the four most influen-
tial CNA regions produced models that were strongly associated

with patient survival (p value < 0.01, FDR < 0.5%, based on
randomly selected proteins). A Kaplan-Meier plot illustrating
the predictive value of each of the locus-specific models after
validation is shown in Figure S4A; for the Kaplan-Meier plots,
high and low expression were defined relative to the median
for that signature across all samples, splitting patients into two
equal groups of 45 (Table S6). We examined the overlap in predictions for patients for the four signatures and found that the
predictions were unanimous (either high or low signature) for
62% of patients and 16% were evenly split, with the remaining
having one dissenting signature. This suggested the utility of
combining these signatures, using a voting method where the
number of high or low calls was counted for each patient; this
substantially improved prediction of survival time (p value =
1.9e 6; Figure 4). We also examined the effects of tumor stage,
tumor grade, patient age, surgical outcome, and platinum status
and found that our proteomic signatures were not improved by
the inclusion of these variables.
For each of these locus-specific models, we analyzed the
enrichment of genes and their regulatory sequences associated
with outcome and found that all four models had genes significantly enriched (Benjamini-Hochberg adjusted p value < 0.05)
in binding sites for the proliferation-associated serum response
factor (SRF), suggesting that SRF activity may be important in
ovarian cancer outcome. Additionally, there was significant
enrichment of proteins involved in the regulation of actin cytoskeleton, apoptosis, and adherens junction (Benjamini-Hochberg adjusted p value < 0.05). We examined the protein
abundance and phosphorylation status of SRF between shortand long-surviving groups and observed that they were higher
in short-surviving patients, but only slightly. Thus, although
SRF alone was not predictive, the trends in SRF abundance
and phosphorylation are consistent with the observation of
Cell 166, 755765, July 28, 2016 759
1.00
Overall survival (%)
0.75
Signature
relative level
0.50
p value = 2e-6
Up
Down
0.25
0.00
0
4
Time (years)
Figure 4. Kaplan-Meier Plot of Overall Survival Stratified by CNADerived Signatures

A survival plot is shown for a consensus of the four best signatures (see Figure S4A and Table S6) identified from analysis of proteins affected in trans by
CNAs. Models were trained using a lasso-based Cox proportional hazards
model on the samples from PNNL, and shown are the survival curves from
these models applied to the non-overlapping data from the JHU analysis, with
the up (red; above the median) and down (blue, below the median) signatures
each representing 45 patients. A vote was taken among the four most predictive signatures, and the results of this vote are shown. Probability of death is
shown on the y axis versus survival time in years on the x axis. Shaded ribbons
denote 95% confidence intervals.
See also Figure S4A and Table S6.
enrichment for proteins potentially regulated by SRF binding.

Details of the larger gene set analysis are provided in the Supplemental Experimental Procedures.
Several proteins shared across all the signatures are known to
be involved in cancer processes. Catenin B2 (CTNNA2) is a cellcell adhesion protein and tumor suppressor (Fanjul-Fernandez
et al., 2013). The Rho guanosine diphosphate (GDP) dissociation
inhibitor (GDI) beta (ARHGDIB) is involved in invasion and
migration in many cancers, and overexpression correlates with
progression in pancreatic carcinoma (Yi et al., 2015). Protein
kinase C and casein kinase substrate in neurons protein 2
(PACSIN2) is a repressor of cellular migration (Meng et al.,
2011). Finally, the guanosine-triphosphate (GTP)-binding nuclear protein RAN is a prognostic marker associated with
increased survival in epithelial ovarian cancer (Barre`s et al.,
2010). The association of these previously described survivalrelated proteins with genes affected in trans by CNAs suggests
a potential mechanism for the parallel activation of multiple pathways associated with poor prognosis in HGSC.
As a comparison, we applied the previously described Provar
signature (Yang et al., 2013), comprising five proteins and four
phosphoproteins that showed good survival prediction in the
TCGA ovarian cancer dataset. We observed all proteins in the
Provar signature in our proteomic data, but only one of the phosphosites (epidermal growth factor receptor [EGFR] Y1173). Thus,
we used the RPPA data from the original signature (Figure S4B).
We found that the Provar signature was prognostic of survival
(Benjamini-Hochberg adjusted p value = 0.11) in the 67 patients
760 Cell 166, 755765, July 28, 2016
Figure 5. Number of CNAs Statistically Explained by Proteins

Significantly Associated with CIN Index
A total of 128 proteins selected for their association with CIN were arranged
along the chromosomal locations of their corresponding genes (y axis). The
length of the light gray horizontal lines indicates the number of CNAs significantly correlated with the protein across all chromosomes except for the
proteins corresponding gene coding region (x axis). The top-ranked proteins
are annotated and highlighted with dark lines to show the bootstrap-estimated
95% confidence intervals.
examined here (those with phosphoprotein data had somatic

TP53 mutations), but the statistical power of Provar is not as
high as the statistical power of signatures derived from the
CNA trans-affected proteins in this dataset. In addition, integrating the Provar signature with the CNA signatures did not
improve survival prediction (Figure S4C).
Identification of Proteins Associated with Chromosomal
Structural Abnormality
The degree of chromosomal instability exhibited by a tumor can
be represented by a calculated chromosome instability (CIN) index, as described previously for lung cancer (Cancer Genome
Atlas Research Network, 2012). Identification of proteins associated with CIN may provide information on the processes
contributing to chromosomal instability, while the analysis of
trans-affected proteins described earlier is more closely related
to the downstream consequences of specific CNAs. Using a
bootstrapping method described in Supplemental Experimental
Procedures, we identified a proteomic signature, including 128
proteins (Table S7) showing significant correlation with the CIN
index (Benjamini-Hochberg adjusted p value < 1e 4, Spearman
correlation; Figure S5A). Functional annotation of these proteins
(Figure S5B) showed that proteins upregulated in tumors with a
high CIN index were preferentially involved in chromatin organization (p value = 6.90e 5), whereas proteins upregulated in tumors with a low CIN index were more often associated with cell
death (p value = 2.13e 5). Correlation analysis of protein abundances and CIN indicated that a small number of proteins could
account for the majority of the CIN (Figure 5); two of the most
strongly associated proteins, CHD4 and CHD5, are known to
be involved in chromatin organization (Liu and Matulonis, 2014).
Figure 6. DDN Analysis and Lysine-Acetylation Analysis between HRD and Non-HRD
Patients
SWATH data
p value = 0.028
HRD negative
HRD positive
HRD status
Histone H4 GLGK(Ac)GGAK(Ac)R
(peak area)
500000 1000000 1500000
Histone H4 GLGK(Ac)GGAK(Ac)R
(log ratio)
-1.0 -0.5 0.0 0.5 1.0 1.5
Global iTRAQ data
p value = 0.039
HRD negative
DDN analysis revealed a sub-network of proteins

that displayed distinct co-expression patterns
between HRD and non-HRD patients, where purple connections indicate protein correlations that
exist only in HRD samples and blue connections
indicate protein correlations that exist only in nonHRD patients. The links between the nodes are
drawn with two different thicknesses, indicative of
whether the connections meet the significance
threshold of 0.05 (thin lines) or 0.01 (thick lines).
Proteins with blue dotted circles are known to be
involved in histone acetylation or deacetylation.
Furthermore, identification and quantitation of
lysine-acetylated peptides in global proteomics
data showed that acetylation levels at K12 and
K16 of histone H4 are significantly different between HRD and non-HRD samples, suggesting a
role of histone H4 acetylation (together with
HDAC1) in modulating the choice of DSB repair
mechanisms.
HRD positive
K16 of histone H4 showed a significant

difference between HRD and non-HRD
samples (Figure 6); differential acetylation
of K12 and K16 was verified using synthetic peptides and targeted analysis using sequential windowed
data-independent acquisition of the total high-resolution mass
spectra (SWATH-MS). In cell-line data, acetylation of H4
has previously been reported to be involved in the choice of
DNA double-strand break (DSB) repair pathways (homologous
recombination or non-homologous end joining (Gong and Miller,
2013; Tang et al., 2013). This relationship is regulated partially by
HDAC1, a protein also identified in the DDN analysis. We
observed a significant enrichment of HDAC1 and its co-regulated proteins in tumors with HRD and low H4 acetylation relative
to non-HRD tumors with high H4 acetylation (DDN analysis: permutation tests, p value < 0.05; differentially acetylated peptides:
t tests, p value < 0.05, with an estimated FDR < 0.5% by
bootstrap/permutation tests). The combined observations of
increased HDAC1 and associated proteins at the pathway level,
together with decreased acetylation of H4 in HRD patients at the
PTM level, provide insight regarding the potential role of HDAC1
in modulating the choice of DSB repair pathways.
HRD status
Identification of Proteins Associated with HRD Status

Since HRD is associated with susceptibility to poly(ADP-ribose)
polymerase (PARP) inhibitors and improved survival (Farmer
et al., 2005; Liu and Matulonis, 2014), we sought to elucidate
systemic changes associated with HRD and identify biomarkers
that might be used to stratify patients for treatment. We defined
tumors with HRD as having either germline or somatic mutations
in BRCA1 or BRCA2, or BRCA1 promoter methylation, or homozygous deletion of PTEN (McEllin et al., 2010), with an overall
survival of >1.5 years, while non-HRD patients were defined as
lacking these genomic aberrations, with a follow-up or time to
death of <2.5 years; additional selection criteria include available
residual tissue volume and a tumor tissue contamination score
estimated using CNAs (Yu et al., 2011). Applying differential dependency network (DDN) analysis (Zhang et al., 2009) on a set of
171 BRCA1- or BRCA2-related proteins curated from the literature and from the cBio portal, we identified a sub-network of 30
proteins that displayed co-expression patterns differentiating
HRD from non-HRD patients (Figure 6; Table S7). Several of
the proteins in these modules are known to be involved in histone
acetylation or deacetylation, e.g., HDAC1, RBBP4, RBBP7,
EP300 (Mielcarek et al., 2015), and HUS1 (Cai et al., 2000).
Although statistical association cannot distinguish between
drivers and consequences of HRD status, the observed enrichment of proteins associated with histone acetylation motivated
us to use an effective database search strategy (Supplemental
Experimental Procedures) to identify and quantify acetylated
peptides from the global proteomic data. Comparative analysis
of 399 acetylated peptides identified 15 acetylated peptides
with significant differences between the HRD and non-HRD
samples (Table S7). Among these, dual acetylation at K12 and
Phosphoproteomic Analysis of Pathways Associated

with Survival
An initial set of 24,464 different phosphopeptides (21,298 unique
phosphorylation sites) contained within 4,420 proteins having
data on net differences in phosphorylation (Table S2) was obtained. Since ischemia of the TCGA tumor samples may alter
phosphopeptide abundance, we removed any phosphorylation
sites previously shown to be altered by ischemia (Mertins
et al., 2014). We also removed three tumor samples having substantially higher than average missing values and two tumor
samples lacking somatic TP53 mutations (Supplemental Experimental Procedures). This yielded a final set of 7,675 different
Cell 166, 755765, July 28, 2016 761
Pathway enrichment (-log p value)
A
0
10
20
15
44.8
RhoA regulatory
31.0
PDGFR
24.4
Integrin-liked kinase
Notch
HER2/Neu
Rac1
Cxcr4
Thrombin
IL-12
Figure 7. Pathway Analysis Associated with

Patient Survival
Phosphopeptide
Protein
Transcript
CNA
(A) Pathway components were statistically

analyzed using a two-sided t test between samples from patients surviving <3 years (short survival) and patients surviving >5 years (long survival)
for differences in CNA, mRNA expression, protein,
or phosphorylation abundance. All significant
pathways for mRNA and protein are shown (Benjamini-Hochberg adjusted p value < 0.05), and the
most significant pathways for phosphoproteomics
are plotted on the x axis as the log of the p value.
Results show that phosphorylation provides
additional information about the functional state of
tumors.
(B) Simplified PDGFR-beta pathway showing differences between short- and long-survivor groups
for protein abundance (p), mRNA abundance (m),
and phosphoprotein abundance (circled P).
Thrombaxane
m = mRNA
P = protein abundance
P = phosphoprotein
= increased (0.1>p>0.05)
= increased (p<0.05)
= decreased (0.1>p>0.05)
= decreased (p<0.05)
= no difference
Nucleus
phosphopeptides (6,802 unique phosphorylation sites) from

2,324 proteins (Table S2) that were mapped to pathways via
the NCI PID. The average net phosphorylation of all phosphopeptides mapping to a given pathway (i.e., the statistical average
across all phosphoproteins known to be in the pathway rather
than the statistically significant individual proteins) was used as
a measure of pathway activity and compared between short survivors (deceased, having survived <3 years; n = 17) and long survivors (patients surviving >5 years; n = 19), using both proteomic
and phosphoproteomic data.
An issue in ranking the pathways is the distinction between
phosphopeptides that are unique to a single pathway, such as
the receptor peptides themselves, and peptides that map to proteins shared between pathways. In order to address this issue,
we performed two analyses: one of all proteins associated with
a specific pathway in the NCI PID and a second analysis in which
762 Cell 166, 755765, July 28, 2016
proteins common to multiple pathways

were excluded. Figure 7A shows a
ranking of activated pathways, as inferred
from the analysis of pathway-specific
Extracellular
components (i.e., differences in mRNA,
protein, and phosphoprotein levels) in
short versus long survivors. Fifteen pathCytoplasm
ways showed increased phosphorylation at Benjamini-Hochberg adjusted
p values < 0.01, in contrast with transcriptional data for the same samples that
yielded only one statistically significantly
increased pathway, androgen receptor
signaling, which was also increased at
the phosphoprotein level (Figure 7A; Supplemental Experimental Procedures). All
pathways indicated in Figure 7A were
increased in association with short survival relative to long survival, though
some individual components within a
pathway showed opposite changes
(e.g., JNK1 transcript and protein levels,
see Figure 7B). The RhoA-regulatory, PDGFRB, and integrinlike kinase pathways emerged as the most activated in this analysis (Figure 7A). Figure 7B shows the common and unique
elements of the PDGFRB pathway and the trends in mRNA, protein, and phosphoprotein levels in the comparison of short
versus long survivors, illustrating the benefit of combined protein
and phosphoprotein measurements, as well as the substantial
contribution of protein phosphorylation data to the overall
analysis.
We compared our results to the 31 phosphoproteins represented on the RPPA arrays used in the original TCGA ovarian cancer analysis (Cancer Genome Atlas Research Network, 2011);
RPPA identified differential phosphorylation of ERK1, RAF1,
and STAT3 in the same patient samples analyzed by MS. For
short-surviving versus long-surviving patients, ERK1 showed statistically significantly increased phosphorylation by RPPA, on the
same ERK1 phosphopeptides as found using MS, while AKT1,

RAF1, and STAT3 showed increased phosphorylation by both
RPPA and MS-based analyses but based upon distinct phosphopeptides (Table S2). Good overlap was observed between RPPA
and MS results for phosphoproteins on the RPPA array, but we
did not observe three of the specific phosphosites from the Provar
survival signature in our MS data, suggesting that proteomics is
more sensitive to overall pathway activity than specific regulatory
events in the pathway itself. However, many of the phosphopeptides identified by MS phosphoproteomics as components of the
PDGFR-beta pathway were not represented in the RPPA, illustrating the ability to identify phosphorylation events not readily
accessible with currently available antibodies (Figure 7B).
DISCUSSION
The addition of proteomic information, including PTMs, to the
rich genomic and transcriptomic data available from the ovarian
HGSC samples analyzed by TCGA has provided additional insights into the biology of HGSC, including the identification of
changes at the level of pathway activation. Integration of
genomic, proteomic, and phosphoproteomic measurements
identified differentially regulated pathways and functional modules that displayed significant associations with patient outcomes, including survival and HRD status. Cox proportional
hazard analysis of proteins associated with CNAs identified
overall survival signatures enriched for targets of the proliferation-associated transcription factor SRF, and integrated proteomic, phosphoproteomic, and transcriptomic data identified
pathways that differentiated patients on the basis of survival,
including the PDGFR-beta signaling pathway associated with
angiogenesis, the RhoA regulatory and integrin-linked kinase
pathways associated with cell mobility and invasion, and pathways associated with chemokine signaling and adaptive immunity. Proteins associated with cell invasion and motility emerged
from both the integration of CNAs with protein abundance data
and the phosphoproteomic investigation of upregulated pathways in short versus long survivors. Despite the complexity of
genomic alterations that characterize HGSC, these analyses
suggest a functional convergence on a subset of key pathways.
The association of increased invasiveness and motility with short
overall survival in this study may help to explain more aggressive
mechanisms of dissemination in ovarian cancer, including
recently reported hematogenous metastasis (Pradeep et al.,
2014), in addition to previously described lymphatic spread
and known peritoneal spread by direct extension.
Focusing on functional modules has also revealed potential
drivers of HGSC and more robust signatures for potentially stratifying ovarian HGSC patients into distinct cancer phenotypes to
inform therapeutic management. For example, the identification
of specific acetylation events associated with HRD, e.g., the
simultaneous acetylation of K12 and K16 on histone H4, may
provide an alternative biomarker of HRD and a rationale for the
selection of patients in future clinical trials of HDAC inhibitors,
alone or in combination with PARP inhibition. This may help to
resolve the current discrepancy between the initial observation
of limited single-agent activity of HDAC inhibitors in ovarian cancer (Mackay et al., 2010; Modesitt et al., 2008) and more recent
findings of a >40% response rate when used in combination with

cytotoxic chemotherapy in platinum-resistant patients (Dizon
et al., 2012). Similarly, comprehensive interrogation of protein
phosphorylation identified multiple pathways with significantly
increased phosphorylation in patients with poor clinical outcomes. Specifically, the observed activation of PDGFR pathways in a subset of patients with short overall survival could
potentially stratify patients for selective enrollment in trials of
anti-angiogenic therapy, which is particularly relevant to current
controversies over the use of bevacuzimab as a first-line therapy
in ovarian cancer (Gadducci et al., 2015).
Overall, this work illustrates the ability of proteomics to complement genomics in providing additional insights into pathways
and processes that drive ovarian cancer biology and how these
pathways are altered in correspondence with clinical phenotypes. The comprehensive proteomic measurements for the
HGSC tumor samples provide a public resource of information.
Importantly, the ability to identify PTMs revealed a strong association between specific histone acetylation events and the HRD
phenotype. In addition, the enhanced view of pathway activity
provided by measurements of protein phosphorylation provides
a foundation for linking genotype to proteotype and, ultimately,
to phenotype for understanding the molecular basis of cancer.
Full methods are available in the Supplemental Experimental Procedures.
Tumor Samples
All tumor samples for the present study were obtained through the TCGA Biospecimen Core Resource, as described previously (Zhang et al., 2014). Samples were selected based on overall survival, HRD status, and availability.
Quantitative Proteomics
Tissue proteins were extracted and digested with trypsin; at each site, a portion
of each resulting sample was also used to create a pooled reference in which
each tumor sample contributed an equal percentage by peptide mass. The
pooled reference sample was included in each multiplexed, isobaric labeling
experiment to enable cross-experiment comparison in the entire sample cohort.
The patient samples and the pooled reference sample were each labeled
by different iTRAQ reagents (Sciex), combined, fractionated, and split for
integrated, quantitative global proteome (10%) and phosphoproteome (90%)
analysis using LC-tandem MS (LC-MS/MS) on an Orbitrap Velos mass spectrometer (Thermo Scientific). Raw data were processed for peptide identification
by database searching, and identified peptides were assembled as proteins and
mapped to gene identifiers for proteogenomic comparisons. Quantitation was
achieved by comparing the iTRAQ reporter ion intensities in each sample.
Comparison of mRNA and Protein Subtypes
A model-based clustering approach was used to model the protein abundance
data as a mixture of subtypes. Bayesian information criteria, statistical resampling, and VISDA-based sub-phenotype clustering approaches provided
similar results, with a stable optimization on five clusters as determined by
consensus clustering. WGCNA analysis was performed to infer genes or
gene networks that drive subtyping into five clusters, followed by correlation
with subtype as a trait.
Integration of Proteomics and CNA Data
Spearman correlation coefficients and corresponding adjusted p values
were calculated for each protein/transcript by CNA locus, and gene set enrichment analysis was used to infer function for groups of proteins significantly
correlated with a given CNA locus. Regression analysis was applied to
the list of trans-affected proteins correlated with each CNA to generate
Cell 166, 755765, July 28, 2016 763
parsimonious Cox proportional hazards models with maximal predictive ability, using the PNNL data for training and JHU data for testing. Kaplan-Meier
plots were used to visualize performance in predicting overall survival and progression-free survival. A CIN index (Cancer Genome Atlas Research Network,
2012) was calculated for each sample as the mean absolute values of copynumber measurements at the 29,393 selected loci. Bootstrap resampling
was used to select proteins correlated with CIN index at high confidence.
Protein Acetylation and HRD
Acetylated peptides were identified by searching the global proteomics data
for dynamic acetylation to lysines (+42 Da). Acetylation levels were compared
between HRD and non-HRD cases by t test. Targeted proteomics (SWATH;
Collins et al., 2013) was used to orthogonally quantify the acetylated peptide
with synthetic peptides as the internal standard.
Survival-Related Pathways Analysis
Phosphoproteome data from the short and long survivors were mapped to
signaling pathways in the NCI Pathway Information Database (PID) (http://
pid.nci.nih.gov) using the gene names. For each signaling pathway in the
PID, relative abundances for all phosphopeptides mapping to any pathway
component were identified and separated into short- and long-survivor
groups. The difference in distributions between the two sets of pathway-specific peptides, i.e., those associated with either short survivors or long survivors, was then assessed using a two-tailed t test. Similar enrichment analyses
were also performed using protein abundance, mRNA abundance, and CNA.
Data Repository
All of the primary MS data on TCGA tumor samples are deposited at the
CPTAC Data Coordinating Center as raw and mzML files and complete
protein assembly datasets for public access (https://cptac-data-portal.
georgetown.edu).
five figures, and seven tables and can be found with this article online at
A video abstract is available at http://dx.doi.org/10.1016/j.cell.2016.05.
069#mmc10.
CONSORTIA
The members of the National Cancer Institute Clinical Proteomics Tumor Analysis Consortium are Steven A. Carr, Michael A. Gillette, Karl R. Klauser, Eric
Kuhn, D.R. Mani, Philipp Mertins, Karen A. Ketchum, Ratna Thangudu, Shuang
Cai, Mauricio Oberti, Amanda G. Paulovich, Jeffrey R. Whiteaker, Nathan J.
Edwards, Peter B. McGarvey, Subha Madhavan, Pei Wang, Daniel Chan, Akhilesh Pandey, Ie-Ming Shih, Hui Zhang, Zhen Zhang, Heng Zhu, Leslie Cope,
Gordon A. Whiteley, Steven J. Skates, Forest M. White, Douglas A. Levine,
Emily S. Boja, Christopher R. Kinsinger, Tara Hiltke, Mehdi Mesri, Robert C.
Rivers, Henry Rodriguez, Kenna M. Shaw, Stephen E. Stein, David Fenyo,
Tao Liu, Jason E. McDermott, Samuel H. Payne, Karin D. Rodland, Richard
D. Smith, Paul Rudnick, Michael Snyder, Yingming Zhao, Xian Chen, David
F. Ransohoff, Andrew N. Hoofnagle, Daniel C. Liebler, Melinda E. Sanders,
Zhiao Shi, Robbert J.C. Slebos, David L. Tabb, Bing Zhang, Lisa J. Zimmerman, Yue Wang, Sherri R. Davies, Li Ding, Matthew J.C. Ellis, and R. Reid
Townsend.
Study Conception & Design, H. Zhang., T.L., Z.Z., R.D.S., D.W.C., and K.D.R;
Investigation Performed Experiment or Data Collection, T.L., S.S., F.Y.,
J.-Y.Z., Lijun Chen, P.S., P.A., Y.T., M.A.G., T.R.C., C.C., S.T., S.N., R.J.M.,
H. Zhang, and H. Zhu. Computation & Statistical Analysis, Z.Z., S.H.P., Bai
Zhang, J.E.M., V.A.P., Li Chen, D.R., M.E.M., S.W.C., S.W., J.W., D.L.T.,
D.F., V.B., Y.W., and Bing Zhang. Data Interpretation & Biological Analysis,
764 Cell 166, 755765, July 28, 2016
T.L., H. Zhang, S.H.P., Z.Z., Bai Zhang, J.E.M., VA.P., Li Chen, D.R., S.N.,
C.W., I.-M.S., A.P., M.P.S., D.A.L., R.D.S., D.W.C., and K.D.R. Writing Manuscript Preparation & Revision, H. Zhang, T.L., Z.Z., S.H.P., J.E.M., D.R., V.A.P.,
L. Cope, H.R., I.-M.S., A.P., M.P.S., D.A.L., R.D.S., D.W.C., and K.D.R. Supervision & Administration: H.R., E.S.B., T.H., R.C.R., L.S., R.D.S., D.W.C., and
K.D.R.
ACKNOWLEDGMENTS
This work was supported by National Cancer Institute (NCI) CPTAC awards
U24CA160019 and U24CA160036 and by NIH grant P41GM103493. The
PNNL proteomics work described herein was performed in the Environmental
Molecular Sciences Laboratory, a U.S. Department of Energy (DOE) National
Scientific User Facility located at PNNL in Richland, WA. PNNL is a multi-program national laboratory operated by the Battelle Memorial Institute for the
DOE under contract DE-AC05-76RL01830. Genomics data for this study
were generated by the TCGA Pilot Project, established by the NCI and the National Human Genome Research Institute.
REFERENCES
Barre`s, V., Ouellet, V., Lafontaine, J., Tonin, P.N., Provencher, D.M., and MesMasson, A.M. (2010). An essential role for Ran GTPase in epithelial ovarian
cancer cell survival. Mol. Cancer 9, 272.
Cai, R.L., Yan-Neale, Y., Cueto, M.A., Xu, H., and Cohen, D. (2000). HDAC1, a
histone deacetylase, forms a complex with Hus1 and Rad9, two G2/M checkpoint Rad proteins. J. Biol. Chem. 275, 2790927916.
Cancer Genome Atlas Research Network (2011). Integrated genomic analyses
of ovarian carcinoma. Nature 474, 609615.
Cancer Genome Atlas Research Network (2012). Comprehensive genomic
characterization of squamous cell lung cancers. Nature 489, 519525.
Collins, B.C., Gillet, L.C., Rosenberger, G., Rost, H.L., Vichalkovski, A.,
Gstaiger, M., and Aebersold, R. (2013). Quantifying protein interaction dynamics by SWATH mass spectrometry: application to the 14-3-3 system.
Nat. Methods 10, 12461253.
Cope, L., Wu, R.C., Shih, IeM., and Wang, T.L. (2013). High level of chromosomal aberration in ovarian cancer genome correlates with poor clinical
outcome. Gynecol. Oncol. 128, 500505.
Creighton, C.J., Hernandez-Herrera, A., Jacobsen, A., Levine, D.A., Mankoo,
P., Schultz, N., Du, Y., Zhang, Y., Larsson, E., Sheridan, R., et al.; Cancer
Genome Atlas Research Network (2012). Integrated analyses of microRNAs
demonstrate their widespread influence on gene expression in high-grade serous ovarian carcinoma. PLoS ONE 7, e34546.
Dizon, D.S., Damstrup, L., Finkler, N.J., Lassen, U., Celano, P., Glasspool, R.,
Crowley, E., Lichenstein, H.S., Knoblach, P., and Penson, R.T. (2012). Phase II
activity of belinostat (PXD-101), carboplatin, and paclitaxel in women with previously treated ovarian cancer. Int. J. Gynecol. Cancer 22, 979986.
Fanjul-Fernandez, M., Quesada, V., Cabanillas, R., Cadinanos, J., Fontanil, T.,
Obaya, A., Ramsay, A.J., Llorente, J.L., Astudillo, A., Cal, S., and Lopez-Otn,
C. (2013). Cell-cell adhesion genes CTNNA2 and CTNNA3 are tumour
suppressors frequently mutated in laryngeal carcinomas. Nat. Commun. 4,
2531.
Farmer, H., McCabe, N., Lord, C.J., Tutt, A.N., Johnson, D.A., Richardson,
T.B., Santarosa, M., Dillon, K.J., Hickson, I., Knights, C., et al. (2005). Targeting
the DNA repair defect in BRCA mutant cells as a therapeutic strategy. Nature
434, 917921.
Gadducci, A., Lanfredini, N., and Sergiampietri, C. (2015). Antiangiogenic
agents in gynecological cancer: State of art and perspectives of clinical
research. Crit. Rev. Oncol. Hematol. 96, 113128.
Gong, F., and Miller, K.M. (2013). Mammalian DNA repair: HATs and HDACs
make their mark through histone acetylation. Mutat. Res. 750, 2330.
persistent or recurrent epithelial ovarian or primary peritoneal carcinoma: a

Gynecologic Oncology Group study. Gynecol. Oncol. 109, 182186.
Hanahan, D., and Weinberg, R.A. (2011). Hallmarks of cancer: the next generation. Cell 144, 646674.
Pradeep, S., Kim, S.W., Wu, S.Y., Nishimura, M., Chaluvally-Raghavan, P.,
Miyake, T., Pecot, C.V., Kim, S.J., Choi, H.J., Bischoff, F.Z., et al. (2014). Hematogenous metastasis of ovarian cancer: rethinking mode of spread. Cancer
Cell 26, 7791.
Jovanovic, M., Rooney, M.S., Mertins, P., Przybylski, D., Chevrier, N., Satija,
R., Rodriguez, E.H., Fields, A.P., Schwartz, S., Raychowdhury, R., et al.
(2015). Immunogenetics. Dynamic profiling of the protein life cycle in response
to pathogens. Science 347, 1259038.
Kislinger, T., Cox, B., Kannan, A., Chung, C., Hu, P., Ignatchenko, A., Scott,
M.S., Gramolini, A.O., Morris, Q., Hallett, M.T., et al. (2006). Global survey of
organ and organelle protein expression in mouse: combined proteomic and
transcriptomic profiling. Cell 125, 173186.
Kobel, M., Huntsman, D., and Gilks, C.B. (2008). Critical molecular abnormalities in high-grade serous carcinoma of the ovary. Expert Rev. Mol. Med.
10, e22.
Komili, S., and Silver, P.A. (2008). Coupling and coordination in gene expression processes: a systems biology view. Nat. Rev. Genet. 9, 3848.
Kuo, K.T., Guan, B., Feng, Y., Mao, T.L., Chen, X., Jinawath, N., Wang, Y., Kurman, R.J., Shih, IeM., and Wang, T.L. (2009). Analysis of DNA copy number
alterations in ovarian serous tumors identifies new molecular genetic changes
in low-grade and high-grade carcinomas. Cancer Res. 69, 40364042.
Langfelder, P., and Horvath, S. (2008). WGCNA: an R package for weighted
correlation network analysis. BMC Bioinformatics 9, 559.
Liu, J., and Matulonis, U.A. (2014). New strategies in ovarian cancer: translating the molecular complexity of ovarian cancer into treatment advances.
Clin. Cancer Res. 20, 51505156.
Lu, P., Vogel, C., Wang, R., Yao, X., and Marcotte, E.M. (2007). Absolute protein expression profiling estimates the relative contributions of transcriptional
and translational regulation. Nat. Biotechnol. 25, 117124.
Mackay, H.J., Hirte, H., Colgan, T., Covens, A., MacAlpine, K., Grenci, P.,
Wang, L., Mason, J., Pham, P.A., Tsao, M.S., et al. (2010). Phase II trial of
the histone deacetylase inhibitor belinostat in women with platinum resistant
epithelial ovarian cancer and micropapillary (LMP) ovarian tumours. Eur. J.
Cancer 46, 15731579.
Marguerat, S., Schmidt, A., Codlin, S., Chen, W., Aebersold, R., and Bahler, J.
(2012). Quantitative analysis of fission yeast transcriptomes and proteomes in
proliferating and quiescent cells. Cell 151, 671683.
McEllin, B., Camacho, C.V., Mukherjee, B., Hahm, B., Tomimatsu, N., Bachoo,
R.M., and Burma, S. (2010). PTEN loss compromises homologous recombination repair in astrocytes: implications for glioblastoma therapy with temozolomide or poly(ADP-ribose) polymerase inhibitors. Cancer Res. 70, 54575464.
Ross, P.L., Huang, Y.N., Marchese, J.N., Williamson, B., Parker, K., Hattan, S.,
Khainovski, N., Pillai, S., Dey, S., Daniels, S., et al. (2004). Multiplexed protein
quantitation in Saccharomyces cerevisiae using amine-reactive isobaric
tagging reagents. Mol. Cell. Proteomics 3, 11541169.
Ruggles, K.V., Tang, Z., Wang, X., Grover, H., Askenazi, M., Teubl, J., Cao, S.,
McLellan, M.D., Clauser, K.R., Tabb, D.L., et al. (2016). An analysis of the
sensitivity of proteogenomic mapping of somatic mutations and novel splicing
events in cancer. Mol. Cell. Proteomics 15, 10601071.
Tang, J., Cho, N.W., Cui, G., Manion, E.M., Shanbhag, N.M., Botuyan, M.V.,
Mer, G., and Greenberg, R.A. (2013). Acetylation limits 53BP1 association
with damaged chromatin to promote homologous recombination. Nat. Struct.
Mol. Biol. 20, 317325.
Vang, R., Levine, D.A., Soslow, R.A., Zaloudek, C., Shih, IeM., and Kurman,
R.J. (2016). Molecular alterations of TP53 are a defining feature of ovarian
high-grade serous carcinoma: a rereview of cases lacking TP53 mutations in
The Cancer Genome Atlas Ovarian Study. Int. J. Gynecol. Pathol. 35, 4855.
Verhaak, R.G., Tamayo, P., Yang, J.Y., Hubbard, D., Zhang, H., Creighton,
C.J., Fereday, S., Lawrence, M., Carter, S.L., Mermel, C.H., et al. (2013). Prognostically relevant gene signatures of high-grade serous ovarian carcinoma. J
Clin Invest. 123, 517525.
Vogel, C., and Marcotte, E.M. (2012). Insights into the regulation of protein
abundance from proteomic and transcriptomic analyses. Nat. Rev. Genet.
13, 227232.
Woo, S., Cha, S.W., Na, S., Guest, C., Liu, T., Smith, R.D., Rodland, K.D.,
Payne, S., and Bafna, V. (2014a). Proteogenomic strategies for identification
of aberrant cancer peptides using large-scale next-generation sequencing
data. Proteomics 14, 27192730.
Woodbine, L., Gennery, A.R., and Jeggo, P.A. (2014). The clinical impact of
deficiency in DNA non-homologous end-joining. DNA Repair (Amst.) 16,
8496.
Wu, L., Candille, S.I., Choi, Y., Xie, D., Jiang, L., Li-Pook-Than, J., Tang, H.,
and Snyder, M. (2013). Variation and genetic control of protein abundance in
humans. Nature 499, 7982.
Yang, L., Duff, M.O., Graveley, B.R., Carmichael, G.G., and Chen, L.L. (2011).
Genomewide characterization of non-polyadenylated RNAs. Genome Biol.
12, R16.
Meng, H., Tian, L., Zhou, J., Li, Z., Jiao, X., Li, W.W., Plomann, M., Xu, Z., Lisanti, M.P., Wang, C., and Pestell, R.G. (2011). PACSIN 2 represses cellular
migration through direct association with cyclin D1 but not its alternate splice
form cyclin D1b. Cell Cycle 10, 7381.
Yang, J.Y., Yoshihara, K., Tanaka, K., Hatae, M., Masuzaki, H., Itamochi, H.,
Takano, M., Ushijima, K., Tanyi, J.L., Coukos, G., et al.; Cancer Genome Atlas
(TCGA) Research Network (2013). Predicting time to ovarian carcinoma recurrence using protein markers. J. Clin. Invest. 123, 37403750.
Mertins, P., Yang, F., Liu, T., Mani, D.R., Petyuk, V.A., Gillette, M.A., Clauser,
K.R., Qiao, J.W., Gritsenko, M.A., Moore, R.J., et al. (2014). Ischemia in tumors
induces early and sustained phosphorylation changes in stress kinase pathways but does not affect global protein levels. Mol. Cell. Proteomics 13,
16901704.
Yi, B., Zhang, Y., Zhu, D., Zhang, L., Song, S., He, S., Zhang, B., Li, D., and
Zhou, J. (2015). Overexpression of RhoGDI2 correlates with the progression
and prognosis of pancreatic carcinoma. Oncol. Rep. 33, 12011206.
Mertins, P., Mani, D.R., Ruggles, K.V., Gillette, M.A., Clauser, K.R., Wang, P.,
Wang, X., Qiao, J.W., Cao, S., Petralia, F., et al.; NCI CPTAC (2016). Proteogenomics connects somatic mutations to signalling in breast cancer. Nature 534,
5562.
Mielcarek, M., Zielonka, D., Carnemolla, A., Marcinkowski, J.T., and Guidez, F.
(2015). HDAC4 as a potential therapeutic target in neurodegenerative diseases: a summary of recent achievements. Front. Cell. Neurosci. 9, 42.
Modesitt, S.C., Sill, M., Hoffman, J.S., and Bender, D.P.; Gynecologic
Oncology Group (2008). A phase II study of vorinostat in the treatment of
Yu, G., Zhang, B., Bova, G.S., Xu, J., Shih, IeM., and Wang, Y. (2011). BACOM:
in silico detection of genomic deletion types and correction of normal cell
contamination in copy number data. Bioinformatics 27, 14731480.
Zhang, B., Li, H., Riggins, R.B., Zhan, M., Xuan, J., Zhang, Z., Hoffman, E.P.,
Clarke, R., and Wang, Y. (2009). Differential dependency network analysis to
identify condition-specific topological changes in biological networks. Bioinformatics 25, 526532.
Zhang, B., Wang, J., Wang, X., Zhu, J., Liu, Q., Shi, Z., Chambers, M.C., Zimmerman, L.J., Shaddox, K.F., Kim, S., et al.; NCI CPTAC (2014). Proteogenomic characterization of human colon and rectal cancer. Nature 513,
382387.
Cell 166, 755765, July 28, 2016 765
Resource
Human SRMAtlas: A Resource of Targeted Assays to

Quantify the Complete Human Proteome
Graphical Abstract
Authors
Ulrike Kusebauch, David S. Campbell,
Eric W. Deutsch, ..., Leroy Hood,
Ruedi Aebersold, Robert L. Moritz
Correspondence
aebersold@imsb.biol.ethz.ch (R.A.),
rmoritz@systemsbiology.org (R.L.M.)
In Brief
This resource enables the accurate
detection and quantification of any
known or predicted human protein from
complex biological samples.
Highlights
d
Human SRMAtlas: 166,174 proteotypic peptides

representing the human proteome
Resource of verified high-resolution spectra and multiplexed
SRM assays
Supports proteome-scale quantification as well as
hypothesis-driven research
Web database with free unlimited access
Kusebauch et al., 2016, Cell 166, 766778

Accession Numbers
GSE83654
Resource
Human SRMAtlas: A Resource of Targeted Assays to
Quantify the Complete Human Proteome
Ulrike Kusebauch,1 David S. Campbell,1 Eric W. Deutsch,1 Caroline S. Chu,1 Douglas A. Spicer,1 Mi-Youn Brusniak,1
Joseph Slagel,1 Zhi Sun,1 Jeffrey Stevens,1 Barbara Grimes,1 David Shteynberg,1 Michael R. Hoopmann,1
Peter Blattmann,2 Alexander V. Ratushny,1,6 Oliver Rinner,2,3 Paola Picotti,2 Christine Carapito,2 Chung-Ying Huang,1
Meghan Kapousouz,1 Henry Lam,4 Tommy Tran,1 Emek Demir,5 John D. Aitchison,1,6 Chris Sander,5 Leroy Hood,1
Ruedi Aebersold,2,7,* and Robert L. Moritz1,*
1Institute
for Systems Biology, Seattle, WA 98109, USA

of Biology, Institute of Molecular Systems Biology, ETH Zurich, 8093 Zurich, Switzerland
3Biognosys AG, 8952 Schlieren, Switzerland
4Department of Chemical and Biomolecular Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong,
China
5Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, NY 10065, USA
6Center for Infectious Disease Research, Seattle, WA 98109, USA
7Faculty of Science, University of Zurich, 8006 Zurich, Switzerland
*Correspondence: aebersold@imsb.biol.ethz.ch (R.A.), rmoritz@systemsbiology.org (R.L.M.)
2Department
SUMMARY
The ability to reliably and reproducibly measure any

protein of the human proteome in any tissue or cell
type would be transformative for understanding systems-level properties as well as specific pathways in
physiology and disease. Here, we describe the generation and verification of a compendium of highly specific assays that enable quantification of 99.7% of
the 20,277 annotated human proteins by the widely
accessible, sensitive, and robust targeted mass
spectrometric method selected reaction monitoring,
SRM. This human SRMAtlas provides definitive coordinates that conclusively identify the respective peptide in biological samples. We report data on 166,174
proteotypic peptides providing multiple, independent assays to quantify any human protein and
numerous spliced variants, non-synonymous mutations, and post-translational modifications. The data
are freely accessible as a resource at http://www.
srmatlas.org/, and we demonstrate its utility by
examining the network response to inhibition of
cholesterol synthesis in liver cells and to docetaxel
in prostate cancer lines.
INTRODUCTION
The ability to accurately and reproducibly detect and quantify
any protein of the human proteome is a main objective in the
life sciences. Achieving it would significantly contribute toward
understanding the biochemical base of living cells (Edwards
et al., 2011). In contrast to the human genome, which has been
determined in its entirety, the composition of the human proteome is still poorly defined. The prevalence of alternative splicing
and post-translational modifications increase the complexity to
an as-yet-unknown number of different proteoforms, and the

annotation of protein-coding regions and experimental evidence
for their validity are still being refined. Therefore, a well-defined
protein sequence database with annotated functional information and the identification and quantification of at least one protein from every protein-coding gene offer a pragmatic and useful
definition of a complete proteome (Mann et al., 2013).
The detection of proteins can be accomplished through mass
spectrometric and affinity reagent based methods. The Human
Protein Atlas, a systematic exploration of the human proteome
using antibody-based reagents, is a unique effort attempting
to characterize all human protein-coding genes (Uhlen et al.,
2005). Since the initial release in 2005, the Protein Atlas evolved
into a knowledge base that includes a diverse collection of
25,039 monoclonal and polyclonal antibodies, collectively targeting 17,005 proteins corresponding to 84% of the predicted
proteome (v.15).
For the mass spectrometric exploration of the proteome,
a range of techniques have been developed, and they can
be broadly grouped into data-dependent acquisition (DDA,
also known as shotgun or discovery proteomics) and targeted
mass spectrometry (MS) methods. Common to both methods
is that the sample proteins are first converted into peptides by
enzymatic digestion. They differ in the manner in which the
mass spectrometer (MS) is used to analyze the resulting peptide
mixtures. The majority of proteomic studies rely on the DDA
strategy that selects peptide precursor ions for collision-induced
dissociation (CID) from signals detected in a survey scan. The resulting fragment ion spectra are assigned to a peptide sequence
by peptide spectrum matching (PSM), and proteins are inferred
from confidently identified peptides. This workflow allows the
identification of thousands of proteins in a sample and provides
quantification via the presence of stable isotope-labeled reference peptides or through label-free methods (Bantscheff et al.,
2007). However, the biased precursor selection of the most
abundant peptide ions in complex samples by DDA limits the
reproducibility of data generated in repeat analyses. Also, to
The Human Proteome

UniProtKB/Swiss-Prot
20,277 proteins
PABST
evaluate sequence characteristics
rank observed & predicted peptides
final suitability score
observed
peptides
empirical suitability score
PeptideSieve,
predicted Detectability Predictor,

peptides ESPP, STEPP, APEX
predictive suitability score
Human proteome sequence variants and post-translational modifications:

14,677 spliced isoforms
32,418 SNPs
N-glycosylated proteins
Selection of 166,174 proteotypic peptides
A S
P R
Peptide synthesis &
Prepare pools of 96 peptides
intensity
MS/MS
6530 Q-TOF
data directed
5 CEs
CE
Q-TOF spectral library
collision energy plot
QTrap spectral library
www.srmatlas.org
www.peptideatlas.org
www.proteinatlas.org
www.nextprot.org
www.pathwaycommons.org
www.srmcollider.org
Figure 1. Human SRMAtlas Development

The scheme outlines the workflow steps to generate SRM assays for every human protein. Peptides were selected for 20,277 proteins in the UniProtKB/SwissProt database as well as for spliced isoforms, SNPs, and modifications. Selection of PTPs was an iterative process by mining MS observed peptides in
PeptideAtlas and the use of prediction tools. The PABST algorithm evaluated sequence constraints and ranked observed and predicted peptides, the highest
scoring peptides for each protein were selected for SRM assay development. Peptides were individually synthesized and pooled in sets of 96. Peptides were
analyzed on an Agilent 6530 Q-TOF with five CEs to acquire high-resolution MS/MS spectra to create spectral libraries and CE plots. SRM coordinates were
extracted from the spectral library to acquire chromatographic traces on an Agilent 6460 QQQ. SRM assays were also developed on a Sciex QTrap 5500, upon
Cell 166, 766778, July 28, 2016 767
reach high proteome coverage, enormous numbers of peptides

need to be sampled, which, in turn, causes significant technical
and computational challenges at the level of PSM and protein
interference (Deutsch et al., 2015b). Therefore, DDA MS is well
suited to discover the components of a sample but less so for
the generation of reproducible quantitative data across many
samples.
Selected reaction monitoring (SRM, also named multiple reaction monitoring [MRM]) instead is a targeted, quantitative technique that is characterized by a lower limit of detection, a wider
dynamic range and increased reproducibility. SRM is primarily
performed on triple quadrupole (QQQ) MS instruments where
the first quadrupole (Q1) filters the precursor ions of a peptide,
the second quadrupole (Q2) provides CID, and the third quadrupole (Q3) isolates predetermined fragment ions. This process results in a quantifiable signal represented as a chromatographic
trace. SRM is referred to as targeted approach as only predetermined ions are measured. The two-level mass selection and the
non-scanning mode translate into increased specificity and
sensitivity and, in presence of stable isotope-labeled standards,
in precise quantification. The pair of mass to charge (m/z) values
that is isolated in Q1 and Q3 is referred to as a transition, and
a set of transitions that determine a peptide signature is, in combination with the peptides elution time, termed SRM assay.
SRM has the unique capability of rapidly quantifying targeted
proteins, their variants, and modifications through the detection
of suitable proteotypic surrogates as a multiplexed and cost-efficient alternative to antibody-based assays. In addition, peptide
affinity reagents can be used to specifically capture analytes
and enhance sensitivity (Anderson et al., 2004). SRM has been
applied for decades in the pharmaceutical industry to quantify
small molecules (Baty and Robinson, 1977) and evolved recently
into an established technique in the field of proteomics, due to
advanced technology and reproducibility across instrument platforms and laboratories (Addona et al., 2009).
However, SRM requires defining a priori a set of target proteins, optimal peptides, and assay parameters. This is not a trivial
task as not every peptide is suitable for SRM, and assays need to
be experimentally determined from selected peptides. The major
challenge of SRM is the initial effort to develop high-quality SRM
assays, which is, despite all progress, still a time-intensive process. Once an assay is developed, it can be applied perpetually
in a variety of studies.
Recently, we developed SRM assays for Saccharomyces cerevisiae (Picotti et al., 2009, 2013), Streptococcus pyogenes
(Karlsson et al., 2012), and Mycobacterium tuberculosis (Schubert et al., 2013) proteomes and successfully applied these assays to a wide range of protein studies in the respective species
(Ebhardt et al., 2015; Picotti and Aebersold, 2012). We and
several other laboratories have also developed SRM assays
for human proteins, typically for a small number of proteins in
the context of a specific biological study. The targeted approach
has progressively been applied toward the quantification of
low abundant proteins in complex matrices, the verification of

biomarker candidates, and has proved to be successful in clinical settings (Craciun et al., 2016; Gillette and Carr, 2013; Huttenhain et al., 2012; Kennedy et al., 2014; Surinova et al., 2015).
However, for the human species no proteome-wide assay
resource has been available and experimental protein research
has therefore remained substantially limited in scope.
As a consequence of these factors, the majority of protein
research is still focused on the same relatively small subset of
proteins for which assays are readily available. Strikingly, the
population of proteins most frequently reported in the scientific
literature has not changed significantly since the publication
of the human genome and thus the definition, in principle, of
the proteome (Edwards et al., 2011). This indicates that the realization of the benefits of genomic knowledge for experimental
protein research critically depends on the availability of assays
supporting the quantification of any human protein.
In the present study, we developed a complete proteomecentric database for the targeted identification and quantification of any human protein of interest via SRM. We present a
unique compendium of SRM assays for essentially the entire
human proteome consisting of verified high-resolution, high
mass-accuracy MS fragment ions of each proteotypic peptide,
the chromatographic behavior of each peptide as an SRM trace,
and the relative quantitative response, all of which constitute an
SRM assay. We have compiled the data into a freely available
web-accessible database providing multiple SRM assays for
each protein, integrated with extensive bioinformatic knowledge
bases to establish a resource of assays to unambiguously identify and quantify any protein of the human proteome. We expect
that this resource will significantly advance protein based experimental biology because any human protein can now, in principle, be quantified in any sample. We also expect that the
availability of reliable assays for the human proteome will significantly contribute to increase the reproducibility of research
results on the human proteome.
RESULTS
To generate SRM assays for the entire human proteome, we
followed the process schematically illustrated in Figure 1. It consists of defining the target proteome, selection of proteotypic
peptides, development of SRM assays via synthetic peptides,
and compiling the data into a web-accessible resource. Here,
we describe each step of the process.
Step 1: Defining the Target Proteome
We used the 20,277 human protein sequences described in the
manually annotated and reviewed UniProtKB/Swiss-Prot database (http://www.uniprot.org/, release 2010-05) (Boutet et al.,
2007) as reference to select peptides for each protein (see also
Supplemental Information). We paid explicit attention to ensure
that membrane bound proteins, large multi-domain proteins
the detection of a transition a full MS/MS spectrum was acquired to create a QTrap spectral library. SRM assay parameters including precursor and fragment ion
type, charge state, and rank order, elution time as well as chromatograms, MS/MS spectra, and CE plots are provided in the human SRMAtlas resource. The
human SRMAtlas is integrated with external knowledge bases providing comprehensive information on a protein of interest.
768 Cell 166, 766778, July 28, 2016
Figure 2. Human Proteome Coverage

The graph details the number of peptides per protein by empirically observed
peptides in the human PeptideAtlas (build 2010-05, blue) and by PTPs
selected for the human SRMAtlas (red). 5+ specifies five or more peptides.
any shows the number of proteins for which at least one peptide is available.
9,946 proteins (49.1% of the predicted human proteome) were described by
MS observed peptides in PeptideAtlas 2010. SRMAtlas provides with 99.9%
proteome coverage for 20,255 proteins by synthetic peptides.
See also Tables S1 and S3.
and protein activation events resulting in non-tryptic cleavage

sites were equally represented. We extended this protein set to
address known protein isoforms, peptides containing SNPs
and N-glycosylation sites. A database of 20,277 proteins and
14,677 isoforms formed the basis for the SRMAtlas (Figure 1,
step 1).
Step 2: Selection of Proteotypic Peptides
The selection of peptides that unambiguously identify each human protein is a key step in the development of SRM assays.
We aimed to select at least the five best peptides for every
human protein-coding gene using several criteria. Primarily, we
chose proteotypic peptides (PTPs) (Kuster et al., 2005) due to
their high likelihood of being detected in subsequent measurements. We considered physiochemical properties including
length, hydrophobicity, and charge state, limitations in chemical
synthesis, reactive amino acid residues susceptible to oxidation,
pyroglutamate formation, and deamidation, and sequences that
are likely modified by post-translational modifications or contain
commonly occurring SNPs. These criteria are important and
often overlooked in selecting PTPs, as SRM assay development
and quantification generally depend on chemically synthesized
peptides. To select the optimal set of PTPs that constitute the
SRMAtlas, we preferably relied on empirical data. For those
proteins for which no empirical data were available, we computationally predicted the optimal set of peptides.
Selection of Peptides from Empirical Data
PeptideAtlas (http://www.peptideatlas.org/) (Desiere et al., 2005)
is a major MS repository that accepts raw MS data acquired from
biological samples generated by the scientific community and reanalyzes all MS data in a consistent process, including statistical
validation of the results, using the Trans-Proteomic Pipeline (TPP)
(Deutsch et al., 2015a). This database provides evidence of the

most consistently detected peptides per protein and their confident detection at very low false-discovery rates (FDR, usually
0.0002 at the PSM level corresponding to a 1% protein
FDR). At the time we specified the peptide set underlying this
study, the human PeptideAtlas (build 2010-05 internal) contained
106,184 distinct peptides identified in over 300 different experiments encompassing 59,142 MS runs of human cell lines, tissues, and fluids.
In a first step, we investigated how many of the 20,277
UniProtKB/Swiss-Prot proteins were represented by one to
five distinct MS observed peptides in the human PeptideAtlas.
9,946 proteins were observed by at least one peptide while
only 5,319 proteins (26% of the proteome) were identified by
five or more distinct peptides. Taken together, this demonstrated that observed peptides alone did not achieve full
coverage of the predicted human proteome and that 51% of
the human proteome had not yet been detected in MS approaches (Figure 2, blue bars; Table S1). Next, we screened
the human PeptideAtlas for the best PTPs and annotated each
PeptideAtlas observed peptide with an empirical suitability
score (ESS). The ESS takes into account the peptide probability,
the number of repeat identifications of the respective peptide,
and the selection criteria specified above (Figure 1, step 2).
The higher the ESS, the more suitable the peptide was deemed
for assay development.
Selection of Predicted Peptides
For proteins that had no empirical evidence of being detected by
MS or less than five PTPs in PeptideAtlas, we used published
and in-house algorithms to predict the best candidate peptides
for assay development (Figure 1, step 2). All algorithms provide
a score based on the sequence and physicochemical properties
of a peptide. Although each algorithm performed reasonably
well, the set of peptides with the highest scores determined by
each predictor overlapped less than expected. Therefore, we
developed a predictive suitability score (PSS) that allowed us
to computationally calculate MS suitable sequences for the
entire human proteome (Z.S., unpublished data). Briefly, we retrained PeptideSieve (Mallick et al., 2007) and devised a composite scoring scheme considering the results of the individual
algorithms to complement the observed peptides for assay
development.
Subsequently, we applied the PeptideAtlas best SRM transition (PABST) algorithm (E.W.D., unpublished data) to calculate
an adjusted suitability score for both observed and predicted
peptides by penalizing unfavorable sequence characteristics
described above using a multiplicative weight scoring system.
Finally, PABST ranked the adjusted scores of empirical observed
and computational predicted peptides, and unique sequences
with the highest scores were selected for SRM assay development (Figure 1, step 2; Table S2).
This process was used to select at least five peptides per
protein if allowed by sequence constraints. For higher molecular
weight proteins (>50 kDa), we expanded the selection by
dividing each protein sequence into 10-kDa segments and
selected suitable peptides to provide assays for protein domains. For a small number of peptides, we allowed less strict
criteria with regard to length, hydrophobicity, and charge state,
Cell 166, 766778, July 28, 2016 769
to be able to select several peptides per protein and ensure as

many proteins as possible are considered for SRM assay development. While we penalized sequences with unfavorable motifs
and reactive amino acids, these were not entirely excluded, as
otherwise assays could not have been developed for several
proteins.
Step 3: Extension of Peptide Selection for Protein
Isoforms, SNPs, and N-Glycosylated Proteins
To augment the assay development beyond a representative
product of the 20,277 UniProtKB/Swiss-Prot proteins, we
selected peptides identifying splice and sequence variants and
N-glycosylation sites (Figure 1, step 3). We attempted to select
at least one peptide to specifically identify splice variants
described in UniProtKB/Swiss-Prot Varsplic. Protein isoforms
originating from differentially spliced versions of a particular
mRNA are usually not characterized by several unique peptides,
but with our selection approach we chose 11,309 peptides that
allow the identification and quantification of unique splice forms.
Further, we selected all suitable C-terminal peptides resulting
in 6,820 additional peptides for the 20,277 proteins and 1,937
peptides for spliced variants. To account for sequence polymorphisms, we extended the selection to include major SNPs resulting in non-synonymous mutations. We chose 3,662 peptides
considering SNPs with a population frequency greater than
30% (=1,831 SNPs) using NCBI dbSNP (build 131) and selected
3,094 peptides (=1,547 SNPs) that fulfill the peptide selection
criteria. To identify peptides representing N-glycosylated proteins, 5,199 membrane proteins (Fagerberg et al., 2010), 1,748
secreted proteins (da Cunha et al., 2009), and 784 membrane
proteins from 47 tissue types were used to select 10,938 peptides spanning N-glycosites located in the extracellular protein
domain. Finally, for selecting peptides representing protein/
peptide hormones, we targeted both the standard UniProt
sequence as well as the mature form of these proteins considering their respective proteolytic cleavage sites to provide
SRM assays for both the pre/pro-hormone and the activated
form. We selected 142 peptides (124 distinct sequences) representing 129 proteins.
Overall, with this iterative and comprehensive selection process, we determined 166,174 peptide sequences representing
20,255 proteins, which constitute 99.9% of the predicted human
proteome as defined in UniProtKB/Swiss-Prot. For 18,010 proteins (88.8% of the human proteome), we selected the best
five or greater PTPs; for 19,505 proteins (96.2%), we selected
the best three or greater PTPs, and for 19,985 proteins (98.6%
of the human proteome) the best two or greater PTPs (Figure 2,
red bars). Only 22 proteins remained inaccessible by tryptic peptides that pass the selection criteria and synthesis requirements;
thus, assays could not be developed for these proteins (Figure 2;
Table S3).
Step 4: Development of SRM Assays and a Complete
Human Peptide Library
To generate SRM assays, the peptides selected above were
chemically synthesized and used to generate fragment ion
spectra that were processed into consensus spectra and ultimately SRM assays.
770 Cell 166, 766778, July 28, 2016
Generation of Fragment Ion Spectra

The 166,174 selected peptide sequences were individually
chemically synthesized and used to generate high-resolution,
high-mass accuracy reference fragment ion spectra. To process
the large number of peptides, we established an assay development pipeline including a robotics platform and multiple
commonly used MS instruments duplicated at two geographical
sites. Pools of 96 peptides each were analyzed on a quadrupole
TOF MS (Agilent 6530 Q-TOF) in a data-directed approach using
exclusive lists based on the expected charge state of a peptide
as guidance for precursor selection. To increase the robustness
of the fragment ion spectra, we implemented a data acquisition
strategy in which each precursor was fragmented exclusively at
five different collision energies (CEs), and at least five MS/MS
spectra per CE were recorded (Figure 1, step 4). The simultaneous acquisition of multiple CEs obviates the need for subsequent CE optimization, a time-consuming aspect in the process
of developing SRM assays. A set of peptides was used for strict
retention time (RT) standardization across multiple MS instruments and to provide a catalog with observed RTs and iRT
values to enable multiplexed SRM analysis (Figure 1; Figure S1).
Generation of SRM Assays
To convert the fragment ion spectra into SRM assays, we subsequently generated consensus spectra. 3,250,015 spectra from
the 6530 Q-TOF (base CE only) were confidently assigned to
149,265 peptide sequences out of the 166,174 synthesized
peptides (89.8%). The five CE events for each peptide and their
high-quality PSMs provided 14,970,896 spectra for use in monitoring differential fragmentation at multiple low and high CE
values. We then generated consensus spectral libraries from
each CE event to provide plots for every peptide and charge
state that visualize optimal CE values for each individual fragment ion. The base CE, i.e., the CE value calculated from the
default CE versus precursor ion mass function, provided the
highest abundance signal for the majority of fragment ions. However, for some fragments the selection of a lower or higher CE
than the calculated base CE resulted in increased fragment ion
signal intensities.
Next, we extracted from the 6530 Q-TOF base CE spectral
library for each peptide and charge state SRM assay coordinates
to acquire the peptides chromatographic traces on a triple
quadrupole MS (Agilent 6460 QQQ). Fragment ions with the
highest signal intensities and above the precursor m/z were preferably selected to obtain assays with optimal sensitivity and
selectivity. SRM chromatographic traces were successfully acquired for 126,712 peptides corresponding to a success rate of
84.9% based on the 6530 Q-TOF verified peptides that served
as input to generate these SRM assays.
In addition, we determined the SRM signatures for all peptides
on a quadrupole-linear ion trap MS (Sciex QTrap 5500) instrument by acquiring SRM traces and full MS/MS spectra upon
the detection of a transition (Figure 1, step 4). We generated a
QTrap 5500 spectral library from 1,789,651 high-quality PSMs
that were assigned to 149,961 peptide sequences (90.2% of
the human proteome).
SRM Assay Success
Whereas excellent peptide recoveries were achieved with each
quadrupole type instrument, the combined recovery exceeded
166,174
149,265
peptides
160,000
126,712
158,015
149,961
120,000
80,000
40,000
0
selected
6530
Q-TOF
6460
QQQ
5500
combined
QTRAP
B 100
C 100
80
80
60
60
40
40
20
20
Success rate in %
Selected peptides in %
0
6 8 10 12 14 16 18 20 22 24 26 28 30
Peptide length
D 100
E 100
80
80
60
60
40
40
20
20
3
4
5
6
Expected charge state
0
5 10 15 20 25 30 35 40 45 50 55 60
A C D E F GH I L MN P Q S T VWY
SSRCalc
Amino acid
F 100
G 100
80
80
60
60
40
40
20
20
0
A C D E F G H I K L MN P Q R S T VWY
N-terminal amino acid
A C D E F G H I K L MN P Q R S T VWY
C-terminal amino acid
Figure 3. SRM Assay Success

(A) Number of developed SRM assays per instrument type in comparison to the number of synthesized peptides. The 6530 Q-TOF-extracted coordinates served
as input for the 6460 QQQ-derived SRM assays with a success of 84.9%. 6530 Q-TOF and QTrap 5500 combined result in 158,015 targeted assays constituting
95.1% of the selected peptides.
(BG) Selected peptides (red) and their assay success rate (blue) in percentages are displayed by (B) peptide length, (C) expected charge state, (D) hydrophobicity
as SSRCalc value, (E) amino acid, (F) N-terminal amino acid, and (G) C-terminal amino acid.
See also Figure S2.
the results from each instrument type. In total, the recovery

yielded 158,015 peptides with verified fragment ion spectra
and SRM assay coordinates corresponding to 95.1% of all
selected peptides (Figure 3A). Peptides of seven to 20 amino
acids length constitute 91.4% of the Human SRMAtlas and

were identified with a 96% success rate, while peptides with
2130 amino acids resulted in an 83% success rate in qualifying
fragmentation spectra. Peptides with an expected precursor
Cell 166, 766778, July 28, 2016 771
20,000
and quantify essentially any human protein. The database of

SRM assays can be adapted to changes in genome annotation
with modest effort.
proteins
15,000
10,000
5,000
0
0
2
3
4
5+
any
peptides
selected peptides
all developed assays
6530 Q-TOF assays
QTrap 5500 assays
6460 QQQ assays
Figure 4. SRM Assay Coverage in the Human SRMAtlas

Assay coverage by peptides per protein and instrument is displayed in green
shades; selected peptides are shown in gray. 158,015 successfully developed
assays represent 99.7% (20,225 proteins) of the human proteome (dark
green). 95.4% of the human proteome is presented by at least three assays. 22
proteins are inaccessible.
charge state (z) of 2 (61.3%), 3 (28.5%), and 4 (6.3%) were preferably selected and performed generally better compared to a
small number of peptides that fragmented with z = 1 (C-terminal
peptides) or z R 5 (long peptides with several basic residues).
Further, we found that peptides with an SSRCalc value of seven
to 46 performed best and that cysteine containing peptides
showed a decreased success rate compared to all other peptide
sequences (Figures 3B3G; Figure S2).
Next, we reassessed the protein coverage achieved by the
158,015 successfully developed SRM assays, taking into account that some peptides failed to result in the correct synthesis
product or to fragment with sufficient quality. The generated assays covered 99.7% of the predicted human proteome with at
least one SRM assay per protein and provide a minimum of three
assays for 19,337 (95.4%) of all UniProtKB/Swiss-Prot annotated human proteins (Figure 4). We were able to develop a minimum of four assays for 91.5% and at least five assays for 85.3%
of the proteome, respectively. Taken together, on average, each
protein of the human proteome is represented by eight SRM
assays per protein and some by more than 25 peptides. The
assessment of the SRM assay chromatographic performance
utilizing the 6460 QQQs was as successful. For 98.9% of the predicted human proteome, we were able to acquire high-quality
SRM traces with at least one peptide per protein, while 90.3%
of the proteome is represented by three SRM assays.
During the course of the project, updated versions of the
UniProtKB/Swiss-Prot database were released. To account for
new protein entries, we developed 443 additional SRM assays
for 162 entries and included these assays in our database to
provide SRM assays for updated human reference proteomes
2014 (20,193 proteins) and 2015 (20,203 proteins) (Figure S3;
Table S4). Overall, we have successfully developed 158,015
mass spectrometric assays based on high-quality MS/MS
spectra and subsequent QQQ deployment with the use of
166,174 chemically synthesized peptides to reliably identify
772 Cell 166, 766778, July 28, 2016
Assessing the Peptide Selection Success in the Context

of Recent Public Data
Recent technical advances in high-resolution MS and efforts to
discover complete proteomes of mammalian cells and tissues
have led to a substantial increase in discovery proteomics
data. The state of the human proteome as viewed through
PeptideAtlas in 2015 (Deutsch et al., 2015b), which incorporates
data from large-scale proteomic measurements, reports 133
million high-quality PSMs, identifying more than 1 million distinct
peptides that collectively represent 14,070 (70%) confidently
identified human proteins, 5% ambiguous and 9% redundant
detections, leaving 16% (3,166 proteins) undetected. Given a
large number of peptides discovered since the peptide selection
for the SRMAtlas was performed, we retrospectively investigated the success of selecting suitable peptides that were
observed in the recent PeptideAtlas and were not available in
the initial selection database. We found that 84% of the newly
observed peptides, that fulfill the selection criteria described
above, were selected for the comprehensive human peptide
SRM assay development. Further, we ranked all observed peptides based on spectral count and determined that we selected
85% of the most abundant peptides by using our predictive algorithm, indicating the robustness of the computational peptide
selection algorithms used.
Step 5: Data Access through the Human SRMAtlas
Resource
With the intent to facilitate life science research, we developed
SRMAtlas (http://www.srmatlas.org/), a freely available resource
providing unlimited access to this unique compendium of targeted assays (Figure 1, step 5). A web interface allows researchers to query assays for their targets of interest. The query
returns verified assays including peptide sequence, precursor,
and fragment ions with their charge states, fragment ion rank order, collision energy for different MS instruments, retention time,
hydrophobicity, and peptide uniqueness within the annotated
human proteome. All MS/MS spectra and SRM chromatograms
are displayed together with collision energy plots for optimal CE
selection. We provide various assay download options such as
instrument specific transition lists for immediate import to the
MS method and subsequent acquisition. Default query settings
are provided for ease of use and all queries can be customized.
For workflows including the quantification with labeled standards, we implemented queries for transitions of the light endogenous peptide and the heavy isotope-labeled analog. The result
page in the human SRMAtlas not only reports verified assay
coordinates but also integrates with external knowledge bases
including neXtProt, PeptideAtlas, the Human Protein Atlas,
Pathway Commons, and SRMCollider offering comprehensive
information on a protein of interest.
SRMAtlas Application
To demonstrate the utility of the SRMAtlas resource, we carried
out two studies. In study 1, we chose cellular cholesterol
regulation as an example for a clinically relevant pathway that

can be perturbed using drugs. The transcription factor SREBP2
induces expression of genes in the cholesterol biosynthesis
pathway if the endogenous levels of cholesterol are depleted
by inhibition with statins (Goldstein and Brown, 2015). A druginduced gene module enriched for SREBP target genes was
identified that also contained unknown targets of this pathway
(Iskar et al., 2013). Hence, the objective of our test case was
2-fold: (1) to perform, using the SRMAtlas as a resource, a systematic proteomic quantification of enzymes in the cholesterol
synthesis pathway upon drug treatment, and (2) to assess if
the regulation of putative SREBP target genes identified in Iskar
et al. (2013) translates to the protein level upon classical perturbation of the SREBP pathway with statin.
To test for differential drug-induced regulation of protein
levels, we selected the two liver cell lines Huh7 and HepG2,
treated them with lipoprotein-deficient serum (LPDS) and atorvastatin and subsequently quantified the relative abundance of
target proteins using the assays obtained from the SRMAtlas.
We targeted 64 proteins with the SRMAtlas assays and unequivocally quantified one to three proteotypic peptides for 33
proteins (74 peptides total) in unfractionated total cell lysate of
Huh7 and HepG2 cells. After perturbation with LPDS and atorvastatin, 32 out of the 33 proteins showed regulation (Figure 5A;
Table S5). This included detecting peptides for 18 out of the 22
enzymes in the cholesterol synthesis pathway (Figure 5B).
All enzymes of the cholesterol synthesis pathway, except
of LBR, increased their abundance upon stimulation of
SREBP2. HMGCR, FDFT1, and DHCR7 showed the strongest
response with an up to 16-fold increase in abundance. The
measured SRM chromatograms of peptide TQNLPNCQLISR
and LFSASEFEDPLVGEDTER from protein FDTF1 show the difference in signal abundance between treated and untreated cells
as representative examples (Figure 5C). The absence in regulation of LBR confirmed a previous report showing no increased
LBR expression in HepG2 cells upon SREBP activation (Bennati
et al., 2006), and LBR was also not present in the co-regulated
module (Iskar et al., 2013). In addition to the proteins in the
cholesterol synthesis pathway, the additional 14 proteins that
were part of the targeted gene module and present in other
cellular pathways also changed substantially their expression.
Most of these proteins are not established SREBP targets, and
this represents therefore an important confirmation of their regulation downstream of SREBP. Interestingly, the response in
Huh7 and HepG2 cells differed for some proteins, and thus the
proteins could be divided into clusters based on their response
to the drug treatment (Figure 5A). Cluster I consisted of proteins
that were increased similarly in both cell lines and contained
most of the enzymes in the cholesterol synthesis pathway. The
proteins in cluster II were mostly regulated in HepG2 cells and
contained proteins present in the mevalonate pathway, the first
part of the cholesterol biosynthesis. Hence, using the SRMAtlas
it was possible to efficiently profile the changes in protein abundance along a whole pathway, to confirm the co-regulation of
novel putative SREBP target genes and to examine the relationship of protein regulation in different pathways.
In study 2, we measured the effect of docetaxel treatment on
three differentially responsive prostate cancer cell lines, LNCaP,
DU145, and PC3. To select target proteins, we first determined a

transcriptional time course response by microarray analysis.
Prostate cancer is a leading cause of mortality in males. While
many men present with localized and curable disease, a large
number of deaths are driven by the development of metastatic
prostate cancer and low curative options. Docetaxel is the first
line drug treatment for metastatic castrate-resistant prostate
cancer, but 50% of patients develop resistance (Antonarakis
and Armstrong, 2011). Docetaxel acts mainly through the significant uptake in cells and the inhibition of microtubule function
leading to mitotic arrest in the cell-cycle and cell death. While
the main action of taxanes like docetaxel is in overall cell cycle
arrest, there are many unknown activities of the drug that
contribute to its antitumor effects.
Performing microarray analysis of mRNA transcripts, we identified a dysregulated network of genes associated with docetaxel perturbation of the cell cycle and used SRMAtlas assays
to target the corresponding protein products at four time points
post-treatment (072 hr) and in untreated controls to investigate comparative proteome and mRNA transcript abundance
changes over time in each of these three cell lines. We targeted
36 proteins spanning nuclear proteins, cytosolic and membrane
proteins with SRMAtlas assays, and unambiguously quantified
33 proteins with one to four proteotypic peptides each (87 peptides total, HIST2H2AC through shared peptides) in unfractionated total cell lysates of these prostate cancer cell lines along
the time course. Analysis of the combined mRNA and protein
abundance data were overall in agreement with each other,
showing larger abundance changes at 48 and 72 hr compared
to 8 and 24 hr but also highlighted the differences of the three
cell lines and some discordance for mRNA and protein abundance involved in the cell-cycle response (Figure 6; Table S5).
Cluster analysis also highlighted differences in mRNA and protein abundance especially for later time points (Figure S4). These
include concordantly decreasing abundance in proteins involved
in cell cycle, DNA repair, and nucleotide base synthesis (RRM2,
RFC3, TMYS, and MCMs, and UBE2C). Discordance between
mRNA and protein abundance was observed for scaffolding
and structural stabilization proteins (KIF23 and NUSAP1). These
results detail regulation within the cell to stabilize cells undergoing stress and the concordant reduction in proteins involved in
normal cell-cycle action. The difference in timescales shows
the effect in transcript abundance occurring first, followed by
reduction in protein abundance as would be expected in normal
transcription/translation timescales. Of note, the discordance
of the Kinesin KIF23 abundance at the protein level for DU145
and PC3 is in agreement with previous studies detailing kinesin
overexpression and increased resistance to docetaxel in breast
cancer (De et al., 2009) and glioma cells (Takahashi et al., 2012).
Kinesins are key components in spindle movements during the
cell cycle and can presumably meliorate the action of docetaxel.
Inhibition of the kinesin complex or key members may interfere
with the resistance mechanism to docetaxel and highlight a
possible avenue for therapeutic intervention. Additionally, with
the deployment of SRM assays for KIF23, this assay could be
developed further to provide a new prognostic and diagnostic
marker of therapeutic resistance. This analysis demonstrates
the ease of rapidly deploying targeted quantitative assays to
Cell 166, 766778, July 28, 2016 773
Huh7
citrate
ACLY
acetyl-CoA
II
III
SREBP2
IV
22.2
6x
untreated
25
y10 - 1130.5324+
y7 - 805.3686+
b3 - 348.1918+
y9 - 1015.5055+
y6 - 706.3002+
b9 - 1026.4415+
22.2
LPDS +
5M atorvastatin
15
10
40.4
untreated
22.2
18
19 20
21
22 23
24
25
18
19 20
21
22 23 24
25
40.4
LPDS +
5M atorvastatin
3
2
1
0
y9 - 1015.5055+
y6 - 706.3002+
b9 - 1026.4415+
8x
10
y10 - 1130.5324+
y7 - 805.3686+
b3 - 348.1918+
CYP51A1
LBR
TM7SF2
MSMO1
NSDHL
HSD17B7
EBP
SC5D
Vitamin D
DHCR7
bile acids, steroids
y9 - 1100.5881+
y8 - 494.2557++
b4 - 457.2405+
20
DHCR24
FDFT1 - LFSASEFEDPLVGEDTER
y10 - 1214.6310+
y8 - 987.5040+
b3 - 344.1565+
20
15
Dolichol, Ubiquinone
HepG2
y9 - 1100.5881+
y8 - 494.2557++
b4 - 457.2405+
25
atorvastatin
DHCR24 cholesterol
Intensity
y10 - 1214.6310+
y8 - 987.5040+
b3 - 344.1565+
ACAT2
HMGCS1
HMGCR
MVK
PMVK
MVD
IDI1
FDPS
FDFT1
SQLE
LSS
CYP51A1
LBR
TM7SF2
MSMO1
NSDHL
HSD17B7
EBP
SC5D
DHCR7
response-based cluster
protein
LPDS + 5M atorvastatin
Cholesterol synthesis pathway
FDFT1 - TQNLPNC[160]QLISR
Intensity (x103, cps)
0.01% DMSO
untreated
log2FC
untreated
3 2 1 0 1 2 3
0.01% DMSO
CYP51A1
MSMO1
FADS2
HMGCS1
NSDHL
LSS
NEU1
DHCR24
NUCB2
FDFT1
DHCR7
HMGCR
IDI1
MVD
ACLY
FDPS
MVK
ACAT2
EBP
PLIN2
HSPA5
FKBP11
FAM117A
SC5D
AGR2
HIST1H2*
ACTB/G*
NDRG1
LBR
SEC24D
IDH1
FASN
GAPDH*
PGM3
TUBA1A
GFPT1
40.4
37
38 39 40 41 42 43 44
1
0
37 38 39 40 41 42 43 44
Retention Time
Figure 5. Drug-Induced Inhibition of Cholesterol Synthesis
Systematic proteomic quantification of proteins from a reported gene module and enzymes in the cholesterol synthesis pathway upon drug treatment.
(A) The heatmap shows the change in protein abundance following the treatment with lipoprotein-deficient serum (LPDS) and atorvastatin compared to control
conditions (untreated and 0.01% DMSO) of the same cell line. The signal represents the mean result from three independent biological experiments and three
SRM analyses per sample. Proteins were hierarchically clustered according to the elicited response with the Ward2 algorithm and Euclidean distance. Based on
the clustering tree and the protein regulation, we defined five different clusters of proteins showing similar response (IV). Proteins marked with asterisks were
included as housekeeping proteins for normalization.
(B) Shown are the measured proteins from the pathway synthesizing cholesterol from acetyl-CoA. The enzymes are sorted by their position in the pathway, and
the proteins are color coded according to the cluster in (A) they belong to (I)(IV). The proteins or metabolites in italics have not been measured and are included for
completeness. The inhibition of HMGCR by atorvastatin and the negative feedback of SREBP2 are depicted.
(C) SRM chromatograms of peptide TQNLPNC[160]QLISR and LFSASEFEDPLVGEDTER from protein FDTF1 showing the difference in signal abundance between untreated cells and cells treated with LPDS + 5 mM atorvastatin as representative examples. The lower signal in untreated cells is magnified in the inset.
See also Table S5.
774 Cell 166, 766778, July 28, 2016
plasma membrane
cytoplasm
nucleus
8h
24 h
48 h
72 h
PRIM1
1.5
RNA
SRM
RNA
LNCAP SRM
RNA
PC3
SRM
DU145
0
-1.5
Log2 fold change
Indirect relationship (IPA)

Additional relationship pathways (IPA)
Indirect relationship (STRING)
Figure 6. Network of Proteins Associated with Docetaxel Perturbation of the Cell Cycle
SRM-based quantification of a protein network in three prostate cancer cell lines, DU145, LNCaP, and PC-3, at four time points post-treatment with docetaxel
and in untreated controls in comparison to mRNA abundance changes. The microarray-derived functional network was visualized with ingenuity pathway
analysis (IPA). The structure of the network is based on the IPA Core Analysis-, STRING-, and Pathway Commons-derived direct interactions and indirect relationships. Each heatmap visualizes the log2 abundance change of treated versus control cells for each time in each cell line at the transcript (mRNA) and protein
(SRM) level. The signal represents the mean result from two technical replicates at the transcript level and three SRM analyses per sample.
study discrete sets of proteins without relying on the vagaries of

antibodies, which is both laborious and costly.
DISCUSSION
We generated high-quality fragmentation spectra and verified
SRM assays for all accessible proteins described in the UniProtKB/Swiss-Prot database, a selection of spliced variants,
non-synonymous SNPs, and N-glycosylated proteins. This
should enable reliable and reproducible quantification of all annotated proteins in the human proteome.
From a systems perspective, biological processes constitute
networks of interacting molecules, and changes in network state
are informative about the biochemical state of the cell. To deter-
mine the state of a network and to compare network states between samples, the parts of a network need to be consistently
detected in sample cohorts and quantified, at least at the level
of relative quantification. An incomplete parts list may not explain
observed processes. The Human SRMAtlas assays provide the
tools to reliably navigate predicted or experimentally derived
protein networks, to rapidly probe promising interaction partners
and to perform relative or, in the presence of isotope-labeled
reference peptides, absolute quantification and to thus provide
new insight into complex biological mechanisms.
The developed assays can be universally applied in different
sample types to target any protein of interest and its abundance
changes. The assays are beneficial in hypothesis-driven experiments focusing on a relatively small number of proteins, e.g.,
Cell 166, 766778, July 28, 2016 775
those that carry out a specific function or to probe a mechanism

but also in large association studies to monitor protein panels
across individuals. We demonstrated the utility of the resource
in two studies and a multitude of different applications can be envisioned. Specific enzyme classes and signaling pathways can
be studied. Proteins operating cooperatively in networks can
be measured across many samples to define, for instance, their
dynamic spatiotemporal changes in order to help identify the
current health status or disease state. The assays can be used
to investigate the response to system perturbations, enriched
sub-proteomes, to confirm protein interaction partners or to
analyze protein complexes to determine their stoichiometry.
The SRM assays are particularly useful in a clinical setting where
large volumes of genomic data suggest aberrations or dysregulation in biochemical pathways, thus providing functional hypotheses that are testable by the SRMAtlas resource. Current
biomarker verification strategies rely primarily on the development of antibodies for western blots and ELISA tests, a timeintensive and expensive process that limits the verification
of many candidate markers. Our assay resource can close
this gap by allowing immediate assay implementation and
subsequent high-throughput and cost-efficient verification of
numerous markers in large cohorts of patient samples. We
developed molecular assays with high specificity for the entire
proteome as an emerging alternative and complement to conventional antibody screenings. While it can be challenging to
verify the specificity of an antibody in discriminating antigenic
variants, SRM provides high specificity by molecular determination through the proteotypicity of the selected peptides and
several independent assays per protein.
N-glycosylated proteins are secreted or located on the cell surface and constitute a clinically interesting group of proteins that is
investigated for biomarkers and drug targets. We developed assays to target N-glycosylated proteins by specifically selecting
peptides that span the N-glycosylated sequence motif for use in
studies involving N-glycosylation affinity approaches. SNPs have
primarily been investigated in genome-wide association studies
to classify different traits and gained interest as markers to diagnose diseases and predict drug response. To investigate SNPs
resulting in non-synonymous sequence variants, we developed
assays for the most frequent mutations, which typically result in
protein malfunction and may expose an altered phenotype.
Although the detection of target peptides by SRM is highly
sensitive and specific, it can be challenging due to the
complexity and large dynamic range in the human proteome,
potentially resulting in interference of peptides with similar
mass and chromatographic properties. Such interference can
lead to a failure in detecting the target peptide. The SRM assays
can be deployed in any sample, but the occurrence of potentially
interfering transitions needs to be assessed in each experiment
to minimize false positives and imprecise quantification. Depending on the scope of a study, different levels of assay validation are recommended (Carr et al., 2014). While SRMAtlas
assays are robust and powerful to assess the protein abundance
in biological samples for research purposes, assays intended to
support clinical decisions would require further clinical validation
for robustness in large patient cohorts. Currently, the technique
cannot attain the entire range of proteins in higher eukaryotes;
776 Cell 166, 766778, July 28, 2016
therefore, optimized protein extraction, enrichment, and fractionation can further improve sensitivity.
While we present a truly comprehensive and highly characterized resource of assays for all human proteins, their variants and
some post-translational modifications, the human proteome is
complex, and the coverage by empirical observed peptides is
evolving and by no means complete. We provide very high
coverage (>99%) at the protein level, but the protein sequence
coverage by peptides, in their native and post-translational modified form, will increase as more data become available. Our established pipeline will allow us to develop SRM assays for newly
identified peptides and different post-translational modifications
in future efforts with ease. The human SRMAtlas is based on the
UniProtKB/Swiss-Prot database 2010 and 2015, but, as the
proteome continues to be refined, new assays can be added.
The human SRMAtlas not only facilitates the reliable identification of proteins, but also provides coordinates for the reproducible quantification of analytes. Absolute quantification of
peptides can be accomplished by adding known amounts of stable isotope-labeled standards. Label-free quantification can also
be performed by SRM but is only accurate for relative quantification given there is no standardization anchor point. Alternatively,
absolute label-free quantification based on few anchor points
can be pursued (Ludwig et al., 2012). Currently, the resource
does not provide experimentally determined limits of detection
(LOD) and limits of quantification (LOQ) for every SRM assay
as these parameters strongly depend on the individual setting
in which the assay is deployed (e.g., cell lysate versus plasma,
type of sample preparation, chromatographic performance of
LC system). In order to obtain the most accurate LOD/LOQ,
these values need to be determined in the individual sample
type in the context of each study. We introduced PASSEL (Farrah
et al., 2012) as a repository for SRM data to share deployed
assays and their performance in different matrices and similar
databases followed (Sharma et al., 2014; Whiteaker et al.,
2014). While SRM assays developed in different laboratories
may be available as part of a publication or through these repositories, the information is scattered, time-consuming to gather,
and currently limited to a small number of proteins (<1,000). In
contrast, the human SRMAtlas provides high-quality SRM assays including their spectral libraries for the entire predicted human proteome developed in a consistent manner. These assays
are generic and are not based on any sample type or biological
context but have proved to work in a number of settings (cell
lines, plasma, urine protein digests, etc.).
The SRM assays presented here also provide a unique
resource for efforts seeking to provide protein-level evidence of
any human protein either previously observed or never observed
to date to advance our knowledge in human biology and complex diseases. The high-resolution, high-mass accuracy MS/
MS spectra generated from synthetic peptides constitute a
gold-standard fragmentation database that can provide additional confidence by spectral comparison with MS/MS spectra
derived from discovery proteomic studies. Recently, the concept
of SWATH-MS was introduced (Gillet et al., 2012), a data-independent MS technique that is less sensitive than SRM but
capable of generating a digital map of a large fraction of a proteome. The analysis of SWATH data requires information from
high-quality fragment ion spectral libraries and assays such as

the ones developed here to mine these complex fragment ion
maps.
In conclusion, the human SRMAtlas provides verified MS assays based on SRM technology developed in a uniform and
consistent process for essentially every protein of the human proteome. These assays can be rapidly deployed in systems biology
and biomedical studies to identify and quantify any human protein with high sensitivity and high selectivity, and to navigate complete proteome maps to understand their biological functions.
Peptide Selection
For every human protein in UniProtKB/Swiss-Prot release 2010-05, a set of
PTPs was selected by mining PeptideAtlas (build 2010-05 internal) or by bioinformatic prediction. The selection of peptides for new protein entries in
release 2014-11 and 2015-08 was based on PeptideAtlas build 2014-08.
Each peptide obtained a suitability score using the PABST algorithm. Selection
criteria include the following: fully tryptic, 730 amino acids, SSRCalc 10-46,
expected z of 2 to 4, and unique within the human proteome. Peptide selection
for spliced isoforms was based on UniProtKB/Swiss-Prot Varsplic (2010-06),
for SNPs on the subset of NCBI dbSNP (build 131) entries annotated in the
UniProt feature tables.
Peptide Synthesis
Peptides were synthesized via solid phase (Thermo-Fisher) or SPOT synthesis
(JPT Peptide Technologies). 96 peptides were pooled and subjected to MS
analysis.
MS Analysis
Peptides were analyzed on a G6530A nano HPLC Chip Cube Q-TOF LC-MS
system (Agilent Technologies). Spectra were acquired in a data-directed
approach using an exclusive precursor selection Auto MS/MS mode. Each
m/z was fragmented over a wide range of CEs. Further, peptides were
analyzed on a QTrap 5500 LC-MS with a Nano Spray Source III and Tempo
nano MDLC (Sciex) in SRM mode, triggering the acquisition of a full MS/MS
spectrum upon the detection of a transition.
Data Analysis
MS/MS spectra were searched with X!Tandem and Mascot against an artificial
protein database consisting of the 166,174 selected peptides. The search results were validated with the TPP. Consensus spectral libraries were created
with SpectraST. For each peptide precursor, up to ten fragment ions from
the 6530 Q-TOF consensus spectrum were extracted for transition verification.
SRM
Peptides were analyzed with the selected transitions on a G6460A nano HPLC
Chip Cube QQQ LC-MS (Agilent Technologies). Data were acquired in
dynamic MRM mode using the base CE.
R.L.M. and R.A. conceived the project. U.K., D.S.C., E.W.D., P.B., O.R., P.P.,
and R.L.M. designed and interpreted experiments. U.K., D.S.C., E.W.D.,
C.S.C., D.A.S., M.-Y.B., J. Slagel, Z.S., J. Stevens, B.G., D.S., M.R.H., P.B.,
A.V.R., C.C., C.-Y.H., M.K., and T.T. performed experiments and bioinformatics calculations. D.S.C., E.W.D., H.L., U.K., E.D., and C.S. carried out database design, integration, and deployment. U.K., D.S.C., R.A., and R.L.M.
wrote the manuscript. J.D.A., L.H., and all authors contributed to the preparation of the manuscript.
ACKNOWLEDGMENTS
This work was performed in part with federal funds from the American Recovery and Reinvestment Act (ARRA) funds through NIH, from the National
Human Genome Research Institute grant RC2HG005805 (to R.L.M.), the National Institute of General Medical Sciences under grant R01GM087221,
S10RR027584 and 2P50 GM076547/Center for Systems Biology (to
R.L.M.), the Luxembourg Centre for Systems Biomedicine/University
Luxembourg (to L.H.), the European Research Council grant ERC-2008AdG 233226 and ERC-2014-AdG 670821 and the Swiss National Science
Foundation (grant #31003A-130530) (to R.A.), and DAAD (fellowship to
U.K.). We kindly thank K. Miller and C. Miller (Agilent Technologies), J. Louette, Drs. G. Sulyok and A. Schierholt (Thermo-Fisher), Dr. H. Wenschuh and
L. Eckler (JPT) for their support, Dr. S. Carr for early access to ESP predictor,
Drs. P. Gaudet and A. Bairoch for supporting the integration of neXtProt, and
T. Farrah, S-T. Kwok, A. Aksoy, and P. Shannon for excellent technical
support.
REFERENCES
Addona, T.A., Abbatiello, S.E., Schilling, B., Skates, S.J., Mani, D.R., Bunk,
D.M., Spiegelman, C.H., Zimmerman, L.J., Ham, A.-J.L., Keshishian, H.,
et al. (2009). Multi-site assessment of the precision and reproducibility of
multiple reaction monitoring-based measurements of proteins in plasma.
Anderson, N.L., Anderson, N.G., Haines, L.R., Hardie, D.B., Olafson, R.W., and
Pearson, T.W. (2004). Mass spectrometric quantitation of peptides and proteins using Stable Isotope Standards and Capture by Anti-Peptide Antibodies
(SISCAPA). J. Proteome Res. 3, 235244.
Antonarakis, E.S., and Armstrong, A.J. (2011). Evolving standards in the treatment of docetaxel-refractory castration-resistant prostate cancer. Prostate
Cancer Prostatic Dis. 14, 192205.
Bantscheff, M., Schirle, M., Sweetman, G., Rick, J., and Kuster, B. (2007).
Quantitative mass spectrometry in proteomics: a critical review. Anal. Bioanal.
Chem. 389, 10171031.
Resource
The human SRMAtlas is available at http://www.srmatlas.org/, SRMAtlas build
Complete Human SRMAtlas.
Baty, J.D.J., and Robinson, P.R. (1977). Single and multiple ion recording
techniques for the analysis of diphenylhydantoin and its major metabolite in
plasma. Biomed. Mass Spectrom. 4, 3641.
ACCESSION NUMBERS
Bennati, A.M., Castelli, M., Della Fazia, M.A., Beccari, T., Caruso, D., Servillo,
G., and Roberti, R. (2006). Sterol dependent regulation of human TM7SF2
gene expression: Role of the encoded 3b-hydroxysterol D14-reductase in
human cholesterol biosynthesis. Biochim. Biophys. Acta 1761, 677685.
The accession number for the prostate cancer cell line microarray data
reported in this paper is GEO: GSE83654.
four figures, and five tables and can be found with this article online at
Boutet, E., Lieberherr, D., Tognolli, M., Schneider, M., and Bairoch, A. (2007).
UniProtKB/Swiss-Prot. Methods Mol. Biol. 406, 89112.
Carr, S.A., Abbatiello, S.E., Ackermann, B.L., Borchers, C., Domon, B.,
Deutsch, E.W., Grant, R.P., Hoofnagle, A.N., Huttenhain, R., Koomen, J.M.,
et al. (2014). Targeted peptide measurements in biology and medicine: best
practices for mass spectrometry-based assay development using a fit-forpurpose approach. Mol. Cell. Proteomics 13, 907917.
Cell 166, 766778, July 28, 2016 777
Craciun, F.L., Bijol, V., Ajay, A.K., Rao, P., Kumar, R.K., Hutchinson, J., Hofmann, O., Joshi, N., Luyendyk, J.P., Kusebauch, U., et al. (2016). RNA
sequencing identifies novel translational biomarkers of kidney fibrosis.
J. Am. Soc. Nephrol. 27, 17021713.
da Cunha, J.P.C., Galante, P.A.F., de Souza, J.E., de Souza, R.F., Carvalho,
P.M., Ohara, D.T., Moura, R.P., Oba-Shinja, S.M., Marie, S.K.N., Silva, W.A.,
Jr., et al. (2009). Bioinformatics construction of the human cell surfaceome.
De, S., Cipriano, R., Jackson, M.W., and Stark, G.R. (2009). Overexpression of
kinesins mediates docetaxel resistance in breast cancer cells. Cancer Res. 69,
80358042.
Desiere, F., Deutsch, E.W., Nesvizhskii, A.I., Mallick, P., King, N.L., Eng, J.K.,
Aderem, A., Boyle, R., Brunner, E., Donohoe, S., et al. (2005). Integration with
the human genome of peptide sequences obtained by high-throughput mass
spectrometry. Genome Biol. 6, R9.
Deutsch, E.W., Mendoza, L., Shteynberg, D., Slagel, J., Sun, Z., and Moritz,
R.L. (2015a). Trans-Proteomic Pipeline, a standardized data processing pipeline for large-scale reproducible proteomics informatics. Proteomics Clin.
Appl. 9, 745754.
Deutsch, E.W., Sun, Z., Campbell, D., Kusebauch, U., Chu, C.S., Mendoza, L.,
Shteynberg, D., Omenn, G.S., and Moritz, R.L. (2015b). State of the human
proteome in 2014/2015 as viewed through PeptideAtlas: enhancing accuracy
and coverage through the AtlasProphet. J. Proteome Res. 14, 34613473.
Ebhardt, H.A., Root, A., Sander, C., and Aebersold, R. (2015). Applications of
targeted proteomics in systems biology and translational medicine. Proteomics 15, 31933208.
Edwards, A.M., Isserlin, R., Bader, G.D., Frye, S.V., Willson, T.M., and Yu, F.H.
(2011). Too many roads not taken. Nature 470, 163165.
Fagerberg, L., Jonasson, K., von Heijne, G., Uhlen, M., and Berglund, L. (2010).
Prediction of the human membrane proteome. Proteomics 10, 11411149.
Farrah, T., Deutsch, E.W., Kreisberg, R., Sun, Z., Campbell, D.S., Mendoza, L.,
Kusebauch, U., Brusniak, M.-Y., Huttenhain, R., Schiess, R., et al. (2012).
PASSEL: the PeptideAtlas SRMexperiment library. Proteomics 12, 1170
1175.
Gillet, L.C., Navarro, P., Tate, S., Rost, H., Selevsek, N., Reiter, L., Bonner, R.,
and Aebersold, R. (2012). Targeted data extraction of the MS/MS spectra
generated by data-independent acquisition: a new concept for consistent
and accurate proteome analysis. Mol. Cell. Proteomics 11, O111.016717.
Gillette, M.A., and Carr, S.A. (2013). Quantitative analysis of peptides and proteins in biomedicine by targeted mass spectrometry. Nat. Methods 10, 2834.
Karlsson, C., Malmstrom, L., Aebersold, R., and Malmstrom, J. (2012). Proteome-wide selected reaction monitoring assays for the human pathogen
Streptococcus pyogenes. Nat. Commun. 3, 1301.
Kennedy, J.J., Abbatiello, S.E., Kim, K., Yan, P., Whiteaker, J.R., Lin, C., Kim,
J.S., Zhang, Y., Wang, X., Ivey, R.G., et al. (2014). Demonstrating the feasibility
of large-scale development of standardized assays to quantify human proteins. Nat. Methods 11, 149155.
Kuster, B., Schirle, M., Mallick, P., and Aebersold, R. (2005). Scoring proteomes with proteotypic peptide probes. Nat. Rev. Mol. Cell Biol. 6, 577583.
Ludwig, C., Claassen, M., Schmidt, A., and Aebersold, R. (2012). Estimation of
absolute protein quantities of unlabeled samples by selected reaction monitoring mass spectrometry. Mol. Cell. Proteomics 11, 013987.
Mallick, P., Schirle, M., Chen, S.S., Flory, M.R., Lee, H., Martin, D., Ranish, J.,
Raught, B., Schmitt, R., Werner, T., et al. (2007). Computational prediction
of proteotypic peptides for quantitative proteomics. Nat. Biotechnol. 25,
125131.
Mann, M., Kulak, N.A., Nagaraj, N., and Cox, J. (2013). The coming age of complete, accurate, and ubiquitous proteomes. Mol. Cell 49, 583590.
Picotti, P., and Aebersold, R. (2012). Selected reaction monitoring-based proteomics: workflows, potential, pitfalls and future directions. Nat. Methods 9,
555566.
Picotti, P., Bodenmiller, B., Mueller, L.N., Domon, B., and Aebersold, R. (2009).
Full dynamic range proteome analysis of S. cerevisiae by targeted proteomics.
Cell 138, 795806.
Picotti, P., Clement-Ziza, M., Lam, H., Campbell, D.S., Schmidt, A., Deutsch,
E.W., Rost, H., Sun, Z., Rinner, O., Reiter, L., et al. (2013). A complete massspectrometric map of the yeast proteome applied to quantitative trait analysis.
Nature 494, 266270.
Schubert, O.T., Mouritsen, J., Ludwig, C., Rost, H.L., Rosenberger, G., Arthur,
P.K., Claassen, M., Campbell, D.S., Sun, Z., Farrah, T., et al. (2013). The Mtb
proteome library: a resource of assays to quantify the complete proteome of
Mycobacterium tuberculosis. Cell Host Microbe 13, 602612.
Sharma, V., Eckels, J., Taylor, G.K., Shulman, N.J., Stergachis, A.B., Joyner,
S.A., Yan, P., Whiteaker, J.R., Halusa, G.N., Schilling, B., et al. (2014). Panorama: a targeted proteomics knowledge base. J. Proteome Res. 13, 4205
4210.
Surinova, S., Radova, L., Choi, M., Srovnal, J., Brenner, H., Vitek, O., Hajduch,
M., and Aebersold, R. (2015). Non-invasive prognostic protein biomarker signatures associated with colorectal cancer. EMBO Mol. Med. 7, 11531165.
Goldstein, J.L., and Brown, M.S. (2015). A century of cholesterol and coronaries: from plaques to genes to statins. Cell 161, 161172.
Takahashi, S., Fusaki, N., Ohta, S., Iwahori, Y., Iizuka, Y., Inagawa, K., Kawakami, Y., Yoshida, K., and Toda, M. (2012). Downregulation of KIF23 suppresses glioma proliferation. J. Neurooncol. 106, 519529.
Huttenhain, R., Soste, M., Selevsek, N., Rost, H., Sethi, A., Carapito, C., Farrah, T., Deutsch, E.W., Kusebauch, U., Moritz, R.L., et al. (2012). Reproducible
quantification of cancer-associated proteins in body fluids using targeted
proteomics. Sci. Transl. Med. 4, 142ra94.
Uhlen, M., Bjorling, E., Agaton, C., Szigyarto, C.A.-K., Amini, B., Andersen, E.,
Andersson, A.-C., Angelidou, P., Asplund, A., Asplund, C., et al. (2005). A human protein atlas for normal and cancer tissues based on antibody proteomics. Mol. Cell. Proteomics 4, 19201932.
Iskar, M., Zeller, G., Blattmann, P., Campillos, M., Kuhn, M., Kaminska, K.H.,
Runz, H., Gavin, A.C., Pepperkok, R., van Noort, V., and Bork, P. (2013). Characterization of drug-induced transcriptional modules: towards drug repositioning and functional understanding. Mol. Syst. Biol. 9, 662.
Whiteaker, J.R., Halusa, G.N., Hoofnagle, A.N., Sharma, V., MacLean, B., Yan,
P., Wrobel, J.A., Kennedy, J., Mani, D.R., Zimmerman, L.J., et al.; Clinical Proteomic Tumor Analysis Consortium (CPTAC) (2014). CPTAC Assay Portal: a
repository of targeted proteomic assays. Nat. Methods 11, 703704.
778 Cell 166, 766778, July 28, 2016
Editorial Note
Systemic Spread of Sequence-Specific Transgene RNA
Degradation in Plants Is Initiated by Localized
Introduction of Ectopic Promoterless DNA
Olivier Voinnet, Philippe Vain, Susan Angell, and David C. Baulcombe*
*Corresponding author
(Cell 95, 177187; October 16, 1998)

We the editors of Cell were contacted by the corresponding author, Dr. David Baulcombe, and the first author, Dr. Olivier Voinnet.
They informed us that, in Figure 6C, lanes 6 and 7 were intended to show two different negative controls, but one of the lanes
was erroneously duplicated. The authors were not able to locate the original data and could not determine how the error arose.
Without access to the original data, a correction of this figure panel is not possible. Our evaluation of the other figures of the paper did
not reveal any additional irregularities. Given the age of the paper and that the duplicated lane does not compromise the conclusions
of the paper, based on the information available to us at this time, we will take no further action.
Cell 166, 779, July 28, 2016
779
Editorial Note
A Viral Movement Protein Prevents Spread of the
Gene Silencing Signal in Nicotiana benthamiana
Olivier Voinnet, Carsten Lederer, and David C. Baulcombe*
(Cell 103, 157167; September 29, 2000)

We, the editors of Cell, were contacted by the corresponding author, David Baulcombe, who informed us that this paper contains an
unacknowledged image duplication. The mock control (Mock:M) lane shown in the northern blot experiment in Figure 3D is the same
as the mock control lane in Figure 5D. Dr. Baulcombe informed us that these two experiments were carried out at the same time, run
on a single gel, and exposed on the same autoradiograph and that they shared a negative (mock) control run in a single lane. Therefore, Figures 3D and 5D present the relevant lanes of each experiment plus the shared mock control. Dr. Baulcombe provided us with
a copy of the original autoradiograph for these experiments. We have evaluated the data and confirmed his explanation for this
duplication.
According to our current policy (although the paper was published before this stated policy was established), reuse of the negative
control should have been mentioned in the figure legend. However, as this issue does not call into question the integrity of data collection and/or presentation overall, and given the age of the paper, we have decided against publishing a Correction of the figure legend.
Based on the information available to us at this time, we will take no further action.
780 Cell 166, 780, July 28, 2016
Editorial Note
Ordered Recruitment of Transcription
and Chromatin Remodeling Factors to a Cell Cycle
and Developmentally Regulated Promoter
Maria Pia Cosma, Tomoyuki Tanaka, and Kim Nasmyth*
(Cell 97, 299311; April 30, 1999)

Concerns about duplicated images in Cosma et al. (Cell, 1999) and Cosma et al. (2001, Mol. Cell 7, 12131220) were brought to our
attention by a reader. We, the editors of Cell and Molecular Cell, have investigated the matter, communicating with the corresponding
author, Dr. Kim Nasmyth; the first author, Dr. Pia Cosma; The Research Institute of Molecular Pathology (IMP), where the research in
question was conducted; and the Center for Genomic Regulation, Dr. Cosmas current institute, which conducted its own investigation. The IMP located Dr. Cosmas notebooks and provided her with high-resolution copies. As part of our investigation, Dr. Cosma
brought those copies to the Cell Press office, where we went through them with her, identifying data for the figures in the paper. The
notebooks contained original images, alternate exposures, and/or replicate data for most of the figures in the papers, providing support for the reported findings. In a few instances, original data could not be located, making it difficult to assess the concerns raised
about those specific data panels.
While we understand the reasons that the figures in the paper were flagged by the community, in our judgment the burden of proof for
determining inappropriate data handling or image duplication has not been met. Furthermore, the available original data support the
findings of the papers. With these things in mind, based on the information available to us at this time, we have decided not to take any
further action. This statement is to notify the community of our investigation and findings.
Cell 166, 781, July 28, 2016
781
Editorial Note
LINE-1 Activity in Facultative
Heterochromatin Formation during
X Chromosome Inactivation
Jennifer C. Chow, Constance Ciaudo, Melissa J. Fazzari, Nathan Mise, Nicolas Servant, Jacob L. Glass, Matthew Attreed,
Philip Avner, Anton Wutz, Emmanuel Barillot, John M. Greally, Olivier Voinnet, and Edith Heard*
(Cell 141, 956969; June 11, 2010)

We, the editors of Cell, were contacted by the corresponding author, Dr. Edith Heard, who informed us that this paper contains unacknowledged lane splices in Figures 2C and 4C. In the preparation of these figures, lanes that were not relevant to the experiment
presented were removed; however, these splice marks were not indicated in the figure or explained in the legend. Dr. Heard provided
us with scans of the original data, which were prepared in her lab. We have confirmed that, in each case, the splice removes irrelevant
lanes from a single gel.
Although splicing should have been indicated in the figure and explained in the legend according to our current policies, this issue
does not call into question the integrity of data collection and overall presentation; we have therefore decided against publishing a
Correction. Based on the information available to us at this time, we will take no further action.
784
Cell 166, July 28, 2016 2016 Published by Elsevier Inc.
DOI http://dx.doi.org/10.1016/j.cell.2016.07.022
See online version for

legends and references
autophagy
rllS: dFOXO,
reduced DILP production
Neuroendocrine
AMPK
Metabolic
homeostasis
Gustatory inputs
Or 83b Gr 63a
X X
Olfactory receptors:
Sensory perception
DILPs
Drosophila
CRTC-1
Cytoplasm
AMPK
aboli h
eos as
Metabolic
homeostasis
IL Ps
Longevity
Insulin signaling
Autophagy
Fat deposition
Stress resistance
Serotonin synthesis
Reduced
educed IIS (rIIS):
rIIS
Age-1/daf-2 mutations
Neuroendocrine
eu oe docrine
Proteostasis
UPR ER (XBP-1s)
Chronic stress
HSF-1
Mitochondrial stress
Nutrient availability
Dietary restriction
SKN-1
Nutrient type
NMUR-1
HIF-1
Low O 2
Olfactory and gustatory
inputs
Octopamine
Cellular stress (intrinsic)
Sensory perception
Sensor
erce tion (extrinsic)
extrinsic
Neuronal mechanisms
DAF-16, HSF-1
CRTC1
NF- B
GnRH
IKK /
rllS: Irs2/IGF-1R
Immune inhibition:
Neuroendocrine
FMO-2
ox2r
Neural activation
SIRT1
Neurogenesis,
cognitive improvement
CGRP
Longevity
Longevity
Altered fat storage

Tissue homeostasis
Altered growth
Youthful mitochondrial
function and morphology
Pancreas
Glucose homeostasis
Respiration
Reduced insulin-like signaling

Stress resistance
Mitochondrial metabolism
Mitochondrial dynamics
Mouse
DAF-16
activity
Stress resistance
Pro-longevity gene transcription
UPR ER, UPR mt
Respiration
espirat on
Xenobiotic detoxification
Downstream mechanisms (peripheral tissues)
Metabolic
homeostasis
TR PV1
Cytoplasm
Cellular Exogenous
stress
Hsp70
Pain
Sensory perception
UPR
C. elegans
Heather J. Weir and William B. Mair

Department of Genetics and Complex Diseases, Harvard T.H Chan School of Public Health, 665 Huntington Avenue, Boston, MA 02115, USA
SnapShot: Neuronal Regulation of Aging
SnapShot: Neuronal Regulation

of Aging
Heather J. Weir and William B. Mair
Department of Genetics and Complex Diseases, Harvard T.H. Chan School of Public Health,
665 Huntington Avenue, Boston, MA 02115, USA
The nervous system orchestrates whole-body homeostatic mechanisms by integrating extrinsic and intrinsic signals and communicating these to peripheral tissues. To
date, much attention has been paid to the nervous system as a central regulator of metabolic homeostasis. However, despite a wealth of data linking metabolic processes to
multiple facets of the aging process, neuronal mechanisms that regulate organismal aging itself are just coming into focus in both invertebrate and mammalian systems. A
growing number of well-established longevity regulators, including dietary restriction (DR), sirtuins, insulin/IGF-like signaling (IIS), and AMP-activated protein kinase (AMPK),
are now known to act via neuronal mechanisms. As such, the nervous system is emerging as a crucial therapeutic target to combat age-related diseases and promote healthy
aging.
Neuronal mechanisms broadly fall into four highly interconnected categories, which converge to regulate common downstream outputs in peripheral tissues.
Sensory Mechanisms
Sensory perception of environmental cues and extrinsic signals regulates the rate of aging via multiple mechanisms. Several of these converge on the perception of nutrient
availability. DR, which promotes healthy aging across the evolutionary spectrum, activates the transcription factor SKN-1 in a subset of C. elegans neurons, altering systemic
metabolism. Nutrient type also influences lifespan, acting via the neuromedin U receptor in C. elegans (Alcedo et al., 2013). Highlighting the importance of perceived nutrient
availability, ablation or reduction in activity of olfactory neuronal subpopulations in C. elegans and Drosophila extend lifespan via enhanced stress resistance in peripheral
tissues (Alcedo et al., 2013). The gustatory system further influences lifespan but in a bidirectional manner, as distinct gustatory neuronal subpopulations oppositely affect
lifespan via both IIS-dependent and -independent downstream mechanisms (Alcedo et al., 2013). Perception of other environmental stimuli in addition to nutrient availability
also regulates aging. For example, low oxygen levels promote longevity in C. elegans via the stabilization of neuronal hypoxia-inducible factor 1 (HIF-1), triggering cell-nonautonomous signaling to the intestine to activate xenobiotic detoxification mechanisms (Leiser et al., 2015). Less is known about the contribution of sensory mechanisms to
aging in mammalian systems, although that is changing. For example, pain perception in mice promotes longevity via a calcium signaling cascade that regulates neuropeptide
signaling, leading to improved metabolic health (Riera et al., 2014). The hypothalamus, which receives multiple sensory inputs, is also a key site of longevity regulation via other
mechanisms, suggesting that sensory influences on the aging process may well be conserved in mammals.
Metabolic Mechanisms
Mechanisms that control energy homeostasis and metabolic function are closely linked to the sensory system and also regulate aging via neuronal actions. Activation of
the cellular energy sensor and DR mediator AMPK promotes longevity via neuronal pathways in both Drosophila and C. elegans. In Drosophila, neuronal activation of AMPK
results in cell-autonomous and non-autonomous induction of autophagy, which is sufficient to promote longevity (Ulgherait et al., 2014). In C. elegans, AMPK activation promotes longevity by phosphorylation-dependent nuclear exclusion of CREB-regulated transcriptional coactivator (CRTC-1) in neurons, resulting in altered systemic mitochondrial metabolism and dynamics (Burkewitz et al., 2015). Notably, although a role for neuronal AMPK is yet to be demonstrated in mammals, promotion of longevity in mice via
reduced pain perception also requires neuronal nuclear exclusion of CRTC1 (Riera et al., 2014). Another key metabolic sensor, SIRT1, is an NAD+-dependent deacetylase that
has considerable interplay with AMPK. Neuronal-specific overexpression of SIRT1 is sufficient to promote longevity in mice via hypothalamic action that promotes systemic
metabolism (Satoh et al., 2013).
Cellular Stress
Perturbations to cellular homeostasis result in activation of cell-autonomous and non-autonomous stress responses that can influence aging via communication to and
from the nervous system. Organelle-specific stress responses that detect loss of proteostasis via the unfolded protein response (UPR) have a prominent role. Neuronal
induction of the endoplasmic reticulum UPR (UPRER) transcriptional program delays aging in C. elegans via improved stress resistance in peripheral tissues (Schinzel and
Dillin, 2015). Similarly, mitochondrial stress in neurons induces the mitochondrial UPR (UPRmt) in peripheral tissues and extends C. elegans lifespan (Schinzel and Dillin, 2015).
Responses to age-related chronic cellular stress are also orchestrated in the nervous system by heat shock factors (HSF). Neuronal activation of HSF-1 delays aging in C.
elegans via pro-longevity gene transcription in the intestine (Douglas et al., 2015). Delayed aging as a consequence of neuronal stress responses may be conserved in mammals, as exogenous administration of heat shock protein 70 (Hsp70) improves neuronal function and extends lifespan in mice (Bobkova et al., 2015).
Neuroendocrine Mechanisms
A number of direct roles for the regulation of aging by neurotransmitter and hormone signaling have been identified. IIS, a highly conserved pathway that influences aging
across multiple species, is regulated by the nervous system. Studies using both invertebrates and vertebrates have demonstrated that reduced IIS specifically in neurons is
sufficient to promote longevity, indicating a conserved role for this integral neuroendocrine mechanism. In Drosophila, reduced production of insulin-like peptides (DILPs)
either by ablation of the neurosecretory cells that produce them or by activation of dFOXO in the head fat body extends lifespan via repression of insulin signaling in the
peripheral fat body (Alcedo et al., 2013). In mice, brain-specific deletion of the insulin receptor substrate 2 (Irs2) or inactivation of insulin-like growth factor receptor (IGF-1R)
both extend lifespan with concomitant alterations to growth, fat storage, and glucose homeostasis in peripheral tissues (Alcedo et al., 2013). Beyond IIS, inhibition of serotonin
signaling increases C. elegans lifespan via mechanisms associated with DR (Alcedo et al., 2013). Hormone signaling from the hypothalamus further regulates aging in mice;
reduced activity of hypothalamic IB kinase- (IKK-)/nuclear factor B (NF-B) prevents age-related decline in gonadotrophin-releasing hormone (GnRH), resulting in neurogenesis and cognitive improvement, improved peripheral tissue homeostasis, and lifespan extension (Zhang et al., 2013).
REFERENCES
Alcedo, J., Flatt, T., and Pasyukova, E.G. (2013). Front. Genet. 4, 71.
Bobkova, N.V., Evgenev, M., Garbuz, D.G., Kulikov, A.M., Morozov, A., Samokhin, A., Velmeshev, D., Medvinskaya, N., Nesterova, I., Pollock, A., and Nudler, E. (2015). Proc. Natl.
Acad. Sci. USA 112, 1600616011.
Burkewitz, K., Morantte, I., Weir, H.J., Yeo, R., Zhang, Y., Huynh, F.K., Ilkayeva, O.R., Hirschey, M.D., Grant, A.R., and Mair, W.B. (2015). Cell 160, 842855.
Douglas, P.M., Baird, N.A., Simic, M.S., Uhlein, S., McCormick, M.A., Wolff, S.C., Kennedy, B.K., and Dillin, A. (2015). Cell Rep. 12, 11961204.
Leiser, S.F., Miller, H., Rossner, R., Fletcher, M., Leonard, A., Primitivo, M., Rintala, N., Ramos, F.J., Miller, D.L., and Kaeberlein, M. (2015). Science 350, 13751378.
Riera, C.E., Huising, M.O., Follett, P., Leblanc, M., Halloran, J., Van Andel, R., de Magalhaes Filho, C.D., Merkwirth, C., and Dillin, A. (2014). Cell 157, 10231036.
Satoh, A., Brace, C.S., Rensing, N., Cliften, P., Wozniak, D.F., Herzog, E.D., Yamada, K.A., and Imai, S. (2013). Cell Metab. 18, 416430.
Schinzel, R., and Dillin, A. (2015). Curr. Opin. Cell Biol. 33, 102110.
Ulgherait, M., Rana, A., Rera, M., Graniel, J., and Walker, D.W. (2014). Cell Rep. 8, 17671780.
Zhang, G., Li, J., Purkayastha, S., Tang, Y., Zhang, H., Yin, Y., Li, B., Liu, G., and Cai, D. (2013). Nature 497, 211216.
784.e1
Cell 166, July 28, 2016 2016 Published by Elsevier Inc.
DOI http://dx.doi.org/10.1016/j.cell.2016.07.022

Cell - 28 July 2016

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cell - 28 July 2016

Uploaded by

Copyright:

Available Formats

Leading Edge

its potential direct targets was effective in killing tumor cells

Expression of the stem cell gene Musashi (red) in human pancreatic

Mirna Kvajo: It seems that a lot if not most of research in

Many of the tools that we use

Anthony Leonardo: Ill speak up first. We study prediction

(L to R) Mirna Kvajo, Nachum Ulanovsky, Gilles Laurent, and Anthony Leonardo

hand, there are certain questions that are difficult to ask in

The problem in the funding

oscillation, this argues against them. This comparative

You see a behavior outdoors

Cell 166, July 28, 2016 527

From Chemistry to Communities

From Whether to How

European Molecular Biology Laboratory

University of California, Berkeley, HHMI

University of WisconsinMadison, HHMI

Evolution refers to the historic unfolding

Evolution is our family story. It holds the

After the debut of Darwins On the Origin

Developmental Biases in Evolution

Evolution with Foresight

Predictive Evolutionary Genomics

National Institutes of Health

Contemporary evolutionary biology has

For me, perhaps, the most pressing

The millions of species on Earth span a

Cell 166, July 28, 2016 529

whether the expression of autoantibodies

Figure 1. A Simplified View of T-B Lymphocyte Interactions in the Presence or Absence of

this self-reactive path. But an additional

tolerance (either regulatory T cells or

Cell 166, July 28, 2016 531

and have neutralization breadth usually

532 Cell 166, July 28, 2016 2016 Elsevier Inc.

This bnAb interacts with a highly

peutic drug candidates that could be

Figure 1. Convergent Broadly Neutralizing Antibodies for Influenza

antibodies that are difficult for humans to

Corti, D., Suguitan, A.L., Jr., Pinna, D., Silacci, C.,

Cell 166, July 28, 2016 533

constant NE breakdown and reformation

534 Cell 166, July 28, 2016 2016 Elsevier Inc.

Figure 1. Insertion of Annulate Lamellae into the Nuclear Envelope

What can AL teach us about the

in unique cell types with distinct nuclear

DAngelo, M.A., Anderson, D.J., Richard, E., and

Cell 166, July 28, 2016 535

drugs, which includes approved and

536 Cell 166, July 28, 2016 2016 Elsevier Inc.

being classified as large effectthat is,

Figure 1. The Major Cancer Cell Line Drug Screening Initiatives

discovered in cell lines, would lead us to

mined, the utility of this cell line resource

Cell 166, July 28, 2016 537

the fieldnamely, why the majority of variable TF-DNA binding

detected, including several polymorphisms in the b globin gene

controls the expression of the TF-coding genes IRX3 and IRX5

Hereditary persistance of fetal

(Martin et al., 1989;

(Reijnen et al., 1992;

(Crossley et al., 1992)

(Matsuda et al., 1992)