You are on page 1of 12

A Secure Submission System for Online Whistleblowing Platforms

uldenring, Eleanor Rieffel, Sven Dietrich, Lars Ries


Volker Roth, Benjamin G
Freie Universitat Berlin, FX Palo Alto Laboratory,

Stevens Institute of Technology
January 29, 2013
arXiv:1301.6263v1 [cs.CR] 26 Jan 2013

Abstract lature. This gave rise to whistleblowing websites such


as Wikileaks. However, the proliferation of surveillance
Whistleblower laws protect individuals who inform the technology and the retention of Internet protocol data
public or an authority about governmental or corpo- records [4] has a chilling effect on potential whistleblow-
rate misconduct. Despite these laws, whistleblowers ers. The mere act of connecting to a pertinent Website
frequently risk reprisals and sites such as WikiLeaks may suffice to raise suspicion [20], leading to cautionary
emerged to provide a level of anonymity to these indi- advice for potential whistleblowers.
viduals. However, as countries increase their level of The current best practice for online submissions is to
network surveillance and Internet protocol data reten- use an SSL [19] connection over an anonymizing net-
tion, the mere act of using anonymizing software such as work such as Tor [17]. This hides the end points of the
Tor, or accessing a whistleblowing website through an connection and it protects against malicious exit nodes
SSL channel might be incriminating enough to lead to and Internet Service Providers (ISPs) who may other-
investigations and repercussions. As an alternative sub- wise eavesdrop on or tamper with the connection. How-
mission system we propose an online advertising network ever, this does not protect against an adversary who can
called AdLeaks. AdLeaks leverages the ubiquity of un- see most of the traffic in a network [12, 18], such as na-
solicited online advertising to provide complete sender tional intelligence agencies with a global reach and view.
unobservability when submitting disclosures. AdLeaks
In this paper, we suggest a submission system for on-
ads compute a random function in a browser and sub-
line whistleblowing platforms that we call AdLeaks. The
mit the outcome to the AdLeaks infrastructure. Such
objective of AdLeaks is to make whistleblower submis-
a whistleblowers browser replaces the output with en-
sions unobservable even if the adversary sees the entire
crypted information so that the transmission is indistin-
network traffic. A crucial aspect of the AdLeaks design
guishable from that of a regular browser. Its back-end
is that it eliminates any signal of intent that could be
design assures that AdLeaks must process only a frac-
interpreted as the desire to contact an online whistle-
tion of the resulting traffic in order to receive disclo-
blowing platform. AdLeaks is essentially an online ad-
sures with high probability. We describe the design of
vertising network, except that ads carry additional code
AdLeaks and evaluate its performance through analysis
that encrypts a zero probabilistically with the AdLeaks
and experimentation.
public key and sends the ciphertext back to AdLeaks. A
whistleblowers browser substitutes the ciphertext with
1 Introduction encrypted parts of a disclosure. The protocol ensures
that an adversary who can eavesdrop on the network
Corporate or official corruption and malfeasance can communication cannot distinguish between the trans-
be difficult to uncover without information provided by missions of regular browsers and those of whistleblow-
insiders, so-called whistleblowers. Even though many ers browsers. Ads are digitally signed so that a whistle-
countries have enacted, or intend to enact, laws meant blowers browser can tell them apart from maliciously
to make it safe for whistleblowers to disclose miscon- injected code. Since ads are ubiquitous and there is no
duct [3, 28], whistleblowers fear discrimination and re- opt-in, whistleblowers never have to navigate to a partic-
taliatory action regardless, and sometimes justifiably ular site to communicate with AdLeaks and they remain
so [11, 26]. unobservable. Nodes in the AdLeaks network reduce the
It is therefore unsurprising that whistleblowers often resulting traffic by means of an aggregation process. We
prefer to blow the whistle anonymously through other designed the aggregation scheme so that a small number
channels than those mandated by whistleblowing legis- of trusted nodes with access to the decryption keys can

1
recover whistleblowers submissions with high probabil- tance and its inclusion would distract from our contri-
ity from the aggregated traffic. Since neither transmis- bution. The second threat we exclude is that of a flood-
sions nor the network structure of AdLeaks bear infor- ing attack on our submission system. While we have
mation on who a whistleblower is, the AdLeaks submis- thoughts on how to limit some attacks of this kind we
sion system is immune to passive adversaries who have prefer to make a solid first step towards unobservabil-
a complete view of the network. ity before considering the next step in our research. We
In what follows, we detail our threat model and our hope that the next step will not become necessary be-
assumptions, we give an overview over the design of cause this means that countries we believe liberal have
AdLeaks, we analyze its scalability in detail, we report gone too far down the slope towards totalitarianism al-
on the current state of its implementation and we explain ready. The third threat we exclude is denial of service
how AdLeaks uses cryptographic algorithms to achieve by means of fake transmissions. This threat manifests
its security objectives. at the level of the editorial process that separates the
chaff from the wheat among the potentially many sub-
mitted disclosures. We consider this threat out of scope
2 Threat Model in this part of our work. The fourth threat we exclude
is that of malware and spyware. For example, sensitive
The primary security objective of AdLeaks is to conceal documents in PDF format may contain JavaScript that
the presence of whistleblowers, and to eliminate network emits a beacon whenever the document is viewed. A
traces that may make one suspect more likely than an- careless whistleblower who opens a sensitive document
other in a search for a whistleblower. This security ob- on his home machine may expose himself or herself in
jective is more important to AdLeaks than, for exam- that fashion [5]. Similarly, if a whistleblowers computer
ple, availability. We rather risk that a disclosure does is infected by a malware or spyware then the whistle-
not come through than compromise information about blower has no security.
a whistleblower. In what follows we detail the threats
our system architecture addresses as well as the threats
it does not address. 2.3 Security-Related Assumptions
AdLeaks ads require a source of randomness in the
2.1 Threats in Our Scope browser that is suitable for cryptographic use. Moreover,
the source must be equally good on regular browsers
AdLeaks addresses the threat of an adversary who has a and on the browsers of whistleblowers. If a whistle-
global view of the network and the capacity to store or blowers browser looks decidedly more random than
obtain Internet protocol data records for most communi- other browsers then whistleblowers can be readily identi-
cations. The adversary may even require anonymity ser- fied. Unfortunately the random numbers most browsers
vices to retain connection detail records for some time generate are far from random. However, there are good
and to provide them on request. The adversary may reasons for browser developers to support cryptograph-
additionally store selected Internet traffic and he may ically secure random number generators in the near-
attempt to mark or modify communicated data. How- term [15], for example, to prevent illicit user track-
ever, we assume that the adversary has no control over ing [25]. In the meantime, entropy collected in the
users end hosts, and he does not block Internet traf- browser may be folded into a pseudorandom generator
fic or seizes computer equipment without a court order. using SJCL [34]. We therefore decided to move forward
We assume that the court does not per se consider orga- with our research assuming that browsers will soon be
nizations that relay secrets between whistleblowers and ready for it.
journalists as criminal. The objective of the adversary is We assume that whistleblowers use AdLeaks only on
to uncover the identities of whistleblowers. The threat private machines to which employers have no access. In
model we portrayed is an extension of [4] and it is likely fact, sending information from work computers even us-
already a reality in many modern states, or it is about to ing work-related e-mail accounts is a mistake whistle-
become a reality. For reasons we explain in the follow- blowers make frequently. We hope that the software dis-
ing section we do not consider additional threats that tribution channels we discuss in Section 3.3 will help
we would doubtless encounter, for example, in techno- reminding whistleblowers to not make that mistake.
logically advanced totalitarian countries.

2.2 Threats Not in Our Scope 3 System Architecture


We exclude blocking from our threat model and our dis- AdLeaks consists of two major components. The first
cussion because we do not contribute to blocking resis- component is an online advertising network compara-

2
Other users
Whistleblower

Guards

Disclosure
Whistleblower laws protect individuals who inform the public or an authority
about governmental or corporate misconduct. Despite these laws, whistleblow-
ers frequently risk reprisals and sites such as WikiLeaks emerged to provide a
level of anonymity to these individuals. However, as countries increase their

Aggregators
level of network surveillance and Internet protocol data retention, the mere
act of using anonymizing software such as Tor, or accessing a whistleblowing
website through an SSL channel might be incriminating enough to lead to in-
vestigations and repercussions. As an alternative submission system we propose
an online advertising network called AdLeaks. AdLeaks leverages the ubiquity
of unsolicited online advertising to provide complete deniability of the intent
to submit a disclosure. AdLeaks ads compute a random function in a browser
and submit the outcome to the AdLeaks infrastructure. Such a whistleblowers
browser replaces the output with encrypted information so that the transmis-
sion is indistinguishable from that of a regular browser. We describe the design
of AdLeaks and evaluate its performance through analysis and experimentation.

http://nytimes.com

Decryptor
Enc(m1) ... Enc(mn)

replace
Disclosure
Whistleblower laws protect individuals who inform the public or an authority
about governmental or corporate misconduct. Despite these laws, whistleblow-
ers frequently risk reprisals and sites such as WikiLeaks emerged to provide a
level of anonymity to these individuals. However, as countries increase their

Ad Enc(0) level of network surveillance and Internet protocol data retention, the mere
act of using anonymizing software such as Tor, or accessing a whistleblowing
website through an SSL channel might be incriminating enough to lead to in-
vestigations and repercussions. As an alternative submission system we propose
an online advertising network called AdLeaks. AdLeaks leverages the ubiquity
of unsolicited online advertising to provide complete deniability of the intent
to submit a disclosure. AdLeaks ads compute a random function in a browser
and submit the outcome to the AdLeaks infrastructure. Such a whistleblowers
browser replaces the output with encrypted information so that the transmis-
sion is indistinguishable from that of a regular browser. We describe the design
of AdLeaks and evaluate its performance through analysis and experimentation.

Figure 1: Illustrates the architecture of the AdLeaks system. Aggregators reduce the incoming traffic so that
submissions can be funneled to a trusted decryptor through a household DSL line. denotes the decryption key.

ble to existing ones. The network has advertising part- AdLeaks ad, the embedded JavaScript encrypts a zero
ners (the publishers) who include links or scripts in their probabilistically with the embedded public key and sub-
web pages which request ads from the AdLeaks net- mits the ciphertext to a guard. The guard strips unnec-
work and display them. Publishers may receive com- essary encoding and protocol meta-data from the request
pensation in accordance with common advertising mod- and forwards the ciphertext to an aggregator. An aggre-
els, for example, per mille impressions, per click or per gator aggregates the ciphertexts it receives per second
lead generated. Advertisers run campaigns through the and transmits them to the decryptor. What makes this
AdLeaks network. AdLeaks may additionally run cam- setting challenging is that we want to limit the band-
paigns through other ad networks to extend its reach, for width of the decryptor to a household Internet connec-
example, funded by donations or profits from its own op- tion so that we can keep a close eye on the all-important
erations. The ecosystem of partners and supporters may machine with the decryption keys. The aggregation
include large newspapers, bloggers, human rights organi- leverages the homomorphic properties of the Damg ard-
zations and their affiliates. For example, Wikileaks has Jurik (DJ) encryption scheme [16], which means that the
partnered with organizations such as Der Spiegel, El Pas product of the ciphertexts is an encryption of the sum of
and the New York Times, and OpenLeaks had hinted at the plaintexts. We chose the DJ scheme because it has
support by Greenpeace and other organizations. The a favorable plaintext to ciphertext ratio.
key ingredient of an AdLeaks ad is not its visual dis- The decryptor decrypts the downloaded ciphertexts
play but its active JavaScript content. Supporters who and, if it finds data in them, reassembles the data
would forfeit significant revenue when allocating adver- into files. The files come from whistleblowers. In or-
tising space to AdLeaks ads have a choice to only em- der to submit a file, a whistleblower must first obtain
bed the JavaScript portion. The JavaScript is digitally an installer that is digitally signed and distributed by
signed by AdLeaks and contains public encryption keys. AdLeaks. This is already a sensitive process that sig-
The second major component of AdLeaks is its sub- nals intent. We defer the discussion of safe distribution
mission infrastructure. This infrastructure consists of channels for the installer to section 3.3. Installing the
three tiers of servers. We refer to these tiers as guards, obtained software likewise signals the intent to disclose a
aggregators and decryptors. When a browser loads an secret, and therefore it is crucial that the whistleblower

3
verifies the signature before running the installer, and as- 3.2 Decryption
sures himself that the signer is indeed AdLeaks. Other-
It is substantially cheaper to multiply two DJ cipher-
wise, he is vulnerable to Trojan Horse software designed
texts in the ciphertext group than it is to decrypt one.
to implicate whistleblowers. When run, the installer pro-
Furthermore, the product of ciphertexts decrypts to the
duces an instrumented browser and an encryption tool.
sum of the plaintexts in the plaintext group. Recollect
The whistleblower prepares a file for submission by run-
that we expect to receive a large number of white ci-
ning the encryption tool on it. The tools output is a
phertexts, that is, encryptions of zeroes. This leads to
sequence of ` ciphertexts. Henceforth, whenever an in-
the following optimization: we form a full binary tree
strumented browser runs an ad signed by AdLeaks, it
of fixed height, initialize its leaves with received cipher-
replaces the scripts ciphertext with one of the ` cipher-
texts c1 , . . . , cn and initialize each inner node with the
texts it has not already used as a replacement.
product of its children. Then, we begin to decrypt at
In order to distinguish ciphertexts that are encryp-
the root. If the plaintext is zero then we are done with
tions of zeros from ciphertexts that are encryptions of
this tree, because all nodes in the tree are zeroes. Oth-
data we refer to the former as white and to the latter as
erwise, the decryption yields = + 6= 0 where ,
gray. If the aggregator aggregates a set of white cipher-
are the plaintexts of the left and right child, respectively.
texts then the outcome is another white one. If exactly
We decrypt the left child, which yields , and calculate
one gray ciphertext is aggregated with only white ones
the plaintext of the right child as = (without
then the outcome is gray as well. If we decrypt the out-
explicit decryption). If or are zeroes then we ignore
come then we either recover the data or we determine
the corresponding subtree. Otherwise, we recurse into
that there was no data to begin with. If two or more
the subtrees that have non-zero roots. If a node is a leaf
gray ciphertexts are aggregated then we cannot recover
then we decrypt and verify it. If we find it invalid then
the original data from the decryption. We call this event
we ignore the leaf. Otherwise we forward its plaintext to
a collision and we refer to such an outcome as a black
the file reassembly process. We quantify the benefits of
one. Obviously, we must expect and cope with collisions
this algorithm in Section 6.4.
in our system. In what follows, we elaborate on details
of the design that are necessary to turn the general idea
into a feasible and scalable system. 3.3 Software Dissemination
We cannot simply offer the installer software for down-
3.1 Disclosure Preparation load because the adversary would be able to observe
that. Instead, we pursue a multifaceted approach to
In order to handle collisions, the encryption tool breaks software distribution. Our simplest and preferred ap-
a file into blocks of a fixed equal size and encodes them proach involves the help of partners in the print media
with a loss tolerant Fountain Code. Fountain codes en- business. At the time of writing, popular print media
code n packets into an infinite sequence of output pack- often come with attached CDROMs or DVDs that are
ets of the same size such that the original packets can be loaded with, for example, promotional material, games,
recovered from any n0 of them where n0 is only slightly films or video documentaries. Our installer software can
larger than n. For example, a random linear Fountain be bundled with these media. Our second approach is
Code decodes the original packets with probability 1 to encode the installer into a number of segments us-
from about n + log2 (1/) output packets [27]. Let n00 ing a Fountain Code. In this approach, AdLeaks ads
be somewhat larger than n0 and let m1 , . . . , mn00 be the randomly request a segment that the browser loads into
Fountain encoding of the file. The tool then generates the cache. A small bootstrapper program extracts the
a random file identification number k and computes: segments from the browser cache and decodes the in-
ci = Enccca 1 (EncData2 (mi , k||i||n)) for 1 i n
00
staller from it when enough of them have been obtained.
where 1 is an aggregator key and 2 is the actual sub- Since extraction happens outside the browser it cannot
mission key. The purpose of the dual encryption will be observed from within the browser. The bootstrapper
become clear in Section 5.5. We assume that the outer can be distributed in the same fashion. This reduces
encryption is a fast hybrid IND-CCA secure cipher such the distribution problem to extracting a specific small
as Elliptic Curve El Gamal with AES in OCB mode [32]. file from the cache, for example, by searching for a file
We defer the specification of the inner encryption scheme with a specific signature or name in the cache directory.
to Section 4. It assures that, when the decryptor receives This task can probably be automated for most platforms
the ciphertexts, it can verify the integrity of individual with a few lines of script code. The code can be pub-
chunks and of the message as a whole and he can as- lished periodically by trusted media partners in print
sociate the chunks that belong to the same submission or verbatim in webpages or it could even be printed on
with all but negligible probability (in |k|). T-Shirts. Our third approach is to enlist partners who

4
bundle the bootstrapper with distributions of popular We assume that |r0 |, |r1 |, |r2 | are polynomial in the se-
software packages so that many users obtain it along curity parameter. Here, r0 corresponds to k||i||n as
with their regular software. With our multifaceted ap- we introduced it in Section 3.1. We define EncZero =
proach we hope to make our client software available to EncData(0, 0). Aggregation is simply the multiplication
most potential whistleblowers in a completely innocuous of the respective ciphertext components. We establish
and unobservable fashion. the correctness of decryption next. Let c, t and c0 , t0 be
two ciphertexts. Then

4 Ciphertext Aggregation 0 0
c c0 = (m + m0 ; hH(m,r0 )+H(m ,r0 ) g r1 +r1 )
0

0
Our ciphertext aggregation scheme is based on the t t0 = ((r0 ||r1 ) + (r00 ||r10 ); g r2 +r2 )
0
Damg ard-Jurik (DJ) scheme, which IND-CPA secure = ((r0 + r00 )||(r1 + r10 ); g r2 +r2 )
and is also an isomorphism of
if r0 , r1 , r00 , r10 are left-padded with zeroes, which we
s : ZN s ZN ZN s+1
s
hereby add to the requirements. The amount of padding
s (a; b) 7 (1 + N )a bN mod N s+1 determines how many ciphertexts we can aggregate in
where N is a suitable public key. The parameter s con- this fashion before an additive field overflows into an ad-
trols the ratio of plaintext size and ciphertext size. We jacent one and corrupts the ciphertext. If we use B bits
use two two DJ encryptions c, t to which we jointly re- of padding then we can safely aggregate up to 2B cipher-
fer as a ciphertext. We refer to t separately as the tag. texts. A length of B = 40 is enough for our purposes.
The motivation for this arrangement is improved per- Observe that the aggregation of two outputs of EncData
formance. We wish to encrypt long plaintexts and the is valid if and only if
costs of cryptographic operations increase quickly for
H(m, r0 ) + H(m0 , r00 ) H(m + m0 , r0 + r00 ) (1)
growing s. Therefore we split the ciphertext into two
components. We use a shorter component with s = 1, modulo (N ). This amounts to finding a collision in
which allows us to test quickly whether the ciphertext H and we assume that this is infeasible if H resembles a
encrypts data or a zero. The actual data is encrypted pseudorandom function. On the other hand, if one input
with a longer component with s > 1. The two compo- to the aggregation is an output of EncData and another
nents are glued together using Pedersons commitment input is an output of EncZero then by the definitions of
scheme [29], which is computationally binding and per- our algorithms we have
fectly hiding. This requires two additions to the public
key, which are a generator g of the quadratic residues of H(m, r0 ) + 0 H(m + 0, r0 + 0)
ZN and some h = g x for a secret x. Instead of commit-
ting to a plaintext the sender commits to the hash of the modulo (N ), which is trivially fulfilled. Hence, a valid
plaintext and some randomness. We use a collision resis- data encryption can be modified into another valid en-
tant hash function H for this purpose, which outputs bit cryption of the same data but not into a valid encryp-
strings of length |N/16|. Furthermore, let R be a source tion of different data. For this reason our scheme is not
of random bits. The details of the data encryption and IND-CCA secure, although it would achieve the weaker
decryption algorithms are as follows: notion of Replayable CCA [8] if we removed the special
case m, r0 = 0. Unfortunately we need some special case
EncData(m, r0 ) =
to enable aggregation. Therefore we wrap the output of
r1 , r2 R
EncData into an IND-CCA secure outer encryption in
chk if m, r0 = 0 then 0
order to prevent adversaries from collecting samples for
else H(m, r0 )
use in a bait attack (see Section 5.5).
c (m; hchk g r1 ))
t (r0 ||r1 ; g r2 )
return c, t 5 Security Properties
DecVrfy(c, t) = 5.1 Traffic Analysis
(m; k), (r0 ||r1 ; ) 1 (c), 1 (t)
chk if m, r0 = 0 then 0 AdLeaks funnels all incoming transmissions to the de-
else H(m, r0 ) cryptor, and transmissions occur without any explicit
if hchk g r1 = k then user interaction. Hence, the posterior probability that
return m, r0 anyone is a whistleblower, given his transmission is ob-
return served anywhere in the AdLeaks system, equals his prior

5
probability. From that perspective, AdLeaks is immune strategy to gain information on who is sending data to
against adversaries who have a complete view of the net- AdLeaks. The adversary samples ciphertexts of suspects
work. Furthermore, AdLeaks deployment model is suit- from the network and aggregates the ciphertexts for each
able to leapfrog the long-drawn-out deployment phase of suspect. He prepares a genuine-looking disclosure that
anonymity systems that rely on explicit adoption. For is enticing enough so that the AdLeaks editors will want
example, if Wikipedia deployed an AdLeaks script then to publish it with high priority. We call this disclosure
AdLeaks would reach 10% of the Internet user popula- the bait. The adversary then aggregates suspects ci-
tion overnight, based on traffic statistics by Alexa [1]. phertexts to his disclosure and submits it. If AdLeaks
does not publish the bait within a reasonable time in-
terval then the adversary concludes that the suspect is
5.2 Denial of Service by Blocking
a whistleblower. The reasoning is as follows. If the sus-
AdLeaks is vulnerable to blocking. Because anyone pect ciphertexts were zeroes then the bait is received
should be able to use AdLeaks, anyone must be able to and likely published. Since the bait was not published,
obtain the AdLeaks client software, including the adver- the suspect ciphertexts carried data which invalidated
sary. It is easy to turn the client software into a classifier the bait ciphertexts. This idea can be generalized to an
that learns blocking filters for AdLeaks ads. Further- adaptive and equally effective non-adaptive attack that
more, it is harder in the AdLeaks case to bypass blocking identifies a single whistleblower in a group of W suspects
than it is in the case of, for example, censorship resis- at the expense of log2 W baits. For this reason, AdLeaks
tance software. A whistleblower cannot trust anyone and employs an outer encryption which prevents this attack.
therefore it is very risky for him to obtain helpful infor- However, if an adversary takes over an aggregator then
mation by gossiping, which works for Tor. Neither can he is again able to launch this attack. Therefore, aggre-
he count on the help of Internet Service Providers, which gators should be checked regularly, remote attestation
is the basis for systems such as Telex [42], Cirripede [21] should be employed to make sure that aggregators boot
and Decoy Routing [24]. We are not aware of a practi- the correct code, and keys should be rolled over regu-
cal mechanism that is applicable to AdLeaks and that larly. Note that it may take months before a disclosure
provides satisfactory security guarantees. Therefore, we is published and that a convincing bait has a price
defer countermeasure design to future research. the adversary must leak a sufficiently attractive secret
in order to make sure it is published. From this, the
adversary only learns that a suspect has sent something
5.3 Denial of Service by Flooding but not what was sent.
If we deployed one decryption unit and operated at its
approximate limit, that is, 51480 concurrent whistle- 5.6 Bandwidth-Based Attacks
blowers, then it would receive already about 2827 disclo-
sures per day on average. In other words, the editorial The adversary may submit a bait while sending a suspect
backend of AdLeaks would be overwhelmed even before chunk at a rate that is close to or exceeds the data capac-
the technical infrastructure is saturated. Under these ity between aggregators and decryptors. If the suspect
conditions, the benefit of protecting against flooding at- chunk is a zero then the recovery probability of AdLeaks
tacks is debatable. remains within the expected bounds. If, however, the
suspect chunk is a data chunk then, with good prob-
ability, AdLeaks does not recover the bait. However,
5.4 Transmission Tagging
AdLeaks will notice the reduced recovery probability and
We have shown before that the encryption scheme may react, for example, by rolling over to a new key or
AdLeaks uses is secure against adaptive chosen cipher- even by pushing warnings to whistleblowers via its ad
text attacks as long as aggregators are honest. This pre- distribution mechanism.
vents adversaries from tagging or adding chunks en route
to aggregators. The Fountain code ensures that AdLeaks
5.7 Client Compromise
can determine when it has enough chunks to recover the
entire disclosure. This prevents any attempts to tag dis- If the computer of a whistleblower is compromised then
clsoures by means of dropping or re-ordering chunks. the whistleblower loses all security guarantees. Limited
resilience to detection could perhaps be achieved by em-
5.5 Dishonest Aggregators ploying malware-like hiding-tactics. However, since the
AdLeaks client is public, it is merely a matter of time
Assume that AdLeaks did not use outer encryption. until its tactics are reverse engineered and a detection
Then adversaries might employ the following active software is written. If the detection software can be

6
pushed to suspects computers then whistleblowers can ability at least 0.9 then submitting a 2 MB file requires
be uncovered. In order to limit the risk, a production- about 1010 transmissions or 21 1 days on average.
grade implementation of the AdLeaks client should offer
a function to delete itself securely when it is not needed
anymore.
6.2 Network Load
A Base64 encoded ciphertext requires transmitting
about 4496 bytes if we include 400 bytes worth of HTTP
6 Scalability headers. At 50 transmissions a day this adds up to an
additional daily network load of 220 KB/user. Since flat
Our goal in this section is to characterize how well rates for Internet access are common in developed coun-
AdLeaks scales with a growing number of users and tries, we believe this is insignificant for users in these
whistleblowers. Among other dimensions, we explore the countries. Also, given that whistleblowers provide large
necessary and sufficient size of the required infrastruc- societal benefits, society may well be willing to pay even
ture and the time it takes to submit a file. An estimate a small, but significant, cost in bandwidth, beyond that
of the financial feasibility of the operation can be found needed by AdLeaks. Although one might get concerned
in Section 7. that the accumulated load poses a problem at Internet
scale. This is not likely the case. The quoted amount
of data is less than the size of an average web page on
6.1 Submission Duration
the wire [31], which Google engineers found to be about
In the absence of better data, we analyzed the Wikileaks 320KB in 2010. Hence, the load AdLeaks adds to the
archives available from wlstorage.net and estimated the network is well within users network traffic variance. It
sizes of disclosures as follows: we counted top-level files may not even be noticed against the backdrop of video
and archives as individual disclosures; we counted the streaming and increasing Internet use.
contents of subdirectories as a single disclosure unless
the contents also appeared in an archive. We found that
6.3 Guards and Aggregators
70% of the disclosure estimates were less than 2 MB
(about 20% were larger than 4 MB) and chose 2 MB to In order to establish a lower bound of the request rate
be our target size. that AdLeaks servers would be able to process we de-
Zhang and Zhao conducted a study on tabbed brows- ployed guard and aggregator prototypes (see Section 8)
ing behavior [43] and measured 89,851 page loadings on EmuLab [41]. We used six pc3000 nodes (3.0 GHz 64-
distributed over 20 participants and 31 days, which bit Xeon processors, 2 GB DDR2 RAM, 1 Gbit connec-
amounts to about 145 page loadings per user per day. tivity) for guards, one d710 node (2.4 GHz 64-bit Quad
Webpages often display multiple ads. It is common to Core Xeon E5530 processor, 12 GB RAM, 1 Gbit con-
display a horizontal ad at the top and one or more verti- nectivity) for one aggregator and another pc3000 node
cal ads in the margins. The popularity of a website also for one decryptor. The guard servers sent chunks to the
plays an important role for how many ad loadings it can aggregator through a reverse SSH tunnel. The aggrega-
trigger. News websites elicit frequent and regular page tor aggregated the incoming chunks and sent aggregates
loadings and are particularly suitable for our purposes. to the decryptor through a reverse SSH tunnel. The de-
Fortunately for us, they also have incentives to support cryptor performed the tree decryption and discarded the
online whistleblowing systems. Furthermore, they might decrypted chunks. Overload situations were easily ob-
support persistent ad-less ads, that is, AdLeaks scripts served through buffer growth, that is, the incoming rate
that do not take up page real estate and therefore do not exceeded the rate at which the aggregator processed and
compete with other ad revenue sources for the news out- sent its aggregates. The aggregator operated stable at
let. Lastly, our ads can trigger multiple transmissions an incoming rate of 8500 chunks per second or roughly
at random intervals if we are below our target. For this 209 Mb/s. Since our prototype does not yet implement
reason, we decided that it is fair to assume we would get the outer encryption we measured the overhead of ellip-
about 50 transmissions per day and user from our ads. tic curve based key exchanges separately. For a 256-bit
For a sound level of security, we use a public key mod- key we measured 0.006632 ms with a standard devia-
ulus with 2048 bits. The DJ scheme is expensive for tion of 0.000171 ms (100 iterations). If we correct for
increasing plaintext lengths and based on initial tests we this overhead then the aggregator handles 8046 reqs/s.
settled on parameters that support 2303 bytes for data Since guards merely strip some encoding from incoming
use. Our file would thus require about 911 blocks. If we requests and pass them on we did not measure them sep-
account for the encoding overhead and further assume arately and instead assume that they perform similar to
that AdLeaks can recover a data transmission with prob- aggregators without outer encryption, that is, we assume

7
they handle 8000 reqs/s.
Web surfers are most active during certain time win-
dows during the day. If active times are reasonably uni-
form across the population then they shift with the time
zone, which spreads the active windows. This prompts 1
the following assumptions, which may capture, for ex- 0.8
ample, the situation in the United States: the active
0.6
time of users is between 6pm and 1am, and they live in
three adjacent time zones. This means, conservatively, 0.4
that the 50 transmissions per user we assumed before 0.2 80
concentrate within an 11 hours window. The U.S. have 100
0 120
about 138 million broadband Internet users at the age of 1600 140 k
1200 160
18 years and older [35]. The expected load on guards is m 800
400 180
therefore about 174243 reqs/s. However, we are rather
interested in the peak load, for example, the load that is
not exceeded in 99% of all cases. Towards this end we
model arrival times using a Poisson distribution. For the Figure 2: Shows the graphs of the recovery probability
calculation of the peak load we use the fact that the Pois- for varying numbers of data chunks and aggregates for
son distribution is very close to the normal distribution t = 1 and t = 4. Contour lines indicate a 0.9 probability
for a mean larger than about 20. It is well known that level.
about 99.7% of normally distributed events are within 3
standard deviations from the mean. Since the variance
of the Poisson distribution equals its mean we have that, This yields the same overall number m of aggregates.
in 99.85% of all cases, the load will be at most 175495 A gray ciphertext is recoverable if it is recoverable from
reqs/s. Using this as our basis we conclude that we need any of the t sets. Hence, the probability we seek is one
22 guard units and 22 aggregators. minus the probability that we fail to recover the data
from all of the t sets of aggregates. Both probabilities
are given by the following formulas:
6.4 Data Recovery
1 k1
By design, the decryptor scalability does not depend on Pr[i = 1] = (1 )
m
the number of users but only on the number of whistle- 1 k1 t
blowers. This is what allows us to upper bound the Pr[i = 1] = 1 (1 (1 ) )
m/t
number of copies of the AdLeaks decryption key. Based
on speed test reports on regional ISPs their download Figure 2 illustrates the effect for t = 1 and for t = 2.
bandwidths range from 4 megabit per second (Mb/s) to The contour lines indicate where the recovery probability
31 Mb/s for a household Internet connection. As a basis becomes larger than 0.9. For example, if aggregators
for subsequent estimation we use the median rounded to can transfer 768 aggregates per second to a decryptor
Mb, that is, we assume that decryptors can download and whistleblowers send at most 82 gray ciphertexts per
ciphertexts from aggregators with 18 Mb/s. At 3072 second and we set t = 1 then AdLeaks can recover each
bytes per ciphertext this translates to 768 aggregate ci- transmission with a probability of at least 0.9.
phertexts per second. Next, we estimate the number of cryptographic opera-
We estimate the data recovery probability of AdLeaks tions AdLeaks must perform in order to decrypt with its
under the following aggregation model. Given k gray ci- tree decryption algorithm. Assume AdLeaks builds trees
phertexts and an arbitrary number of white ones, what is of depth n. This requires 2n 1 modular multiplications.
the probability that the data from a random gray cipher- The expected number of decryptions is
text can be recovered if we can transmit m aggregates n
from aggregators to decryptors? The data of a gray ci- 1 X i n+1i
E(X) = 1 + 2 (1 (1 p)2 )
phertext is recoverable if it is aggregated only with white 2 i=1
ones. Therefore, we seek the probability that, given any
gray ciphertext, all other gray ciphertexts fail to be as- Proof. The root of the tree must always be decrypted,
signed to the same aggregate, of which there are m. If k hence the expectation is always at least 1. Since the
is small with respect to m/t for some t then we can im- algorithm only decrypts left children and not right chil-
prove our recovery probability as follows. We perform t dren, we need to count only half of the remaining nodes.
independent aggregations with t sets of m/t aggregates. Recall that a right child is calculated by subtracting its

8
1 6.6 Decryptors
The performance of the decryptor is bound by the cost
Percent decryptions needed

0.8 of two operations: the time it takes to test whether a


tree node is white, and the time it takes to perform a
full decryption and verification. We measured 0.011578
0.6
and 0.192843 seconds for these operations on a single
Our example core (2.66GHz Intel Xeon) with negligible standard de-
0.4 viation. If we assume that the decryptor has 11 cores
available for decryption then we estimate that our run-
0.2 tree depth 4 ning example requires 2 decryptors. Since the necessary
tree depth 8 resources for decryption scale linearly with the number of
tree depth 16
0 whistleblowers this means that AdLeaks can serve about
0 0.2 0.4 0.6 0.8 1 25740 concurrent whistleblowers with just one unit, for
example, a dual 6-core Mac Pro.
Probability that a ciphertext is gray/black

Figure 3: Shows the normalized savings of the tree de- 6.7 Client Measurements
cryption algorithm for various tree sizes and load char-
Our JavaScript implementations of the DJ scheme lever-
acteristics. For t > 8 the graph looks identical to the
age several optimizations [23] that improve efficiency
graph for t = 8.
over unoptimized DJ by a factor of 8 to 32 in our mea-
surements. As a side effect, the bit lengths of two pa-
rameters of our inner encryption scheme, namely r1 , r2
sibling from its parent. At level i from the root, starting
(see Section 4), bound the time it takes to encrypt in the
under the root node, we have 2i nodes. We have to de-
browser. Reasonable values range from 512 bits (prob-
crypt a left child at level i if its parent is not zero. The
ably sufficient) to 2044 bits (paranoid). We measured
probability that the parent is not zero is one minus the
these times on an Intel i5-2500K CPU at 3.30 GHz with
probability that its 2t+1i leaves are all zeroes.
Chromium Version 20.0.1132.47. For 512 bits it took
7.55 seconds (s = 0.06, speedup 32), for 1024 bits it
If we divide the expected number of decryptions by 2n took 14.67 seconds (s = 0.05, speedup 16) and for
decryptions (the nave approach) and plot the normal- 2044 bits it took 28.68 seconds (s = 0.38, speedup 8).
ized results for several values of n then we obtain the Since Java is still significantly faster than JavaScript we
graphs in Figure 3. The graphs tell us, for example, that assume that these times will become smaller still in the
we expect to save 0.61% of the decryptions if AdLeaks future.
operates at its limit, that is, a recovery probability of
0.9 and 82/768 0.11% gray or black ciphertexts. The
lighter the load is the more we save. 7 Financial Viability
Given the server resources AdLeaks requires it is pru-
6.5 Number of Whistleblowers dent to ask what are the costs the AdLeaks organiza-
tion has to bear. We found that dual quad-core servers
We characterize next how many concurrent whistleblow- with unmetered 1Gb interfaces are available for less then
ers a link capacity of 82 data chunks per second can 400 USD/month. At this price, the infrastructure for the
support. Since no data set exists that we could use to guards and aggregators necessary to serve 138 million
estimate the arrival times of data chunks, we model them users would cost 579 USD/day. While this seems high it
as a Poisson process that we approximate by a normally is instructive to look at the revenue side. Each ad loading
distributed process as before. To be on the safe side corresponds to what is called an impression in the on-
we seek a safe average sending rate r so that the actual line advertising business. The prices for impressions are
sending rate does not exceed 82 reqs/s in 97.725% of all typically quoted as costs per mille, or CPM. Top ranking
cases, that is, the second quantile. The safe
average rate websites can command prices over 100 USD while run-
can be found easily by solving r + 2 r = 82 for r, off-the-mill websites receive in the order of 0.25 USD.
which yields 65 reqs/s. At 50 transmissions per day and Cost per click models or cost per conversion models gen-
whistleblower this means that AdLeaks can serve 51480 erate additional revenue, which we ignore for the sake of
concurrent whistleblowers at any time with a 18 Mb/s conservatism. If AdLeaks had 138 million users who see
uplink for the decryptor. 5 AdLeaks ads per day on average and if AdLeaks paid

9
out 0.25 USD per mille impressions and if its markup was tributed sensing and diagnostics. In preserving the pri-
0.34 percent or better then AdLeaks would break even vacy of web-based email [7] one can take advantage of a
on its infrastructure costs. For comparison, ValueClick spread-spectrum approach for hiding the existence of a
Inc. reported a gross profit of over 96 million USD on a message, but it is not secure against a global attacker.
revenue of about 161 million USD in its second quarter Membership-concealment [36] can also be used to hide
of 2012, which translates to about 59 percent profitabil- the real-world identities of participants in an overlay net-
ity, and reported 25 cents of net income per common work, but this doesnt suffice for AdLeaks.
share. The numbers suggest that, if AdLeaks was run as In censorship resistance, there is Publius [40] which is
a not-for-profit operation, it could gain market share by an anonymous publication system but does not offer any
offering very competitive pricing while earning a decent sort of connection-based anonymity. Collage [6] stegano-
plus. This is in sharp contrast to contemporary whistle- graphically embeds content in cover traffic such as photo-
blowing platforms who depend entirely on donations. sharing sites and implements a rendezvous mechanism
to allow parties to publish and retrieve messages in this
cover traffic, but Alice and Bob must exchange a key a
8 Implementation priori. More recent work [22] explores an approach that
assumes the ability of being able to globally check and
We developed fully-functional multi-threaded aggrega- retrieve all blog posts in real time and determining and
tion and decryption servers with tree decryption sup- extracting all the embedded content.
port as well as a Fountain Code encoder and decoder.
Another related area is secure data aggregation in
Decryptors write recovered data to disk and the decoder
wireless sensor networks [2] or WSNs. One can try
recovers the original file. We also developed a fake guard
to securely aggregate encrypted data [9], which identi-
server which is capable of generating and sending chunks
fies the key stream in the header and requires remov-
according to a configurable ratio of white and gray ci-
ing a stream for each ciphertext received. The latter
phertexts. All servers connect to each other through
wouldnt scale for AdLeaks because it requires millions
SSH tunnels via port forwarding. The entire implemen-
of keystream removals per aggregate. One approach [37]
tation consists of 101 C, header and CMake files with
for aggregating in multicast communication uses the
7493 lines of code overall. This includes our optimized
Okamoto-Uchiyama encryption scheme for secure aggre-
DJ implementation [23], which is based on a library by
gation, which resembles Pedersons commitment scheme
Andreas Steffen, a SHA-256 implementation by Olivier
very closely. The difference is really that AdLeaks is
Gay, and several benchmarking tools. Our ads imple-
not used in the aggregate but instead deals with colli-
ment the DJ scheme based on the JSBN.js library and
sions. Other work [38, 13] targets the security of sta-
use Web Workers to isolate the code from the rest of
tistical computations on the inputs from various sen-
the browser. The entire ad currently measures less than
sor nodes. The two key differences in the approaches
81 KB. The size can be reduced further by eliminating
found in WSNs are: (i) WSNs want correctly aggre-
unused library code and by compressing it. The ad sub-
gated data that allows for unencrypted sending, whereas
mits ciphertexts via XmlHttpRequests. We instrumented
our approach seeks the opposite of that, and (ii) the at-
the Firefox browser for our prototype and patched the
tacker wants to have a tainted input accepted, whereas
source code in two locations. First, we hook the com-
in our context he wants to learn the content of the input.
pilation of Web Worker scripts and tag every script as
In WSNs both event-driven and query-based processing
an AdLeaks script if it is labeled as one in lieu of car-
is of interest, with most approaches focusing on query-
rying a valid signature. We placed a second hook where
based solutions, whereas AdLeaks is event-driven. That
Firefox implements the XmlHttpRequest. Whenever the
also means that we dont know a priori which client
calling script is an AdLeaks script running within a Web
sends what in what round. It is difficult for us to ex-
Worker, we replace the zero chunk in its request with a
change keys beforehand and our approach remains uni-
data chunk.
directional, i.e. we cannot distribute keys. Lastly, in
WSNs clients are not trusted initially and are later vet-
9 Related Work ted, whereas in our approach clients are never trusted.

Closely related to our work, there is early work on DC


Nets [14, 39] which aims to provide a cryptographic 10 Conclusions
means to hide who sends messages, the use of Raptor
codes [10] to implement an asynchronous unidirectional AdLeaks leverages the ubiquity of online advertising to
one-to-one and one-to-many covert channel using spam provide anonymity and unobservability to whistleblow-
messages, and anonymous data aggregation [30] for dis- ers making a disclosure online. The system introduces

10
a large amount of cover traffic in which to hide whistle- [8] R. Canetti, H. Krawczyk, and J. Nielsen. Relaxing
blower submissions, and aggregation protocols that en- chosen-ciphertext security. In Proc. CRYPTO,
able the system to manage the huge amount of traffic volume 2729 of LNCS, pages 565582. Springer,
involved, enabling a small number of trusted nodes with 2003.
access to the decryption keys to recover whistleblowers
submissions with high probability. We analyzed the per- [9] C. Castelluccia, A. C.-F. Chan, E. Mykletun, and
formance characteristics of our system extensively. Our G. Tsudik. Efficient and provably secure
research prototype demonstrates the feasibility of such a aggregation of encrypted data in wireless sensor
system. We expect many aspects of the system can be networks. ACM Trans. Sen. Netw.,
improved and optimized, providing ample opportunity 5(3):20:120:36, June 2009.
for further research. [10] A. Castiglione, A. De Santis, U. Fiore, and
F. Palmieri. An asynchronous covert channel using
Acknowledgements spam. Comput. Math. Appl., 63(2):437447, Jan.
2012.
The first, second and last author are supported by an
endowment of Bundesdruckerei GmbH. An abridged ver- [11] E. R. Center. 2011 National Business Ethics
sion of this paper has been accepted for publication in Survey. Online at http://www.ethics.org/nbes,
the proceedings of Financial Cryptography and Data Se- 2345 Crystal Drive, Suite 201, Arlington, VA
curity 2013 [33]. Copies of the abridged version will 22202, USA, 2012.
eventually become available at http://www.springer.
de/comp/lncs/. [12] S. Chakravarty, A. Stavrou, and A. D. Keromytis.
Traffic analysis against low-latency anonymity
networks using available bandwidth estimation. In
References Proc. ESORICS, pages 249267. Springer, 2010.

[1] Alexa. Online at http://www.alexa.com, Apr. [13] H. Chan, A. Perrig, B. Przydatek, and D. Song.
2012. Sia: Secure information aggregation in sensor
networks. J. Computer Security, 15(1):69102,
[2] H. Alzaid. Secure Data Aggregation in Wireless Jan. 2007.
Sensor Networks. PhD thesis, Queensland
University of Technology, March 2011. [14] D. Chaum. The dining cryptographers problem:
Unconditional sender and recipient untraceability.
[3] D. Banisar. Whistleblowing International Journal of Cryptology, 1:6575, 1988.
Standards and Developments. Transparency 10.1007/BF00206326.
International, Feb. 2009.
[15] D. Dahl and R. Sleevi. Web cryptography API.
[4] S. Berthold, R. Bohme, and S. Kopsell. Data W3C Draft, October 2012.
retention and anonymity services. In The Future of
Identity in the Information Society, volume 298 of [16] I. Damgard, M. Jurik, and J. Nielsen. A
IFIP Advances in Information and Communication generalization of Pailliers public-key system with
Technology, pages 92106. Springer, 2009. applications to electronic voting. International
Journal of Information Security, 9:371385, 2010.
[5] B. M. Bowen, S. Hershkop, A. D. Keromytis, and
S. J. Stolfo. Baiting inside attackers using decoy [17] R. Dingledine, N. Mathewson, and P. Syverson.
documents. In Proc. SecureComm, pages 5170. Tor: the second-generation onion router. In Proc.
Springer, 2009. USENIX Security Symposium, pages 303320,
2004.
[6] S. Burnett, N. Feamster, and S. Vempala.
Chipping away at censorship firewalls with [18] K. P. Dyer, S. E. Coull, T. Ristenpart, and
user-generated content. In Proc. USENIX T. Shrimpton. Peek-a-boo, i still see you: Why
Security, pages 2929, 2010. efficient traffic analysis countermeasures fail. In
IEEE Symposium on Security and Privacy, pages
[7] K. Butler, W. Enck, J. Plasterr, P. Traynor, and 332346, 2012.
P. McDaniel. Privacy preserving web-based email.
In Proc. International Conference on Information [19] A. Freier, P. Karlton, and P. Kocher. The secure
Systems Security, ICISS, pages 116131. Springer, sockets layer (SSL) protocol version 3.0. RFC 6101
2006. (Historic), Aug. 2011.

11
[20] S. Gustin. Columbia university reverses [32] P. Rogaway, M. Bellare, and J. Black. OCB: A
anti-WikiLeaks guidance. block-cipher mode of operation for efficient
http://www.wired.com/threatlevel/2010/12/ authenticated encryption. ACM Trans. Inf. Syst.
columbia-wikileaks-policy/, Dec. 2010. Secur., 6:365403, Aug. 2003.
[21] A. Houmansadr, G. T. K. Nguyen, M. Caesar, and [33] V. Roth, B. Guldenring, E. Rieffel, S. Dietrich,
N. Borisov. Cirripede: Circumvention and L. Ries. A secure submission system for online
infrastructure using router redirection with whistleblowing platforms. In Proc. Financial
plausible deniability. In Proc. ACM CCS, 2011. Cryptography and Data Security, LNCS. Springer
Verlag, 2013. To appear.
[22] L. Invernizzi, C. Kruegel, and G. Vigna. Message
In A Bottle: Sailing Past Censorship. In Hot [34] E. Stark, M. Hamburg, and D. Boneh. Symmetric
Topics in Privacy Enhancing Technologies, Vigo, cryptography in javascript. In Proc. ACSAC,
Spain, 2012. pages 373381. IEEE Computer Society, 2009.

[23] M. J. Jurik. Extensions to the Pailler [35] Persons using the Internet in and outside the
Cryptosystem with Applications to Cryptological home. http://www.ntia.doc.gov/files/ntia/
Protocols. Dissertation, BRICS, Department for data/CPS2010Tables/t11_1.txt, Jan. 2011.
Computer Science, University of Aarhus, Aug. [36] E. Vasserman, R. Jansen, J. Tyra, N. Hopper, and
2003. Y. Kim. Membership-concealing overlay networks.
[24] J. Karlin, D. Ellard, A. W. Jackson, C. E. Jones, In Proc. ACM CCS, pages 390399, 2009.
G. Lauer, D. P. Mankins, and W. T. Strayer. [37] A. Viejo, Q. Wu, and J. Domingo-Ferrer.
Decoy routing: Toward unblockable internet Asymmetric homomorphisms for secure
communication. In Proc. Workshop on Free and aggregation in heterogeneous scenarios. Inf.
Open Communications on the Internet. USENIX, Fusion, 13(4):285295, Oct. 2012.
2011.
[38] D. Wagner. Resilient aggregation in sensor
[25] A. Klein. Temporary user tracking in major networks. In Proc. ACM Workshop on Security of
browsers and cross-domain information leakage Ad Hoc and Sensor Networks, SASN, pages 7887.
and attacks. Technical report, Trusteer, 2008. ACM, 2004.
Online at http://www.trusteer.com.
[39] M. Waidner and B. Pfitzmann. The dining
[26] K. J. Lennane. Whisteblowing: a health issue. cryptographers in the disco: Unconditional sender
British Medical Journal, 307:667670, Sept. 1993. and recipient untraceability with computationally
[27] D. MacKay. Fountain codes. IEE Proceedings, secure serviceability. In EUROCRYPT, volume
152(6), Dec. 2005. 434 of LNCS, pages 690690. Springer, 1990.
[40] M. Waldman, A. Rubin, and L. Cranor. Publius:
[28] A. Osterhaus and C. Fagan. Alternative to Silence
A robust, tamper-evident, censorship-resistant and
Whistleblower Protection in 10 European
source-anonymous web publishing system. In Proc.
Countries. Transparency International, 2009.
USENIX Security Symposium, pages 5972, Aug.
[29] T. Pedersen. Non-interactive and 2000.
information-theoretic secure verifiable secret
[41] B. White, J. Lepreau, L. Stoller, R. Ricci,
sharing. In Proc. CRYPTO, volume 576, pages
S. Guruprasad, M. Newbold, M. Hibler, C. Barb,
129140. Springer, 1992.
and A. Joglekar. An integrated experimental
[30] K. P. N. Puttaswamy, R. Bhagwan, and V. N. environment for distributed systems and networks.
Padmanabhan. Anonygator: privacy and integrity In Proc. OSDI, pages 255270, Dec. 2002.
preserving data aggregation. In Proc.
[42] E. Wustrow, S. Wolchok, I. Goldberg, and J. A.
ACM/IFIP/USENIX 11th International
Halderman. Telex: Anticensorship in the network
Conference on Middleware, Middleware, pages
infrastructure. In Proc. USENIX Security
85106. Springer, 2010.
Symposium, Aug. 2011.
[31] S. Ramachandran. Web metrics: Size and number [43] H. Zhang and S. Zhao. Measuring web page
of resources. Online at https://developers. revisitation in tabbed browsing. In Proc. CHI,
google.com/speed/articles/web-metrics, May pages 18311834, 2011.
2010.

12

You might also like