Professional Documents
Culture Documents
Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
This paper presents and discusses the first steps – and above all the first impasses
– of research that started in January 20052. This research intended to identify and
discuss the hyperlinks connecting sites on the World Wide Web that are registered
as being within Brazilian domains (.br) with sites of other nationalities. It is unlikely
that a country code Top Level Domain3 (ccTLD) includes all of the sites created by
people and institutions of that nationality, or all of the pages hosted on servers
located in territory belonging to that country. Various Brazilians host their sites
under other ccTLDs, amongst the reasons for which is the disproportionate amount
of bureaucracy required to register a .br domain in comparison that required in
1
This paper presents the partial results of research sponsored by the Conselho Nacional
de Desenvolvimento Científico e Tecnológico - CNPq, a Brazilian government organ which
promotes scientific and technological development.
2
The research team consists of:– Rosana Vieira de Souza, M.A (Associate Researcher);
Theo Lucas de S. Felizolla, Maria Cândida Lucca di Primio and Ana Lúcia Migowski
(Research Assistants)
3
Domain Names refer to specific computers on the Internet and distinguish each one from
all others. The last part of a Domain Name is a Top Level Domain (TLD). There are two
main types of TLD: generic and country code. Generic Top Level Domains (gTLDs), such
as .com, .org or .net, are to be used by the general Internet public, in principle
distinguishing a particular type of association. Country Code Top Level Domains (ccTLDs),
such as .br, .ca or .ar, identify a particular country or geographical territory.
1
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
other nations. On the other hand it is not difficult to find sites belonging to other
nationalities in the .br domain. It is known for example that “multinational
companies commonly register their name under many domains to protect their
brands” (Gomes e Silva, 2005, p. 2). It is considered here, however, that the set of
pages hosted under the ccTLD .br is representative of the pages published by the
social actors of Brazilian nationality on the World Wide Web, not just because this
is indicated by everyday experience, but more importantly because, despite the
initial uptake of the Internet in Brazil having been rather late, the number of .br
domains registered increased very rapidly (Figure 1) and reached meaningful
figures in few years. The country is currently ninth on the worldwide list of nations
by the number of hosts present (second in the Americas, behind only the USA, and
first in Latin America, with more than three times as many hosts as second placed
Argentina).
Due in part to the lack of an empirical tradition in communications research, not
even the Google4 phenomenon was capable of leading Brazilian researchers in the
direction of hyperlink analysis. It is not uncommon, however to find sites with a .br
Top Level Domain in the samples taken by researchers of other nationalities. The
fact that these investigations have been predominantly carried out by authors from
the Northern hemisphere has resulted in the particular sets of data concerning
Brazil not being examined or discussed at great length.
4
Google was the first search engine to make use of an algorithm that used the link
structure of the web to predict the best quality matching pages. After Google’s success,
the efficacy of Google’s PageRank algorithm has even been taken as a given by related
research that has borrowed aspects of its functionality (cf. Thelwall inpraiseofgoogle.pdf),
notably the analogy between the creation of a link to a site and an academic citation as a
measure of popularity and/or importance. (googlepaper).
2
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
Figure 1: Number of hosts with ccTLD .br per year. Source: ISC Internet Domain Survey, Internet
Systems Consortium. Data available from http://www.isc.org [18th September, 2005]
The results of some surveys in which .br domain sites appear collaterally are
particularly interesting. In his Master’s dissertation, Halavais (1988) carried out one
of the first investigations of the relationship between the structure of web linkage
and national and territorial borders.
Given that the total number of hosts registered with .br domains was still relatively
low (Figure 1) and that the tool used by Halavais for the construction of his quasi-
random sample5 probably implied a bias toward English language sites, the study
registered just 9 sites with there domain registered in Brazil, the only developing
nation to be named in the sample6. According to the data presented, these 9 sites
with .br domains received 0.2% (123) of the total international inlinks verified in the
5
Halavais’ sample consisted of 4,000 sites drawn with a randomizer which was a feature
of Excite’s Webcrawler search engine (the ‘roulette’ page). (Halavais, 1998, p. 62).
3
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
sample, which implies an average of 13.66 links per .br site, placing Brazilian
websites in sixth place with respect to their international connectedness7. (Table 1).
For a country in development, with low indices of digital inclusion and education,
and whose population is far from being proficient in English, the position is
surprising. Of the 5 countries whose domains contain more connected sites than
6
The sample can, however, have included sites from other nations in development, which
were aggregated in the ‘Others’ category.
7
It is worthwhile noting that the connectivity indicators used by Halavais do not describe
the link totals for a given country, but correspond to the average proportion of links for
sites in a given country (Halavais, 1988, p 62).
4
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
those of .br, in only 2 cases is English not the principal language:- Switzerland and
Japan. Switzerland, as widely known, is a particularly multicultural country, with as
many as three different official languages (German, French and Italian). It is also a
highly developed country, described in the CIA World Factbook as a “stable
modern market economy with low unemployment, a highly skilled labor force, and
a per capita GDP larger than that of the big Western European economies” (10th
place in the worlds highest GDP per capita in 2004 (est. $ 33,800.) . Located in the
heart of Europe, its comprehensive state educational system supports the 99%
literacy rate index8 estimated in 1988. Japan, despite having very a different
cultural and ethnic composition to Switzerland, has a very similar profile in terms of
education and economic prosperity: 99% literacy rate (data for 2002) and a GDP
per capita that was estimated at $ 29,400 in 2004 (CIA, 2005, s.p.). In fact, the only
other ‘developing nation’ to appear in the table of connectivity presented by
Halavais is The Republic of South Africa9, which, as with the other Southern
Hemisphere nations that are included (Australia and New Zealand) has English as
one of its principal languages.
Also projecting the patterns of linkage onto the national frontiers given in
‘traditional’ geopolitical maps in 2003 and 2004, in co-authorship with Park and
others and Jun respectively, Barnett based his considerations concerning the
international information flow on the Internet on data that indicated separately the
number of inlinks (received) and outlinks (sent) by domain for 47 different
8
This value from the CIA WorldFactbook uses as its basis the proportion of people above
15 years of age that are capable of reading and writing.
9
According to the CIA World Factbook, South Africa has an estimated GPD per capita of $
11,100, literacy rate of 86,4% and 50% of the population living under poverty line (2000
est.). Brazil’s GPD per capita is estimated by the same source as $ 8,100, the literacy rate
is also estimated as 86,4% and the percentage of population living below poverty line was
estimated at 22% (1998).
5
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
nationalities10. Working from this data, the authors estimate the degree of centrality
of different nodes in the network11 according to two values – the total number of
links with other nodes (Freeman, 1979) and the eigenvector measure (Borgatti,
2005, p. 61)12. Reworking this information with a multidimensional scaling of the
network, Barnett et al. found a completely interconnected system in which the US
occupies the most central position. Next most central are Australia, UK, China,
Japan, Canada. and Germany. The authors call attention to Norway’s position in
the two-dimensional graphical representation of the network, which is more central
than other Nordic nations and located closer to the US than expected, which they
attribute to Norwegian efforts to market through the web. According to the two-
dimensional representation and the color code used by the authors, Brazil appears
to be positioned practically as centrally as Norway (Figure 2). The only comment
possibly related to this positioning that figures in the Barnett et al. paper concerns
the fact that links between Brazil and Portugal are particularly strong, as pointed
out by previous authors (as Bharat et al., 2001, p. 5)13.
10
The initial sample of Barnett et al and Barnett and Jun did not include Brazilian TLDs,
being comprised of the TLDs of the nation members of the OECD (except Poland) and six
generic TLDs (.com, .net, .edu, .mil, .org, .gov)
11
Understanding as nodes the countries to which the domains encountered pertain, the
centrality of each domain reflects its importance, influence and pre-eminence in the
network.
12
A node has a high eigenvector centrality when it is connected to many nodes which are
themselves connected to many nodes.
13
Bharat et al. did not include, in this 2001 work, data about the indegree or weighted
indegree of websites in the .br domain. 6
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
Figure 2: International Internet Hyperlink Structure. Reproduced from Barnett and Jun, 2004, slide
15 (figures also in Barnett et al., 2003). Thickness of the connection line is proportional to the
number of hyperlinks between two countries (50,000 links is the minimum value for a connection to
be indicated). The intensity of the circle representing each country indicates its centrality in the
network. Black arrows and country names were added to emphasize the features which interest the
argument herein developed.
7
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
It is also worth noticing that the analysis of the bandwidth network by the same
authors resulted in a graphical representation in which once more the US was the
most central country, followed by UK, Germany, Hong Kong, Singapore, Japan,
France and Italy. This time, Brazil occupies what appears to be a less central
position and is definitely more closely related to its neighbors in Latin America
(Figure 3).
Barnett et al. (2003) and Barnett and Jun (2004) consider that such results
corroborate the claims of World-System Theory (Wallerstein, 1979) in indicating
the existence of an information flow in the center-periphery direction, “with the
United States and the wealthier nations of Western Europe at the center and the
poor less developed nations of Latin America, Asia and Africa along the margins”
(Barnett et al., 2003, p.11). With regards to bandwidth, the data concerning Brazil
seems to confirm the existence of a hierarchy between the central and peripheral
nations, in that there are structural connections from the periphery to the hub, but
not among the peripheral nations themselves. “The U.S. dominates internet flows
due to its central position in the network. While there are some flows entirely within
Europe or the Asian-Pacific region and limited flows within Latin America, flows
between these localities primarily go through the U.S.” (Barnett et al., 2003, p. 11).
8
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
Figure 3: International Internet Infrastructure. Reproduced from Barnett and Jun, 2004, slide 16
(figures also in Barnett et al., 2003). Thickness of the connection line is proportional to the
bandwidth capacity between two countries (13Mbps is the minimum value for the presence of a
connection to be indicated). Colors indicate membership in a cluster. Black arrows and country
names were added to emphasize the features which interest the argument herein developed.
would be expected. Looking at the detail of the inlinks and outlinks found for each
nationality (ccTLD) in the work of Barnett et al. and Barnett and Jun, a still more
intriguing result appears: the sites with a .br domain feature as receiving more
international links than they provide (Table 2).
At first sight it can appear that the larger number of links into sites with a .br
domain than the number of links out of these is indicative of an information flow
that runs from the informationally rich nations to the informationally poor ones,
corroborating once more the World System Theory. Links between websites,
however, are not equivalent to tracks through which goods, capital, humans - or
even information itself – flow, as if going from the source anchor to the link target.
On the contrary:– at least since the seminal work in which Brin and Page
presented the prototype of Google (1998), the quantity of inlinks received by a web
page has been widely accepted as an indicator of its importance. According to the
rationale behind such argument, the creation of a link functions as an endorsement
of the destination page by the Publisher which established the connection. Thus,
when I place a link to AoIR on my personal webpage, I give the AoIR site a link
with which I declare it to be a destination that I consider pertinent to the readers of
my Web page. In the terms of Walker (2002), with this reference I “create value” for
the AoIR site. It is evident that the value that a given page is capable of
aggregating to another by the establishment of a link is proportional to the value of
the page that contains the outlink: a connection on the first page of Yahoo! (which
receives a high number of daily visitors) aggregates more value to the AoIR site
than a connection from my personal page (which passes days without a single visit
occasionally).
Seen through this lens, the pre-eminence of inlinks to sites in the .br domain found
in the samples of Barnet and Park (2003) and Barnet and Jung (2004) appears
somewhat more paradoxical: what attributes would lead publishers of various
nationalities to create a profusion of links to a developing, Latin American country,
which speaks Portuguese? It is true that Brazil was not part of the set of
nationalities initially selected to comprise the author’s sample, which probably
created a bias toward the inclusion of .br pages that possessed international inlinks
(to the exclusion of the many that do not receive such links). There is no reason,
though, for the methods and criteria in the sampling of Barnett et al. and Barnett
and Jun to have induced selection of pages with lower numbers of international
outlinks. Further, in respect to the degree of outlinkage of sites in the .br domain it
11
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
must be added that Brazil also appears in the 20 top level ccTLDs with the highest
weighted outdegree14 in the research done by Bharat and Ruhl (2001, p. 4-5).
Veloso et al. (2000, p. 5) observe that most of the content within websites with a .br
domain is in Portuguese (more than 75%). With this being so, the existence of a
significantly larger number of international inlinks in comparison to outlinks in
Brazilian websites contradicts the supposed concentration of the international
structure of the web around websites in English (Keniston, 1999). It also goes
against the current conception of the Brazilian people and institutions as being
particularly open and desirous of international contact but without the country being
able to attract the interest of other nations around the world.
14
Outdegree being the value that represents the number of distinct hosts to which the host
in question provides links, the weighted outdegree is the total number of hyperlinks
established by the host in question to the pages of other hosts. Conversely, the indegree is
the number of distinct hosts which link to the corresponding host and weighted indegree is
the number of hyperlinks to pages on corresponding host from other hosts (Bharat and
Ruhl, 2001, p. 3)
12
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
Our first objective was to produce a mapping of the pattern of links between
Brazilian websites and those of other nationalities, projecting the potential
information flows (corresponding to the international hyperlinks) onto a traditional
world map. Having done this we would target our second objective, namely the
qualitative characterization, by sample, of the hyperlinks found in the most visible
sites (those that receive the highest number of international inlinks) within the most
significant flows (initially understood as the most numerous).
The first stage of the research required the building of a sample of websites in the
.br domain that receive (or emit) links from (or to) sites in other countries. So that
we could ensure the collection of all the data needed to carry out the second stage,
we needed to know not only the address (URL) of all of the .br sites that made up
the sample, but also be certain that we also recorded the addresses of the sites
that originated or were the destination of the verified international inlinks and
outlinks relating to the .br sites. We believed that this could be done quickly:– it
would be sufficient to reproduce the method adopted by Barnett et al. (2003) and
Barnet and Jun (2004), which was to search via AltaVista using the algorithm
<domain:xx AND link:yy> where .xx and .yy correspond to the most frequent
TLDs15 and to the .br domain. We were aware of some concerns about the use of
AltaVista for collecting this type of data, but it appeared to us, at first, that the
simplicity of the procedure compensated for any eventual inaccuracies that we
would find.
The results we obtained however, were shockingly inconsistent, indicating a level
of instability in the search engine that was impossible to ignore. (Figure 4)
15
In the preliminary conception of our investigation, the top 20 TLD names by host count in
January 2005 according to the Internet Systems Consortium (http://www.isc.org/)
13
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
Figure 4: At top left the screen of AltaVista with the result of a search using the syntax indicated by
Barnett et al. <domain:br AND link:uk>. The engine found 0 pages. In the centre above, the result of
a search with just the command <domain:br> and a result of 194,000,000 pages. To the right
above, the result of a search with the command <link:uk> and the outcome: 448 pages. A second
trial was made with the command <domain:.br AND link:.br> (37 pages, lower left) and <link:.br>
(65 pages, lower right). Evidently there should not exist only 65 pages with links to other .br domain
14
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
items amongst the 194,000,000 pages listed by AltaVista. The majority of the pages found using the
query term link contained the text <link> and <.br> but, curiously, not all of them proved to have this
connection with the search command used.
16
A wildcard is a special symbol especial which stands for one or more characters.
15
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
Figure 5: To the left, the results of searching for sites with .br domain that offer links to
http://www.aoir.org (9 results) and sites within the .ar domain that provide links to
http://www.unisinos.br (281 results). To the right, corresponding searches that attempted potential
wildcards (0 results).
16
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
On the other hand, as it was already planned that the second stage of the research
would concentrate on the sites with the highest international visibility, it appeared
reasonable to us to consider the indexing and retrieval by the search engines as an
additional indicator of the visibility of the .br sites in the sample selection. There
are, after all, three basic methods of finding a website:– (a) using search engines,
(b) following hyperlinks given on other sites – which is the reasoning behind the
presupposition that there is a directly proportional relationship between the number
of inlinks and the visibility of the site – or (c) entering a previously known URL –
supplied by a person or institution either online or offline, the user has previously
visited the site, etc. The analysis of hyperlinks is based, above all, on access of
type (b). The use of search engines for the composition of the sample involved
access of type (a), and in so doing emphasizes the considerably higher visibility of
the sites indexed by the search engines when compared to those not indexed. The
use of inlinks as a measure of visibility by the search engines themselves, the
basis of Google’s Page Rank and also used now by Yahoo!, reiterates the
presence of the access modes (a) and (b) in our sample. To include, minimally,
access of type (c)18 we also added to our sample some of the Top 100 Third-Level
Domain Names as indicated by the Internet Systems Consortium for July 2005,
whose countries of origin (indicated by the ccTLDs) were included in our initial list
of ccTLDs for analysis.
None of these considerations mean, however, that the largest of all the obstacles
encountered to date has been overcome. As with AltaVista, none of the other
search engines that we tested was capable of carrying out searches conjugated to
a restriction of domains and the localization of inlinks in the way that our research
required. We opted therefore to adopt a mixed procedure, combining the potential
of the search engines with that of individual site mappings made using limited-
18
Which, it is worth noting, emphasizes a type of visibility whose inducers can be located in
or out of the Web.
18
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
19
A good part of the Brazilian universities do not use the SLD .edu.br, being registered as
just .br (Comité Gestor da Internet no Brasil, 20/09/2005, s.p.).
20
Public Access data, at http://registro.br. 19
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
number of results found by each search engine and the total number of registered
hosts should function as an indication of the representativity of the sample with
respect to the complete universe of hosts. It is not reasonable to use for this
calculation the number of results indicated by the engines at the end of each
search (in the form <1-100 results of about n,000 for…>). In the end, as with the
indicated total number of pages indexed in the data base by the search engine
always being questionable, the page estimates listed at the top of web results page
has never been accurate either21. Additionally, the degree of success of the
clustering22 done by each search engine would greatly influence the total number
of results indicated and the number of single DNS hosts effectively located.
21
A good listing and discussion of the factors that compromise the reliability of this data
can be found, for example in Price, 2005, s.p.
22
In the present context, clustering means showing only one page from the same DNS per
results page. 20
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
Figure 6: Partial view of Google’s first results page for site:.edu.br in 23rd June, 2005.
21
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
The number of results effectively offered by the engines23 varied between 800 and
1000 addresses. These were stored as HTML files. (Figure 6). The raw initial
sample was composed of 28 lists with an average of 900 addresses per list
(around 25,200 URLs). The URLs of the host pages were manually extracted from
the HTML files (Figure 7) and then organized and counted using a Perl script
(repeated addresses were substituted by an indication of how often they appeared
in the original list). The repetitions of host pages were not frequent, indicating that
the clustering of both search engines, Google and Yahoo!, is efficient. The total
number of results with single URLs listed was considered to be the real sample
size obtained for each TLD or SLD and will be used to calculate the quantitative
representativity of the samples.
23
In other words, those results whose URLs were really made available by the engine, in
contrast to the total results that the engine announced as having been located but the
addresses of which were not made available to the user.
22
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
Figure 7: Partial view of three phases of the cleaning process for Google’s first results page for
site:.edu.br on 23rd June, 2005.
24
‘Google Dance’ was a commonly used denomination of the index update of the Google
search engine, which was undertaken about once a month, and during which the Google
search results varied significantly.. It has been reported that since 2003 Google has been
updating its index continuously, thus Google Dance no longer happens. There are claims,
however, that there has to be an update of the complete index once in a while and that this
could still cause similar disruptions in Google’s search results.
23
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
Figure 8: Google’s results for site:.edu.br on 23rd June, 2005 after counting (left) and organized in
decreasing number of occurrences.
We are currently working on the crossing of the lists by domain type and between
search engines. Once we have the results of this process, we shall pass on to
mapping the sites that have appeared with the highest frequency in our searches.
It is believed that a local crawler, covering up to five levels of depth, will be
sufficient to obtain a first indication of the existence or otherwise of international
24
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
outlinks. To locate inlinks, searches will be made with the query syntax
<domain:.xx AND link:URL> targeting the host page of each site and a random set
of pages from the 5 levels identified through the mapping.
With the mapping in hand we will move on to the qualitative analysis of the verified
international hyperlinks, in principle characterizing them by function (structural or
associative, as in Obendorf and Weinreich, 2003), the types of elements that
function as source and destination anchors (text or image), the meaning of the
source and destination anchors and their context (in the middle of text, at the start
of text, at the end of text, in a list etc.) The type of website will also be analyzed
(looking at, over and above the SLD type, the effective content of the page) and
the depth at which the international hyperlink is located (the host page being level
0).
Conclusion
In the conception of our research, we planned to reproduce the procedures
adopted by Barnett et al. (2003) and Barnett and Jun (2004), considering this
replication to be a preliminary step that would be completed quickly. However, on
attempting to replicate the procedure we found that it was now impossible to obtain
the desired results using the techniques described. Reviewing the literature, we
considered and discussed some alternative techniques for the collection of
international linkage data – all of which demanded resources that we did not have
available.
Faced with the non viability of collecting a quantitatively representative sample of
.br sites with international hyperlinks, we sought to develop a set of procedures
that would allow us to obtain data from samples that may possibly be smaller than
usual. The methodological strategies that we are proposing are based upon
chaining together a sequence of qualitative selections and as a result are
particularly labor intensive.
25
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
Bibliography
Borgatti, S.P., “Centrality and network flow”, Social Networks, volume 27 number 1, p. 55-
71. 2005: Analytic Technologies, available online at
http://www.analytictech.com/borgatti/papers/centflow.pdf (28th September, 2005).
Brin, S. and Page, L., "The Anatomy of a Large-Scale Hypertextual Web Search Engine".
Proceedings of the Seventh International World-Wide Web Conference, Elsevier
Science B.V., 1998. Available at
http://www7.scu.edu.au/programme/fullpapers/1921/com1921.htm (12th June,
2005).
26
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
Cullen, D., Yahoo!buys!Overture!, The Register: Sci/Tech News for the World, 14th July
2003. Available online at
http://www.theregister.co.uk/2003/07/14/yahoo_buys_overture/ (14th September,
2005)
Halavais, A. M. C., Measuring National Borders on the World Wide Web. Thesis submitted
in partial fulfillment of the requirements for the degree of Master of Arts, University
of Washington, 1998. Available online at http://alex.halavais.net/research/thesis.pdf
(20th September, 2003).
27
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
Halavais, A. M. C., Networks and Flows of Content on the World Wide Web. International
Communication Association Convention, 2003. Available online at
http://alex.halavais.net/research/halavais-ica03a.pdf (12th March, 2005).
Keniston, K., “Language, Power and Software” in C. Ess and F. Sudweeks (eds.) Culture,
Technology, Communication: Towards an Intercultural Global Village, 1999. New
York, Suny Press. Available online at
http://web.mit.edu/~kken/Public/PDF/Language%20Power%20Software.pdf (20th
September, 2005)
Overture Services, AltaVista Help, Search, Special Search Items. Available online at
http://www.altavista.com/help/search/syntax (30th May, 2005 ).
Page, L. et al., The PageRank Citation Ranking: Bringing Order to the Web, Stanford
Digital Library Technologies Project, 1998. Available at
http://dbpubs.stanford.edu/pub/1999-66 (10th March, 2005).
28
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
Price. G., More on the Total Database Size Battle and Googlewhacking With Yahoo.
Search Engine Watch Blog, 11th August, 2005. Available online at
http://blog.searchenginewatch.com/blog/050811-231448 (17th September, 2005)
Richardson, T., “Altavista flogged to Overture”, The Register: Sci/Tech News for the World,
19th February 2003. Available online at
http://www.theregister.co.uk/2003/02/19/altavista_flogged_to_overture/ (14th
September, 2005)
Sullivan, D., Who Powers Whom? Search Providers Chart. Search Engine Watch Reports,
23rd July, 2004. Available online at
http://searchenginewatch.com/reports/article.php/2156401 (7th June, 2005)
Thelwall, M., Research Note: in praise of Google finding law journal websites. Online
Information Review, volume 26, number 4, 2002, p. 271-272. Available online at
http://www.scit.wlv.ac.uk/~cm1993/papers/2002_In_praise_of_Google.pdf (18th
September, 2005).
Thelwall, M., SocSciBot 3, Link crawler for the social sciences. Available online at
http://socscibot.wlv.ac.uk/ (29th September, 2005)
Vaughan, L. and M. Thelwall, “Search Engine Coverage Bias: Evidence and Possible
Causes”. Information Processing and Management: an International Journal,
volume 40, issue 4, May, 2004. ACM Digital Libraries, available online at
http://www.acm.org/dl.cfm (25th September, 2005) [restricted access]
Veloso, E. et al., “Um retrato da web brasileira”, Anais do XXI Seminário Integrado de
Hardware e Software (SEMISH 00), 2000, Curitiba, Paraná, Brazil. Available online
at http://stat.akwan.com.br/~golgher/semish00.ps.gz. (15th July, 2005)
Walker, J., "Links and Power: The Political Economy of Linking on the Web", Proceedings
of the thirteenth ACM conference on Hypertext and hypermedia - Hypertext 2002.
29
Fragoso, S. Mapping Brazil's Connectivity – do we really get more than we give? Presented
at the IR 6.0, 6th International Conference of the Association of Internet Researchers,
Chicago, USA, October 2005.
Baltimore: ACM Press, 2002. 78-79. ACM Digital Libraries, available online at
http://www.acm.org/dl.cfm (7th June, 2005) [restricted access]
Wallerstein, I., El Moderno Sistema Mundial. Madrid, Siglo Veintiuno Editores, 1979.
Weinreich, H. et al., The Look of the Link - Concepts for the User Interface of Extended
Hyperlinks. Proceedings of the Twelfth ACM conference on Hypertext and
Hypermedia – Hypertext 2001. Denmark, ACM Press, 2001. 19-28. ACM Digital
Libraries, available online at http://www.acm.org/dl.cfm (3rd September, 2005)
[restricted access].
30