Professional Documents
Culture Documents
by
Kristian Billeskov Bøving
kristian@billeskov.dk
Supervisor:
Professor Klaus Bruhn Jensen
Department of Film and Media Studies
University of Copenhagen
Abstract
Computer mediated communication in organizations today is characterized by the
introduction of packaged, generic computer media based on Internet technology to support
communication and collaboration. This challenges conventional views and theories on how
technology is related to the context (i.e. social structures) in which it is embedded. One type
of technology introduced in organizations is virtual workspaces, a specific type of web-based
groupware. This thesis investigates the adoption and use of a virtual workspace technology in
an organization. It studies how the technology is adopted and integrated in the organization
and in specific work practices. The theory of genres of organizational communication is used
as a framework for specifying the context relevant for understanding the adoption of the
technology.
The study of computer mediated communication in temporally and geographically
distributed settings poses methodological challenges as to the observation of usage. This
study explores the triangulation of different methods for analyzing usage of the technology
and develops a method for utilizing and integrating log file analysis in a case study, which can
serve as a possible response to the methodological challenge.
Abstract.....................................................................................................................................i
Acknowledgements .................................................................................................................ii
Table of headers .................................................................................................................... iii
Table of contents ....................................................................................................................iv
Introduction .............................................................................................................................1
Framing the research ...............................................................................................................6
Organizational computing...................................................................................................6
Two imperatives and an alternative................................................................................7
IT and organizational design ..........................................................................................9
Organizations and task-technology fit..........................................................................11
Media Richness theory and its critics...........................................................................13
The emergent perspective .............................................................................................15
Structuration theory as a basis for an emergent perspective .......................................16
Adaptive structuration theory .......................................................................................18
Technology-in-practice.................................................................................................20
Selecting the relevant social structure ..........................................................................22
Organizational communication.........................................................................................22
CSCW............................................................................................................................23
Research in Group Decision Support Systems (GDSS) ..............................................24
Units of analysis for organizational communication .......................................................26
Genres of communication.............................................................................................26
The role of computer media in genre theory................................................................28
Genre repertoire and genre system ...............................................................................29
The consequence of introducing genres of communication ........................................30
Research method ...................................................................................................................32
Defining the object of study .........................................................................................32
Schools of IS research methods....................................................................................34
The case study or field study approach ........................................................................37
On Generalizability .......................................................................................................39
Level two generalizations in my study:........................................................................40
Combining Research methods ......................................................................................41
Two hypothetical studies of Lotus Quickplace............................................................43
The research design...........................................................................................................44
Mine the gap - a multi-method investigation of web-based groupware use
iv
The interviews...............................................................................................................47
Log file analysis ............................................................................................................49
The survey .....................................................................................................................52
Level-one generalizations .............................................................................................54
Sampling in a case study...............................................................................................56
HTTP-log analysis for CMC.................................................................................................58
Web mining and web usage mining .............................................................................59
A survey of the research in web usage mining ............................................................60
Mining computer-mediated communication ................................................................62
HTTP-log analysis and cryptanalysis...........................................................................64
The practical process of log analysis................................................................................65
Mapping user actions to log lines.................................................................................66
Breaking the code in Lotus Quickplace............................................................................69
Some generic technical challenges of HTTP-log analysis...............................................70
Identifying the user .......................................................................................................70
Handling caching ..........................................................................................................72
Consistency of the resource ID.....................................................................................73
Introducing virtual workspaces.............................................................................................74
Introduction .......................................................................................................................74
Method...............................................................................................................................76
Three economic models of virtual workspaces ................................................................76
The design process of virtual workspaces........................................................................78
The standards process ...................................................................................................81
The application development process ..........................................................................82
The adoption of virtual workspaces .............................................................................83
Design and the use of metaphors ......................................................................................84
The design strategy for virtual workspaces..................................................................85
Approaches to modelling the anticipated use...............................................................86
The use of metaphors ........................................................................................................89
The metaphorical landscape .........................................................................................90
The house, room or the office.......................................................................................90
The domains of reference. ............................................................................................92
Summary of virtual workspaces .......................................................................................94
The study of Quickplace use.................................................................................................95
Characterising Quickplace use..............................................................................................97
The implementation of QP..........................................................................................100
Introduction
The research documented in this thesis has been conducted under the auspices of the
DIWA (Design and use of Interactive Web Applications) research program. The ambitions of
the program are expressed in the following way:
“The goal of the program is to examine how Web-technology - as a networked,
distributed computing platform - is changing organizational IS development and use.”
Source: www.diwa.dk
This thesis investigates the adoption and use of a specific web-based communication
technology in a specific organization. The investigated technology is Lotus Quickplace,
marketed by IBM as "Instant, Secure Team Workspaces for the Web". Lotus Quickplace is a
type of web-based application that is called virtual workspace in this thesis. Virtual
workspaces have emerged as part of the .com boom. They are applications that are inspired by
groupware technologies, and they are designed to support a group of people working together
remotely. As the marketing message form IBM indicates, the value proposition is that a group
of people can instantly establish a platform for cooperation on e.g. a project across
geographical and organizational boundaries.
Virtual workspaces started out as applications that could be leased via the web for a low
monthly fee or as ad-ware, but it seems that the largest commercial success has been to sell
the product in a more traditional software license model, and the technology has spread as
internal collaboration tools in large organizations, and as platform for consulting
organizations collaborating with customers on engagements.
The first commercial virtual workspaces were offered in the second half of 1999 and,
according to Gartner, "Team collaboration support" is now a maturing market segment IBM
(2002). "Team collaboration support" covers more or less the same applications which are
denoted as virtual workspaces. According to IBM, Lotus Quickplace is used in 60% of
Fortune 100 companies IBM (2002). Virtual workspaces seem to have become a commercial
success in terms of licenses sold.
The promise of "instant" collaboration across geographical and organizational
boundaries raises of course a suspicion. My own experience with using Lotus TeamRoom as a
consultant in IBM Global Service was that it is difficult to utilize such a tool, and in many
cases it actually fails. TeamRoom is a predecessor to Quickplace based on the proprietary
Lotus Notes platform. A genuine fascination with the possibilities of utilizing virtual
workspaces, combined with an interest in the obstacles involved, spurred an interest in
investigating how virtual workspaces are actually adopted and used in an organizational
setting.
Mine the gap - a multi-method investigation of web-based groupware use
1
Introduction
The opportunity to study the use of a virtual workspace arose in Beta Corporation, a
partner in the DIWA research program. Beta is a Nordic financial corporation that was
formed as the result of a merger during 2000 of a Danish financial corporation named Alpha,
a Swedish-Finnish financial corporation, and a Norwegian financial corporation. In May 2000
Beta had implemented Lotus Quickplace as a technology to support projects spanning more
than one country.
One of the interesting aspects of virtual workspaces introduced in organizations is that
they are introduced as though the magic word "instant" marketed by vendors can be taken
literally. At Beta the technology is introduced "as is" without education or guidelines for how
to use it. A virtual workspace is a generic groupware solution that can be used in a wide
variety of ways. It typically offers basic support for controlling access to documents, sharing
documents, integration with e-mail, discussions, and synchronous chat. This leaves the users
of a virtual workspace with the task of finding out what it should be used for and how it
should be integrated in their existing work and with the existing media for communication.
This approach to the implementation of IT-systems is new - at least at Beta. The only
predecessor might be e-mail. E-mail has been introduced in corporations as an open
communication platform without an explicit purpose or specific guidelines for how it should
be integrated into the work practice. E-mail therefore seems to be the computer medium that
can provide the best inspiration for understanding how virtual workspaces are adopted in an
organization.
The open character of the virtual workspace technology and the way it is implemented at
Beta calls for a closer examination of how it is actually integrated in diverse work practices.
Investigating this integration has been the main driver of the research process. The research
question, which I formulated in order to guide the investigation, was the following:
The main purpose of the investigation is therefore to understand in detail how virtual
workspaces interact with the social structures of organizations in which they are adopted. An
important aspect of this is to understand the actual role of the technology in relation to the
work practice and in relation to other available technologies for communication and
collaboration.
A secondary purpose of the research has been to experiment with different methods for
analyzing the use of a virtual workspace. The use of virtual workspaces is distributed in time
and space, which renders traditional observations of their use very difficult. This poses a
methodological challenge for the detailed studies of technology use, and this has therefore
been the other main drive of the research process. Virtual workspaces offer an opportunity
that has not been developed in previous research on the use of technology in organizations.
Virtual workspaces are web-based. This means that the client used is a browser and that the
communication between client and server is achieved via a web or HTTP-server. All HTTP-
servers adhere to a de facto standard for logging activity known as the Common Log File
Format. This provides a possibility for using the HTTP-log as part of the empirical
investigation of virtual workspace use.
My investigation seeks to combine both interviews, document analysis, a questionnaire
and HTTP-log analysis in the study of technology use and maintains also a particular focus on
the investigation of the possibilities of log analysis. The analysis of HTTP-log files is being
used to some degree in HCI (Human-Computer Interaction) and commercially to analyze the
interaction between a user and a web-application. It has not yet been used to understand
communication between users mediated by a web application. The thesis proposes some
promising applications and identifies limitations and pitfalls of using HTTP-log analysis for
understanding computer-mediated communication.
The thesis is divided into six main sections, which are presented here:
Framing the research
The purpose of this section is to provide the theoretical framework for investigating how
virtual workspaces are adopted in an organization. This involves a discussion of theories that
have been important in previous research on organizational computing, especially in the
research of related technologies such as group support systems, CSCW and e-mail. I have
chosen structuration theory developed by Anthony Giddens (1984) and extended by Klaus
Bruhn Jensen (2000) as the overall framework for understanding the dynamics of social
structures in the organization. I have taken Yates' and Orlikowski's (1992) application of
genre theory and structuration theory in the concept of genres of organizational
communication to specify the social structures relevant for understanding the adoption and
use of Lotus Quickplace at Beta. Lastly, I have used Orlikowski's (2000) distinction between
the technology as an artefact and the technology-in-practice to explain the interaction between
the properties of the virtual workspaces and the social structures of the organization.
Research method
This section discusses methodological issues related to doing research on the use of
technology in an organization, and presents the research design. It focuses on the issues
related to combining qualitative and quantitative traditions of research. The research
presented in this thesis combines interviews, a survey, and HTTP-log analysis as the primary
sources of data in a case study. As we shall se this has both advantages and potential pitfalls.
The research design and the collection and analysis of data are presented here to enable
the reader to assess the conclusions, which have been drawn when the results of the case
study are presented.
HTTP-log analysis
A separate section is devoted to HTTP-log analysis for two reasons. HTTP-log analysis
and log analysis in general is not a commonly used method in research in technology use in
organizations. The methodological and practical challenges of using log analysis are therefore
discussed in detail. The second reason is that the use of HTTP-log analysis to investigate
computer mediated communication presents a novel approach to utilizing HTTP-logs. The
previous analyses of HTTP-logs in HCI and in commerce have focused on session-based
analysis. A document-based analysis is introduced as a method for analyzing computer-
mediated communication.
Introducing virtual workspaces
Preceding the reports from the case study, a separate section is devoted to an analysis of
the virtual workspace technology. This I have done by means of a comparative analysis of
seven different virtual workspace products. Both the design process and the user interface and
functionality are analyzed. The design process of virtual workspaces is characterized by both
the heavy reliance on the development of Internet standards and the fact that it is designed as
a generic software product. The analysis of the user interface and functionality shows that the
functionality across the products is more or less the same. The main differences between the
products lie in the approaches taken to model the anticipated use situation and in the different
metaphors used in the design. The "metaphorical landscape" is introduced as a method for
analyzing user interfaces.
The study of Quickplace use
The analysis and findings from the case study at Beta is divided into five sub-sections,
which reflect the explorative nature of the study.
The first sub-section gives an overall characterization of the introduction and adoption of
Lotus Quickplace at Beta. The history of Lotus Quickplace at Beta is reported and the use of
QP is reported based primarily on data from the survey and log analysis.
The next sub-section looks more closely at three QPs at Beta and gives a detailed
account of the use of these. In this section three different instantiations of genres of
communication from the three Quickplaces are analyzed in detail as a basis for understanding
the exact role of the Lotus Quickplace technology. This analysis also addresses the role of
additional media. The analysis shows that genres of communication involve more than one
medium. The study of the instantiations of genre exemplifies a valuable contribution from log
analysis to the understanding of computer media use.
Following this I zoom out again to explore another perspective on the use of log
analysis. The document-based log analysis is used as a method for characterizing patterns of
computer-mediated communication. In this analysis different data-mining techniques are
applied.
The fourth section suggests a theoretical model for understanding a specific aspect of the
use of virtual workspaces: maintaining a folder structure is a central aspect of integrating a
virtual workspace within a work practice, and a model is suggested to explain how folder
structures develop over time. The model is substantiated by an analysis of the development of
folder structures in three Quickplaces based on log analysis and interviews.
Based on the model for understanding folder structures and the study of genres of
communication some general implications are drawn on how we should understand the virtual
workspace technology.
Conclusions and implications
The last section summarizes and evaluates the findings of the thesis. This section is
divided in three parts. The first part addresses the findings related to the research question
posed. The second part summarizes the experiences with using log analysis, and the last part
is devoted to the presentation of some practical consequences of the findings for designers
and users of virtual workspaces.
Organizational computing
The idea of using computers to support various tasks in organizations has existed since
the early days of computing. At first, when computer power was expensive, they were
primarily used for processing information such as handling invoices or inventories. One of the
first industries to exploit computers was the banking industry. Now, computers have entered
Mine the gap - a multi-method investigation of web-based groupware use
6
Framing the research
most aspects of organizational life and they have moved from being conceived as information
processing machines, and are now also researched and used as a medium for communication.
Research into issues regarding organizational computing has been conducted using a
variety of names: Information Systems, Management Information Systems, information
processing systems, information and decision systems, organizational information systems,
etc. Orlikowski and Baroudi (1991) use the broad term "Information Systems Research" to
denote research that from different theoretical perspectives studies information technology in
organizations. Information systems research is divided into a number of under-disciplines
such as DSS (Decision Support Systems), GDSS(Group Decision Support Systems), CSCW
Computer Supported Cooperative Work), CMC (Computer Mediated Communication), PD
(Participatory Design), HCI (Human-Computer Interaction), and others.
Most research into organizational computing is focused on producing theories, which
can be useful for guiding practice. Lynne M. Markus and Daniel Robey state the imperative
of IS research rather precisely:
“Good theory guides research, which, when applied, increases the likelihood that
information technology will be employed with desirable consequences for users,
organizations, and other interested parties.” Markus and Robey (1988) p. 583
This characteristic of IS research implies that it is performed in the same system in
which it is later applied. The research in organizational computing is characterized by the fact
that it is done with the purpose of applying the research in the same organizations in which it
is performed. The results of the research presented in this thesis are intended to promote the
understanding of how IT is integrated in organizations, and the results intended to be
applicable in improving the design and adoption of computer media in organizations.
In the technological imperative, organizational forms act as the dependent variable and
IT is treated more or less consistently as an independent variable.
The organizational imperative treats the IT as the dependent variable. It holds that
“...human actors design information systems to satisfy organizational needs for information.”
Markus and Robey (1988) p. 587. One example of the organizational imperative is the media
richness theory, which holds that information needs vary with task ubiquity and variety and
that media are chosen according to this (Daft and Macintosh (1981), Daft and Lengel (1986)).
According to this theory, e-mail would be used for routine communications, while face-to-
face communication would be used for ubiquitous and unique situations. Another example is
theories of task-technology fit, for example Zigurs and Buckland (1998). The organizational
imperative argues basically that managers and developers of IT can choose rationally to
design an IT system according to their needs and that subsequently the system will be used
accordingly. As Markus and Robey (1988) note, the empirical support for the organizational
imperative is limited. Lynne Markus has in her classical case study (Markus (1994)), shown
that media richness theory is not a good theory for understanding how e-mail is used.
As a third perspective Markus and Robey present the emergent perspective.
“The emergent perspective hold that the uses and consequences of information
technology emerge unpredictably from complex social interactions.” p. 588
An early example of the emergent perspective used in an empirical study is Barley's
(1986) study on the use of CT scanners in two radiology departments. The study of two
radiology departments introducing and using CT scanners shows that the same properties of
the technology produced two different outcomes. Barley concludes his article by stating that
structuring theory as he labels his contribution “...departs from previous approaches to the
study of technology by postulating that technologies are social objects capable of triggering
dynamics whose unintended and unanticipated consequences may nevertheless follow a
contextual logic.” Barley (1986) p. 107
He argues here that neither the technological not the organizational imperative is useful
for explaining his case. Rather, studying the context in which the technology is put to use will
produce useful theories for understanding the relationship between IT and organizations. The
theoretical background on which Barley builds his theory is structuration theory.
In the following, the three perspectives on the relation between technology and
organization are introduced in more detail by using examples of research performed from the
perspectives. The purpose is to clarify some of the issues involved in the relationship and thus
the background for the thesis.
The primary means of coordination used in the four organizational types are described
below the name of the type, so whereas as the machine bureaucracy primarily coordinates
activities through standardized work processes and output, the professional organization
coordinates primarily through standardized skills and norms.
According to Groth the basic contributions of IT are the following three, rated by
significance.
1. IT can process information outside of the mind
2. IT improves information storage capabilities
3. IT improves communication
This rating of IT contributions shows how information processing is still considered the
most important aspect of IT. The rating also shows that while this thesis explores the
possibility of IT to support communication certain persons, such as Groth, express their
disbelief in the value of supporting human communication with IT, compared to other uses.
“... not to say that personal communication and networking are unimportant, but they
have limited potential compared to other uses of information systems.” Groth (1999) p. 14
Additionaly the rating documents a contemporary account of the technological
imperative. Groth's vision for the future organizational design is the model-driven
organization.
“The main constituting part of the organization will be the integrated computer-based
systems, part of the organization will be the integrated computer-based systems, their
programmed patterns of action, and, implicitly, the conceptual model they are based on. The
coordination of the organization members will then be mediated mainly by the systems and
thereby (logically) by the model, not by direct human communication.” Groth (1999) p. 356
His central point is that the model you create of an organization is no longer a passive
model like an organization diagram, but that once it is built into a computer system it
becomes an active model. It is activated because it at the same time defines and describes the
organization. This characteristic was first expressed, although with a completely different
emphasis, by Shoshana Zuboff (1988) as a fundamental duality of information technology.
“Activities, events, and objects are translated into and made visible by information when
a technology informates as well as automates.” p. 10
While previous technologies only produced concrete products, IT simultaneously
generates information about the process in which the products are produced. Groth uses the
duality of information technology in an argument for letting the model drive the organization
in the sense that the model is embedded in IT systems which at the same time describe and
"are" the producing organization. The central systems of financial organizations like the one
studied in the present thesis can actually be analyzed in this way. Groth's own examples
include airplane design and car production. He concludes with three basic models: the
regulating model, the mediating model, and the assisting model. These three models result in
five different organizational forms: joystick organization, flexible bureaucracy, interactive ad-
hocracy, meta-organization, and organized cloud p. 402 - 403
Groth illustrates work on IT in organizations, which provides another perspective than
the one chosen for this thesis. Groth suggests that other approaches, rather than support of
communication between humans, might produce more radical results in terms of
organizational effectiveness. Groth's approach is basically to build a model organization and
represent the model in an IT system. In fact, he conceives that the building of the model is the
same process as the building of the IT system. The IT system should then guide the operations
of the organization. However, the approach of groupware and communication technologies is
to construct simple tools that let people construct simple local models of coordination.
The bank of the case study is like most other organizations, an organization that uses
both strategies for organizational computing. Processes such as financial transactions or
counselling processes for people wishing to acquire loans are modelled so that the
coordination between employees who take part in these processes is mediated by the model
and not directly by the employees. At the other end the use of virtual workspaces, e-mail,
LAN-drives etc., is meant to support the local construction of coordination/communication
between employees.
Outside of serving as a contemporary account of the technological imperative, Groth
also expresses an often-heard opinion that groupware is not worth betting on. His opinion
aside, it is a fact that groupware is being used in a number of organizations. E-mail is now a
medium as important as the telephone in many organizations, and other kinds of groupware
have also been deployed, and among these, the virtual workspaces of our case study
organization.
Underlying both the technological and organizational imperative is the belief that the
relationship between organizational structure and properties of the technology should be
studied independently of specific organizational configurations and specific technologies.
Theories of task-technology fit are examples of the organizational imperative.
A school of thought that has had impact on the study of groupware is decision support
theory (see e.g. Jarvenpaa (1989)). A decision support theorist hinges on the idea that
organizations are units producing decisions. A natural role for IT systems is therefore to
support decision-making.
Saunders and Jones (1990) have, for example, attributed the use of media to different
phases in the organizational decision-making process. They divide the decision-making
process into three phases: identification, development, and selection. The role of written
media such as e-mail is attributed to the latest phase of the decision-making process.
A special type of decisions is group decisions and consequently Group Decision Support
Systems to support them. This is an area, which has been intensely investigated. See for
example Fjermestad and Hiltz (2000), Fjermestad and Hiltz (2000) for an overview of the
empirical research.
Stemming from the decision support school are theories of task-technology fit. The task-
technology fit theories are based on the idea, that there are a limited number of task types that
can be mapped to a design of an IT system. The IT system should thus enable the group of
people, which performs the task to accomplish the task better. That could be either faster,
with less errors, with less conflicts or with greater satisfaction.
Zigurs and Buckland (1998) presents a theory of task-technology fit for Group Support
Systems. We have previously quoted the rationale behind the proposed theory:
“Can we specify particular combinations of task and GSS (Group Support Systems)
technology that will enhance group performance?” Zigurs and Buckland (1998) p. 314
The task-technology fit theory answers the question positively based on a typology of
tasks and a typology of characteristics of computer support. The different tasks are divided by
their structural properties and the typology consists of tasks which are labelled: simple,
problem, decision, judgement, and fuzzy. The types of computer support are reproduced in
the table below.
Communication support “any aspect of the technology that supports, enhances or defines
the capability of group members to communicate with each
other.” p. 319
Process structuring “any aspect of the technology that supports, enhances, or defines
the process by which groups interact, including capabilities for
agenda setting, agenda enforcement, facilitation, and create a
complete record of group interaction.” p. 319
Information processing “the capability to gather, share, aggregate, structure, or
evaluate information, including specialized templates.” p. 319
By introducing different kinds of fit between technology and task, Zigurs and Buckland
propose a model for perfect matches between technology and task. The match is expressed in
five propositions, one for each type of task. For example the proposition on simple tasks
states:
P1: Simple tasks should result in the best group performance (as defined for the specific
task) when done using a GSS configuration that emphasizes communication support.” p. 326
The proposition for fuzzy task states:
“P5: Fuzzy tasks should result in the best group performance (as defined for the specific
task) when done using a GSS configuration that emphasizes communication support and
information processing, and includes some process structuring.” p. 328
The theory of task-technology fit exhibits a common approach to researching and
theorizing in IS. It defines some generic structure of tasks or processes, which can be matched
to IT systems with certain properties, independently of the specific organizational setting. The
theory also exhibits some of the limitations of this kind of theorizing.
Firstly, the categories for both task and computer support are too generic to be of
practical use in the design of IT systems. As the first proposition exhibited above shows,
linking a simple group task with communication support is of limited value for designing IT
or for making decisions about implementing IT in an organization. Communication support is
a very abstract notion that could involve a number of different media such as telephone, e-
mail, instant messaging, virtual workspaces, etc.
While acknowledging that existing organizational structures and issues of
implementation are not insignificant, the theory states that, despite all the noise, at the core
there is a perfect match between task and technology. As we shall see, when we turn to an
overview of the research on Group Decision Support Systems, the empirical evidence for this
kind of perfect match is lacking.
As an example of the structural properties relevant to the adoption of the technology she
mentions the competitive culture of the consulting organization. This culture had the
consequence that employees were reluctant to share documents with each other.
The study exemplifies the emergent perspective in the sense that it identifies specific
cognitive and social structures in the organization that are necessary for understanding how
the technology is used.
Regardless of whether generic theories of IT and organizational design (e.g. Groth
(1999)) or of task-technology fit are true or useful, a study such as Orlikowski's identifies
specific problems of introducing IT that are perhaps of more practical importance. There
exists however an ambiguity in her interpretation of the results. One could interpret the results
as pointing in two directions. The users' mental models of the technology could be seen as an
issue regarding the implementation. The fact that the technology was implemented in a
certain way by management could explain the mental models. The identification of the
structural properties of the organization points in a direction that has greater consequences for
how the relationship between IT and organization should be understood. In later papers she
focuses on the social structures as the most profound finding (e.g.Orlikowski (2000)).
Both Markus (1994) and Orlikowski (1996) argue for the importance of the specific
social structures in organizations for the understanding of how IT is introduced in
organizations. Structuration theory formulated by Anthony Giddens has been proposed as a
theoretical frame for addressing this.
the relationship between agents, structure and media as a trichotomy that extends Giddens
dichotomy of structure and agents.
Structure
Agent
Medium
While Giddens' concept of duality of structure addresses how social structures interact
with agents, the introduction of the media element results in three different kinds of
interactivity. This distinction allows us to specify the relationship studied in this thesis. While
the discussion until now has been structured around the relationship between organizations
and technology, structuration theory includes the agents as an important element. The agents
create and recreate social structures.
The three interactivities are agent - medium, agent - structure, medium - structure. The
interactivities specified in a setting of organizational computing and computer media will
result in three different kinds of relations to look for:
Agent - medium the relationship between agent and medium is
the relationship studied in HCI. It analyzes
the agents' interactions with a medium.
Agent - structure the relationship between agent and structure
is e.g. the study of computer mediated
communication. It studies how computer
media affects communication.
Structure - medium This relation studies how social structures in
an organization interact with the properties of
computer media.
The focus of this thesis is to study the agent - structure interactivity. While this is the
focus, the strength of Jensens model is that it insists that the three elements cannot be studied
independently. While the focus is on the interactivity between agent and structure, it is the
purpose of the research to analyze how this interactivity is affected by the medium. The
agent-medium and structure-medium interactivities should be included in observations on
agent-structure interactivity.
Structuration theory has been used as the outset for creating a number of theoretical
accounts on how collaborative technologies are adopted in organizations. It has been used for
characterizing two different processes in relation to IT: the design of IT, and the adoption of
IT. This distinction has not until recently (Orlikowski (2000) provides an attempt) been
clarified. Orlikowski and Robey (1991), Lyytinen and Ngwenyama (1992), Orlikowski
(1992), Desanctis and Poole (1994) all focus their use of structuration theory on how social
structures are built into IT. Social structures are therefore primarily thought of as something
built into IT.
Two specific uses of structuration theory have offered themselves as theories for
understanding Group Support Systems (Desanctis and Poole (1994)) and CSCW (Lyytinen
and Ngwenyama (1992)). We shall therefore deal with them in more detail.
actions), their sophistication (see Desanctis and Gallupe 1987) and by comprehensiveness
or richness. The more comprehensive the system the greater the number of features
offered to users.
2. The spirit of these features
Spirit can be identified by treating the technology as a "text" and developing a reading of
its philosophy based on analysis of:
- the design metaphor
- naming and presentation of features
- the nature of the user interface
- training materials and on-line guidance
- other training or help provided with the system
The structure built into the technology is a social structure along with others like the
"task" and "organizational environment" (examples provided by Desanctis and Poole (1994))
They describe the second aspect of structuring (the adoption of IT) in the following way:
"When the social structures of the advanced information technology are brought into
action, they may take on new forms. That is, interpersonal interaction may reflect rules and
resources that are modified from the advanced information technology. For example, when a
group uses voting rules built into a GDSS, it is employing the rules to act, but - more than this
- it is reminding itself that these rules exist, working out a way of using the rules, perhaps
creating a special version of them. In short the group is producing and reproducing the GDSS
rules for present and future use."
This description of the process of adopting a technology actually ignores the social
structures existing in the context where an IT system is put to use. It only deals with the social
structures that are built into the IT system. AST uses "appropriation" as the concept for
explaining how the technology is integrated in the work environment. In the fourth
proposition they state that:
"New social structures emerge in group interaction as the rules and resources of an
advanced information system are appropriated in a given context and then reproduced in
group interactions over time." Desanctis and Poole (1994)
By appropriations they mean the following aspects of adoption:
- A structural feature of the system can be chosen to be appropriated in different ways
by the group. They can adopt them directly, combine them with other social
structures, interpret and reflect on the structures.
- Faithful vs. unfaithful appropriations (whether they adhere to the "spirit of the
technology")
- Appropriate features for different instrumental uses
The problem with AST is two-fold. Firstly, technology is considered as something that
represents social structure. Orlikowski (2000) notes that this is a misinterpretation of Giddens'
theory. As Jensen's (2000) development of the structuration theory suggests, we should
separate the media (IT) from the social structures. The concept of the duality of structure
implies that social structures only exist through the enactment of agents in recurrent
situations. Contrary to this, properties of technology, despite their origin in social processes,
have a material existence independent from their use.
The second problem with AST has to do with the limited aspects of the use situation
captured by the concept of appropriation. The appropriation process only focuses on how the
structures, which are built into the technology, affect the use. It therefore tends to ignore
social structures such as existing patterns of communication, power structures, culture.
Technology-in-practice
Orlikowski (2000) argues that we should draw a distinction between the technology as
artefact, and the technology-in-practice. The technology as artefact is an entity with certain
properties that we for example describe as functionality. These properties should neither be
thought of as properties determining specific uses, nor as social structures that are “built” into
the artefact as suggested by AST and Lyytinen and Ngwenyama (1992). In other words
should we distinguish clearly between the social processes involved in the design of IT, and
the social structures involved in adopting IT. The result of the social processes of design
exists as fixed properties of an IT artefact when the IT is adopted.
What happens in the use situation is that users interact with some properties of the
technology at hand, while ignoring most of them, and in this interaction create and recreate
the social structures that constitute work.
"Through their regularized engagement with a particular technology (and some or all of
its inscribed properties) in particular ways in particular conditions, users repeatedly enact a
set of rules and resources, which structures their ongoing interactions with that technology."
Orlikowski (2000) p. 407
Thus she corrects the categorical mistake made by AST and her own previous attempts
in Orlikowski (1992) and Orlikowski and Robey (1991). The technology as artefact thus
represents the material properties of a technology that are fixed at the point of adoption.
Technology-in-practice is characterized in the following way:
“These enacted structures of technology use, which I term technologies-in-practice, are
the sets of rules and resources that are (re-) constituted in people’s recurrent engagement
with the technologies at hand.” Orlikowski (2000) p. 407
While users interact with some of the properties of the IT artefact, they do not interact
with all of them. Nor can the designer predict which properties. This differs from AST, which
hinges on a notion that the properties of IT have specific effects on the adoption. They leave
some room for variations, but they basically believe in a specific link between properties of
IT and the resulting use patterns.
Orlikowski mentions the World Wide Web technology originally designed by Tim
Berners-Lee (Berners-Lee and Fischetti (1999)), as an example of a technology which would
defy the notion of a direct link between its design and the resulting use. Another obvious
example is the e-mail technology, which Orlikowski has herself studied but for some reason
does not mention. In Bøving (2001) the design of the e-mail standard is analyzed and there is
by no means a direct link between its design and the use observed in the numerous studies on
e-mail in organizations. (See Garton and Wellman (1995) for an overview on e-mail
research).
The conceptual distinction between the technology as artefact and the technology-in-
practice also seems a useful distinction for understanding the observations made in this thesis.
The character of the design process of virtual workspaces as well as the use patterns observed
in the case study suggest that there is no direct link between the properties of the technology
and the resulting use patterns. At least there are other social structures more important for
understanding how the technology is adopted.
Organizational communication
The research field of organizational communication has been around since the late 1930s
(Jablin and Putnam (2000)) and has from different perspectives studied diverse aspects of the
communication taking place in organizations.
According to Stanley Deetz (Jablin and Putnam (2000) p. 4 – 5) organizational
communication can be approached with three different conceptions of organizational
communication.
1. Organizational communication as a specialty of communication departments and
communication associations.
2. Communication as a phenomenon that exist in organizations alongside other
phenomena
3. Communication as a way to describe and explain organizations
In the third approach organizational communication becomes an alternative theory of
organizations to the decision making (e.g. Simon (1977)) and information processing (e.g.
Media Richness Theory) approaches, which we have discussed previously and which are
primarily based on psychological and economic theories.
In the context of this thesis, communication is treated as such an alternative theory of
organizations. Before we proceed to the choice of a theory that uses structuration theory as
the starting point for understanding organizational communication and its relation to
computer media, we need to deal more specifically with alternative theories for understanding
groupware. The virtual workspace technology studied here should be considered a kind of
groupware, and both the CSCW and GSS traditions offer frameworks for understanding
groupware.
Mine the gap - a multi-method investigation of web-based groupware use
22
Framing the research
CSCW
The IS research tradition of which we have seen a number of examples, and studies of
the impact of communication technologies on organizational communication, are
characterized by a modest interest in specific technology designs. Most articles refer to rather
abstract characterizations of technologies’ properties such as “communication support”,
“decision modeling” and “rule-writing capability” Desanctis and Gallupe (1987)
Computer Support for Cooperative Work has established itself as a tradition in research
with a bi-annual American conference since 1986, a bi-annual European conference since
1989 and the Journal of CSCW. Grudin (1994) provides an overview of the CSCW tradition,
Hughes, Randall et al. (1991) and Schmidt and Bannon (1992) attempt a definition of the field
of CSCW. The CSCW tradition seems rather isolated from the other related IS disciplines
presented so far and is characterized by many experiments with the construction of CSCW
systems (e.g. Conklin and Begeman (1988), Bentley and Dourish (1995), Bentley, Horstmann
et al. (1997), Guzidial, Rick et al. (2000). It is also characterized by numerous accounts of
methods for designing CSCW systems (e.g. Grudin (1991), Grudin (1994), Teege (2000),
Büschner, Gill et al. (2001)).
The tradition has also produced theories of the nature of cooperative work relevant for
understanding CSCW. These are based on notions such as a distinction between "work" and
"coordination of work", and a notion of articulation work. CSCW draws on a number of
research fields and there is not a unified theory of work underlying CSCW. However
cooperation, coordination and articulation work stand out as central concepts in the CSCW
understanding of work.
The rationale behind CSCW is, as implied in the acronym, to support what is termed
"cooperative work".
"CSCW should be conceived as an endeavour to understand the nature and
characteristics of cooperative work with the objective of designing adequate computer-based
technologies". Schmidt and Bannon (1992)
At least for several researchers in CSCW, coordination and articulation are seen as basic
concepts for understanding the work supported by computers Schmidt and Bannon (1992),
Schmidt and Simone (1996), Suchman (1996), Divitini and Simone (2000). The concept of
articulation work stems from Strauss (1985). In the words of Schmidt and Bannon (1992)
articulation work amounts to:
"First, the meshing of the often numerous tasks, clusters of tasks, and segments of the
total arc. Second, the meshing of efforts of various unit-workers (individuals, departments
etc.). Third, the meshing of actors with their various types of work and implicated tasks."
Schmidt and Bannon (1992)
Mine the gap - a multi-method investigation of web-based groupware use
23
Framing the research
Underlying the idea of articulation work is a distinction between articulation work and
more basic work tasks. This distinction makes indeed sense in the case of settings with
physical labour involved, for example, in a production facility. The coordination of efforts
through the means of articulation is distinct from the completion of the actual work of, for
example, manipulating steel. Whether this distinction is relevant for symbolic work settings
(e.g. office work) is more ambiguous. One "side-effect" of the notion of articulation work is
the role of communication in understanding work.
The role of communication in this theory of work is that it is conceived as a means of
articulating work. This has the "hidden" consequence that communication is not understood
as basic work. The purpose of a CSCW system is to facilitate the articulation of work and
"thus augment the capacity of the ensembles in articulating their distributed work."Schmidt
and Bannon (1992). In production settings where the basic work is seen as the physical
manipulation of materials, the role of communication as a means for the articulation of work
seems like a very useful distinction. In the settings of work dealt with in the context of this
thesis, where all that is manipulated is symbols, the role of communication as an articulator of
work seems limiting. In symbolic work communication should be considered basic work. The
outcomes of the activities of symbolic work could, in many cases, actually be characterized as
communication.
conditions: a computer-based support system (GDSS), a paper and pencil based support
system, or no support at all. The common task of the groups was to act as a philanthropic
organization deciding how to allocate money among six projects competing for funding. The
results of the experiment were such that:
“In general, the GDSS technology appeared to offer some advantages over no support,
but little advantage over the pencil and paper method of supporting group discussion.”
Watson, Desanctis et al. (1988) p. 463
Group Decision Support Systems are in general systems that support meetings where
people are interacting synchronously. One GDSS called GroupSystems (now a commercial
company www.groupsystems.com) has been used in 55% of 54 case and field studies of
GDSS Fjermestad and Hiltz (2000). GroupSystems is a tool that supports online meetings and
has tools for supporting the decision process. This includes supporting collaborative
generation of ideas (electronic whiteboard), doing surveys, and votings among the
participants. The typical experimental research design also places people in the same location
with computer screens.
While the term GDSS denotes research generally based on decision making with the
“group” being a special case, a broader term “Group Support Systems” has emerged. GSS
also includes experiments that are not specifically focused on decision-making, and includes
what is referred to as CMC systems.
At least 200 experiments have been conducted on GSS (Fjermestad and Hiltz (1998-
1999)) compared to 54 case and field studies (Fjermestad and Hiltz (2000)).
“The results show that the modal outcome for GSS systems compared with face-to-face
(FtF) methods is "no difference," while the overall percentage of positive effects for
hypotheses that compare GSS with FtF is a disappointing 16.6 percent.” Fjermestad and Hiltz
(1998-1999) p. 7
The approach taken by GSS and GDSS has not produced much empirical evidence for
the underlying theories of group decision-making and task-technology fit. The research in
GSS is typically based on the notion of a group task. Previously a recent theory of task-
technology was introduced. It defined task in the following way:
“Thus, a group task is defined here as the behavior requirements for accomplishing
stated goals, via some process, using given information.”
Zigurs and Buckland (1998)
Other than the lack of empirical evidence of the usefulness of the notion of task, it is
problematic for studying the use of virtual workspaces in another sense as well. When we
look at the use of Lotus Quickplace at Beta, the process of defining and changing tasks during
the integration of the technology is important. In the settings we look at, tasks are rarely
defined in advance. The goals are only on an abstract level clear and are not entirely agreed
upon by all members, the process is not defined, and the information needed is not given in
advance. Thus the three defining criteria for a task as formulated by Zigurs and Buckland are
not met.
Much of what goes on in the settings where Lotus Quickplace is used is that members of
a group negotiate goals, process, and information needed. This is an important part of
working and part of the rationale behind designing a technology such as Lotus Quickplace.
Instead of focusing on the technologies-in-practice, the concepts of group decision
process and task assumes a rational definition of work that ignores the specific social
structures involved in the adoption of IT.
Genres of communication
The theory of genres of organizational communication is based on two major theoretical
developments. The first is that of structuration theory, which we have dealt with earlier, and
the second is the concept of genre, which is drawn from rhetorical theory.
As noted in Orlikowski and Yates (1994) the approach taken is to treat organizational
communication as an alternative theory of organizations to the decision making and
information processing approaches we have discussed above, which are primarily based on
psychological and economic theories. In this respect they agree with Deetz (2000) mentioned
earlier.
The theory of genres of organizational communication has adopted the concept of genre
from rhetorical theory and used the premises of structuration as a framework for
understanding organizational communication and how it develops, and how media and
especially computer media affects and are affected by the genres of communication. From
rhetorical theory Orlikowski and Yates adopt Miller (1984)'s definition of genres as "typified
rhetorical actions based on recurrent situations", p. 159.
They define the overall concept of genres of organizational communication in the
following manner:
“…a genre of organizational communication is a typified communicative act having a
socially defined and recognized communicative purpose.” Yates and Orlikowski (1992) p. 3
Genres of organizational communication are therefore types of communicative acts such
as project meetings or meeting agendas. Henceforth genre is used as an abbreviation of genre
of organizational communication.
Three different aspects define genres: the social rules, form, and content. While the form
and content are the observable properties of the genre the social rules are social structures in
Giddens' sense, which are only observable through their effect on the form and content of the
genre. The content of the genre includes: “… social motives, themes, and topics being
expressed in the communication.” 1992 p. 301. The form is “… the observable physical and
linguistic features of the communication." The medium used for the genre is considered an
aspect of the form of the genre.
"Media are the physical means by which communication is created, transmitted, or
stored. Genres are typified communicative actions invoked in recurrent situations and
characterized by similar substance[content] and form." Yates and Orlikowski (1992) p. 319
A project meeting, for example, has certain social rules. There is a project manager who
issues invitations for the meeting and decides who should participate. A social rule dictates
formal cancellation if one cannot participate in the meeting. The form of the project meeting
concerns aspects such as the typical existence of an agenda, that the meeting is held in a room
and the appointment of a person to produce minutes from the meeting. The content of the
project meeting concerns the themes dealt with in the meeting. (A project meeting typically
discusses the project plan and whether the project is on target or delayed.)
The agenda of a meeting is another example. It contains certain social rules, which
include, perhaps that it is produced by the chairperson of the meeting, in some cases at the
meeting and in other cases it is sent out in advance. The form of the agenda could be that it is
a Word document attached in an e-mail with the invitation to the meeting. The content of the
meeting agenda specifies that it contain the subjects dealt with on the meeting in a specific
order.
A very important feature of genres is how they change, and this is where Yates and
Orlikowski put Giddens' idea of structuration into play. As introduced earlier, one of the
central concepts in the theory of structuration is the duality of structure: this duality notes that
social structures regulate how agents interact in situations, while the same agents in the same
situations create, recreate and change the same structures which regulate the agents. This is
known as the process of structuration.
Adhering to the distinction between technology as an artefact with fixed properties and
technology-in-practice, the concept of genres of organizational communication offers a
specification of relevant social structures for understanding the use of computer media. The
process in which genres of communication are reproduced and changed using computer
media is thus an example of technology-in-practice.
While computer media will affect the existing genres, the reverse effect is perhaps the
most important contribution for understanding the adoption of computer media:
"Any time a new communication medium is introduced into an organization, we expect
that existing genres of communication will influence the use of this new medium, though the
nature of this influence will reflect the interaction between existing genres and human action
within specific contexts." Yates and Orlikowski (1992) p. 318
systems consist of multiple genres. The meeting documentation, for example, consists of
meeting logistics, meeting agenda, and meeting minutes.
The study is based on studying the contents of all messages posted to three Team Rooms
over a seven-month period. The total body of messages was 492 and the content analysis was
followed by interviews with members from each of the TeamRooms.
Genre systems clarify that communicative acts cannot be understood without
understanding preceding and subsequent communicative acts. The concept of genre systems
will be used as the primary unit of analysis when adopting the genre theory in the context of
this thesis.
As stated in the introduction of this thesis the secondary purpose of the thesis is to
experiment with multi-method studies of the use of virtual workspaces. More specifically the
use of log file analysis will be explored in combination with interviews and survey data. The
theory of genres of organizational communication has implications on how to research genres.
Yates and Orlikowski (1992) suggest that both diachronic analysis and synchronic analysis of
genres of communication could be useful. Their study of how the memo genre has evolved in
business organizations is an example of a diachronic analysis. On synchronic analysis they
state the following:
"Synchronic analyses would identify the existing genres influencing communication and
media use within certain contexts, either by searching for the presence of well-established
genres such as the memo or the meeting, or by identifying genres based on detailed analysis
of communication form, substance, and the invoking situation." Yates and Orlikowski (1992)
p. 322
The investigations presented in this thesis identify genres based on a detailed analysis of
specific situations in which a genre is used. The purpose of the study is not to identify the
typical genres used in relation to a virtual workspace, but to provide examples of the detailed
adoption of specific genres and illustrate how log analysis can provide additional insights in
this process. Particularly for the analysis of genre systems (multiple related communicative
"moves"), log analysis provides important insights into the relationship between the
individual communicative moves not captured by content analysis alone.
Genre studies are typically based on the analysis of content. This is the case both in the
context of organizational communication as well as in the aesthetic analysis of media
products in general such as films, novels, etc..
“Genre analysis requires qualitative textual analysis of messages to understand the
situations within which certain genres are invoked and their shared purpose, substance and
form.” Orlikowski and Yates (1994)
The textual analysis is not used as a method in the study reported here. This will, of
course, limit the aspects of communication content in a genre. Instead it shows how log file
analysis, which provides a detailed account of how communicative actions are linked
temporally in a genre system, can extend the analysis of genres and genre systems.
Research method
Before we can proceed to the analysis of the virtual workspace technology, and report
from the case study, we need to reflect on the research methods chosen for the investigation.
A research method and a research design allow for the drawing of certain kinds of
conclusions and the ruling out of others. This section is devoted to both the discussion of
some methodological issues concerning studying the use of computer media in an
organization, issues specifically related to utilizing log analysis and also, to the presentation
of the research design of the case study.
In this thesis I am devoting relatively more text to these methodological considerations
than is customary, due to the experimental character of the case study. There are two main
issues, which have not been resolved systematically in the literature on research method:
1. How can we use quantitative data from user-actions to investigate a research
question, which would normally be classified as an interpretive research question
lending itself to qualitative methodologies?
2. What is the validity of HTTP-log files when using them to analyse the use of a web-
based technology such as Lotus Quickplace.
The purpose of this section is firstly, to discuss the methodological issues of combining
quantitative and qualitative data, and secondly to report the considerations made during the
different phases of the case study, in order for the reader to judge the conclusions drawn
based on the actual process of gathering and analysing data.
the social structures. As discussed in the previous section on existing research, there are two
dominant imperatives for this relation: the technological imperative, and the organizational
imperative. The technological imperative states that the properties of the technology affect the
social structures of the organization (e.g. the organizational form (Leavitt and Whisler (1958),
Simon (1977), Groth (1999))). In positivist research terms, the social structures are treated as
the dependent variable in the technological imperative. The organizational imperative treats
the technology as the dependent variable, and states that IT is designed to satisfy needs that
are a result of the social structures in an organization (e.g. the media richness theory (Daft and
Macintosh (1981), Daft and Lengel (1986)).
As Orlikowski and Iacono note, studying either the social implications of technology or
the properties of the technology independently from its use is analytically hazardous.
“By following specific artefacts over time, it should become clear that changes occur not
only in the social, behavioural, and economic circumstances within which the artefacts are
embedded (resulting in the so called “societal” or “organizational transformations” that we
hear so much about) but also that changes are constantly occurring in the IT artefacts
themselves – whether through invention, innovation, regulation, expansion, slippage,
upgrades, patches, cookies, viruses, workarounds, wear and tear, error, and failure.”
Orlikowski and Iacono (2001) p. 132
As others also have stressed (see e.g. Markus and Robey (1988)), both imperatives are
too general to be useful in understanding how a technology such as Lotus Quickplace
interacts with the social structures at Beta. While this opens the space for new understanding
of the relationship between technology and organizations, it leaves us with the rather daunting
task of answering the question:
What properties of the artefact and which social structures are relevant to the
relationship between technology and social structures?
Let me repeat the research question in order to bring it into the light of this discussion:
How do the properties of an IT artefact for organizational communication in a group of
people interact with the social structures and result in a work practice in which the IT artefact
plays a role?
The phrases “properties” and “interact with” indicate that the research here is based on
the assumption that we cannot understand the relation between technology and social
structures if we separately choose either one of the imperatives. This is true on a general level
and implies that theories, which do not deal with the specific social structure in the
organization, will not improve the understanding of how technology and organization
interacts. In another perspective, we should however not ignore that technologies do affect
organizations and vice-versa. The meeting of technology and organization is simply a
collision of factors and consequences combined in different ways, which produce a result
where both technology and the organization are affected. The properties of the IT artefact will
allow for certain usages of the technology and rule out others.
The concept of “social structures” used in the research question is as vague a term as
any. Social structures can be investigated on many levels in an organization. The specification
of social structures in the context of this thesis is made through the notion of genres of
organizational communication. The kinds of social structures studied here are the ones, which
are relevant to the genres of communication used by “groups of people”. The notion of groups
of people means that we are not dealing with e.g. strategic communication from management,
for which the Intranet is used at Beta.
The formulation of the research question also indicates (at least it should !) the kind of
answer one might to provide. The purpose of a case study such as the one reported her, is to
provide a set of generalizations, which make the observations made here relevant to
researchers or professionals dealing with related issues.
I can rule certain generalizations out:
- It is not the purpose of this thesis to draw conclusions on the general behaviour,
ethics, aesthetics etc. of people. No generic sociological implications are drawn from
the research.
- It is not the purpose to draw sociological implications on the dynamics of
organizations from the study.
- It is not the purpose of this study to provide better algorithms or designs of virtual
workspaces.
The purpose of the study is to understand the object, a virtual workspace and understand
its relations to the social setting of use. The research provided here is intended for other IS
researchers as well as practitioners (not primarily for sociologists or humanists in general),
and its generalizations are made in order to improve the understanding, implementation and
design of other IT systems for communication in other organizations.
As the field of IS is interdisciplinary and combines research traditions from natural
science, humanities, and sociology, the choice of method is not at all self-evident.
real positivist”.). While the terms positivist and interpretive originally refers to
epistemologies, in IS they refer rather to approaches to research. They are named after their
underlying epistemology, but carry with them different theoretical frameworks,
methodologies, data analysis methods, and data collection methods.
Orlikowski and Baroudi characterize positivist research as studies which
“…are premised on the existence of a priori fixed relationships within phenomena which
are typically investigated with structured instrumentation. Such studies serve primarily to test
theory in an attempt to increase predictive understanding of phenomena.” Orlikowski and
Baroudi (1991) p. 5
The theory of task-technology fit (Zigurs and Buckland (1998)) or media richness theory
(Daft and Lengel (1986)) are examples of theories which are characterized as positivist.
Empirical research based on a positivist epistemology can, in principle, use both qualitative
and quantitative methods, but positivist IS research is associated with quantitative methods.
Quantitative research methods investigate the relationship between one or a few dependent
variables and a number of independent variables. Examples mentioned previously include the
experimental research performed in the tradition of GSS (see Fjermestad and Hiltz (1998-
1999) for an overview). Gallupe, Desanctis et al. (1988) provide a specific example in which
measures of decision quality and individual perceptions are dependent variables, and the
difficulty of the decision task and whether the process is supported by a GDSS are
independent variables. This yields four experimental conditions (high vs. low difficulty *
GDSS support vs. no GDSS support).
Orlikowski and Baroudi provides the following characterization of interpretive research:
Interpretive research “… assume that people create and associate their own subjective and
intersubjective meanings as they interact with the world around them. Interpretive
researchers thus attempt to understand phenomena through accessing the meanings that
participants assign to them.” p. 5
While the positivist research approach involves quantitative methods, the interpretive
research approach is associated with qualitative methods taken from sociology, ethnography
and other related fields.
The positivist research approach was the first and dominant approach in IS research.
Interpretive and critical research have emerged as rival approaches. Very illustrative of this is
the abstract of Benbasat, Goldstein et al. (1987). "The article defines and discusses one of
these qualitative methods - the case research strategy. Suggestions are provided for
researchers who wish to undertake research employing this approach."
In terms of placing the present case study within the three traditions, the research
question rules out that this be a piece of critical research. It is not the aim of this case study to
shed light on underlying contradictions and alienating social conditions.
As for the remaining two research traditions, it is the intention of this work to try to
combine interpretive and positivist research traditions. I have not in my research proposed
specific propositions or hypotheses that I wished to test, and in that sense neither the
theoretical framework nor the methodology are taken from the positivist tradition. On the
level of data analysis methods and data collection methods I have however used methods
from the positivist research tradition.
The study presented here is based on the theoretical framework presented earlier, which
in turn is based on two interrelated perspectives for understanding computer media.
Technology can on the one hand be described as an artefact with certain properties. On the
other hand it can be described as something embedded in a social practice referred to as
technology-in-practice. For understanding technology-in-practice, the theory of genres of
communication is used. This theory carries with it a certain methodology, as well as data
analysis methods and data collection methods. The empirical research performed using genre
theory uses the analysis of texts such as e-mails or documents in a specific groupware
application, but the purpose of this study is to explore new methods for collecting and
analyzing data. Rather than basing the research on a pre-packaged research approach
described in the six levels presented by Jensen (2002), part of the purpose of the study is to
explore how log files and log analysis as data collection and data analysis methods can be
combined with interview and survey data. Some of the results presented will be based on the
theoretical framework of genre theory, but some of them will be characterized by a focus on
exploring the value of log analysis for field study research of computer media use.
The case study approach is used in IS research alongside the more traditional approaches
of controlled experiments or surveys. Next to surveys and controlled experiments, it is the
most widely used research design in IS Orlikowski and Baroudi (1991). The case study
strategy is described en e.g. Benbasat, Goldstein et al. (1987), Yin (1994), Walsham (1995)
The concept of field studies has also been used in IS Klein and Myers (1999), and there
seems to be no general agreement on the distinction between case and field study, other than
as an indicator for the research background of the researcher.
The main characteristic for a case or field study in IS is, as in most other disciplines, that
phenomena are studied in their natural contexts (as opposed to experimental laboratories) and
that it does not use inferential statistics in the process of generalizing observations.
Yin (1994) p. 13 defines a case study as "an empirical inquiry that investigates a
contemporary phenomenon within its real-life context, especially when the boundaries
between phenomenon and context are not clearly evident." According to this broad definition,
the study reported in this thesis is an example of a case study.
Yin (1994) identifies six sources of evidence or data relevant to performing case studies.
These are presented in the following table with a description of the types of data used in this
study.
Documentation Standard Operating Procedures for using
Lotus Quickplace and the Intranet have been
studied.
Archival Records Applications for opening a Lotus Quickplace
gathered since the introduction of the
technology in the organization.
Interviews Interviews with managers of Quickplaces and
with the people responsible for the
introduction of the technology have been
conducted.
Direct Observations HTTP-log files could be characterized as a
kind of direct observations.
Participant-observation
Physical artefacts The Lotus Quickplace technology has been
analyzed both in terms of its functionality for
the user, and for understanding the
relationship between what a user does and
how this is represented in the log file.
HTTP-log files are characterized as a special type of direct observations, which combine
certain characteristics of archival records and direct observations. Archival records are
characterized by Yin (1994) p. 80 as:
• stable - can be retrieved repeatedly
• unobtrusive - not created as a result of the case study
• exact - contains exact names and details of an event
• broad coverage - long span of time, many events and many settings
• precise and quantiative
All of these properties are also properties of log files. They differ, however, in a very
important way from archival records in that they are not produced intentionally by members
Mine the gap - a multi-method investigation of web-based groupware use
38
Research method
of the organization studied. The information present in the log files is a combined product of
both the de-facto standard HTTP-log format and of the technical design of the Lotus
Quickplace technology. In this sense log files are very different from an archive of all
applications for using Lotus Quickplace, which have been sent to the technical manager of the
Lotus Quickplace server.
In this respect it might be better to characterize the HTTP-log as a type of direct
observation. Yin characterizes direct observations as:
• reality - covers events in real time
• contextual - covers context of the event
Clearly HTTP-log files only capture a very limited aspect of events, and they do not
capture what Yin calls the context of the event. Comparing them to Yin's typology, log files
represent a new type of data for case studies and the study presented here is partly devoted to
exploring their qualities as a data source for case or field studies.
On Generalizability
Before we get down to the discussion of data analysis methods and data collection
methods, a little more space will be devoted to the issue of generalizability. Discussing
generalizability is a useful approach to discussing the issues of combining research traditions.
Generalizations made from a study constitute the results of the study, but furthermore: the
way the generalizations are made judges it as valid or proper in a research tradition. Therefore
we shall discuss some general (!) issues of generalizing, as a background for the
generalizations made from the present case study.
A central question that has consequences both for the design of an empirical study and
the results drawn from the study is the question of generalizability. First of all, empirical
research must show results or insights that in some way are general beyond the empirical
setting. This means that it must be generalizable beyond the organization in which the case
study has been undertaken or beyond the specific technology used.
The most common way of thinking about generalizability is the statistical generalization
from a sample to a population. The methods of statistical inference are used to assess whether
a characteristic of a sample (e.g. that 30% of males between 20 – 34 watch football against
only 5% of females over 50 found in a sample of 2000 Danes) can be generalized to the
whole population (the Danish population). One of the basic requirements for a valid
generalization from sample to population is that the sample is chosen randomly.
This kind of generalization is not the only one relevant to research. In qualitative
research, the generalization from sample to population and the methods of inferential statistics
do not apply.
Let's take an example. At some point in this thesis I will conclude that, based on the case
study, "end-user design is essential for establishing use of Lotus Quickplace at Beta".
Considering generalizations under 1.) concerns properties of the Lotus Quickplace
technology. If the technology has properties that we can find in other technologies, and if
these properties are essential for the relation of the technology to end-user design, then we
would have something that is generalizable to these other technologies. Following from that,
we would expect that exchanging Lotus Quickplace for another technology X with the same
essential properties would produce the conclusion: "end-user design is essential for
establishing the use of technology X at Beta"
Considering generalizations under 2.) concerns the social structures of the organization.
Rather than looking for properties of the technology, we would be looking for social
structures of Beta that firstly can be found in other organizations, and secondly are essential
for its relation to end-user design. By combining the two, we are striving for a generalization
of our conclusion as stating something like "end-user design is essential for establishing the
use of technology X in organization Y."
The generalizations both on technology and genres of communication are based on
specific properties. If we generalize from Beta to other organizational settings, we do so
because of some specific properties of the genres of communication at Beta. The type-two
generalizations such as the example above will therefore essentially be a discussion of which
properties of both technology and genres of communication (which we have observed in the
case study) are the essential ones. That is to say, the essential ones for the relationship
between technology and social structures stated in the generalized conclusion.
The type-two generalizations made in a thesis are usually done at the end to allow the
reader to assess whether he agrees on the generalizations or not, so we shall leave them for
now and turn to the issues of how to combine research approaches and present the design of
the present research. This includes the practicalities of combining data collection methods and
data analysis methods in a case study.
The advantages and challenges of combining research methods have been discussed by a
number of authors (see e.g. Kaplan and Duchon (1988), Lee (1991), Mingers (2001),
Jensen (2002)). John Mingers argues that:
“Different methods generate information about different aspects of the world. The
information is used to construct theories about the world, which in turn condition our
experience of the world. It is both desireable and feasible to combine together different
research methods to gain richer and more reliable research results.” Mingers (2001) p. 243
It sounds both true and important to try to combine research methods. The difficult
question that follows is how this combination can be exercised in practice. The paper of
Lynne Markus (1994) - which we have discussed earlier - is an empirical example of
combining quantitative and qualitative methods. The main purpose of her paper is to
challenge media richness theory and propose better alternatives based on social construction.
In her study, Markus uses a survey based on statistical sampling to test the propositions from
media richness theory. In parallel, she uses text analysis of e-mails and interviews, which
were then analyzed interpretively as a basis for proposing better alternative explanations for
e-mail use.
“…a better explanation for e-mail use patterns at HCP [ the case study organization ]
may be found in views shared by most HCP managers about what various media were good
for – social definitions of media appropriateness that do not necessarily reflect the material
characteristics of the technology, such as its objectively-defined or individually-perceived
degree of information richness.” Markus (1994) p 519
Her approach to the combination of research methods is to divide her research question
into two parts. The first to test the propositions of information richness theory; the second to
propose better alternative explanations of e-mail use. Together the combination creates a
much stronger argument than researching the two parts in two separate case studies. If the
organizations had differed, one might have argued that organizational differences
undiscovered by the case studies could disturb the result.
Markus (1994) combines quantitative and qualitative methods in a more specific way
than described above. When she combines the analysis of e-mail archives with interviews, she
engages in what is named triangulation. The term triangulation is taken from trigonometrics.
Webster's Dictionary defines it as:
“any similar trigonometric operation for finding a position or location by means of
bearings from two fixed points a known distance apart” (Webster's) Dictionary
Triangulation is originally the technique for determining the position in two-dimensional
space by means of two measurement points and the geometry of triangles. Triangulation is
used metaphorically in multi-method research approaches, to describe the study a
phenomenon from different angles and thereby gain a better view of its character. In Jensen
(2002) p. 272 it is described as "a general strategy for gaining several perspectives of the
same phenomenon."
Norman K. Denzin (Denzin (1989), Denzin and Lincoln (2000)) is one of the developers
of triangulation as a strategy for integrating multiple perspectives on the same phenomenon.
He distinguishes four basic types of triangulation (reprinted from Denzin and Lincoln
(2000)):
1. Data triangulation: the use of a variety of data sources in a study
2. Investigator triangulation: the use of several different researchers or evaluators
3. Theory triangulation: the use of multiple perspectives to interpret a single set of data
4. Methodological triangulation: the use of multiple methods to study a single problem
Before I proceed to the presentation of the research design, which will allow us to
discuss the issues raised here much more concretely, I will present two hypothetical studies.
This will clarify in which sense I combine qualitative and quantitative research methods.
The experimental study would investigate the effect of a new design of the document
views in Lotus Quickplace on the time spent to find a specific document. Document views are
the lists of documents in a repository, which are used to select the document you wish to read.
Our hypothesis would be that design Y shortens the time spent on finding a document
compared to the traditional design X. The independent variable would then be the design of
the document views and the dependent variable would be search time. We would then design
an experiment where a number of randomly selected people would solve a pre-defined task of
finding certain documents. A control group given the old design X would solve the same
tasks. The results would then be analyzed to see whether there was any significant correlation
between the independent and the dependent variable. (See e.g. Smith, Cadiz et al. (2000) for a
similar research design).
The statistical study would investigate the ideal size for the group of users using a
Quickplace. The hypothesis to be tested would be that the size of the group is significant for
the successful use of a Quickplace. The independent variable would be the size of the group,
and the dependent variable would be successful use. One would then randomly select a
sample of Quickplaces. This study could through log analysis capture the independent
variable by counting the number of unique users present in the log files (or by some other
means discovering the number of users). The dependent variable could be captured by a
questionnaire of a sample of users, to establish whether the Quickplace is a success or not.
It is well known in IS that both the implementation process, and the organization in
which the technology is introduced, are important factors in determining the success of a
technology. This could be corrected by choosing a sample of users that all had a similar
implementation process, or to document the implementation process through, for example, the
questionnaire and have the sample reflect different types of implementation processes.
The two hypothetical studies exemplify research design which would appear to be
possible to pursue in the case study presented here. However, there are two reasons why this
is not the case:
1 It would not produce answers to questions I would be interested in asking
2 It would have required a more conscious top-down research design than has actually
been the case.
Let us look at how the case study actually was designed (or emerged).
DIWA is an acronym for Design and use of Interactive Web Applications, and the program
performed empirical studies of both design and use processes of interactive web applications.
Our study was classified as a study of use. We worked with partly individual research
questions and this has affected, in particular the way in which we conducted interviews.
Besides the case study at Beta the present thesis also contains a comparative analysis of
technologies similar to Lotus Quickplace. (virtual workspaces). The comparative analysis is
included in the thesis, because it strengthens some of the conclusions drawn from the case
study. As discussed earlier, technology can partly be conceived as an artefact and partly as
part of a technology-in-practice. In the following section the technology will be analyzed as
an artefact. I will devote my main energy on the methodological issues of the case study since
they form the primary data source of the thesis.
The case study has been primarily based on three sources of data:
1. Interviews with selected managers of the Lotus Quickplaces and the people
responsible for introducing and managing the technology.
2. HTTP-log files from the Lotus Quickplace server
3. A survey among managers of the Lotus Quickplaces
The numbering of the data sources does not rate them according to importance; they
rather present a time sequence.
Quickplace study
Log-analysis
Use case study Interviews
Pilot Survey
The results of the first part of the study entitled “Use Case study” are not reported in this
thesis. This study analyzed the use of “use cases” in an IT-development project. Use cases
(see Fowler and Scott (1997)) are a specific genre of documents used in various phases of the
development of an IT-system. Some of the results of the study are reported in Bøving (2001).
The study of the use of "use case" documents and technologies supporting its use could have
been relevant to this thesis. The problem is, that the technology used in the IT-development
project for exchanging "use case" documents was a custom groupware system based on Lotus
Notes. Mixing two studies based on two different technologies would blur the conclusions
presented here.
In spite of this, I mention the use case study because insights into and interests in the
workings of the organization as well as document genres from this study have been used as
input for the Lotus Quickplace (QP) study.
The research design of our case study was not finalized before starting the study. From
the outset we planned and agreed with our Beta contact person that we performed
approximately 10 interviews and obtained log files from the Lotus Quickplace server for
analysis. As the analysis of the log files proceeded I had the idea that a survey might provide
a useful insight into what the Quickplaces were used for. The log files do not reveal any
purpose of the activity observed, nor does it link the activity to concepts such as work, teams
or groups. Despite the fact that log files were a planned part of the empirical data from the
outset, the analyses of the log files have developed significantly over time. At the start, I did
not have a clear idea of what information the log files could provide and which specific
analyses would be useful.
Beta has been a partner of the DIWA project since the launching of the project in 1999.
This includes a study of the Intranet in the Danish part of the organization, the use case study
mentioned above, and a study of a unit responsible for supporting the organization with
methods for managing projects and re-using knowledge across projects.
Preceding my engagement as a Ph.D. researcher in the DIWA project, I worked as a
management consultant for IBM Global Services. There I was involved in a project spanning
several phases and approximately 1 1/2 year of planning and construction of the Intranet for
the Danish part of the organization. This work has given me much insight into the processes
of this organization, and specifically how IT is managed and projects are done. It has also
provided experience with interacting with this organization, which has eased the process of
getting access to information etc.
The experiences gained from my engagement as consultant have provided background
insights beneficiary to the section on the general introduction to Beta and Lotus Quickplace.
Some of the knowledge on how IT is managed and implemented stems from the interviews
conducted as part of the case study, but some of it (such as my understanding of Standard
Operating Procedures and how they are used) stems from my experience as a consultant. This
blend of roles in my relationship with Beta could be seen as a potential methodological
problem. When we conducted the interviews, I knew some of the interviewees from my
engagement as a consultant and therefore had a special relationship to them. One could argue
that this would influence their answers. I feel confident that I have drawn precautions to avoid
this in my study. The reader is asked to make up his own mind as I seek to lay open the
source of the descriptions I provide and the conclusions I draw.
The interviews
The interviews conducted in collaboration by Jesper Simonsen, Keld Bødker, Jens
Kaaber Pors and myself consist the first data collection from the case. The purpose of the
interviews was partly to investigate how Quickplace was implemented and used in general,
and partly to investigate a specific Quickplace called GIC, which was used by the "Group
International Communication" department. Our primary contact at Beta was employed in this
department, which was also the department responsible for introducing the Lotus Quickplace
technology in the organization.
For the planning of the interviews, we used a number of tools which I had previously
used as a management consultant. The purpose of using the tools was primarily that we had
differing research questions. An obvious problem as a researcher collaborating with
researchers with differing interest in the interviews is that you only get answers to your
questions in the interviews you perform yourself - unless you do something about it. We used
a data-gathering matrix (see appendix 1). This is a tool for planning the gathering of data used
in the issue-based consulting methodology, which is used widely in consulting companies
(e.g. IBM). The issue-based approach is, for example, used by Kunz and Rittel (1970) to build
an issue-based information system, IBIS. The data-gathering matrix is a document that is
created collaboratively between the participants. The idea is that you can distribute the data
collection among the participants based on a shared understanding. The rows of the matrix
constitute issues (or research questions), hypotheses and key questions the answering of
which should confirm or disconfirm the hypotheses. The columns of the matrix contain data
sources. In our case these were interviews, log files, survey, and QP observations/document
analysis.
The data gathering matrix was used to create an “interview matrix” (see appendix 2).
The rows of the interview matrix contained the key questions checked in the data-gathering
matrix as relevant for the interviews. Rather than having data sources as columns in the
matrix, each interviewee was given a column. We scanned the interview matrix together, and
checked for each key question the person we thought could answer the question. The selection
of interviewee for each question was based on our knowledge of the interviewee gained from
our primary contact at Beta, and based on the fact that we had about one hour for each
interview. The interview guides (see appendix 3 for an example) were then generated
automatically from the interview matrix, followed by adjustments since some of the
interviews ended up having too many questions.
The data gathering matrix and the interview matrix enabled us to create interview guides
which consisted of a collection of questions which would answer (at least partially) all of our
individual research questions. The rigid character of the data-gathering matrix had the effect
that we only used it for planning the interviews and not for guiding the analysis of the data.
None of us had a clear enough picture of what the survey and the log file analysis should be
used for at that point in time. However, it ensured that we were able to plan an interview
process where different research questions could be investigated using the same data.
Beside the interviews, the log files have been the most important source of data for this
study. While interviewing is a well-proven technique in case and field studies, the use of log
files as data is not. Therefore I will devote quite extensive space to the discussion of log file
analysis.
Hyper Text Transfer Protocol (See RFC1945 for HTTP 1.0 and RFC2068 for HTTP 1.1). It
serves requests to a browser or some other HTTP-compliant client.
All HTTP-servers (see www.netcraft.com for a list of web servers and surveys of the
most popular ones) have the built-in possibility of maintaining an HTTP-log. The HTTP-log
is not specified in a standard, but two formats have emerged as de-facto standards.
The fact that the logging format follows a de-facto standard makes it a lot cheaper to
analyze the log files. The reason is that it is possible to build general software that can
analyze log files across different sites and different HTTP-server implementations. This fact
was one of the reasons for using the HTTP-log of the Lotus Quickplace as data for the case
study.
In the tradition of IS and the study of IT systems, log files can be used as a means of
observing how an IT-system is actually used. When one studies the use of an IT-system one
can either study users accounts of how they use it, or actually try to observe the users when
they use it. Log file analysis has only been used in experiments in laboratory settings and not
as a part of field studies. The reason for this can only be speculated. One possible reason is
perhaps that the standards-based Internet has put more focus on the analysis of use-patterns. It
might also be due to the fact that log files are problematic for observing use for a number of
reasons:
1. They show very few aspects of some users activities. Two identical lines in a log file
might therefore document use processes that would differ significantly in a direct
observation of use.
2. Typically log files are not designed by the researchers. The creators of the software
typically design them with another purpose in mind than a study of use patterns.
Therefore the log files often do not contain the information required by a researcher.
This problem can be solved in very controlled settings such as in experimental research were
the IT-system is custom designed so that it logs the wanted information.
For real life studies of IT use, using log files as a data source is not common. I have been
unable to track down a case study or other empirical real-life studies in IS using log files as a
data source. Therefore the use of log files in this thesis must be considered explorative. As we
shall see, when the results of the log file analysis are presented, they might serve as a
mediator between qualitative accounts of usage based on observations of use which suffer
from a lack of any indications of their generality and surveys, which gives a general picture of
usage which is not related to how systems actually get used.
While the log files have serious shortcomings, as we shall see in more detail later on,
they still seem to be an interesting solution to the problem of observing use of Lotus
Quickplace. Generally, systems for distributed collaboration have some built-in challenges for
the case study researcher:
1. The users are distributed geographically.
2. The usage is distributed in time.
The problem with the geographical distribution of users is that they are difficult for a
researcher to observe for a researcher. If the users are few in numbers and in predictable
locations it might be possible to observe usage, but if there are many it soon becomes
infeasible. In addition, web-based systems including Lotus Quickplace can be used from
different client machines. It is therefore unpredictable which locations the usage is taking
place in. Finally, the usage is distributed in a timescale up to weeks and months.
Location C Event
Location B
T1 T2 T2
Through the ID of a document on which a user performs some action, and a username in
the log file, we can link together events that are spread organizationally and temporally,
which would otherwise have been difficult for a researcher to discover.
The qualities and deficiencies of the HTTP-log files as data for a case study might be
summarized as follows:
Indices They provide perfect information about a very limited aspect of usage. Log
lines are indices of use.
Consistent They provide this information consistently across time
Analyzable The data is very easy to analyze using data mining techniques compared to
richer observations such as videos, tape recordings, researchers' notes, etc.
These features of log files make them useful for studies that span many users and a long time-
span.
The survey
Doing a survey was not on the agenda for the joint DIWA group (Jesper, Keld, Jens and
myself). As I started working on the analysis of the log files it became clearer that, without
some account of what the QPs were used for, and how they were used, the information we
could deduce from the log files would be limited. Therefore I planned a survey to answer
some overall questions regarding the use of the QPs, and some more specific questions about
how it was used together with other media, etc.
The only way of doing a survey that would not take up all of my research time was to
use the web: paper-based surveys as well as telephone-based would have taken up too many
resources. Another good reason for this was that, according to our contact at Beta, they had
good experience with doing web-based surveys internally in the organization. I customized a
freeware survey application built on top of Lotus Notes and hosted the survey application on a
computer at the University. This allowed me to customize the look of the application so that it
included a Beta logo and looked familiar to the respondents.
The response rate of the questionnaire was 46% completed and 12% which were
partially completed. 46% is a good response rate also according to the usual rate at Beta, and
the 12% partially completed questionnaires is not a very high percentage. It is therefore likely
that using a web-based questionnaire compared to more well-proven methods, did not disturb
the results.
The survey was made in two stages: a pilot stage with 10 respondents and a full stage
with 123. Prior to the pilot, the technical set-up as well as the questions were tested on
selected researchers in the DIWA project. The pilot was produced to test several aspects of
the questionnaire:
- to test the technical set-up of e-mail invitations and links to the survey application.
- to test the wording of the invitation.
- to test the questions and descriptions for errors.
- to test whether we received answers that made sense in relation to what we would
like to know.
The pilot revealed some technical problems with the set-up, and some minor errors in the
wording of the questions. None of the questions were changed significantly, and none were
removed. After implementing the changes in the pilot, we issued invitations to 123
respondents on the 23/11 2001. This resulted in 33 answers. On the 5/12 2001 we reissued
invitations to the 92 who had not responded. This produced an additional 24 answers. All in
all 57 completed answers. A completed answer does not necessarily imply, that all questions
are answered, but implies that the respondents had clicked though the entire questionnaire.
Typically though, the respondents answered all questions.
Mine the gap - a multi-method investigation of web-based groupware use
52
Research method
another round of selection of managers of the 77 selected QPs. From interviews I knew that
some managers were merely appointed as QP managers because they were real-life managers.
Some of these managers were not using the QP at all. Since I was interested in accounts of
how it was used, asking non-active managers would therefore produce problematic answers. I
took the list of 77 Quickplaces and the list of managers and made queries in the database to
see which managers were active in the three-month period before the questionnaire was sent
out.
As this selection process documents the respondents were not chosen as a random
sample of all users. This was not possible because nobody knew who were users. There was
information in the log file but the user-name in the QP was not identical to the users ID in the
e-mail system.
Level-one generalizations
As discussed earlier, level-one generalizations are generalizations used in empirical
research guided by the methods of inferential statistics.
An important distinction in quantitative research is that between reliability and validity
considerations. The concept of validity deals with how well that which is measured actually
measures the desired variable. Reliability concerns how well the sample measured can be
generalized to the population of objects over whom the question is posed. Reliability is
assessed using the methods of inferential statistics.
As to the question of validity, one of the challenges of the survey was how the
respondents were selected. Our practical limitation was that we could only survey managers
of the QPs since we had access to their e-mail addresses. The first challenge was that this
group of users was not representative for the whole group of QP users. If we asked them
questions as users of QP, the reliability of conclusions drawn for all users of QP at Beta
would be poor. Instead they were asked questions as representatives for uses of QPs. The
population is uses of QPs instead of users of QP. Because I asked them questions as
representatives of uses of QPs, the reliability of the sample is increased. So then the question
becomes whether asking the managers is a valid way of gaining descriptions of uses of
Quickplaces. This problem of validity will be addressed as I report the results, because they
can only be assessed for single questions.
Quantitative methods are typically used to measure associative or causal links between
few variables, whereas qualitative research investigates relationships between more variables
at a time. In qualitative research, relationships are not usually named variables. In the present
case study, where we combine quantitative and qualitative methods, variables can be used as
a way of discussing reliability. All members of a sample will share some characteristics and
differentiate in other characteristics called the variables. The characteristics that are shared are
called background variables. The variables are the ones we inquire the sample about. The
quantitative methodology and theoretical framework investigates relationships between
dependent and independent variables. Statistical methods such as regression analysis are then
used typically to analyze the relationship between one dependent and one or more
independent variables.
As we have discussed previously, it makes sense to distinguish between level-one and
level-two generalizations. A level-two generalization concerns the generalization from
empirical statements to theoretical ones. Level-one generalizations are the generalizations
ruled by the discipline of inferential statistics. They concern the generalizations made from
one or more samples observed to the population from which the samples are drawn. Because
both log file analysis and the survey are part of the investigation it is relevant to discuss
whether it makes sense to apply the methods of inferential statistics in our case study.
The basic issue of inferential statistics is the relationship between the sample and the
population Bakeman (1992), Gunter (2002). The method of sampling (or lack of method)
determines the conclusion that can be drawn about the population. The first step is to define
the population relevant for the investigation. In the present case one could set up a few
alternative possibilities:
- Uses of virtual workspaces in general
- Uses of virtual workspaces in intra-organizational settings
- Uses of Lotus Quickplace in Financial institutions
- …
Common to these suggestions for populations is that they go beyond the setting of the
study, which is the use of Lotus Quickplace at Beta. If we chose one of these populations, the
next step would be to argue for the principle used for sampling. In any case, both the
sampling of the technology and the organization would be useless because it is not random.
The sampling of the technology and the organization is based on convenience. While one
could argue that the sampling of the technology is random, certainly the choice of
organization is not. Therefore I will not spend more time on level-one generalizations beyond
the population of “use of Lotus Quickplace at Beta”. This is perfectly in line with the overall
characterization of the study as a case study. In the following considerations, "use of Lotus
Quickplace at Beta" is our population. The conclusions we draw on this population will then
form part of the basis for level-two generalizations.
Regarding the issue of sampling in the population “use of Lotus Quickplace at Beta”:
The log file analyses are generally made on the whole population. This means that we
have made the analysis on all log file data from the whole 10-month period. Sampling is
generally used in data mining when the amount of data exceeds what is technically feasible.
We did not run into that limit (although some of our queries in the database took several days
to complete).
For some specific analyses we set up criteria for selecting a collection of documents
because we, for example, only wanted documents for which we had a full lifecycle. This was
the case also with the selection of QPs. These criteria are relevant for the specific analyses
and will be dealt with when the results are presented.
Before we proceed to the actual results, I wish to use some additional time on the
analysis of HTTP-logs. This will present information, which will enable the reader to
critically evaluate the results from the log analysis. My discussion of the use of HTTP-log
analysis is extensive because it is not a well-described method in IS research and research on
computer-mediated communication. HTTP-log analysis is established as a method for
understanding how single users interact with a web site in the research tradition of Human-
Computer Interaction (HCI). It is argued that HTTP-logs also offer a valuable source for
analyzing computer-mediated communication via web sites, not just in case study research of
computer use in organizations, as is the case in this study, but as a generic method for various
purposes. The HTTP-log analysis performed in this study thus illustrates a use of HTTP-logs
not previously described in research or performed in practice.
The logging of activity on an HTTP-server is produced using “the common log file
format”, which has emerged as a de facto standard (see Appendix 5).
HTTP-logs are attractive compared to other logs because they represent a standardized
format for logging. This has some interesting consequences. Firstly, it means that analytical
tools for HTTP-logs can be built across diverse systems, because they use the same HTTP-log
standard. Secondly, it enables researchers to more easily compare studies of use across
different technologies.
The standardized format of HTTP-logs also limits the tyoe of analysis one can perform.
This has lead to a working draft under W3C (World Wide Web Consortium) to define an
“Extended log file format” (not to be confused with the extended common log file format),
which standardizes the description of the data which is logged from a specific HTTP-server
rather than defining the data itself, as is the case with the common log file format. This would
enable more flexible logging that can be suited to specific purposes.
Web Mining
Web Mining
Agent
AgentBased
Based Database
Database
Approach Approach
Approach Approach
Web content mining is the mining of the content on the Internet. The Google search
engine is an example of a practical application of web mining. The Google search engine
analyzes the contents of the Internet and indexes the contents so that users can search the
index. This is a relatively simple process of web content mining, which is more interesting
Mine the gap - a multi-method investigation of web-based groupware use
59
HTTP-log analysis for CMC
because of the huge amount of data analyzed. Usually web content mining is interested in
establishing models to describe data rather than just indexing it for users to search, as Google
does, but in principle Google is a simple example of web content mining.
The model above distinguishes between the agent-based approach and the database
approach to web content mining. The agent-based approach searches for information and
patterns of information where the information is, whereas the database approach is focused on
structuring web-data in a database to make it available for structured querying using
languages such as SQL (Structured Query Language). The Google search engine is an
example of the agent-based approach.
Web usage mining is the discipline from which our analyses of log files set out. Web
usage mining is the adoption of data mining techniques to discover use patterns from HTTP-
log files. Web usage mining is still a young discipline and is characterized by a very strong
drive from the industry. The next section provides an overview of the types of research that
have been conducted on web usage mining.
for analyzing on-line shopping behavior are now built into commercial data mining products
and services (e.g. Clementine from SPSS, WebHound from SAS, SurfAid from IBM).
Marketing based on log analysis:
Clustering of users based on usage and correlated analyses of other data such as
customer segmentation models (e.g. Minerva or Kompass) or buying history have been
researched as an input for marketing. The goal is to direct marketing more precisely towards
potential customers. One of the buzzwords is personalized marketing (Büchner and Mulvenna
(1998)). While not based on the analysis of HTTP-log files but rather transactional data, the
marketing e-mails sent out by the Amazon bookstores illustrate this principle.
This overview of the research shows that web usage mining has been driven strongly by
the commercial interest in utilizing www. The applications of the methods are also all focused
on the interaction between single users and a web site. It could therefore be characterized
partly as a contribution to research in HCI (Human-Computer Interaction) and partly to
marketing and sales research.
Some of the analysis of HTTP-log files made in this thesis will exhibit some potential
applications of HTTP-log analysis for CMC.
The next sections will be devoted to some practical problems involved in using HTTP-
logs for analysis. Some of the problems cover both session-based log analysis and document-
based log analysis, while others specifically deal with doing document-based analysis. The
purpose of the description in relation to this thesis is to clarify the process of doing log
Mine the gap - a multi-method investigation of web-based groupware use
63
HTTP-log analysis for CMC
analysis. This will enable us to judge the implications drawn when the results of the analysis
is presented later.
hypotheses on how a line in the log file is related to a user's actions, and testing the validity of
the hypotheses in an on-going learning cycle.
The process of decrypting text consists of multiple cycles of hypothesis formulation and
hypothesis testing. The cryptanalyst assumes that there is a relationship between an unknown
text and the encrypted text, which is fairly simple. There are two elements of this relation in
most cryptography Singh (1999): the algorithm and the key.
Encryption
key
Encryption
Cleartext Algorithm
Encrypted text
The algorithm specifies the mathematical relationship between three entities: the
encrypted text, the clear text (decrypted) and the encryption key. Often the cryptanalyst
knows the algorithm and needs to identify the encryption key.
The analogous task of analyzing log files looks like this: as we have seen above, in the
explanation of the common log file format, some user action causes a number of lines to be
written to the log. In the common log file format, the only field available for understanding
the relationship between user actions and the log line is the URL field. Thus we can
schematize the log file analysis in the same manner as the process of cryptanalysis.
Action code
Application Server
User action architecture
Log-line URL
4. Generation of data matrices based on the cleansed data. A lot of the analyses of the
log data are made on matrices containing aggregated information. An example can be found
in appendix 6. In our case this was done in a relational database.
5. Analysis and visualisation of the data matrices. In our case this was achieved using a
number of tools: Clementine from SPSS, SQL, Excel spreadsheets, and SPSS statistical
package.
In the following, some aspects of these processes are dealt with more extensively.
The mouse click, or typing of the URL in the browser address field, and pressing return
eventually produces 5 loglines in the server HTTP-log. The problem is then to locate the
resource relevant for the analysis. Typically the relevant resource to analyze is the one that
uniquely identifies that the user is in fact looking at the contents of a specific document.
For most analyses, the identification of the resource, which contains the contents of the
document on which the user performs the actions, is the goal. In certain specialized analyses it
might be different. In the process of analyzing the log files of the Lotus Quickplace server we
discovered that some of the .gif files loaded were named after the folders of the QP. It was
therefore possible to analyze the folder structure of the QP through the log file. For this
specialized analysis the contents of documents were unimportant.
This problem of a one-to-many relationship between a user action and a log line is
solved in a number of ways. One of the elements for solving the one–to-many relationship
puzzle is to identify the main resource of interest. When users read or edit an HTML page,
there is one resource that is unique for this action and several that are not. How this is
organized is completely dependent on the architecture of the server. Two scenarios are
important to distinguish between in this respect. We might name the first one the file server
scenario and the other the application server scenario.
The file-server scenario:
This is the original HTTP-server scenario. It is still being used and is characterized by its
simplicity.
HTTP
Browser server File-system
HTML, PDF,
Server GIF, JPEG...
In this scenario, log analysis is simple:
- All resources requested by a browser via HTTP exist on the server in a hierarchical
file system. The HTTP-server has a mapping table between URL’s and places in the
file system so that ,for example, the URL http://www.billeskov.dk/publications.html
maps to C:\www\html\publications.html.
- Usually the file-naming convention tells something about the contents of the page
being viewed. So just by looking at the name of the resource, you can determine
information about the contents.
- Often, the hierarchy of the file system is mapped to the information architecture of
the web site. (Such that, for example, all html files related to product presentations
are placed in the folder “\products” in the file system.)
Even though the file system scenario is simple in some ways, it faces problems when:
- the information architecture does not relate to the hierarchical structure of the file
system.
- the complexity of the web site (number of resources and number of links) exceeds
what can be grasped by the mind of the log-analyst.
Attempts have been made to solve the problem of complexity by machine-analyzing the
contents of individual resources (See e.g. IBM SurfAid (http://www.ibm.com/surfaid) which
uses a text mining clustering algorithm for categorizing pages by analyzing contents).
The application server scenario:
Today many web sites are managed via an application server. This means that our
scenario above is now more complicated.
File-system
Interface
CGI
HTTP Application
Browser
server
Database
interface
Relational
database
Application servers are used for a number of reasons: for handling more complex
interactions than reading HTML pages, for handling authentication, for handling scalability
etc. While the file server scenario utilizes the URL standard (RFC 2396) for accessing
resources in the form: http://host/resource in an application server everything preceding
http://host is not standardized and solely dependent on the internal architecture of the
application server. This means that the techniques used in the file server scenario cannot be
applied. In the file server scenario it is often possible to provide standardized solutions to the
analysis of log files independent of the architecture of the specific web site. All standardized
log analysis tools are based on the file server scenario and they are therefore not applicable
for application servers.
In the application server scenario it is therefore necessary to perform the code-breaking
process described above. I will now go through the process with the Lotus Quickplace server.
identifying that the action had taken place. This we repeated a number of times for each type
of action and in various combinations until we had a reasonably good idea of how to identify,
in the log, that a user had performed one of the action types.
After this, we performed a test which included a sequence of all the action types. The log
file was then searched to see whether the actions predicted by the search of the log files was
identical to the actions actually performed. This test exposed some problems with the criteria
and meant, that the process was repeated again for some of the action types. After the second
overall test, the predictions made from searching the log files matched the actions performed.
As for the problem of establishing the relationship between properties of the URL and
the contents of the document, the Lotus Quickplace server provided no possibilities of doing
this. This is very often the case in the application server scenario. The documents on Lotus
Quickplace were all identified in the URL via a 32-character hexadecimal code and thus gave
no indications whatsoever as to the contents of the documents. For the analysis where this
relation was important, the file-names of attached files were used to indicate the contents of
the document. This was, for example, necessary for the analysis of specific genres of
communication.
This section has explained some aspects of the first three phases of the log analysis. The
remaining two phases will be dealt with when the results of the analysis are presented.
identity of the user changes. The IP-address remains the same across a use session,
but for analysis over longer periods of time it is highly unreliable. Therefore it cannot
be used for document-based analysis of CMC.
2. Even in the same session, the IP address can pose problems. People connecting
through a proxy-server will all appear in the log with the same IP-address. Thus an
analysis would collapse perhaps 10 or 100 users into one.
including the ones related to studying CMC, could only be based on data in the period before
5/10/2001.
Handling caching
Except for the problem of identifying the user, caching poses another challenge to the
reliability of the results from log analysis. Caching means that documents and other resources
on the web-server are stored in other places to save network traffic. There are two scenarios
where a mouse-click does not produce the lines in the server HTTP-log, because of caching:
Browser caching:
The browser caches resources from the web according to two sets of rules:
1. A browser can be set to cache files differently. In Internet Explorer the options on
when the browser should check for an updated page are “Never”, “Once per session”
and “Always”. If set to “Once per session”, the browser checks only once in the time
period, where a browser window is open.
2. For each HTML-page an additional rule applies. The browser caching settings are
only used when an HTML-page is not expired. HTML-page headers have a field for
defining an expiration time. If this expiration time is before the time on the client
machine, the browser will automatically check for an updated version of the html
page.
If a page has not expired and the browser cache settings are set to “Once per
Session”, each additional mouse-click to the same HTML-page in the same session will
not produce any additional lines in the HTTP-log. This is of course important to know if
one wants to study the exact path that a user has taken.
In the case of Lotus Quickplace all browsers at Beta were set up in a unified way, so
the browser caching was not a problem. Generally the browser caching poses a greater
problem for session-based analysis than document-based analysis because the session-
based analysis analyze events in the same session.
Server/Proxy caching:
For performance reasons it is not uncommon that files often served from an HTTP-
server are cached. This means that they are not processed by the HTTP-server itself, but
from some cache server that is placed between the browser and the HTTP-server on the
network. This of course means that requests to the HTTP-server are not logged at the
server because the request never reaches it. The Lotus Quickplace server did not use
caching.
This section has dealt with some rather detailed problems related to the analysis of log
files. It has also presented a new approach to log analysis. It has been shown how HTTP-logs
can be used as a basis for studying computer-mediated communication. Before we proceed to
the presentation of the results from the case study, including the use of log analysis, we need
to take a closer look at the technology, which has been introduced in the organization. As
discussed previously the specific properties of the technology artefact is essential for
understanding how it is adopted in the organization.
Introduction
The www is no longer simply a medium for publishing information. It is being used for a
wide range of things such as doing commercial transactions, shopping, and so om. One of the
trends is to use Internet technologies to support collaborative work in teams and projects. The
promised advantages include, amongst others, the ability to enable geographically dispersed
teams to work together, to improve collaboration in the team, and to lower the cost of setting
up inter-organizational projects or teams.
From august 1999, when Intranets.com began offering a virtual workspace service, it has
been possible to lease an application for collaboration over the Internet either at a monthly
rate or as ad-ware. Since then a large number of companies have offered this service and
several of them have already disappeared again.
My first interest in this type of application came from the sheer enthusiasm concerning
the possibility of obtaining a shared workspace application for collaboration almost for free.
This illustrates the new economic possibilities of the Internet and the economic effects of
infrastructure and standard protocols.
Virtual workspaces are also interesting in other respects. They exemplify how the design
process of software applications has changed, and how the distinction between design and use
is shifting. The design of virtual workspaces is very open in the sense that it is designed to
support a wide variety of use situations or genres of communication.
Virtual workspaces also exemplify how the design of the protocols and standards of the
Internet has an increasing significance for the end-user situation. This is the case, as we will
see, with both the HTTP and the SMTP standards. Seen in the light of emerging standards
such as XML, which is a standard for modelling data and creating standard data models, this
trend is due to continue and enter new territories.
The virtual workspaces studied are very similar in terms of functionality yet project very
different images to the user. An analysis of the metaphors used reveal different strategies for
modelling the anticipated use situation. The differentiator in this type of application is not the
functionality but the metaphors used, and the strategy for modelling the use situation.
Virtual workspaces are suggested as a name for the type of application studied. PCWorld
has also used this term1. There is no unified and agreed name and definition of a virtual
workspace. Other names used include: The Digital Workspace, Virtual Office, Team
Workspace, Worksite, and Teamware. Most of these names are registered as trademarks,
which prevents them from being used for naming a type.
A virtual workspace is an application that facilitates people working together. Typically
this means that it has facilities for sharing files, engaging in discussions, sharing a calendar,
integration with e-mail, synchronous chat, and simple work-flow. All virtual workspaces also
provide access control and model the user in standard profiles such as managers, authors, and
readers.
A virtual workspace is an application, which is available as a service over the Internet. It
is designed to be ready for use for a group of people. In the marketing of virtual workspaces
terms such as "instant collaboration", "lets your whole team start working immediately" are
typical. The findings from the Beta study will show that, while there are no technical
difficulties in setting up a virtual workspace, "instant collaboration" is not a term well suited
to the adoption of the technology. On the contrary it is a complex task to integrate the virtual
workspace in the genres of communication, already existing in the organization.
Appendix 8 is a compiled list of virtual workspace products available at the time of the
analysis, together with some products that have appeared since this analysis took place.
1
http://www.zdnet.com/pcmag/stories/reviews/0,6755,2619206,00.html
Mine the gap - a multi-method investigation of web-based groupware use
75
Introducing virtual workspaces
Virtual workspaces are not new in terms of functionality, nor are they based on new
insights into the workings of groups. Shared workspace applications and groupware has
existed for a while, both in the scientific community and commercially, and the virtual
workspaces studied here do not present new aspects in terms of functionality or new unique
approaches to collaboration support. The research in both CSCW and GSS has investigated
the kind of functionality offered. Commercially, the Lotus Notes platform is an example of
groupware adopted in many organizations that offers the same functionality as virtual
workspaces based on proprietary protocols.
In October 1995, GMD made the first version of their BSCW system (Basic Support for
Co-operative Work) available for the public to test and use (see Bentley, Horstmann et al.
(1997), Appelt (1999)). BSCW is the first groupware system based on web protocols such as
HTTP, HTML and TCP/IP. It has facilities for file sharing, discussion facilities, user
modelling and access control. The BSCW system seems to be a main inspiration for all the
commercial products studied here.
Method
The empirical basis for my analysis of virtual workspaces is a selection of seven
commercially available products (See Appendix 8). All are web-based applications providing
a set of functions, which enable communication, the exchange of documents, discussions and
collaboration within a group of people. All the virtual workspaces studied were available
from an ASP (Application Service Provider) at a monthly rate or as ad-ware. The criterion for
selecting the products to be analyzed was primarily their availability for testing without cost.
The results presented are drawn from analyses of the application user interfaces and self-
observation in simple use-situations. The other Ph.D. students, from the DIWA project, and I
have used the applications in different situations for collaborative work. Also the functionality
of the applications has been analyzed in detail. Appendix 9 provides an example of the guide
used for the analysis of the virtual workspaces with answers taken from the analysis of Lotus
Quickplace.
Standards
The first virtual workspaces (except for BSCW which was a research project) were
based on the .com model. In this model, a company develops the virtual workspace and offers
it as a service directly to users. The primary medium of communication to customers is the
Internet, and in many cases it is available as ad-ware where users get a free service in return
for being exposed to banner-ads. This model has turned out not to be viable, and by the end of
2002 very few virtual workspaces were solely based on the .com model. The companies have
either closed or they have combined the .com model with either the ASP or the traditional
model.
The ASP-model divides the development of the technology and the offering of the
service to customers into two companies. The software development companies lease the
software to ASP's who offer the virtual workspace directly to users.
In both the .com- and ASP- models the virtual workspace is offered as a service to be
paid for on a per unit of use basis. You don’t buy a software license with unlimited use but
pay specifically for each unit of use. The ASP or .com also hosts the application and the data,
so that all the customer needs is a browser and Internet access. The typical unit of use by
which payment is measured is one month of use by a given maximum number of users.
In the traditional model for offering virtual workspaces, a software license is sold to an
organization, which installs and run the software in their internal IT-department, as is the case
at Beta. The main reason for this is that financial corporations have a history for focusing on
security and protection of corporate data. Therefore it was not an option to have a .com or
ASP host the data.
The case of virtual workspaces shows how the .com model, which has turned out to be
problematic, has produced a type of application that is finding its way into organizations
primarily through the ASP and traditional model.
The customers for whom the service is marketed are, for example, projects in and
between organizations of varying sizes. As we shall see in the section on the metaphors used,
the virtual workspaces are targeting slightly different customer settings. The ASP's all provide
a very easy start-up of the virtual workspace, with no installation required. All that is needed
for someone to start a virtual workspace is a browser.
takes place as a product development process in the software organization, and the
customization process takes place at the customer, handled by different people within a
different organizational setting. With Internet based applications such as virtual workspaces,
the design process is best described as three distinct processes. The first process is the
development of the standards of the Internet, the second is the application design process and
the third is the process of customization or adoption of the technology in the organization of
use.
The virtual workspaces showcase a design process, which has changed in two ways.
Firstly, the development of standards is becoming a process that is significant not only to the
developers but also to the way IT systems will be adopted in the use organization. Secondly,
an increasing proportion of the design process is left to be solved in the use situation. Virtual
workspaces are examples of applications that are open to many different forms of integration
into work settings, and processes which in standard systems such as SAP or work-flow
systems are the task of specialists who customize the software, are now the domain of end-
users.
In traditional in-house development projects, the design process is typically organized as
a project in the IT department with perhaps representatives from the users. Once the project is
completed, the application is then taken into use and in the event of necessary re-designing,
this will generally take place as a new project in the IT department. Broadly speaking, there is
one design process involved and a use process.
With standard applications such as SAP, Lotus Notes and others the design process is
divided into a software design process taking place at the software company and an extensive
customization process in the use organization. Here the design process is divided into two
organizationally and temporally distinct processes. Typically the customization process is
made in-house, organized as a project in the same way as the traditional design proces. The
customization or adoption of virtual workspaces, and other applications such as e.g. e-mail,
WikiWikiWeb (Leuf and Cunningham (2001)), or CoWeb (Guzdial, Rick et al. (2000)), is
accomplished by users more or less as part of the daily work, and in many cases without
explicit attention to it as being a design process.
The most important feature of virtual workspaces and other web-based applications is
their reliance on standards. They all use the browser as the client and rely on TCP/IP, HTTP,
HTML, and the group of standards guiding e-mail. (Bøving (2001) contains an analysis of the
standards underlying the e-mail system and how they affect the use patterns of e-mail). These
standards impose a number of constraints and "ways of doing things" which are important for
understanding how the application will be developed and used. In addition to the well-
established standards, new standards have been developed but not yet adopted in virtual
workspaces. These are XML (Extensive Markup Language) and webDAV (Web-based
Distributed Authoring and Versioning). XML is a standard for structuring information
including providing standard data structures for specific purposes known as DTD's
(Document Template Definition) or Schema. WebDAV is an extension to the HTTP 1.1
protocol and handles the locking of files for editing as well as version control and other
related features.
Of course the effects of standards are not new in systems development, but their
importance has increased significantly, and they are, in the case of RFC822 (Standard for the
format of ARPA Internet text messages), HTML and XML, not only concerned with
providing basic communication protocols but also with forming the content of the application.
The problem of editing documents via HTPP provides a specific example of how the
standards process is important for the use situation in the case of virtual workspaces (see Dix
(1997) for an extensive discussion on CSCW and internet protocols). The virtual workspaces
rely on the HTTP protocol, which is a stateless protocol for requesting and sending resources
between a client (browser) and a server (HTTP-server). That the protocol is stateless means
that when the server has sent what the client asks for it forgets everything about the
transactions except for maybe writing it to an HTTP-log. In a virtual workspace where part of
the purpose is the sharing of documents in the making this is somewhat inconvenient, since
one risks simultaneous editing of different versions of the document that will result in a
conflict when the changes are sent back to the server. Because of this, all of the virtual
workspaces have implemented locking of files, which enables one to lock the file against
editing or reading by other users while you download a copy of the document to your local
machine for editing directly in the browser or in another application such as Word.
The locking of files is implemented differently in each of the workspace applications and
is generally not very intuitive to use. As a result of the deficiency in HTTP, WebDAV is
being developed as a standard protocol under IETF (Internet Engineering Task Force). If
webDAV is adopted by the software organizations who develop virtual workspaces, the
locking of documents for editing and version control will be defined by the standard rather
than by the software organization. This example illustrates how the protocol is setting up
conditions, which have consequences for the design of the application and, subsequently, for
the end-user.
Above I have illustrated how the development of standards is becoming an important
factor in the design of applications such as virtual workspaces. There is one other major
factor, which should be mentioned, which concerns the tools and methods used to develop the
application. The virtual workspaces divide into three main groups according to the basic
model they are built on. Most of the applications (e.g. eRoom, Projectplace, BSCW) are based
on the concepts of files and folders known from file systems in operating systems such as
Windows. Lotus Quickplace is based on the concept of forms, documents and views known
from the world of Lotus Notes. Forms are document templates or data structure definitions on
which individual documents are based; views are collections of documents selected by some
common property. Views serve the same purpose as the folder but have a more flexible
structure. Quickteam is based on a generic object-oriented model. The whole organisation of
the Quickteam points to an object modelling approach. There are many entities such as issues,
documents, goals, events, etc. which share the same general properties such as the possibility
of assigning security settings. It would be rather simple to reconstruct the class hierarchy
underlying the application. The logic of classes, objects, properties and inheritance is very
evident in the way things are organized.
This is exemplified in the modelling of a project. A project is modelled not as an
aggregated class containing tasks, documents, etc. The project class only contains a schedule.
Overall a project is modelled as a property of all objects such as documents, tasks, etc. In the
file system approach a project is either the whole room as with eRoom or a folder such as in
BSCW. In the Lotus Notes approach, a project is not modelled and would therefore first be
modelled in the use situation as a view collecting a number of documents or by the naming of
a Quickplace after the name of the project it is supporting.
Generally, we can depict the design process of virtual workspaces as three distinct
processes, which are distinctly different in character:
both from academia and the commercial world. It is typically done as a Nebengeschäft for the
members of the group. Both organizations have defined a standard process for developing
standards (see Bradner (1996), Jacobs (2001)).
As an example, the WebDAV working group under IETF was approved in March 1997,
and WebDAV has the status of "Proposed Standard" by IESG (Internet Engineering Steering
Group). "Proposed Standard" is the first maturity level of an IETF standard. The next is the
"Draft Standard" and the last is "Internet Standard". As a comparison HTTP 1.1 now has the
status of "Draft Standard" while TCP, IP and FTP are "Internet Standards".
It is, however, important to note that the standards development processes of W3C and
IETF are not the only relevant standards processes. The Internet has several examples of de-
facto standards, which have emerged typically because a software company has been so
successful that competitors have adopted them. The "common log file format" and the
"extended common log file format" are both examples of de-facto standards. The cookie is
another example, which was developed by Netscape as part of the Navigator browser. A third
example is JavaScript, which was also a Netscape invention.
environment and it is critical factor for success that products are developed to be very
flexible. The shifts in demand and the constant release of products by competitors mean that
software development must be able to change significant aspects of products at late stages in
the development processes (MacCormack, Verganti et al. (2001)).
The description of the standard development and its importance for how applications are
designed and packaged in products, and the description of the development process has
highlighted some structural factors relevant for understanding how virtual workspaces turn
out as they do. The model of the three major processes involved in understanding the role of
virtual workspace technologies is characterized as "levels of structuring a domain of work".
This means that the standards set up both constraints and possibilities for the development of
the applications. More specifically, the application development process is producing an
artefact with certain properties, which will constrain some and enable other possible
technologies-in-practice.
The support for designing folders is approached quite differently in the virtual
workspaces, mainly in the way the creation of folders is organized. In, for example, eRoom,
Quickteam, and Projectplace any member of the group can create folders. In Lotus
Quickplace the manager is the only one allowed to create structures for organizing content.
One aspect of designing the document structure is not modelled/considered by any of the
applications. This is the process of deciding on a structure of the documents. The decision
process of structuring the documents in the virtual workspace could be the task of a project
manager, a project librarian or it could involve the whole group. As we shall see in the case
study at Beta, the decision processes on structuring is approached differently across the
Quickplaces.
Another illustration is the discussion forum. All virtual workplaces model a discussion
forum where members can post information and others can respond to it in a threaded way. A
discussion forum is open to many different genres of communication. In a project these could
be: announcements from the project manager, issues which need resolution, questions, etc.
The discussion forum is thus a very open structure that can be used for many different genres
of communication.
In one of the interviews made at Beta, a manager of a Quickplace reported trying to
model the issue resolution genre. He was managing a geographically distributed project
group, which met face-to-face in workshops every two weeks and otherwise communicated
using the telephone, e-mail and a Lotus Quickplace. At one of these workshops the project
decided on a number of issues which the participants in the different countries should each
investigate. The project manager posted each of the issues as root documents in the discussion
forum, and anticipated that each of the countries would respond to each issue so that the
project could take an informed decision at the next workshop. However, nobody responded in
the Quickplace, but all had done their work and were ready to report back when the project
met again. They just did not use the Quickplace, because the project manager did not use an
existing genre of communication, and was suggesting a new one without making it explicit.
He did nothing but post the issues, leaving the project members to guess how they should
respond to it.
These illustrations exemplify the challenge of adopting the virtual workspace in a work
setting, which is analyzed in greater detail in the report from the Beta case study.
different strategies used by the application developers for modelling the anticipated use
situation.
The most interesting aspect of modelling the anticipated use situation is the use of
metaphors. The use of metaphors is the primary means by which the virtual workspaces are
differentiated. As noted previously, the functionality offered across the different virtual
workspaces is more or less the same. The typical approach to comparing software focuses on
functionality, but this approach does not work when assessing an application such as virtual
workspaces.
logically into issues, hypotheses on the issues, and key questions which confirms the
hypotheses, and leave the interactions untouched.
which match different phases of a typical consulting project. So, in fact, the management of
the project is modelled in different phases such as Project
Initiation, Development, Testing, and so on.
The different approaches to modelling the idea of a
project illustrate how the virtual workspaces have different
strategies for modelling the anticipated use situation. Whereas
the functionality of Quickplace and eRoom is more or less the
same, the modelling of the anticipated use situation is quite
different. As we will se in the reports from the case study,
Lotus Quickplace, which was implemented to support
projects, is also used also to support other groups of people. It
can only be speculation, but this pattern would probably not
have emerged had Beta chosen to implement eRoom instead.
There are two underlying trade-offs in play here: specificity vs. generality and flexibility
vs. complexity. The specificity/generality trade-off is important in standard applications such
as these with the underlying economic model we have described earlier. One wants specificity
in relation to the work situation, which the application is supposed to support, in order to
create perceived value. On the other hand one wants it to be general so that no design is
required from the ASP or software company for each copy of the software leased or sold.
The other trade-off between flexibility and complexity is actually two trade-offs. The
first one is well known from systems engineering. Maximum flexibility is desired so that one
can cope with changes in the use situation without having to re-design. However, flexibility
increases the complexity of the application and thereby the costs of designing and maintaining
the application. In addition, and here is the second variant of the trade-off, flexibility also
adds complexity to the process of adopting the technology. People using Quickplace or
BSCW have to deal with a more complex design-process when modelling a project than
people using Projectplace or eRoom.
While these trade-offs are attached to the virtual workspaces as a type, the outcome in
the individual workplaces has been quite different. Which approach is the most successful can
only be decided in the use situation.
Another aspect in which the virtual workspaces differ is through the way the
organization of the users is modelled. They manner in which user rights are distributed
projects an image of how the users are organized.
Quickplace, Projectplace, eRoom, and Quickteam project quite different images of the
organization of the group or project which uses the workspace. Projectplace is projecting an
image of peer-to-peer collaboration with no hierarchical structure. Quickplace on the other
hand projects the image of a manager (or a librarian hired by a manager), a group of core
members and peripheral members who can only read documents.
The two examples of modelling a project and modelling the organization of the users are
prototypical examples of how virtual workspaces, which are quite similar in functionality,
choose to model different aspects of an anticipated use situation. The user rights management
clearly shows a difference in the image of the use organization projected into the application
by the developers. The result is that the different virtual workspaces create quite different
starting points for the establishment of technology-in-practice.
The use patterns of Lotus Quickplace at Beta show that in most of their Quickplaces a
few persons are assigned the role of manager, a small group of people author documents,
while a larger group of people only read documents. This pattern reflects therefore the pattern
anticipated in the design of Lotus Quickplace. It would not have been a practical problem to
assign the role of manager to everybody in a Quickplace but for some reason the model
suggested by the application is used. Again, the consequences of choosing another virtual
workspace technology such as eRoom can only be speculated, but perhaps rather different
technologies-in-practice would have emerged. It is notable that Beta has not consciously
chosen to implement Lotus Quickplace rather than, for example, eRoom because it modelled
Mine the gap - a multi-method investigation of web-based groupware use
88
Introducing virtual workspaces
the right aspects of the use situation. The choice of virtual workspace was done because the
IT department had previous experience with Lotus Notes software, on which Lotus
Quickplace is developed.
“a figure of speech in which a word or phrase literally denoting one kind of object or idea is used
in place of another to suggest a likeness or analogy between them.”
(www.merriam-webster.com)
Goodman (1976) describes one important aspect of a metaphor, which is not covered by
the Merriam-Webster definition.
“Now a metaphor typically involves a change not merely of range but also of realm”
(Goodman 1976) p. 72
A metaphor changes the range of a predicate. When you use “sad” to describe music, the
predicate, which we literately apply to human feelings, has now changed its range to include
music metaphorically. However, as soon as the predicate “sad” is applied to music it is also
possible to apply “happy”, “depressing” or “gay”. A predicate such as “sad” is always part of
a symbol scheme e.g. the scheme of human feelings, and when “sad” is applied
metaphorically to music, it enables the rest of the scheme applicable in the new realm.
This property of metaphors is central in explaining the use of metaphors in the virtual
workplaces. Some of the metaphors are explicit, while others are implicit or potential
metaphors.
Nelson Goodman thus offers us three different types of metaphors to look for: the
explicit ones such as when applying “sad” to music, the hidden ones such as “gay” or “happy”
and the symbol scheme or realm which would in this case be “human feelings”.
metaphor, and the second branch use a room, house or a huddle which focus the idea of a
place where people meet and do things together.
The method of identifying the root metaphors was to collect the symbols used in the
application or used to describe the application in help systems and marketing material. The
BSCW system was the only virtual workspace, which provided some reflection on the design.
BSCW is a scientific project as opposed to the rest, which are all commercial. In the
description of the system (Appelt(1999)) the author is reflecting upon the use of metaphor in
BSCW:
“The BSCW system is based on the metaphor of shared workspaces.”
Exactly how the “shared workspaces” metaphor informed the design of BSCW is not
reflected upon in the article. The interface of BSCW seems to point in many directions.
Quickplace, eRoom and Huddle 247 use the house, room or huddle respectively as the
root metaphor. The metaphor is used directly in the naming and marketing messages of the
three applications:
“eRoom is the digital workplace on the web that enables distributed teams to work
together on their complex business projects.” (www.eroom.com)
“Lotus QuickPlace is the self-service Web tool for team collaboration. QuickPlace
enables the creation of a team workspace on the Web -- instantly!”
(www.lotus.com/quickplace)
The metaphor is also used in the interface design. One enters a Quickplace via a URL,
log in and then see a user interface which one assumes is similar for all members (in fact it is
not, since documents and rooms can be hidden for certain members). The same is the case
with eRoom and Huddle247.
The other group of virtual workspaces uses the personal office space as the root
metaphor. Projectplace, TeamNow, and HotOffice use this approach. While the marketing
messages all focus on the possibility of collaboration between geographically separated
people, the different metaphors are reflected in the start-up screens. When one enters the
virtual workspace a personal interface is presented, and the projects one participates in are
modelled. This does not produce a feeling of a shared place, which is common to all the
members of the project or team.
The two overall metaphors used in the virtual workspaces are thus the room or house,
and the personal office. It is again very important to note, that most of the functionality is the
same. One example is the personal storing of files. The applications such as Quickplace and
eRoom using the room or house metaphor also allows for files to be stored without being
visible to others as the applications using the personal office metaphor.
My Desk
Calendar Contacts
Phone messages
documents
e-mail
The Office
Bulletin Board
Departments
Users
Files
Tools
List Note
Poll
Client Engagement Personal org.
Project Public
Discussion
New Product Launch
Discussion
Group
Reader
Members Organization Authoring
Manager Revision
Tasks Notify Author
The metaphorical landscapes reveal a mixture of realms from which the metaphors are
drawn. As an example of the way my analysis is made we can look at the discussion facility.
The discussion facility is in all three cases the standard threaded discussion. The user can post
messages and respond to the messages as well as respond to the responses. In eRoom,
discussion is used in conjunction with a poll. A poll can be performed as part of a discussion,
so that one response in a discussion could be to define a poll for the member of the eRoom.
This is why the elements of poll and discussion are gathered in the “public“ realm as they
borrow from public discussions and democratic decision making. HotOffice, on the other
hand, names the same threaded discussion facility a “bulletin board”. I have not traced it back
to the bulletin board in a university or an office building. This is because HotOffice uses it
together with a chat function, which is a standard synchronous non-threaded chat. Therefore
the original realm of the bulletin board metaphor in HotOffice is rather the BBSs (Bulletin
Board System) on the Internet. In Quickplace discussion is associated with members and
should therefore be assigned to a generic realm of groups where members discuss.
Another example, which illustrates the different realms from which the metaphors are
drawn, is the facility for sharing files. The eRoom relies on metaphors from PCs and uses
"folders" in association with "files". Also the graphical design of the folder structure
resembles what is seen in, for example, the explorer application of the Windows operating
system. In Quickplace the metaphors of "document", "Library", and "Index" are used in
conjunction. These metaphors stem from the realm of the archive or library, where documents
are stored for future use.
As was the case with the different strategies for modelling projects and distributing
rights to users, the consequences of applying various different metaphorical landscapes can
only be speculated.
naming of folders. For some reason, the .gif pictures used as buttons for the folders in the QP
are named after the text on the button. The text on the button is the name of the folder. This
means that a button named “presentations” can be clicked and will take you to a folder which,
if used with some sense, contains material related to the concept of presentations. Since these
.gif files were loaded when a person used the QP, it gave us access to a very detailed account
of how the folder structure of each QP has developed over time.
Structuring the contents of a QP is an important aspect of making a QP work (Bannon
and Bødker (1997), Bannon (2000), Schmidt and Israel (2000)). Besides seeing it as a
communications tool, it is a place to store information in a way, which makes it easier for the
users to find and work on. Our analysis of the development of folder structures in the three
exemplars suggests that information structuring should not be characterized as a static
librarian discipline but rather as something performed by users as a part of their work. The
analysis concludes with the presentation of a functional model that attempts to explain the
dynamics of folder structures.
180000
160000
140000
Pagereads
120000
100000
80000
60000
40000
20000
0
ee 9
W 22
W 25
W 28
W 31
W 34
W 37
W 40
W 43
W 46
W 49
W 52
W 3
6
1
k
ee
ee
k
k
ee
ee
ee
ee
ee
ee
ee
ee
ee
ee
ee
W
The chart shows the overall activity on the QP server in the whole log period
summarized per week. This shows a steady increase in activity over the 10-month period
summarized across all QPs. 80 different QPs were active in the first month of logging. In the
last month of the log period 126 QPs were active. The number of active QPs had therefore
risen by 58%. By comparison, the overall activity had risen by 275%. Not only was there an
increase in the number of QPs but the average activity per QP had risen over the log period by
138%.
QP was only offered to the headquarters of the bank. Retail banks historically have a
sharp distinction between the customer-facing branches and the headquarters where IT,
Communications, Human Resources, etc. are placed. Corporate and institutional banking and
investment banking are also considered part of the “headquarters”.
Around July 2001 GIC, (the Danish part) that initiated the use of QP in the organization
and commissioned its use, told us that QP was probably going to be closed down. The reason
was a missing approval from the department of IT Security. According to them QP had some
features which violated Beta's IT security policy. However, QP was not shut down. After an
intense political struggle between IT Security and the users represented by the
Communications Department, a compromise was agreed where IT Security took over issuing
QPs to new users and QP remained operational in the organization. A main reason why QP
remained in the organization was the fact that there were over 80 active QPs at that time. For
practical reasons it was impossible to close down. Some of the activities supported by QP,
such as the translation of the quarterly financial reports, were indeed business critical. Closing
it down would have created many problems, and would have required that some alternative
technology be found to replace it.
In order to understand this conflict, we need to take a closer look at the security
architecture of QP. Once the opening of a new QP is granted, two QP managers are assigned
to it by IT Security. After this appointment the two managers have full control over security
in the QP; they are even able to invite other managers. The distributed nature of the security
model in QP was originally motivated by the need to ensure privacy of data to the users in an
ASP context. The QP manager defines who can use the QP and the author of a document
solely defines who is able to read and edit it. This distributed security model also enables a
manager to create new "sub-rooms" potentially inaccessible by the two QP managers
originally appointed by IT Security.
It is obvious that QP hereby compromises the hierarchical and centrally managed
security model normally used at Beta. Neither does the central IT security unit have any way
of controlling access to rooms or documents, nor does a QP manager have any means of
controlling what is in “his/her” QP.
What we see is a tradition of centrally managing both the technology and the use of the
technology on a macro-level at Beta. Their tradition is not suited to handling a technology
such as QP where both access rights, what the system should be used for, and how it should
be used is defined at the level of the individual QP (the micro level). The tradition of centrally
managing IT is also evident when we look at the way QP was implemented.
Despite the conflict QP has remained at Beta, and in June 2002 when I last had contact
with Beta, QP was still a part of the IT infrastructure offered to units, which work across the
Nordic countries.
The IT management of the QP technology has caused a number of conflicts. At Beta
there is a long tradition of assigning system owners to IT systems. The system owner is
typically the manager of a business unit. The role of the system owner is to define the purpose
of the system and define rules for the proper use of it. As an example, the communications
department is the system owner of the Intranet. It has been rather difficult to find someone
willing to play the role as system owner of QP. The reason is that QP, like other virtual
workspaces, makes it impossible to exercise the role of system owner. This is because it is
very decentralized in the way it is managed, and because it is a very generic technology. It is
very difficult to define a “proper” use and to exercise control.
The typical IT system at Beta has a surveillance functionality, which enables the system
owner (or the system administrator on his behalf) to oversee and control the actual use of the
system. In QP this surveillance functionality does not exist. As previously discussed, QP is
originally designed as an application for an ASP environment. In an ASP environment, the
last thing you would want, as a customer leasing a QP, is for some system administrator to
have unlimited access to the documents you choose to put in it. For that reason nobody but
the members of the individual QP have access to it and define who else should have access to
the QP.
The typical IT system at Beta has a SOP (Standard Operating Procedure), which
documents the “proper” use of it. The SOP describes who should use the system, what it
should be used for, and how it should be used. When an IT system is put to use, the system
owner writes an SOP. The SOP is meant for users and managers of the system. There are
SOPs for, for example, issuing a mortgage to a private customer which tells the bank assistant
step by step how it should be done, including the use of the supporting IT system. In the case
of QP, it has taken more than one year to come to an agreement about a SOP for QP. It has
been very hard for the people responsible for the implementation to actually formulate a SOP
for an open technology such as QP.
So, QP can be characterized as a “rebel” technology in a financial institution where IT is
traditionally very centrally managed. The organization was used to technologies where the
proper use is defined centrally.
Not only the character of the technology, but also the implementation of the QP
technology has been different from normal.
The implementation of QP
The QP implementation effort is understood more clearly when contrasted with the way
one of the pre-merger banks previously implemented an Intranet. Intranets have in some
organizations been the first experiments with bottom-up IT initiatives, where Intranet-sites
have emerged without a planned change approach (Bansler, Damsgaard et al. (2000)). This
was however not the case at Beta. The Intranet implementation implied the defining of a
number of communication channels, as well as roles and work-flows for the publication of
information through these channels. The implementation of the Intranet included a formalized
education effort where editors and authors, in a two-day seminar, learned about system
features as well as how to write for the new medium Also, all readers were introduced to the
Intranet by video-presentations within the organizational units. The SOP written for the
Intranet is a 50+ page document.
There were very little efforts to implement the QP in the organization. Some resources
were spent on customizing the look of the application, but apart from that the only formal
means of implementation were e-mails sent to potential QP managers, and oral
communication. The e-mails were sent by the initiator of the QP initiative to people in his
network and then forwarded.
The e-mail contained an instruction to people who wanted to begin using a QP. Potential
QP managers should send an e-mail to IT Operations applying for a QP. The original idea was
that the application should contain a business justification, but in practice all applications
were approved. The rule of thumb for granting an application for a QP was that the use was
justified when the proposed members were from geographically dispersed organizational
units.
A part of our data from the case study consists of the applications for QPs sent to the QP
administrator. The conclusion after reading them is that very few applications had business
justifications for using a QP. Most of them simply contained a request to start up a QP with
the names and e-mail addresses for the proposed managers.
In contrast to the Intranet implementation there was neither educational effort of users
nor any guidelines on how the QP could and should be used to support various
communicational and collaborative needs. A 5 page SOP (compared to the 50 page Intranet
SOP) was written containing information about how to open and close down a QP, but was
first issued one year after the introduction of QP. As to how they should set up and use the
QP, the users were left with the general guidelines provided by the software manufacturer.
As noted above, creating and setting up a QP is by default distributed to the manager(s)
of the QP. The QP manager(s) defines the initial structure (rooms and folders) of the QP and
the authorization structure.
The start-up of a new QP is initiated by someone sending an e-mail to Security with the
application for opening a QP. The e-mail typically contains names of at least two persons who
are assigned the roles of managers in the room.
Being a manager firstly allows one to invite other members to the QP and assign them
rights as either reader, author or manager. Secondly, the manager can create and name new
folders and sub-rooms in the QP. Below is the start-up screen as shown to the user after the
QP has been created. The layout and graphics are customized at Beta but the functionality and
default folder structure are the same as shown here.
The default folder structure of the QP consists of a welcome page and seven different
folders. The folders are:
- Discussion: a threaded document repository, which allows people to post
comments to published documents.
- Library: a simple document repository which does not allow threaded comments
- Calendar: a basic calendar where one can post events, meetings etc.
- Tasks: a simple planning tool which allows one to create tasks with start-date and
duration and visualize the tasks in a simplified GANTT-like manner.
- Index: a repository for all documents in the QP. All the contents of the other
folders are also available here.
- Customize: only available for managers. It allows them to create sub-rooms,
forms for structured data entry, and to change the appearance of the QP.
- Members: the list of members in the QP. Here the manager can also invite new
members or expel existing members who are no longer welcome.
The start-up technically consists of inviting members, and eventually of changing the
default folder structure to suit the needs of the group. Of course this is but a small part of
the efforts needed to actually make a group begin using it. The development of the folder
structures will go into one aspect of this.
QP has been the first technology at Beta, which has been implemented using what
Bansler, Damsgaard et al. (2000) call an "improvisational" approach. The use, which we will
study in detail is not framed by strict limitations defined in a SOP, which have then been
defined centrally and distributed to users through, for example, educational sessions.
LAN-drive: LAN-drives have been available at Beta headquarters for a number of years
before QP arrived. A LAN-drive is only available in a certain geographical unit. LAN-drives
are used to store files. I studied the use of the LAN-drive as a part of the “Use-case study”
Bøving (2001) mentioned previously in the section on research method. LAN-drives were
used in the project primarily as personal back up. Both interviews, the analysis of documents
on the drive and the naming of folders (e.g.” Bob’s documents”) indicated this. There were a
few accounts of people who used a folder on the drive to exchange documents.
Intranet: each of the three banks that merged into Beta had their own Intranet. The
Danish was by far the best developed (according to the interview with our primary contact at
Beta). From version two, the Danish Intranet consisted of six overall types of information:
1. News: a highly structured set of communication channels targeted at specific
organizational, geographical and functional units.
2. Reference information: a collection of handbooks and SOPs. All SOPs for the use of
IT systems were available here.
3. Homepages: a tool available for creating information targeted at the members of an
organisational or a geographical unit.
4. Discussion forums: a collection of discussion forums, which were hardly used.
5. Tools: a collection of tools targeted at specific tasks and not related to
communication
6. Bulletin board: used for varying informal information from the wine clubs, sports
clubs of the bank and the list of available holiday houses to let.
I have no accounts of the contents of the Intranet in the other countries, but the Danish
Intranet gives an overall idea of the possibilities available.
These communication media together with the QP technology form an infrastructure for
communication. The functionality offered in these media is partly supplementary and partly
overlapping. For example, both QP and e-mail have calendar functionality and both LAN-
drive and QP enable shared storing of documents. A quick look at the functionality offered in
the different media shows a communication infrastructure, which is in terms of functionality
not clearly divided. In the survey, I asked the users two questions on their combination of QP
use with the other available communication media.
The following graph shows the result of asking the following inclusive question in the
survey:
“Which other media do you use to communicate or exchange files with the other
members of the QP?”
100%
90%
80%
70%
Frequency
60%
50%
40%
30%
20%
10%
0%
E-mail Face to Telephone Intranet LAN-drive Regular
Face mail
Clearly QP does not work as a communication medium with an isolated and well-
defined role. It is rather used sometimes in combination, sometimes instead of alternative
media. As the answers to the next exclusive question indicate, e-mail is the most important
competitor and supplement to QP.
“Please select the medium most frequently used by you to communicate with the other
members of the QP.”
80% 74%
70%
60%
Frequency
50%
40%
30%
20% 14%
10% 5% 4% 4% 0% 0% 0%
0%
l
d
ce
e
t
t
ne
ai
ai
ne
ne
e
riv
m
m
Fa
er
o
tra
tra
-d
ph
E-
sw
ar
to
N
In
In
le
ul
an
LA
Te
ce
eg
ot
Fa
R
N
We will investigate the combination of media in some more detail later on. It would
appear that even when we study a single genre of communication, different media are put to
use. This seems to challenge the way genres of communication are studied traditionally by
analyzing the contents of one medium. The ad-hoc combination of communication media in a
communication situation also challenges the deep-seated notion of a rational relation between
functionality and the use of an IT-medium. We will analyze this in more detail in the section
on design-in-use.
The next four sections contain some general observations on how the QP technology has
been used in the 10-month period of investigation. The first two sections are observations on
how the technology had spread in the organization. The means by which it had been
introduced has produced some uses of QP, which were not intended at the outset. The third
section contains statistics on the sizes of the QPs observed in terms of the number of active
users. The fourth section contains an analysis of the QPs, which had been started during the
log period that shows very diverse outcomes of deploying a QP.
Unintended uses
The intention of the people who implemented QP in the organization was to supply a
tool for merger projects which would support a faster merger process and a lower level of
travel expenses. As the results from the survey report below shows, the use of QP has grown
into other areas than the one originally intended.
I asked the respondents of the survey to characterize their use of their QP in terms of
four types of use. These were: To support organizational units, to support different recurrent
tasks such as translating the quarterly financial reports, to support special interest groups and
to support projects, which was the original intent. The types of use were derived from the
interviews conducted.
Types of use
25
20
20 17
Frequency
15
10 7 6
5 3
0
An A project A group A special Other
organisational performing a interest group
unit recurrent task
The graph shows the distribution of QPs according to the typology, and therefore gives
an indication of the purpose of using QPs. The “other” category of the graph was written by
the respondents in a free-text answer. The three responses in the “other” category covered one
empty description, a QP spanning more than one project and a QP spanning more than one
organizational unit.
The respondents’ answers suggest that the use of QP had evolved without central
control. This is a very different pattern than usually seen in the bank. E-mail and LAN-drives
are the only previous experience the bank had had with a technology, which expands with
very little central control.
30
25
Frequency
20
15
10
5
0
y
IT
l
g
fe
es
g
ce
ns
tro
ga
ur
in
in
io
Li
rc
an
io
ki
Le
nn
nk
at
as
on
an
at
nd
ou
ic
r
Ba
a
C
su
el
un
lB
Tr
ta
Pl
es
R
k
In
e/
-I
is
R
en
ai
r
C
al
R
nc
to
om
et
an
em
er
es
nd
R
na
um
C
en
ag
v
ta
Fi
In
d
H
an
an
di
re
tM
ity
C
t
se
en
As
Id
Organisational unit
The graph should be interpreted in the following manner: in 28 of the 45 QPs that
answered the survey, Corporate and Institutional Banking was selected. 50% of the QPs
report that all members are from one organizational unit, while the other 50% had members
from two or more units. The names of organizational units used in the survey were the major
organizational units of the company. It is therefore very likely that, for example, a QP
reporting only members from IT in the survey response will have members who span both
countries and functional divisions of IT. All major organizational units in the bank were
represented in the QPs. The reason why Corporate and institutional Banking and IT were
present in so many QPs is probably because the strategy of merging the banks focused on
aligning IT to save costs, and additionally to align Corporate and Institutional banking,
Mine the gap - a multi-method investigation of web-based groupware use
106
The study of Quickplace use
because many corporate and institutional customers operate internationally. The reason for
Retail Banking having a high presence is probably due to the fact that it is by far the biggest
business area in terms of people employed.
The survey also asked a question on the geographical distribution of the QP members,
which was phrased: "In which countries do the members of the QP primarily work (Where are
their offices located)?" The respondents should select one or more countries from the list of
countries in which Beta has offices.
60%
50%
40%
30%
20% 11% 9% 7% 5%
10%
0%
Denmark Sweden Norway Finland United USA Singapore Poland
Kingdom
Country
Not surprisingly, the four Nordic countries are represented in almost all of the QPs, and
all but four QPs had more than one country represented. This corresponds to the fact that
applications for a QP were only accepted if the users came from more than one country.
While most major organizational units on the organizational chart are represented, there
is a striking difference in activity across geographies
13%
12% Finland
Sweden
4% Norway
Denmark
71%
This chart shows the distribution of the overall activity observed in the log based on the
IP-addresses of the requests made to the QP server. The network architecture allowed us to
identify from which country the request was made. The graph is slightly misleading since the
requests displayed as coming from the four Nordic countries also include requests made from
offices in countries outside of the Nordic countries. The amount of traffic stemming from
other countries is indubitably very small and fairly evenly divided across the four Nordic
countries, so that it still provides a reasonable picture of the actual distribution.
The distribution of activity across the countries can be interpreted as a result of the fact
that the technology is introduced by the Danish part of the organization. It could also reflect a
difference in the experience of working with collaborative technologies across the countries.
There are no data to support this interpretation other than an indication made in one of the
interviews with a Danish QP manager.
140
120
120
100
Frequency
80
60
40
19
20 9 7 3 4 3 1 0 2 1 0 0 1
0
1 22 42 63 83 104 125 145 166 187 207 228 248 More
No. of users interval
The histogram shows a huge concentration of QPs with more than 1 and less than 22
users. The following histogram zooms to give a detailed account of this distribution.
25 19 20
Frequency
20 17
14
15 11 10 9
10 8
5 4 3
5
0
2 3,9 5,8 7,7 9,6 11,5 13,4 15,3 17,2 19,1 More
No. of users interval
The histograms show a large diversity of QP sizes. This suggests quite diverse uses of
the technology. A possible hypothesis for the size of the QPs is that there would be an “ideal”
size, which most of the QPs would have, and that few QPs would have much fewer or many
more users than the average. This hypothesis does not, however, match the data. We shall see
later that the number of users in the QPs is distributed according to a power-law distribution.
A QP called NP_InternationalDivision is a QP, which exemplifies one of the QPs with
many users. Within the log period, 203 users had been active using 5297 different documents.
The QP supports an organizational unit called "International Division", which is an
organizational unit spanning the four Nordic countries as well as all other countries where
Beta is present. The QP is used for holiday lists, to support credit projects, distribute credit
limits and related information on issuing credits to large customers, and for marketing
materials (responses to the survey question: "Please give some specific examples of tasks the
Nordicplace is used for?").
Another example is the NP_MarketRiskReports, which had 35 active users within the
log period working on 1896 documents. The QP was used to collect daily risk reports in the
Mine the gap - a multi-method investigation of web-based groupware use
109
The study of Quickplace use
form of spreadsheets from different parts of the organization and to consolidate them into one
spreadsheet, providing a daily snapshot of the overall risk situation in the organization. In
contrast to NP_InternationalDivision, this QP was used to support one very specific task.
A third example is NP_nnn which was used to support a group of IT people working
with the Lotus Domino platform. It had 50 active users working on 114 different documents
within the log period. The QP is used to support "Discussions, experiments, programming,
documents. All relevant topics that have to do with the Domino platform within Beta". (Quote
from the survey).
The large concentration of QPs with a small number of users shown in the histograms is
also a reflection of the fact that not all QPs are successful in being integrated in a work
practice.
QP lifecycles
In order to look closely into how new QPs actually evolve, I will present an analysis of
QPs, which have been started in the log period.
The general usage statistics presented in the beginning of this section is a generalization
of 170 QP lifecycles. Some of them were active before we started logging, some died out
during the period of logging and some started up during the log period. For a few we may
have the full lifecycle represented in the log-data. None of the QPs residing on the server
were deleted during the period of logging. There were no procedures for closing down a QP
and archiving or deleting its contents, so QPs only die out in the sense that they are no longer
used.
In order to analyze the lifecycle of a QP, 37 were selected which were all started in the
period of logging. On a per-week basis, the number of active users, number of document
reads and the number of document edits was plotted for each of the 37 QPs. These graphs
give an overall idea of the level of activity or, indeed, whether there was activity at all. The 37
graphs are reproduced in appendix 10.
The three graphs below show three typical use patterns observed from the graphs. The
first use pattern is from a QP used to support a project. This had been started up 13 weeks
after the start of the log period and showed an activity level, which grew throughout the
remaining log period without signs of a decline.
np_fx-mm-globalisation
700
600
500
400
Reads
Edits
300 Users
200
1
4
7
10
13
16
100
19
22
25
28
31
0
34
37
Reads
Users
The second QP had been used to support a short-term project, which had been started
and closed down during the log period. The project supported by the QP had the purpose of
implementing a new corporate name and logo throughout the merged organization. It
demonstrates a shirt-term use of a QP. Since the closing of the project, there is almost no
activity in the QP.
np_name-change
450
400
350
300
Reads
Edits
250
Users
200
1
4
7
150
10
13
16
100
19
22
25
50
28
31
0
34
37
Reads
Users
The last use pattern is a QP, which never got started, the example here being a QP
supposed to support a system development project. Clearly the QP was never integrated into
the practice of the project.
np_fonda
40
35
30
25
20
15
10
Reads
Edits
5
Users
0
1
3
5
7
9
11
13
15
17
19
21
23
25
27
29
31
33
35
37
Reads
39
Users
One should be cautious about making conclusions from these graphs. It is impossible to
deduce anything about whether a QP was actually used for anything sensible, simply from a
plot of the number of users and the activity. One observation is, however, possible: if the QP
showed no activity or only very little activity in very few weeks it should be characterized as
dead. About 14 of the 37 QPs never really got started and showed a little activity for only a
few weeks, or showed very fragmented activity over a longer period. The graphs tell us
nothing about the reason for this, but we can conclude that it is not straightforward to begin
using a QP. The nature of the technology, which we have analyzed earlier in the comparative
analysis of virtual workspaces, allows for a wide range of uses. This is also confirmed by the
study of use at Beta. It is very different than applications targeted at specific uses such as an
application for calculating interest rates for a loan to a customer, or for the Intranet as it has
been implemented at Alpha.
Instead of approaching the adoption of QP in the organization as problematic because so
many of the QPs never get to be used, one could re-direct the argument. The adoption of this
type of technology occurs even though its use is prescribed neither in the technology nor in
the implementation process. The adoption documents the ability of groups of users to re-think
their communication processes and integrate the technology in these processes.
Both in terms of the increase in the number of QPs, which were active and the activity of
each QP, the use of the technology had evolved during the period in which it was observed. It
is not a technology, which is stagnating or dying out. The people responsible for the QP
technology discussed how to replace the technology because it did not fit with the IT-
management practice of the organization. It would create many problems because the
technology had been integrated in very diverse communication practices. It would therefore
be very difficult to replace the technology with something else. A replacement of an
application which serves a specific purpose, such as calculating loan offers to customers, is
much easier to handle centrally.
The QP technology is not easy to integrate into a communication or work practice. The
statistics of the QPs, which had started during the log period, shows that many of them did not
really start at all. It is therefore a technology with a large economic overhead involved. Many
QPs start up and some of them survive to become part of the communication infrastructure of
a group of people.
In order to understand the microstructures of the QPs about which we have untill now
only seen crude patterns, we will focus on three QPs to study the users and the usage in more
detail. This will also assist us in understanding the process of integrating QP in a work
practice.
NP_Solo-ID
This QP served as a project repository. The project was a Nordic IT project with the
purpose of finding an IT-solution for giving one ID to each customer, which they could use
across all Internet-based services offered by the bank. The customers should only have one
cryptographic key and associated password for all the different services. The project could be
characterized as a technical infrastructure project with the purpose of producing a common
solution to be used by all Internet-based customer services. It therefore had many
stakeholders. They included all the managers of existing customer services, customer services
under development and planned services, and the people responsible for the overall IT-
strategy. The project was therefore organized with a relatively small core team of people led
by the project manager who drove the definition and implementation of the solution forward,
and a larger outer group of stakeholders, who were involved in the decision process and who
needed to be informed about the progress of the project.
The project was lead by a Danish project manager and the core team included a few IT-
architects, IT-specialists and business people placed in Denmark who conducted the actual
development work. The stakeholders included people from the other Nordic countries.
The project had had bi-weekly meetings. According to the project manager it had been a
challenge to try to work together in the newly merged corporation across the old
organizational and geographical boundaries. Therefore, they have decided to meet this
frequently in spite of the cost of transportation.
According to the project manager, the QP was used to store and share all relevant
information about the project, including documentation of the process and the solution, as
well as the documentation of decisions, meeting minutes, IT solution documentation, business
process documentation, presentations and material from suppliers of technical solutions.
The QP was started at the initiative of the project manager, and he also decided the
initial folder structure. The management of the folder structure had later been delegated to
people responsible for different parts of the project.
GIC
The GIC QP served as a communication tool for a Nordic organizational unit, which was
formed after the merger. The acronym GIC stands for Group Identity and Communication and
the organizational unit consisted of people working with external and internal
communication. The responsibility for the Danish Intranet was placed in this unit and it also
included the people involved in translation work. There was a large extent of overlap between
the members of the GIC and the IC QP. The GIC QP was used for meeting agendas, meeting
minutes and the holiday list, and is also used to store presentations, which had been used at
different venues.
The GIC QP was started at the same time as the organizational unit, and the decision to
use it was taken at a workshop where most of the members of the organizational unit were
present. It had four managers who could invite members and maintain the folder structure.
IC
The IC QP served primarily as a tool for a number of translation tasks. It is used for the
translation of the financial reports, which were released quarterly. It was also used for the
production and translation of an internal magazine for all employees in the organization and
for the translation of press releases, as we will study in detail in the next section.
The manager of the translators in the organization started the IC QP. As to the reasons
for starting to use the QP, she states:
“If we should ever make the translation process work, we needed some drive or
something were one could be sure that what was there was the right.”
From the beginning the purpose of the IC QP was not to inform people such as in GIC
where meeting minutes were distributed. It should be used for supporting specific work
processes.
“I mean we use it for other purposes than just a medium to keep people informed. One
could see it as a place where you go get some things and work on them and put them back
when you are finished.”
They had been using e-mail and LAN-drives previously but had had problems with
controlling versions of documents. Also, the e-mail and LAN-drive were not available after
the merger where they needed to communicate across country boundaries.
The number of documents used in the three QPs show a striking difference between IC
and the two other QPs. IC had approximately four times as many documents as the two
others.
The matrix also shows that the average document contained one attachment in GIC and
IC while the average NP_solo-ID document contained two. It is therefore safe to say that
most of the actual contents of the documents used across the QPs are stored in attached files.
The three genres we will analyze later also share this characteristic. The QP technology
allows for a number of different ways of storing contents, but the one used here was to attach
a file to a document. This meant that the contents of the file were not directly visible before
the file was downloaded.
The average number of edits between 0,2 and 0,4 indicate that most documents were not
edited after they had been created. The QP technology offers several possibilities for having
more than one user edit a document and it automatically locks the document to prevent two
users from editing it at the same time. These possibilities are hardly used in the three QPs and,
as the general statistics will show, this is also the case across all QPs.
The document use statistics for the three QPs shown above hides an important
characteristic of how documents are used. A large proportion of the documents in the QPs can
be characterized as dead documents. A dead document is a document where the number of
users is one and the lifespan is less than a day. A dead document is therefore a document that
is created by a user and never used again by anyone.
For GIC 87% of the documents were dead. This indicates great difficulties in
establishing genres of communication, which utilize the QP. The average of dead documents
across all QPs at Beta was 72%, so the difficulty of utilizing the QP was a general
phenomenon, which was especially striking in the case of GIC. While the percentage of dead
documents in NP_solo-ID was close to the average, IC had a significantly lower percentage.
It seems therefore that IC had been more successful in utilizing QP.
A phenomenon well known from LAN-drives is that they are used as private archives.
The analysis I conducted in the IT development project, before the QP study, included an
analysis of the structure and use of the LAN-drive used by the project. The conclusion was
that it was mostly used as a private archive, where people placed their files and were the only
ones to access them. One might suspect that this was the case in QP as well. The tendency to
use it as a private archive was, however, not prevalent in the three QPs or generally. The
average across all QPs was 4%. For IC it is 2%, GIC 1,4% and NP_solo-ID 6%. A document
in QP that served as private archive, was a document with only one user and a lifespan of
more than one day.
The table below is cleansed for dead documents and personal archives and clarifies the
different patterns in the QPs. Only documents with more than one user are included.
Statistics of # of Avg. Avg. # Avg. # Avg. # Avg. # Avg.
living docs documents lifespan users uploads reads downloads # edits
IC 255 12,1 4,2 1,4 8,1 4,6 0,6
GIC 8 16,6 17,9 1,3 54,0 14,6 0,1
NP_solo-ID 36 40,0 4,7 1,7 11,6 6,4 1,2
The GIC QP use is actually limited to 8 documents whose full lifecycle was from 5/5
2001 to 5/10 2001. These are read 54 times by 18 users. Compared to the other QPs there
Mine the gap - a multi-method investigation of web-based groupware use
118
The study of Quickplace use
were significantly more readers. The GIC QP seems primarily to have been used for
publishing very few documents to a larger group of people such as the management meeting
agenda genre analyzed in the next section.
A document in NP_solo-id was edited 1,2 times on average, which was more than the
two others. This makes sense if the QP works as a project repository where documents are
worked on while they reside in the QP.
For all three QPs, more than 95% of the activity was conducted by half of the users. The
other half of the users were therefore very peripheral to the QPs and should be characterized
as virtually inactive. The table also shows that IC had a larger proportion of “core” users. The
5% of the users had a lower share of the activity while the 25% of the users had
approximately the same share of activity as GIC and NP_solo-ID.
Another way of characterizing the users is to look at the author/reader ratio. Apart from
the level of activity, a distinction should be made between the users who only read documents
from the QP and the users who authored documents meant for others.
Number of users Number of authors Author percentage
International-Communications 81 29 36%
GIC 108 8 7%
NP_solo-id 97 8 8%
The image of a limited number of people being responsible for most of the activity is
repeated when we look at the percentage of authors. For GIC and NP_solo-ID respectively
7% and 8% are authors while the rest only read the documents. While the tables do not
document the link directly, the authors were in the group of the most active users.
There is a striking difference in the author percentage between IC and the two others.
This matches the descriptions derived from the interviews and survey. While GIC and
NP_solo-id were used to distribute information (e.g. meeting minutes) and document work
(e.g. solution documentation and presentations), IC was used to support work processes where
people post articles for the internal magazine or press releases, which had been translated and
therefore 36% were authors of documents. It is clear that for GIC and NP_solo-id the task of
distributing information and document work was executed in a centralized manner.
The distribution of authors and readers across all QPs shows that 11,9% of the users
were authors. The patterns of GIC and NP_solo-id are therefore much closer to the overall
image of how the usage of QP was divided among the users.
We have seen a wide spectrum in the activity levels of the users. The following table
shows statistics on how often the average user visited the QP. The numbers are measured on a
per week basis. The week unit was chosen based on the hypothesis that if a user has visited
the QP once a week, he can be considered active. The level of activity measured across the
whole period used above gives no indication as to how concentrated the usage was. In
principle the most active user could have used the QP during only one week being otherwise
totally inactive.
Avg. no. of users Avg. weeks of Avg. weeks of
pr. week activity pr. user activity / total no.
of weeks
International-Communications 18,8 9,3 25%
GIC 19,5 6,9 19%
NP_solo-id 17,9 7,1 19%
The table shows that 19% of the users of GIC and NP_solo-id used the QP during an
average week or that the average user uses the QP every fifth week. The average user in IC
uses the QP every fourth week.
The following graphs display the number of users for each week in the log period. The
week numbers on the x-axis correspond to calendar weeks.
35
33
32
30
28 28
27 27
25
25 24
21 21 21 21 21 21 21
20 20 20 20
No. of users
20
18 18
17 17
16 16
15
15 14 14
13 13 13
12
11
10 9 9
8
6
5 4
3
0
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 1 2 3 4 5 6 7
Week
NP_solo-id shows a rather stable number of users per week. Between 15 and 30 users
used the QP, of course, except during the public holidays where the number of users dropped
significantly. The activity seems to have dropped in the last period, which might indicate that
the project was ending or entering a new phase.
IC weekly users
40
38
37
35 34 34
31 31
30
30 29
28
27
26
25 25 25
25
Number of users
23 23
21 21
20 20
20 19 19 19
18 18
15 14
12
10
8
6 6
5
5 4
3
2
1 1 1 1
0
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 1 2 3 4 5 6 7
Week
The number of IC users increased rather dramatically from a level of 6 – 8 users before
the summer holiday to a level of 25 – 30 after. This is probably because the IC had started to
support new tasks. The press release, which we are going to analyze in detail as a genre, was
produced in week 33. This is the first time the QP was used for the translation of press
releases. Also the use of IC to support the production and translation of the internal magazine
was started after the summer holidays. After the dramatic increase in number of users, the
weekly number of users stayed between 15 and 30 for the rest of the log period.
70
64
60
57
54
52
50 48
40
40
35
30
27
25
23 23
21 21
20
20
16 16
15 15 15 15
14
13
12
11
10
10 9 9
8 8 8
7 7 7
5 5
3 3
1
0
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 1 2 3 4 5 6 7
Compared to NP_solo-ID and IC the GIC QP had a much more unstable number of
weekly users. Part from the peak in week 47, the graph also shows a tendency towards fewer
users per week. The activity during the late autumn and early winter was at the same level as
during the summer holiday. As we will see later in the section on folder structures, the peak in
week 47 was concurrent with a major reorganization of the GIC QP. Similarly, the dramatic
increase in the number of weekly users in IC occurred concurrently with a major re-
organization of the QP. While the IC reorganization seemed have been a success, the
dropping number of users in GIC might indicate that the re-organization had been less so
successful. We will deal with this in more detail in the section on folder structures.
share presentations. The interpretation of "sharing" changes radically when provided with the
knowledge that all but 8 documents in GIC were never used by anyone. The term "share" as
Linda used it therefore means that she had the intention of sharing the presentation with
others. A study of GIC based on interviews would have required interviews with all 108 users
of GIC to provide this insight.
As the example with Linda shows, communication does not follow from good intentions.
In the context of this thesis the concept of a genre of communication is used as a basis for
understanding what makes QP a medium for communication. The analysis of instantiations
where a genre is established with QP as the medium will provide us with a greater
understanding of this issue.
1. The authors of articles for the internal magazine place their finished articles in the IC
QP in the folder named after the language in which it is written. (In appendix 12 the
folders called "Nordic ideas - English" and so on can be found in depiction of the IC
folder history.)
2. The translators download the articles and translate them into English (typically one
translator per source language). Once they have completed the draft translation, they
e-mail it back to the author and when he/she has accepted the draft it is sent to a
proof-reader. The translator then places the final English version in the “Nordic
ideas - English” folder.
3. A deadline is agreed on when all the articles should be available in English. Then the
translators translate the articles into the other languages. When they have completed
the translations they also place them in the QP.
If an article is changed after the translation process has started, the author must place the
new version in QP and notify the involved translators by telephone or e-mail. By the deadline
the result should be that all the articles are available collected in the folders by language.
This description of the genre of translation is taken from the translation of the internal
magazine. While it may not be identical in details to other translation processes, it exemplifies
a genre, which is used for the translation of annual and interim financial reports, the magazine
and for press releases.
Yates and Orlikowski (1992) identifies three criteria for defining something as a genre of
communication:
“…that is, the business letter and the recommendation letter, the meeting and the
personnel committee may all be designated as genres of communication if there can be
identified for each a recurrent situation, a common subject (either very general or more
specific), and common formal features.” Yates and Orlikowski (1992)
The translation process can be defined as a genre of organizational communication in the
following way:
The recurrent situation occurs when someone in the organization requests that the
translation unit translate a certain document. Most of these tasks re-occur with fixed intervals.
The financial reports occur each quarter and so do the press releases connected with the
release of the financial report. Also the internal magazine is published four times a year. Once
the request is made, the manager of the unit (Lone) initiates the work by informing the
translators and by sending the document to be translated. A translation process always has a
deadline, so the time length of the process is always known before it starts.
The common subject of the translation genre is always that it is about the translation of
a document. The subject of the documents to be translated of course varies.
The common formal features of the translation process are described above for the
translation of the magazine. The translator collects the document from the QP or receives it
via e-mail (as we shall see e-mail is used in the translation of the press release). In some cases
it is first translated into English and then from English to the other languages. One person
typically conducts a translation. In cases of tight deadlines, as is the case with the financial
reports, the translation process begins on drafts of the documents. This requires that the
translators coordinate the translation with the author of the original text. In these cases the
translation to one language requires more cycles as the original document is finalized in
parallel with the translation. In the case which we will be study in detail, the document is,
however, finalized before the translation process starts.
The detailed analysis of the translation of a press release is based solely on data from the
log file analysis. This allows us to focus on the specific role of the QP technology in the genre
of translation and the limits of interpretations made from log data. The visualizations of the
process can be found in total in appendix 13.
The first indication of the translation process in the log file is that Eva Berg on 15/8
12:00 uploaded two files to a document (a document is equivalent to an html page which can
contain attached files), which we shall name “Danish”. These files were named 16press-
dk.doc and 16aug-dk.doc. The “16” probably refers to the deadline of the translation process.
As a qualification of this interpretation, the Beta website had an announcement dated 16/8,
which stated that the interim financial report will be released on 22/8.
The chart below shows the full lifecycle of the “Danish” document. Each dot represents
an action and a number denotes the type of action (8 for upload attachment, 7 for loading
document in browser and 2 for downloading a file attachment). The graph shows us a very
short lifecycle of the document. After the last download of the attached files at 9:00 on 16/8
by Lena, the document was not touched again in the log period, and neither was it deleted, as
this would have shown in the log. The document was not edited after uploading, since this
would have produced an action type 4. This means that no new versions of the attached files
were uploaded in the document lifecycle.
Eva Berg who uploaded the Danish files later on the same day also uploads the
Norwegian and the English versions. On the next day (16/8) at 11:00 she created a new
document (which we name ”samling1”) and uploaded all language versions of the two press
releases to that document, which amounts to ten attached files. By the time she did this, all
language versions were available as documents in the QP. Lone had uploaded the Swedish
version and Kati had uploaded the Finnish version during the afternoon of 15/8.
The following chart summarizes the actions performed by Eva Berg on all the
documents involved in the translation process.
<<evaberg0.jpg>>
If we assume that QP was used as the coordination tool for the translation process, we
would predict that Eva Berg had downloaded the Swedish and Finnish version from the QP in
order to be able to collect them in “samling1”. However, this was not the case. Some other
medium must have been used to send her the files, and probably the e-mail system was used.
Eva Bergs role in the translation process could therefore be characterized as a proxy in
relation to the QP. A proxy is a widely used concept in IT and means that someone does
something en behalf of others. Eva Berg acted as a proxy in relation to the use of QP in the
sense that the translators of the Swedish and Finnish versions sent the documents to Eva by e-
mail instead of uploading them directly in the QP. The reason for the proxy role might be that
the two translators stuck to the way the translation process had worked before where e-mail
was used. Unfortunately the log data cannot tell how widespread the proxy role is in QP use.
If we turn to the other people using the “Danish” documents, all of them either simply
looked at the document or downloaded the attached files, and all these actions happened in
the period from the creation of the “Danish” document to the creation of the “samling1”
document.
In addition to the observation made on the actions made by Eva Berg, another
observation strengthens the interpretation that the QP is used as a secondary medium for
coordination. Approximately five hours after Eva Berg had created the “samling1” document,
Lone, the manager of the translation unit created another document (“samling2”) and
uploaded all ten files.
As the chart for Lone shows (please refer to appendix 13), she had herself uploaded the
Swedish versions and downloaded the Finnish versions. In order for her to upload the ten
files, she must have had them sent, probably by e-mail. Eva Berg acted as a proxy and so did
Lone, who was uploading the Swedish versions (she was Danish, had no special knowledge
of Swedish and therefore could not be the translator).
From the first analysis of the detailed process we can make two interesting observations:
1. While Lone described QP as the medium used to coordinate the translation process,
the log file analysis tells us that at least e-mail was used also. Actually it seems that
e-mail was used as the primary way of routing the documents from the translators to
the people responsible for publishing them.
2. Eva Berg and Lone Kaas seems to have acted as proxies for some translators, while
other translators seemed to have uploaded documents themselves.
Lone reported in the interview that the e-mail notification function was used when a
document was published. While the files were sent as attachment via e-mail as the means for
the primary routing of work, the QP notification process must have been used to explain the
observation that the files are downloaded from the QP and read by others.
Six people read the “Danish” document during its short lifecycle. All these readings took
place before all the five language versions (ten files) were collected in “samling1”. The two
readings by Kati and Tore of the "Danish" document were of action type 7, which means that
they were simply checking to see whether the documents were there. Kati uploaded the two
Finnish files and immediately after that she loaded the “Danish” document as if she was
checking that the Danish version was also there. This indicates that the QP was used as a way
of checking the status of the translation process.
The other four readings of the “Danish” document were of action type 2, which means
that one or more of the attached files were downloaded. Susanna, Claus and Lena downloaded
both files, while Per downloaded the one file and then after an hour downloads the same file
again.
The six people who read the “Danish” document must all have been notified that it had
been published by means of the notification function in QP. We can only speculate about their
reason for reading the document, but it is likely that at least some of them read them for proof
or approval. If they had provided input for the Danish press releases to the translator Eva
Berg, they must either have e-mailed the comments, phoned her or most likely have walked
across the room to her and provided the input face-to-face. All Danish members of the
communications department were located in one open office space. The finding, as in the
analysis of the actions of Eva Berg and Lone Kaas, is that QP was not used as the only tool to
support the coordination of the translation process. The use of QP was mixed with the use of
e-mail, phone calls and face-to-face conversations.
While the hypothesis that the readings of the "Danish" document had a corrections and
approval function could not be validated, it was validated in one of the readings of
“samling1”. Two hours after Eva Berg published “samling1”, Jens downloaded the two
Danish files from “samling1”. Jens was press officer for Beta in Denmark. Therefore, he was
responsible for all communication with the Danish press and responsible for the two Danish
press releases analyzed here. We conducted an interview with Jens, where he specifically
addressed his role as approver.
A number of people read the “samling1” and after a break in activity in the QP of three
hours, Lone created a new document “samling2” and uploaded all ten files. There is one
probable explanation for this: The readers of “samling1” had provided input for the different
language versions and the updated versions had been routed via e-mail to Lone. Again the
finding is the same, that QP was used together with e-mail, phones and face-to-face
encounters.
What is then the role of QP in this genre of communication? From the interview with
Lone, it seemed that QP was used as the coordination tool for the translation process. The
result of the analysis of the log files gives us a rather different picture.
Firstly, the primary routing of documents from the translators was only partially done
using QP. E-mail must have been a central part of it. Otherwise we cannot make sense of the
action patterns observed in the log file. The routing of documents is not the only relevant
aspect of the translation process, and in these other aspects QP seems to have played a more
central role. The diagrams show that for all published documents regarding this translation
process, a number of people downloaded (and probably read) the attached files. Either they
did it simply to inform themselves of the contents of the press release, or to provide
corrections or to provide approvals for the press releases. In the cases where the readers fed
comments back into the translation process, they did it by other means than the QP.
Secondly, the use of QP was limited to simple processes of upload and read. The files
that were posted in the QP were not edited after the creation. The exchanges back-and-forth
of new versions between the translators and readers must have been accomplished using e-
mail. This initially seems strange. QP provides nice facilities for locking documents in and
out and thus support that the document could be edited several times by different people
without the risk that versions get out of sync.
In the interview Lone stated that: “…only the things that are finished are put in the
Quickplace”. This might provide an explanation for the observed pattern that documents were
simply uploaded and read without any edits of the document. However, the logical
consequence of the principle stated by Lone would be that once the “Danish” document was
uploaded, it would serve as the final document. This is not the pattern observed on the
diagrams. The QP was in fact used for draft versions and later collected in a final version. The
principle stated by Lone explains why the revision processes were divided into simple upload
and read processes in the QP.
One of the questions arising here is, whether they were using QP as intended in the
design. With the words of Desanctis and Poole (1994) we might ask whether it was used
according to “the spirit” of the technology. This question might be answered with both a yes
and a no in this context. The no answer would argue that the QP has facilities for handling the
full process of translation, including that other people than the author provide direct input for
corrections and the handling of different concurrent versions of the documents. According to
this line of argument (which is very common in IS), QP was not used according to its design.
The “yes” answer would argue that they used some facilities provided by QP and integrated
them with the use of other media in a way which worked to their own satisfaction.
There is a third response to the question of whether QP is used as intended in the design.
The third response is that the question is wrong. The way in which the QP was used together
with other media as observed in this genre of communications prevents a clear answer to the
question. The problem of answering this question points to a distinction between design and
use common in the study of computer systems and their use. It seems that the observations of
the use of QP challenge this distinction both in terms of who, where and when design and use
take place. We will get more into this discussion after collecting more evidence that this
distinction needs rework when applied to computer media such as QP.
The translation of a press release showed us how QP was used to support a genre of
communication that involved several documents and several exchanges of documents. The
two remaining genres analyzed are of a simpler nature. They are taken from the GIC QP. The
two genres represent a significant amount of the activity in the GIC QP. As documented
earlier only eight of the documents were used by more than one user in the period between
5/5 2001 and 5/10 2001, and the two genres account for two of the eight.
The holiday list was one of the most visited documents in the GIC QP. The document
was created on 7/5 2001 and was still used at the end of the log period. The analysis presented
here only covers the period before 5/10 2001 because the change in the logging-format caused
by a new version of the QP server prevented a proper identification of users after this date.
The holiday list were used as follows: when a member of GIC wanted to plan a vacation
he/she downloaded the Excel sheet from the QP. The first graph shows the uploads, edits of
the document and downloads of the attached Excel sheet containing the holiday list. Souma
was the responsible person for maintaining the holiday list. She uploaded the spreadsheet and
conducted all edits of the documents.
The graph shows extensive downloads of the spreadsheet on two occasions. This is
probably in response to an e-mail from Souma requesting the individual holiday plans from
each member of the organizational unit.
A closer look on the actions of Souma reveals that she performed 14 edits of the
document during the period of analysis. The edit was always preceded by the download of the
spreadsheet, indicating that the QP was used to maintain the master copy of the sheet.
A graph of five days of activity in June illustrates how the holiday list was maintained.
The triggering event was that some user downloaded the spreadsheet. Probably he/she
was planning a holiday and wanted to check with the other members' holiday plans. The next
observation in the log was that Souma downloaded the spreadsheet and thereafter edited the
document, indicating that she was uploading a new version of the spreadsheet. Like in the
case of the translation genre some other media must have been used in between the two
events in the log. Probably the user who downloaded the spreadsheet sent an e-mail to Souma
telling her when he wanted to have a holiday. She then read the e-mail and made the
appropriate changes to the sheet and uploaded it.
Apart from the e-mail to Souma the holiday was probably coordinated directly with
colleagues, either by face-to-face conversations, phone calls or by e-mail.
The observation in this case is the same as with the translation regarding to the
combination of media. The spreadsheet in QP was used for some aspects about
communicating and coordinating the holiday plans for the employees, while other aspects
used e-mail, telephone and face-to-face conversations.
It is also characteristic that one person maintained the holiday list. Souma was a
secretary and therefore had no power to decide when people should take their vacation. It
would have been simple to distribute the update of the spreadsheet to the individual users. It is
a general observation, as we will see in the statistical analysis of document lifecycles, that
only in extremely rare cases do more than one person make changes to the same document.
The upload of the document by Suoma must have been accompanied by an e-mail
notification, explaining why so many people were aware that the document had been
uploaded.
The meeting agenda genre served to inform the employees of the GIC department about
the subjects of the management meeting, and it is a simple example of one-way
communication. According to an interview with Linda, the publishing of the agenda also
served as a chance for people to respond to the agenda and provide input or comments. The
log documents that the responses were not mediated by the QP, but were probably mediated
by the telephone or face-to-face communication.
The downloads of the meeting agenda 8 and 12 days after the meeting was held indicate
that it also had an archival function.
Analyzed in the terms of the genre theory, the recurrent situation initiating the genre is
the production of the agenda for the management meeting. This agenda is agreed in the
management group, and for this process the QP is not used. Once the agenda is settled in the
management group it is ready for publishing. The publishing of the document should take
place before the meeting so that employees have a chance of commenting or contribute with
information or insights
The common subject for the genre is of course the agenda of the management meeting.
The subjects dealt with at the different meetings will vary, but the stable factor is the fact that
it is the agenda for the GIC management meeting.
The form of the genre is that the secretary of the manager uploads the agenda once it is
settled and notifies all employees of the department of the publication. Some of the
employees then read the agenda and eventually respond to it using some other medium than
the QP.
point made here is not that I as a researcher or system developer have overlooked some
crucial little detail in the coordination work that makes me misinterpret the work situation.
Neither is it the point that some ignorant system developer has designed the software so that it
cannot be used for the task, which it was meant for. The point is rather that the existing genres
of communication, in which e-mail is a well-established medium, are social structures that are
not changed from one day to the other.
An interesting difference between the three QPs studied in detail hinges on the
relationship to existing genres of communication. The use of NP_solo-ID and IC were
characterized by the fact that they served well-established genres in the organization. Projects
have exchanged descriptions and kept a project archive, and the translators have translated
and coordinated their translations before the introduction of the QP technology. In the case of
GIC it was harder to identify the genres of communication on which it should be based. While
the communication needs of a project and translation tasks are pretty straightforward, on the
level of an organizational unit, they are not. The two genres that seem to have been successful
in GIC were the meeting agenda and the holiday list. These genres have most likely been
established at the time the GIC organizational unit was formed. The holiday list was not a
new genre and the members of GIC must have been familiar with it before it was put in QP.
In the case of the management meeting agenda, the genre was certainly a new invention that
was connected to the merger of the communications departments into GIC. Before the merger
the Danish organization was characterized by a fairly simple structure with one manager. The
meeting agenda genre was therefore a new genre that was unique to QP. The question is in
what sense it was uniquely integrated with the QP technology.
The meeting agenda genre provides an opportunity to pinpoint the relationship between
the QP technology and the social structures of Beta and highlight the difference between the
QP technology as an artefact and the technology-in-practice. Since the management meeting
agenda is unique to the QP technology one might speculate whether it is a consequence of
introducing the QP technology and whether the genre is directly related to unique properties
of the QP technology. This is hardly the case even with a new genre. First of all, the QP
properties utilized in the management-meeting genre are not unique. E-mail could have
served exactly the same purpose of distributing the agenda. Also the genre was introduced as
a consequence of change in the social structures. It was introduced because the
communication departments had merged into one unit and a resulting new management
structure.
It is probably the case that the decision to publish the management meeting agenda for
the purpose of getting input and informing employees was influenced by the presence of a
communication infrastructure that allows easy distribution of information to a group of
specific people. Only in this abstract sense does the properties of the technology artefact
affect the genres of communication. We cannot pinpoint specific properties of the artefact that
has caused specific properties of the genre. Rather it exemplifies the emergence of a
technology-in-practice in the form of a genre of communication that includes utilizing some
properties of the artefact.
Up till now, the reports on the use of QP at Beta have exhibited a continuous zooming in
from an overall characterization to the analysis of three specific instantiations of genres in
which the QP was integrated. In the next section we shall zoom out again an attempt an
overall characterization of QP document use. The next section is solely devoted to document-
based HTTP-log analysis that will both illustrate the potential of this kind of log analysis and
provide some interesting findings for the characterization of QP use.
different results can derive from the analysis of log files. When the global models are
compared to the instantiations of the genre, the trade off between investigating many aspects
of use in one situation and investigating on aspect across a number of situations becomes
clear.
The first step of the process of deriving the basic document life cycles is to define how a
document life cycle should be represented. This determines the relevant methods available in
statistics. We worked on two different definitions, which were distinguished by the manner in
which time was represented.
1. A document life cycle is represented as a sequence of different types of actions on
the same document.
2. A document life cycle is a collection of properties of a document that characterizes
how it has been used.
The first representation is the most complex but would also cover more aspects of use. It
would distinguish between documents that were first edited two times and then read seven
times from one that was first edited, read three times, edited once more and finally read four
times. Sequence analysis is the relevant statistical method of analysis for this representation
of a document life cycle. Sequence analysis is a method that can analyze a sequence of events
and produces results of the form: event 1, event 2 -> event 3. It can be interpreted so that if
event 1 occurs and event 2 occurs after event 1, then event 3 will occur after event 2 with a
probability score. Sequence analysis is used in HTTP-log analysis for analyzing click streams.
The click streams are the most probable paths through a web site. It has been identified as a
very useful method for analyzing shopping behaviour in on-line stores (Srikant and Agrawal
(1995), Srikant and Agrawal (1996)). The sequence analysis was tried out using a matrix
derived from the database and the sequence analysis algorithm in Clementine from SPSS but
did not produce useful results. The main practical problem was to set the time limits of events
(in what period of time should events be characterized as one event between documents, and
what period of time should define between events). The difference in lifespan between the
meeting agenda and the holiday list very clearly illustrates this problem. It was simply not
possible to define these time limits because of an extreme variance in the data. Besides the
practical problems, a general limitation of sequence analysis is that it cannot represent users.
It can only analyze types of events over time.
The second representation selects a number of properties of individual documents that
are descriptive of their life cycle. This is, for example, the lifespan, number of users, number
of downloads etc. of each document. In statistical terms this means an analysis of multiple
variables simultaneously. The statistical definition of finding types of document life cycles
based on multiple properties is to group instances according to multiple variables, known as
non-hierarchical clustering. The most popular method for non-hierarchical clustering is K-
means clustering (See e.g. Hand, Mannila et al. (2001)). The K-means clustering process
starts by choosing the number of clusters one wishes to end up with as a result of running the
algorithm on the data. Initially the algorithm defines k cluster centres in an n-dimensional
space (where n is the number of variables). Thereafter it iterates through all instances and
assigns the instances to their closest cluster and recalculates the cluster centre. This process is
iterated until no change occurs in the assignment of instances.
The weakness of this representation of the document life cycle is that it ignores the order
of events. Instead it gives us a possibility to choose any property of the document life cycle
for defining the types.
Choosing a sample for the analysis
As the basis for analyzing document life cycle we used a matrix that collected a number
of data for all documents used across all QPs in the log period. This amounts to 14049
documents. Because the server was upgraded on the 5/10 2001, we had to limit the analysis to
documents with a life cycle that ended before this date. The server upgrade changed the log
format and meant that e.g. the calculation of the number of readers of each document would
be misleading after 5/10 2001. This left us with 5826 documents. The amount of documents
allowed us to perform the analysis without performing further sampling. It is not possible to
determine whether the sample represent all 14049 documents.
Selecting the properties of the documents
For all documents we calculated the data on for how long the document had shown
activity in the log, how many times it had been accessed and by how many different people
(for all details on the matrix, please refer to Appendix 6).
Specifically for the cluster analysis a number of properties were chosen to characterize
each document life cycle. These properties were selected because they were relevant for the
characterization of the life of a document. For all documents in the sample we calculated the
following properties and collected them in a matrix:
Property Definition
No. of uploads: occurrences of action type 8
No. of edits: occurrences of action type 4
The document properties provide a simple picture of how documents were used ignoring
special actions such as document moves and ignoring the sequence in which actions were
performed.
K-means clustering of the document sample
The process of k-means clustering was iterative. Initially the matrix was fed into the k-
means algorithm with the values of the matrix. It turned out that very few documents had very
extreme values in some of the variables (e.g. number of reads). This produced very uneven
clusters in terms of size. It put most documents in one cluster and the rest of the clusters
contained very few documents. The clusters therefore didn’t provide a broad description of
the documents but characterized only the few with the extreme values. Because of this the
values of the variables were categorized based on two principles: the number of documents in
each category should be equal and the number of categories should be small and relevant for
interpretation. The categories are described in appendix 15.
Another observation from the initial clustering was that a large proportion of the
documents were “dead” documents. 72% of all documents analyzed were only used by one
user and had a lifespan of less than one day. These documents were simply created by a user
and never accessed again. This is an interesting finding in its own right and we shall get back
Mine the gap - a multi-method investigation of web-based groupware use
143
The study of Quickplace use
to it later, but the extreme skewness in the data produced by the dead documents was a
problem for the clustering algorithm. We therefore chose to leave out the 72% from the
cluster analysis. They were very easy to characterize as a type of document life cycle without
clustering and would only disturb the process.
With 28% of the documents and 8 variables we started the clustering process. We had to
make two other adjustments before doing the final clusters. A correlation analysis between
the variables showed that there was no significant correlation between the lifespan of a
document and any of the other variables. So, while lifespan = 0 was important in
distinguishing dead documents, it turned out to be insignificant to the rest of the documents.
This lack of correlation is illustrated in the following graph:
The last adjustment to the data for the k-means clustering was to leave out the number of
readers and the number of editors and just use "no. of users" as the variable describing the
number of persons who had accessed the document. As the hierarchical analysis of document
types will show, all documents but 22 had, if the document was edited at all , one user doing
it.
Setting the number of clusters when doing the k-means clustering is a non-trivial task. In
some analysis such as analyzing use patterns of credit cards for predicting credit card fraud k
is straightforward, because you want to isolate the ones where a credit card is used fraudulent.
In our case the number of types of document life cycles was of course unknown at the outset.
We tried out k values from three to twelve. The iterations with different values of k showed
that 5 clusters provided the best grouping of properties. When comparing the clusters from k
= 3 to k = 5 it was clear the 5 clusters evolved from 3 so that two of the clusters for k = 3
Mine the gap - a multi-method investigation of web-based groupware use
144
The study of Quickplace use
would divide in two for k = 5. For k > 5 the algorithm kept the 5 clusters from k = 5 more or
less the same while the rest of the clusters were very small. The output of the clustering
algorithm for k = 3 to k = 8 can be found in Appendix 16.
The clusters produced for k = 5 were then used as the basis for a characterization of the
six types of document life cycles. As an illustration, the interpretation process of Cluster 1 is
presented below. The definition of the categorizations of the document properties into A,B, C
etc. can be found in appendix 15:
Cluster 1: 467 examples
Uploads : The number of uploads in this cluster is
B -> 0.858672, C -> 0.141328 typically 1, because 86% of the occurrences
in the cluster have 1 upload
Edits: The number of edits are 0 – 1 with 0 being
A -> 0.732335, B -> 0.152034, the most typical
C -> 0.115632
Reads: The number of reads of the documents is
A -> 0.010707, B -> 0.025696 between 4 and 15.
C -> 0.239829, D -> 0.434689
E -> 0.239829, F -> 0.049251
Downloads: The number of downloads of attached files is
A -> 0.023555, B -> 0.201285 between 1 and 8
C -> 0.762313, D -> 0.012848
Users: The documents in this cluster typically have
A -> 0.014989, B -> 0.070664 more than three users
C -> 0.730192, D -> 0.184155
If we take the upload property, the numbers show the share of the documents in the cluster
that has that categorical value. For example, 0,858672 or 85,9% of the examples have B,
which refers to 1 upload while 0,141328 or 14.1% has C, which refers to more than one
upload.
Basic characteristics:
1 upload of an attached file, 0-1 edits, between 4 and 15 reads, 1 – 8 downloads and more
than three mostly 3 – 5 users.
Cluster 2: Publish a document with no attachments which is read by 325 examples
a small number of users
Basic characteristics:
0 uploads (20 percent > 1 uploads), 0 downloads and 1 – 5 users
Cluster 3: Publish an attached file, which is downloaded by a larger 271 examples
number of readers.
Basic characteristics:
More than one upload and more than 8 downloads, more than 10 reads and more than 3
readers (with more than 5 in 79% of the cases).
Cluster 4: Publish a document that is hardly read, if it has an 245 examples
attachment it is not downloaded.
Basic characteristics:
0 downloads and 0 – 3 reads, 0 edits, and 0 or 3 – 5 users.
Cluster 5: Publish one or more files that is downloaded by one or 250 examples
two readers
Basic characteristics:
1 or more uploads of a file, 0 edits, 1 – 5 reads and 0-2 downloads by mostly 2 users
Cluster 6: dead documents 4215 examples
Basic characteristics:
Are created by one users and never gets accessed again.
The first observation from the clusters is that the three genres of communication studied
are not typical for the overall use of documents. The meeting agenda genre and the holiday
list genre both belong to cluster 3, while the translation genre consists of 7 documents also
belonging to cluster 3. Cluster 3 consists of 271 examples or 4,6% of the sample of 5862
documents. It would have been nice to connect the cluster analysis with the genre analysis by
providing a genre analysis exemplifying each of the clusters. There were not sufficient
descriptions of use available from interviews so as to provide a basis for interpreting the
patterns in the log files to exemplify each cluster. Other than the fact that sufficient data was
not available suggesting a connection between the genre analysis and the cluster analysis
would also be unjustified. To suggest that a genre exemplifies a document cluster is true in
the sense that they have properties in common. The fact that all three genres analyzed stem
from the same cluster should raise a suspicion as to the relation between genre analysis and
Mine the gap - a multi-method investigation of web-based groupware use
146
The study of Quickplace use
the clustering of document life cycles that is firstly based on the role of time. For the analysis
of genre, time is central. In the statistical analysis of documents it turned out that time was not
significant. Secondly, it illustrates the difference between the two analysis. Genre analysis
takes into account the specific work practice and the meaning assigned by users to the
documents used in the genre. In contrast it is the purpose of the cluster analysis to ignore the
specific work practice and make statements that are generalized form that.
The second observation that does not stem from the k-means clustering process but was
discovered during the initial statistical analysis is that most documents are dead documents.
72% of all documents published are never again used by anyone. Of course one cannot know
for sure whether they will ever be used. The 4215 dead documents have all been published in
the period 5/5-2001 – 5/10-2001 and it is known positively that they have not been used more
than once before 19/2-2002 where the log period ends. This gives a minimum period of 4 1/2
months to measure the “deadness” of the documents.
One kind of conclusion to draw from the percentage of dead documents is to question
the usefulness of the technology. I think it is justified to say that the 72% of the documents in
the QPs might as well not have been there. If people just put stuff there without using it for
anything it might as well be closed down. The genre analysis however shows that it is
certainly possible to use the QP technology as a communication tool for doing work and
coordinating things. One might rather just observe that integrating an open technology such as
QP that can support many different genres of communication is not straightforward.
Obviously a lot of unsuccessful attempts are made to use the QP technology.
For each top-down type of documents, the average value of some of the properties was
calculated as a way of characterizing the documents of each type. These averages are
exhibited below.
Document type % of document sample Average values for properties
Dead documents 72,3 Uploads = 2.58, Downloads=0.24,
Edits=0.12
Short term 4,8 Users=2.48, Uploads = 1.50, Reads=4.45,
coordination Downloads=1.67, Edits=0.30
Personal archive 4,1 Lifespan= 12.95, Uploads=2.64,
Reads=4.80, Downloads=2.12, Edits=0.90
Publish document 2,5 Lifespan= 29.41, Users=3.86,
Reads=10.44, Edits=0.99
Publish files 16,8 Lifespan= 25.83, Users=5.58,
Uploads=2.69, Reads=13.88,
Downloads=8.17, Edits=0.81
The top-down typology identifies two types of documents not visible in the cluster
analysis. Firstly, it identifies short-term coordination. The short-term coordination documents
only have a lifespan of less than on working day. Some of the documents involved in the
translation of the press release have this characteristic. On the average they have 1,5
attachments and are read by 2,5 users. Secondly, the personal archive also exists in QP. 4% of
the documents in the document sample are only used by one user, and have an average
lifespan of 13 days.
Mine the gap - a multi-method investigation of web-based groupware use
148
The study of Quickplace use
The next model serves to clarify how the documents are divided according to how they
are edited. This model only includes documents with a lifespan of at least one day and more
than one user.
The model shows that most documents are not edited at all. While representing 12,7% of
the total document sample they represent 66% of the documents with a lifespan of at least one
day and more than one user.
The most striking observation of this analysis is the number of documents edited by
several users. One of the features of the QP technology is that it supports collaboration on
documents, where several users can edit the same document without the risk of overwriting
each other's changes. It can be used to co-author documents without e-mailing the document
back and forth. In the document this happens in 22 occasions or 1,9% of the documents with a
Mine the gap - a multi-method investigation of web-based groupware use
149
The study of Quickplace use
life cycle of at least one day and more than one user. It may be concluded that this feature of
QP is not used at Beta. The empirical data does not provide a specific explanation for this, but
one explanation can be ruled out. The respondents in the survey were asked whether they
wrote collaboratively with other people.
60%
50%
40%
30%
20%
10%
0%
Often Seldom Never
The reason is not that people don't write collaboratively, they just do it without the
support of QP.
some of the dead documents are personal back ups or not the observation based on the
interview with Linda from GIC can be generalized to the use of QP across the whole
organization.
In the analysis of the genre of translation and the meeting agenda, we observed that the
use of QP documents was very simple. All documents followed the pattern where the
document is created and a file is attached, and subsequently read and downloaded by a
number of users. None of the documents were changed after creation nor were more people
involved in changing them. From the typologies we can generalize this observation.
Only 22 documents out of the sample of 5826 or 0,3% were edited by more than one
person. Thus QP is not used as a means for collaborative writing. This does not implicate that
users of QP do not write collaboratively, they just use other communication technologies to
do it. Even though all interviewees complain about the e-mail system as a means of
coordinating the production of documents, they seem to use it anyway. It is clear also for the
interviewees that the functionality of QP is better suited to support collaborative writing. This
finding shows that theories of task-technology fit Zigurs and Buckland (1998) are not very
useful in understanding the use of QP and e-mail.
The clustering of document life cycles has shown that CMC log-analysis can serve as a
useful tool in a case study. The statistical generalizations of document life cycles have been
used to test the generality of findings from the interviews. Apart from that, the analysis has
also produced findings that are completely invisible to interviewees or respondents in the
survey. The fact that 72% of the documents are dead is not observable through interviews or
surveys. This would have required a cross checking on who had accessed individual
documents not possible in practice.
document with the largest number of uses was given rank 1. The number of uses was then
plotted as a function of the rank on a double logarithmic graph.
As the graph show, the plots almost produce a linear curve. The calculated Pearson
correlation is – 0.971 where –1 describes a perfect inverse proportionality.
The second hypothesis was tested by counting the number of unique users for each of the
QPs. The QPs were ranked in the same manner as the documents and the number plotted as a
function of the rank.
The Pearson correlation is –0.970 or very close to the correlation calculated for the
documents.
The two hypotheses are hereby confirmed by our analysis. Both the number of uses of
documents and the number of users per QP is distributed according to Zipf's law. The
question arising now is: "so what?" In the context of this thesis does not seem like an
explanation of how QP is used at Beta. It completely ignores the social structures of the
organization, which according to the research question in this thesis as well as in most IS
research is considered crucial for understanding use of technology. Huberman and Adamic
(1999), Adamic and Huberman (2000), Huberman (2001) and others seam to treat it as a kind
of natural law of information.
Zipf's law has been applied in the economy of the www (Adamic and Huberman (2000))
to describe the structure of the market of web sites as a "winner-takes-all" market. It has also
been used to design caching algorithms for web-servers (Breslau, Cao et al. (1998), Breslau,
Cao et al. (1999)). The fact that the frequency of access to the documents on a web server is
distributed according to Zipf's law is used to decide which pages should be cached with the
purpose of increasing the performance of the web-server. A number of possible implications
of the confirmation of the Zipf's law hypotheses in our case can be drawn:
Zipf's law is independent of information architecture and centralized authoring.
It has previously been shown that the use of web pages on a web-server is distributed
according to Zipf's law. These analyses have been performed on web pages residing in an
information architecture and for web sites where the information distribution is centralized. In
the context of the QP technology there is not a central information architecture, because each
QP has its own information architecture and the number of authors is 394 or 11,9% of the
total users.
Zipf's law questions task-technology fit
As we have seen in the section on previous research a lot of research in IS either tries to
build theories on task technology fit or hinges on the notion that there is an ideal fit between
the task at hand and the design of the technology. While not explicitly stated in the literature a
task-technology fit theory of the use of QP would assume that there is an ideal relationship
between number of users and the use of a QP. According to this hypothesis the distribution of
number of users in a QP would be something close to a normal distribution where most QPs
would have the same number of users and very few would have either very few or very many
users. This is refuted by the fact that the number of users follows a power-law distribution.
While this result does not refute the overall idea of task-technology fit as a way of
understanding the use of IT, it questions whether it is a good theory for understanding QP or
other technologies that are open for diverse kinds of usage.
Zipf's law might solve the archival problems of virtual workspaces
The Zipf distribution of web pages on a web site has been used to design caching
algorithms. The fact that document use is distributed equally might be used for archival
algorithms for virtual workspaces. One of the problems inherent in using virtual workspaces
in an organization as well as a general problem in the knowledge management technology
discipline is to control the archival of documents that might be valuable to reuse. There is no
archival process defined for QP use and therefore the process of reusing documents across
different projects is not supported. The fact that document use is distributed according to
Zipf's law might be used to design an archival algorithm. By automatically archiving 25% of
the most used documents in QP one would archive all of the documents that have been used
by more than one user. It might turn out to be a quick way of solving a task that is otherwise
extremely complex because it requires individual judgments of an archival person. One of the
problems of centrally archiving documents for later reuse by someone else is also that the
person who created the document does not have an incentive to spend time on the archival
process.
After this short detour into the world of inferential statistics, we shall return to the
question of the relationship between properties of the technology artefact and the social
structures.
the concept from Orlikowski (2000). First the theoretical model is presented and thereafter the
empirical evidence for the model is presented. While the empirical evidence is limited to the
QP technology at Beta, it is proposed as a generic model that can be used for other
technologies and certainly other organizational settings.
The reason for viewing the folder structures as a homeostatic variable in a functional
relationship is to provide a view of classification as something else than a more or less
scientific discipline of classifying entities into the best structure.
The functional relationship introduces the dynamics observed in the QP folder
structures. The environmental conditions for the folder structure change over time. This either
causes the folder structure to change or produces an instable situation. The mere statement
that it is a functional relationship however provides limited insights. The strength of the
model is tested in the identification of the specific structural factors and the disturbances.
Introducing a functional model in addition to the classic discipline of classification is
also done because the virtues of the librarians and the scientist's classification are not the most
important. The model is introduced based on a hypothesis that a good folder structure
(classification) is one that aims at equilibrium between the folder structure, the structural
factors and the disturbances. This quality criteria is much more important than the virtues of
consistency and stability known from the classification discipline Bailey (1994).
By selecting a functional explanation we also assume that given the same structural
factors and the same disturbances, the homeostatic variable - the resulting folder structure -
will be similar.
The purpose of the following will firstly be to show how the empirical data support the
relationship between the structural factors, the disturbances and the homeostatic variable.
Secondly, the purpose will be to identify the relevant structural factors and disturbances for
the folder structures. Without this identification the model would be too generic and only
describe very few aspects of the data.
For the analysis the three exemplars GIC, NP_solo-ID and IC are chosen. Partly because
they were chosen for the genre analysis, partly because of the data available from interviews
where the process of structuring the folders is described.
folders and the possibility of creating sub-rooms. Another is the concept of documents and the
possibility of attaching multiple files to each document. The comparative analysis of virtual
workspaces also showed that the metaphors of the user interface should be included as
structural factors.
Organizational Classifications:
Organizational classifications are the classifications surrounding the users of the QP.
The organizational diagram is an important classification that is visible in GIC. Another is the
classification of tasks in a project that can be observed in NP_solo-ID. Organizational
classifications though encompass much more than those visible in i.e. the organizational
diagram. Organizational classifications might be captured by the notion of code in semiotic
theory (Eco (1976)) or "symbol scheme" in Goodman (1976).
Genres of communication:
The third structural factor is the genres of communication (and genre systems) supported
by the QP and thus also the folder structure. The genres of communication capture the
communication processes in which the QP is embedded. The work of genres as a structural
factor can be observed in a very direct sense in IC where the genre of translation is reflected
in the naming of folders such as "Press releases - in progress" and "Press releases - info". Of
course the genres are not always reflected directly in the folder structure.
The types of disturbances in the model are new documents, new work tasks, and new
members. Disturbances generally represent the changes happening in the environment of the
folder structure that causes instability.
New documents:
As it is well known from the personal folder structures of a PC, new documents are
introduced in an existing structure and some of the documents do not seem to fit. Over time
this will create instability in the equilibrium. The users might not explicitly note the instability
at first as it happens gradually over time. In the period where a folder name remains constant
the fact that new documents are put in a folder over time changes the interpretation of the
folder.
New work tasks:
The introduction of new work tasks that is to be supported by the QP is an obvious
disturbance that can be observed e.g. in IC, where the translation of press releases is
introduced as a new task supported by the QP.
New members:
New members (or users) of the QP also introduce a disturbance. They disturb in two
ways. Firstly, they disturb the equilibrium because they interpret the folder structure
differently than the existing users. Secondly, they may produce instability because they
introduce new documents in the QP and new work tasks to be supported by the folder
structure.
The disturbances are very closely related. The introduction of a new work task that
should be supported by the QP will typically include new documents and perhaps also new
members. Also new members may introduce new documents and tasks.
Column one and two are snapshots of the number of folders on the June 1, a bit less than
a month after the beginning of the log period and on the January 10, a little more than a
month before the end of the log period. Column three counts the number of folders registered
on June 1 that are deleted in the period up to January 10. Column four counts the number of
folders created after June 1. that are still active on January 10. This gives a crude picture of
the amount of change to the folder structures ignoring the folders that have been created and
discontinued between the two dates. The table enables us to calculate a change index that
measures the degree of lasting changes in the folders structure. The changes are lasting
because the index ignores folders that are created and quickly deleted again.
(Newfolders + Deletedfolders) = Changeindex
Initialfolders
The change index of IC is 0,61. For GIC it is 0,45 and for NP_solo-ID it is 0,78. A
change index of 0 would indicate a stable folder structure and an index of 1 would indicate a
†
radical change e.g. that the number of folders was doubled. Based on the change index scores,
it is safe to say that the folder structures in neither of the QPs are stable.
Mine the gap - a multi-method investigation of web-based groupware use
160
The study of Quickplace use
The following descriptions of how the three QP folder structures change serve to
illustrate the process of reaching equilibrium in the functional relationship.
NP_solo-ID is, as explained in the section on the three exemplars, used as a project
repository. This explains the gradual increase in the number of folders. The old documents in
the QP continue to stay relevant as documentation while new documents arrive on new
subjects thus creating the need for more folders. Another observation supports that the
NP_solo-ID is a project repository and that this affects the structure of the QP. Other than the
observation of creation and deletion of folders, the movement of folders from one folder to
another and the deletion of documents are important to understanding the development of the
folder structure. The number of documents that are moved around in NP_solo-ID is
significantly higher than the other two QPs. In NP-Solo-ID 138 documents were active in the
period. Of these were 44 or 32% moved. 28 documents were deleted in the period. The
following graph shows the number of new folders created and the number of document moves
per week from week two (week no. 20) of the log period until week 51.
16
14
12
10
New folders
8
Moves
0
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
Week
We can observe a close correlation between the creation of new folders and document
moves. This indicates that new folders also change the relevant classification of the existing
documents. A new folder is therefore created as a reactive action to instability caused by new
documents introduced in the QP. While some folders are created proactively as e.g. the folder
structure created initially by the manager, folders in NP_solo-ID are as often created
reactively.
The moving of documents is distributed among more users such as the maintenance of
the folder structure. Three users are performing the document moves. The project manager
has performed 21 moves, a second user 20 and a third user 3.
IC weekly users
40
Major reorganisation 38
of the IC QP 37
35 34 34
31 31
30
30 29
28
27
26
25 25 25
25
Number of users
23 23
21 21
20 20
20 19 19 19
18 18
15 14
12
10
8
6 6
5
5 4
3
2
1 1 1 1
0
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 1 2 3 4 5 6 7
Week
The correspondence between the increase in weekly users and the major reorganization
indicates that the reorganization is done proactively as a preparation for new work tasks. The
creation of new folders named "Press-releases..." also supports this. The decrease in the
number of folders also indicates a reactive reorganization. Some folders have turned out not
to be used or are not used anymore.
While the maintenance of folders is centralized, the deletion of documents, which is an
important related task to the maintenance of the folder structure is distributed among 18 users.
80
70
60
50
New folders
40
Deletions
30
20
10
0
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51
Week
The chart does not exhibit the same relationship between change in folder structure and
document deletions, as was the case with the relationship between changes in folder structures
and document moves in NP_solo-ID.
In contrast to NP_solo-ID the maintenance of the folder equilibrium is not done
continuously and distributed but through a major reorganization and some minor adjustments.
Thus the timing of re-establishing the equilibrium after the introduction of disturbances
differs significantly between IC and NP_solo-ID.
The increase in the complexity of the folder structure happens concurrently with a
decrease in activity both in terms of overall activity and in terms of the number of weekly
users.
The folder structure of GIC can be interpreted according to the functional relationship as
an example of the absence of genres of communication as a structural factor to guide the
folder structure of the QP. Except for the meeting agenda and holiday list genre analyzed in
the section on genres of communication, there seems to be no guidance from the genres in the
structuring of QP.
Linda - one of the managers of the GIC QP - addresses specifically the issues concerning
the folder structure of the QP. The following quotes can qualify the interpretation of the GIC
folder structure:
“Whether organizing after subject is a better idea…the question is then whether one can
agree on what a specific subject covers.”
“We use communication plans quite a lot. Typically that involves more than one section.
And that would be hard to put in.”
Linda mentions problems with the fact that the GIC QP is structured according to the
organizational diagram. What characterizes the two quotes is that she discusses the use of the
QP in the mode of hypotheses. The interviews with the manager of NP_solo-ID and IC both
describe the creation and maintenance of the folder structure as something related to real
events and not hypothetical situations. Her own understanding of the problem with the folder
structure is that it is hard to decide between principles of structuring. The one possibility -
according to her - is the organizational diagram, the other is to structure it according to
subject. The problem underlying the apparent issue of choosing the "right" principle of
structuring the folders is clarified in the following quote.
“I think even if we have all sat down and discussed it [the structuring if the
NordicPlace] this time, we would not have agreed, because we didn’t have an idea of what it
was. Maybe we didn’t have an idea of what type [of information] should be out there [ in the
NordicPlace]".
The fact that no genres of communications are established that utilize the GIC QP
produces the problem of deciding on a relevant folder structure.
As in NP_solo-ID and IC there is more than one manager. Like in NP_solo-ID more
than one person make changes to the folder structure. Linda addresses this as a problem.
“And I think personally that there is a problem in being four managers. Well it takes a
lot of coordination there…because the one…well it requires that you have an agreement on
how you wish to structure it, so suddenly some folder appears that doesn’t fit in with the
rest.”
70
60
57
54
52
50 48
40
40
35
30
27
25
23 23
21 21
20
20
16 16
15 15 15 15
14
13
12
11
10
10 9 9
8 8 8
7 7 7
5 5
3 3
1
0
23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 1 2 3 4 5 6 7
Unlike IC the increase in weekly users is not permanent. The week after the
reorganization the numbers of users drop to an average lower level than before. While the IC
reorganization was correlated with the introduction of new work tasks and thus also new
documents, this is not the case in GIC.
The GIC QP is primarily structured according to the organizational diagram of the GIC
organizational unit. Analyzed as a static classification, the folder structure of the GIC QP is
the most consistent of the three. It classifies according to the organizational diagram of the
GIC department. This illustrates very clearly the point of introducing the folder structures as a
functional relationship. The criteria of consistency and completeness are not important in their
own right. The virtue of a folder structure in a QP is that that it supports genres of
communication and uses terms that are part of the surrounding classifications in the
organization. According to the interviews the GIC folders structure appears as the one done
with the most systematic and careful considerations of the three. The problem is that the
structural factor of genres of communication is not there to guide it. Expressed in everyday
language, they don't know what they should use the QP for.
The functional model does not address an important finding from the analysis of genres
of communication in QP. In most genres multiple media are involved. With respect to the
issue of structuring information in computer media or in general in IT systems it points to the
fact that the structures present in different media affect each other. This is addressed in the
following section, which present a model for how different functional relationships used to
explain structures in specific media affect each other.
Central Intranet
IT-system structure
Group QP structure
This is illustrated in the model. The model suggests distinguishing between three
relevant levels of functional relationships. The first level is the structures relevant to the
individuals in an organization. These include the folder structure of the PC that can be
understood using the same functional relationship as the QP structure (with slightly modified
disturbances and structural factors). The next level is the group of people using the QP. The
third level is the structures that exist across the organization. This includes the structure of the
Intranet and the structure of the central IT-system. The structure of the Intranet is directly
observable while the structure of the central IT-system is observable in what is referred to as
the "data model" at Beta. This "data model" is an Entity-Relationship diagram e.g. specifying
which information is stored for each employee in the organization.
The model thus provides an overview of the relationship between the different functional
models, which apply to each individual structure. It provides an overview of the infrastructure
of classifications embedded in technologies in an organization and specifies that not only the
structural factors identified in the functional model for virtual workspaces affects the creation
of folder structures. The model thus takes the functional relationship beyond the scope of one
technology used by a group and applies it to the whole organization.
through continuous enactment of the genre. When QP is introduced the genres change during
this continuous enactment and are gradually integrated in certain genres and eventually create
new genres.
Two other observations support this. Perhaps most strikingly, 72% of all documents
were characterized as dead documents. They were simply placed in a QP and never used
again. Although other explanations may account for some of the dead documents they support
the finding that many attempts are made at using QP to communicate which are not successful
because the QP has only partly been integrated in genres of organizational communication. It
also indicates that users explore the new medium in a trial-and-error manner. Although there
is no direct evidence in the observations it might be the case that new genres emerge in this
trial-and-error manner. Not all genres are planned and then executed, as for example is the
case with the translation genre, and it would be an interesting subject for further research to
study how individual genres might emerge without conscious planning by the participants.
The present study has presented two interesting findings in relation to existing uses of
genre theory in the study of the use of IT in organizations. Firstly, the study of the three
instantiations of genres has shown that genres or genre systems span multiple media. The
detailed study of the use of QP for, for example, the translation genre showed that QP was
used together with e-mail and probably also telephone and face-to-face communication.
Secondly, the use of log analysis for the study of genre instantiations illustrated a useful
viewpoint for studying the enactment of genres and understanding in detail the role played by
the computer medium. We will deal with the second finding in the next section.
The close integration of different media in the enactment of a genre also points to a
general precaution for the study of how computer media is integrated in work practice.
Understanding this integration requires an understanding of the supplementing and
overlapping media available to users. While this observation is not new, it has perhaps not
been taken seriously enough. It suggests that studies of individual computer media should be
supplemented by studies of infrastructures of computer media. An infrastructure of computer
media is the suite of individual computer media available to a group of people, for example,
the members of an organization or a project. The notion of infrastructure should be considered
a supplementary research object to that of the individual medium when conducting empirical
research as well as in the theories of how media are adopted in an organization.
The research question formulated in the introduction to this thesis was the following:
user from objects that will gradually adopt a technology in more or less predictable patterns to
agents that actively design the integration of the IT artefact in a work setting.
The introduction of the concept of end-user design might be problematic because it blurs
the distinction between what it is like to design IT-systems and what it is like to work and
communicate using IT-systems. However, the analysis of virtual workspaces has at least
shown that the distinction between design and use has shifted considerably from the
traditional world of customized IT-systems such as e.g. a work-flow system. Some of the
processes previously handled by professional designers are now left to the end-users and
whether they should be denoted use, adoption, appropriation or end-user design is a question
for debate. At least it should not be ignored.
Because the log-data are systematically collected by the HTTP-server and because they
are structured so that computer-based analysis tools can manipulate them they lend
themselves to quantitative analysis. The types of document life cycles generated from the
cluster analysis and the top-down analysis are examples of using data mining techniques for
looking for global models in the data. In terms of statistics they merely used descriptive
statistics. The methods from mathematical (inferential) statistics were not used to assess the
generality of the result. In one case a quantitative hypothesis was tested using inferential
statistics. The hypothesis that the number of uses of documents and number of users in QPs
were distributed according to Zipf's law was tested and the Pearson correlation was calculated
as a way of assessing the reliability of the correlation found in the data. The distribution
according to Zipf's law was an interesting finding in its own right but it is a finding that is a
bit of the track of understanding the role of Lotus Quickplace at Beta.
The log analysis has proven useful in relation to understanding the research question
connected with two concepts that address the relation between the technology and social
structure. Firstly it was used for investigating genres of organizational communication in QP.
Secondly, it was used as a basis for understanding the dynamics of folder structures, that is an
important part of establishing technology-in-practice.
In both analyses of instantiations of genres and the dynamics of folder structures it has
proven useful as a means of triangulating. In the case of genres descriptions of genres of
communication was triangulated with indications of how the QP was used in a specific
instantiation of that genre. This provided a very precise analysis of the role of the technology.
that e-mail was the most important medium for coordinating the translation. The IC QP had
but a secondary role.
To approach the evaluation of the use of a communication technology by interviewing
the users has turned out to be a very problematic approach. The use of QP is so intertwined
with other media – in this case e-mail in particular - that interviewing people about a specific
technology is not a very good approach if one whishes to understand how the technology is
used and integrated in a work practice. The interview situation, which is framed to be about
the use of a technology, seems to make the interviewee focus on that aspect and ignore other
technologies involved. Interviews on the usage of one medium tend to exaggerate the role of
that technology.
The exaggeration of the role of the technology is also a general methodological finding.
In order to research the role of a technology in an organization it is necessary to address the
whole set of technologies available to users - in the case of virtual workspace the
technological infrastructure available for communication. The triangulation of interviews and
log analysis has shown that the design of the interview guides in this case has produced
answers that in some cases would have been misleading had they been used as the only
empirical data.
Log analysis turned out to be useful for another kind of triangulation as well. It can
provide a perspective on the use that cannot be captured through interviews because the
interviewee does not have the information on how it is used but from his personal
experiences. The document life cycle is a perspective on use that it is practically impossible to
extract from interviews or direct observations of use. A very specific example was the finding
that 72% of documents were dead. Another is that although virtual workspaces are
characterized as tools for collaboration and coordination most documents had a very simple
life cycle. They were created and after that a number of users would read it. Only in 22 cases
were documents edited by more than one person.
The techniques of data mining can produce patterns and link different observations that
would have otherwise have been invisible. The analysis of the folder structures uses the
history of folder structures and combines it with user activity and document moves and
deletions. These data were all derived from the log files. These data were then triangulated
with interviews on how the folder structure was maintained and for what the QP was in
general used for. Linking these would probably never occur without the techniques of data
mining. The functionalistic model for explaining folder structures is a theory that would not
have been possible without data mining of the log files. In principle it would be possible to
observe the history of the folder structure by visiting the QP daily or weekly through the
whole period, but it could not have been related to numbers on the activity. A study of folder
structures using observation and interviews would probably have focused on documenting the
structure as well as interviewing the person responsible for the changes and inquire into the
reasons for doing it.
While the log analysis was used for understanding the role of technology in genres of
communication, another sociological method lends itself to an investigation through
document-based log analysis. Communication between users in a virtual workspace can be
used for drawing social networks. Social network analysis Harary, Norman et al. (1965) is a
method used for describing the structure of a social setting that can be based on many
different types of data. The document-based log analysis provides excellent data for drawing
social networks based on who is accessing the same documents in a virtual workspace. Social
network analysis is e.g. used as a method for analyzing the concept of community in
computer media Wellman, Salaff et al. (1996). Some experiments have been conducted in the
course of the work on this thesis with drawing the social network of the three QPs. No final
results are however presented here because time ran out and because social network analysis
lends itself to the investigation of other social structures than genres of communication i.e. the
concept of virtual community.
Log files can be characterized as "new" in the sense that they are not covered by Yin's
categories and in the sense that they have not been used for case study research in IS.
Log files share characteristics with both direct observations and archival records.
Archival records are characterized by Yin (Yin (1994) p. 80) as:
• stable - can be retrieved repeatedly
• unobtrusive - not created as a result of the case study
• exact - contains exact names and details of an event
• broad coverage - long span of time, many events and many settings
• precise and quantitative
All of these properties except maybe for "exact" are also properties of a log file. The
analysis of HTTP-log files has shown that is often difficult to link the log file data to names
and details that are part of the social discourse in the organization.
A log file differs from archival records in a very important way because they are not the
product of an intentional archiving process by members of the organization studied. In the
case of the HTTP-log the log files is a product partly of the de-facto HTTP-log standard
partly of the technical architecture of the Lotus Quickplace technology. In this sense log files
are very different from archival records. An example of archival records used in this study is
the archive of all applications for opening a Lotus Quickplace sent to the technical manager of
the Lotus Quickplace server.
It might be better to characterize the HTTP-log as a kind of direct observation. Yin
characterizes direct observations as:
• reality - covers events in real time
• contextual - covers context of the event
Clearly HTTP-log files only capture a very limited aspect of events and it does not
capture what Yin calls the context of the event. Log files have another characteristic that is
very important. The data are structured computer records. This means that they are directly
available for analysis using data mining techniques. In sum we can characterize log files with
the following properties:
• reality - covers events in real time
• stable - can be retrieved repeatedly
• unobtrusive - not created as a result of the case study
• broad coverage - long span of time, many events and many settings
• precise and quantitative
• Structured data records - analyzable through data mining techniques
calls for investigator triangulation simply because experienced interview researchers are
seldom experienced data miners at the same time.
The research presented in this thesis has shown that log analysis is a method for
investigating computer mediated communication, whether in organizational settings or not,
which should be taken seriously. It not only lends itself to the testing of quantitative
hypotheses, but can also be applied in case or field studies as a means of triangulation. Log
analysis can help solve the problem of direct observation of CMC. Direct observation is used
in combination with interviews and other reflections on computer use in settings where the
use is not distributed temporally and geographically. Direct observations can be used to
analyze the gap between what people say and do. In situations where this is not possible I
recommend to "Mine the gap".
Practical implications
The purpose of this section is to provide implications of the study of the QP technology
at Beta, which can be of practical value to people who are either constructing virtual
workspace technologies or are planning to implement them.
The log analysis has been used in this thesis to analyze the use of QP in order to
understand how it is used at Beta. The information gained from log analysis might also be
used directly by the users and by the virtual workspace themselves. The analysis of document
use according to Zipf's law suggests that the distribution of the use of documents can be used
for making decisions on which documents to delete or archive and which documents to keep
in the folder structure. The fact that 72% of all documents were never used could be useful
information to users for reflecting on the use of the virtual workspace. As an example, the
BSCW system (Bentley, Horstmann et al. (1997), Appelt (1999)) displays the number of
readers who have accessed a document as an inherent property of the document. Other
statistics of document usage could, for example, be used as a basis for maintaining folder
structures. Given that 72% of the documents are never used, the number of documents is not a
good indication of the value of a folder. Displaying document usage statistics for each folder
in a virtual workspace would serve as a better basis for considering a change in the folder
structure. The use of other users' use of information in the design of IT systems has
previously been suggested by e.g. Dieberger, Dourish et al. (2000).
think of it as part of an infrastructure and its use should be defined in relation to e-mail, the
Intranet, etc.
The analysis of how technologies-in-practice emerge as virtual workspaces are
integrated in genres of communication suggests a conscious approach to starting a virtual
workspace. The adoption of a virtual workspace should be seen as a design process where
existing genres of communication are changed and integrated with the technology. There is
no specific use inscribed in the properties of the technology, and it is the responsibility of the
group of users to define in which ways the virtual workspace should be integrated in their
work practice.
The functional model for understanding the dynamics of folder carries some
consequences for how users of virtual workspaces should think about and maintains a folder
structure. The folder structure is a "living" structure, which requires constant re-working.
Both centralized and distributed approaches to managing the folder structure seem to wor, and
in both approaches a change log for the virtual workspace visible to all users could provide a
simple means of communicating changes and help the users to adjust their personal
interpretations of folder contents. The change log would document the change made to the
structure and the arguments for doing so.
"My lord, facts are like cows. If you look them in the face hard enough, they generally
run away."
Dorothy L. Sayers
References
Adamic, L. A. and B. A. Huberman (2000). "The nature of markets in the World Wide Web."
Quarterly Journal of Elctronic Commerce 1(1): 5-12.
Andersen, J., R. S. Larsen, et al. (2000). Analyzing Clickstreams Using Subsessions. ACM
Third International Workshop on Data Warehousing and OLAP(DOLAP'00),
Washington DC, ACM.
Appelt, W. (1999). WWW Based Collaboration with the BSCW System. SOFSEM'99,
Milovy, Czech Republic, Springer Lecture Notes in Computer Science.
Ashforth, B. E. and R. H. Humphrey (1997). "The ubiquity and potency of labeling in
organizations." Organization Science 8(1): 43-58.
Bailey, K. D. (1994). Typologies and taxonomies : an introduction to classification
techniques. Thousand Oaks, Calif., Sage Publications.
Bakeman, R. (1992). Understanding social science statistics : a spreadsheet approach.
Hillsdale, N.J., L. Erlbaum.
Bannon, L. (2000). Understanding common information spaces in CSCW. Workshop on
common information spaces. Copenhagen.
Bannon, L. and S. Bødker (1997). Constructing Common Information Spaces. European
conference on Computer Supported Cooperative Work, Lancaster, UK, Kluwer
Academic Publishers, Netherlands.
Bansler, J., J. Damsgaard, et al. (2000). "Corporate Intranet Implementation: Managing
Emergent Technologies and Organizational Practices." Journal of the Association for
Information Systems 1(10).
Barley, S. R. (1986). "Technology as an Occasion for Structuring - Evidence from
Observations of Ct Scanners and the Social-Order of Radiology Departments."
Administrative Science Quarterly 31(1): 78-108.
Benbasat, I., D. K. Goldstein, et al. (1987). "The Case Research Strategy in Studies of
Inofrmation Systems." Mis Quarterly 11(3): 369-386.
Bentley, R. and P. Dourish (1995). Medium versus mechanism: Supporting collaboration
through customisation. Xerox. London.
Bentley, R., T. Horstmann, et al. (1997). "The World Wide Web as enabling technology for
CSCW: The case of BSCW." Journal of Computer Supported Cooperative Work 6(2-
3): 111-134.
Berners-Lee, T. and M. Fischetti (1999). Weaving the Web : the original design and ultimate
destiny of the World Wide Web by its inventor. San Francisco, HarperSanFrancisco.
Mine the gap - a multi-method investigation of web-based groupware use
184
References
Bowker, G. C. and S. L. Star (1999). Sorting things out : classification and its consequences.
Cambridge, Mass., MIT Press.
Bradner, S. (1996). The Internet Standards Process -- Revision 3, IETF. 2002.
Breslau, L., P. Cao, et al. (1998). On the Implications of Zipf's Law for Web Caching. 3rd
International WWW Caching Workshop.
Breslau, L., P. Cao, et al. (1999). Web Caching and Zipf-like Distributions: Evidence and
Implications. IEEE INFOCOM, New York.
Büchner, A. G. and M. D. Mulvenna (1998). "Discovering Internet Marketing Intelligence
through Online Analytical Web Usage Mining." SIGMOD Record 27(4): 54-61.
Büschner, M., S. Gill, et al. (2001). "Landscapes of Practice: Bricolage as a Method for
Situated Design." Computer Supported Cooperative Work 10(1): 1-28.
Bøving, K. B. (2001). Datastructuring, Standards, and Knowledge Work. Proceedings of the
24th Information Systems Research Seminar in Scandinavia, Ulvik, Norway,
Department of Information Science, University of Bergen.
Bøving, K. B. (2001). Datastructuring, Standards, and Knowledge Work. 24th Information
Systems Research Seminar in Scandinavia, Ulvik, Norway, Department of Information
Science, University of Bergen.
Bøving, K. B. (2001). "Digitalt samarbejde via Virtual Workspaces." Internethåndbogen,
Børsens forlag.
Bøving, K. B. (2002). Digitale web-baserede samarbejdssystemer. Digital genopbygning af
den offentlige sektor. København, Forvaltningshøjskolen.
Bøving, K. B. and L. H. Pedersen (2002). Design for Dummies: Understanding Design Work
in Virtual Workspaces. PDC2002, Malmö, Sweden.
Callon, M. and J. Law (1989). "On the Construction of Sociotechnical Networks: Content and
Context Revisited." Knowledge and Society 8: 57-83.
Carmel, E. (1995). "Cycle-time in packaged software firms." Jpurnal of Product Innovation
Management 12(2): 110-123.
Carmel, E. and B. J. Bird (1997). "Small is beautiful: a study of packaged software
development teams." Journal of High Technology Management Research 8(1): 129-
148.
Carmel, E. and S. Sawyer (1998). "Packaged software development teams: what makes them
different?" Information Technology & People 11(1): 7-19.
Conklin, J. and M. Begeman (1988). gIBIS: A Hypertext Tool for Exploratory Policy
Discussion. CSCW'88, Portland, Oregon, Association of Computing Machinery.
Cooley, R., J. Srivastava, et al. (1997). Web mining: Information and pattern discovery on the
world wide web. 9th IEEE International Conference on Tools with Artificial
Intelligence (ICTAI'97), Newport Beach, California.
Cooley, R., P.-N. Tan, et al. (1999). WebSIFT: The Web Site Information Filter System.
Proceedings of the 1999 KDD Workshop on Web Mining, San Diego, CA, Springer-
Verlag.
Daft, R. L. and R. H. Lengel (1986). "Organizational Information Requirements, Media
Richness and Structural Design." Management Science 32(5): 554-571.
Daft, R. L. and N. B. Macintosh (1981). "A Tentative Exploration into the Amount and
Equivocality of Information-Processing in Organizational Work Units." Administrative
Science Quarterly 26(2): 207-224.
Davidson, E. J. (2000). "Analyzing genre of organizational communication in clinical
information systems." Information Technology & People 13(3): 196-209.
Deetz, S. (2000). Conceptual Foundations. The New Handbook of Organizational
Communication. F. M. Jablin and L. Putnam, Sage Publications: 3 - 46.
Denzin, N. K. (1989). The research act : a theoretical introduction to sociological methods.
Englewood Cliffs, N.J., Prentice Hall.
Denzin, N. K. and Y. S. Lincoln (2000). Handbook of qualitative research. Thousand Oaks,
Calif., Sage Publications.
Desanctis, G. and R. B. Gallupe (1987). "A Foundation for the Study of Group Decision
Support Systems." Management Science 33(5): 589-609.
Desanctis, G. and M. S. Poole (1994). "Capturing the Complexity in Advanced Technology
Use - Adaptive Structuration Theory." Organization Science 5(2): 121-147.
Dieberger, A., P. Dourish, et al. (2000). "Social Navigation: Techniques for Building More
Usable Systems." Interactions 7(6): 36-45.
Divitini, M. and C. Simone (2000). "Supporting Different Dimensions of Adaptability in
Workflow Modeling." Computer Supported Cooperative Work 9(3/4): 365-397.
Dix, A. (1997). "Challenges for Cooperative Work on the Web: An Analytical Approach."
Journal of Computer Supported Cooperative Work 6(2-3): 135-156.
Eco, U. (1976). A theory of semiotics. Bloomington, Indiana University Press.
Engeström, Y. (1987). Learning by expanding : an activity-theoretical approach to
developmental research. Helsinki, Orienta-Konsultit Oy.
Engeström, Y., R. Miettinen, et al. (1999). Perspectives on activity theory. Cambridge ; New
York, Cambridge University Press.
Fiedler, K. D., V. Grover, et al. (1996). "An Empirically Derived Taxonomy of Information
Technology Structure and Its Relationship to Organizational Structure." Journal of
Management Information Systems 13(1): 9-34.
Fjermestad, J. and S. R. Hiltz (1998-1999). "An Assessment of Group Support Systems
Expeirmental Research: Methodology and Result." Journal of Management Information
Systems 15(3): 7-149.
Fjermestad, J. and S. R. Hiltz (2000). Case and Field Studies of Group Support Systems: An
Emprical Assessment. 33rd Hawaii International Conference on System Sciences,
Hawaii, IEEE.
Fjermestad, J. and S. R. Hiltz (2000). "Group support systems: A descriptive evaluation of
case and field studies." Journal of Management Information Systems 17(3): 115-159.
Fowler, M. and K. Scott (1997). UML distilled : applying the standard object modeling
language. Reading, Mass., Addison Wesley Longman.
Fu, Y., K. Sandhu, et al. (1999). Clustering of Web Users Based on Access Patterns.
Proceedings of the 1999 KDD Workshop on Web Mining, San Diego, CA, Springer-
Verlag.
Gabaix, X. (1999). "Zipf's law for cities: an explanation." Quarterly Journal of Economics
114(3): 739-767.
Gallupe, R. B., G. Desanctis, et al. (1988). "Computer-Based Support for Group Problem-
Finding - an Experimental Investigation." Mis Quarterly 12(2): 277-296.
Garton, L. and B. Wellman (1995). "Social Impacts of Electronic Mail in Organizations: A
Review of the Research Literature." Communication Yearbook 18: 434-453.
Giddens, A. (1984). The Constitution of Society, Berkeley University of California Press.
Glymour, C., D. Madigan, et al. (1996). "Statistical Inference and Data Mining."
Communications of the ACM 39(11): 35-41.
Goodman, N. (1976). Languages of Art, Hackett Publishing.
Groth, L. (1999). Future organizational design : the scope for the IT-based enterprise.
Chichester, England ; New York, John Wiley & Sons.
Grudin, J. (1991). "The Development of Interactive Systems: Bridging the Gaps Between
Developers and Users." IEEE Computing 24(4): 59-69.
Grudin, J. (1994). "CSCW: History and Focus." IEEE Computing 27(5): 19-26.
Grudin, J. (1994). "Groupware and social dynamics: Eight challenges for developers."
Communications of the ACM 37(1): 92-105.
Gunter, B. (2002). The quantitative research process. A Handbook of Media and
Communication Research. K. B. Jensen. London, Routledge: 209-234.
Guzdial, M., J. Rick, et al. (2000). Recognizing and Supporting Roles in CSCW. CSCW
2000, Philadelphia, PA, ACM.
Guzidial, M., J. Rick, et al. (2000). Recognizing and Supporting Roles in CSCW. ACM
Conference on Computer Supported Cooperative Work, Philadelphia, PA, ACM.
Habermas, J. (1981). Theorie des kommunikativen Handelns. Frankfurt am Main, Suhrkamp.
Hamilton, A. (2000). "Metaphors in theory and practice: the influence of metaphors on
expectation." ACM Journal of Computer Documentation 24(4).
Hand, D. J., H. Mannila, et al. (2001). Principles of data mining. Cambridge, Mass., MIT
Press.
Harary, F., R. Z. Norman, et al. (1965). Structural models: an introduction to the theory of
directed graphs. New York,, Wiley.
Hidber, C. (1998). Online Association Rule Mining. Berkeley, CA, International Computer
Science Institute, University of California at Berkeley.
Huberman, B. A. (2001). The laws of the Web : patterns in the ecology of information.
Cambridge, Mass., MIT Press.
Huberman, B. A. and L. A. Adamic (1999). "Growth Dynamics of the World Wide Web."
Nature 401(131).
Hughes, J., D. Randall, et al. (1991). CSCW: Discipline or Paradigm? Second European
Conference on CSCW (ECSCW'91), Amsterdam.
IBM (2002). Lotus Quickplace Homepage. 2002.
IBM (2002). Research Lists IBM Lotus Software As A Leader In Team Collaboration Tools.
Cambridge, Mass.
Jablin, F. M. and L. Putnam, Eds. (2000). The new handbook of organizational
communication : advances in theory, research, and methods. Thousand Oaks, Calif.,
Sage Publications.
Jacobs, I. (2001). World Wide Web Consortium Process Document, W3C (World Wide Web
Consortium). 2002.
Jacobson, I. (1992). Object-Oriented Software Engineering: A Use Case Driven Approach,
Addison-Wesley.
Jarvenpaa, S. L. (1989). "The Effect of Task Demands and Graphical Format on Information-
Processing Strategies." Management Science 35(3): 285-303.
Jensen, K. B. (2000). Interactivities: Constituents of a Model of Computer Media and
Communication. Moving Images, Culture, and the Mind. I. Bondebjerg. Luton,
University of Luton Press.
Lyytinen, K. and O. K. Ngwenyama (1992). "What does computer support for cooperative
work mean? A structurational analysis of computer supported cooperative work."
Accounting, Management and Information Technology 2(1): 19-37.
MacCormack, A., R. Verganti, et al. (2001). "Developing Products on Internet Time: The
Anatomy of a Flexible Development Process." Management Science 47(1): 133-150.
Markus, M. L. (1983). "Power, Politics, and MIS Implementation." Communications of the
ACM 26(6): 430-444.
Markus, M. L. (1994). "Electronic Mail as the Medium of Managerial Choice." Organization
Science 5(4): 502-527.
Markus, M. L. and D. Robey (1988). "Information Technology and Organizational-Change -
Causal-Structure in Theory and Research." Management Science 34(5): 583-598.
Masseglia, F., P. Poncelet, et al. (1999). "Using Data Mining Techniques on Web Access
Logs to Dynamically Improve Hypertext Structure." ACM SIGWEB letters 8(3): 1-19.
McGrath, J. E. (1984). Groups : interaction and performance. Englewood Cliffs, N.J.,
Prentice-Hall.
Miller, C. R. (1984). "Genre as Social Action." Quarterly Journal of Speech 70: 151-167.
Mingers, J. (2001). "Combining IS research methods: Towards a pluralist methodology."
Information Systems Research 12(3): 240-259.
Mintzberg, H. (1979). The structuring of organizations : a synthesis of the research.
Englewood Cliffs, N.J., Prentice-Hall.
Mobasher, B., R. Cooley, et al. (2000). "Automatic personalization based on Web usage
mining - Web usage mining can help improve the scalability, accuracy, and flexibility
of recommender systems." Communications of the Acm 43(8): 142-151.
Mobasher, B., H. Dai, et al. (2001). Effective Personalization Based on Association Rule
Discovery from Web Usage Data. WIDM'01 3rd ACM Workshop on Web Information
and Data Management, Atlanta, Georgia, ACM.
Ngwenyama, O. K. and A. S. Lee (1997). "Communication richness in electronic mail:
Critical social theory and the contextuality of meaning." Mis Quarterly 21(2): 145-167.
Orlikowski, W. (1996). Learning from Notes:. Computerization and controversy : value
conflicts and social choices. R. Kling. San Diego, Academic Press.
Orlikowski, W. and J. Baroudi (1991). "Studying information technology in organizations:
research approaches and assumptions." Information Systems Research 2(1): 1-28.
Orlikowski, W. and D. Robey (1991). "Information Technology and the Structuring of
Organizations." Information Systems Research 2(2): 143-169.
Orlikowski, W. J. (1992). "The Duality of Technology - Rethinking the Concept of
Technology in Organizations." Organization Science 3(3): 398-427.
Orlikowski, W. J. (2000). "Using technology and constituting structures: A practice lens for
studying technology in organizations." Organization Science 11(4): 404-428.
Orlikowski, W. J. and C. S. Iacono (2001). "Research commentary: Desperately seeking the
"IT" in IT research - A call to theorizing the IT artifact." Information Systems Research
12(2): 121-134.
Orlikowski, W. J. and J. Yates (1994). "Genre Repertoire - the Structuring of Communicative
Practices in Organizations." Administrative Science Quarterly 39(4): 541-574.
Orlikowski, W. J., J. Yates, et al. (1995). "Shaping Electronic Communication - the
Metastructuring of Technology in the Context of Use." Organization Science 6(4): 423-
444.
Pei, J., J. Han, et al. (2000). Mining Access Patterns Efficiently from Web Logs. Proceedings
Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD'00).
Rice, R. E. and U. Gattiker, E. (2000). New Media and Organizational Structuring. The New
Handbook of Organizational Communication. F. M. Jablin and L. Putnam, Sage
Publications: 544 - 581.
Saunders, C. and J. W. Jones (1990). "Temporal Sequences in Information Acquisition for
Decision-Making - a Focus on Source and Medium." Academy of Management Review
15(1): 29-46.
Schmidt, K. and L. Bannon (1992). "Taking CSCW seriously - Supporting Articulation
Work." Journal of Computer Supported Cooperative Work 1(1-2): 7 - 40.
Schmidt, K. and U. Christensen (2000). Using classification in common information spaces.
Workshop on Common information spaces. Copenhagen.
Schmidt, K. and R. Israel (2000). Cooperative Management of common information spaces.
Workshop on classification schemes in cooperative work, ACM conference on CSCW.
Philadelphia.
Schmidt, K. and C. Simone (1996). "Coordination Mechanisms: Towards a Conceptual
Foundation of CSCW Systems Design." Computer Supported Cooperative Work 5(2-
3): 155-200.
Shipman III, F. M. and C. C. Marshall (1999). "Formality Considered Harmful: Experiences,
Emerging Themes, and Directions on the Use of Formal Representations in Interactive
Systems." Journal of Computer Supported Cooperative Work 8: 333-352.
Simon, H. A. (1977). The new science of management decision. Englewood Cliffs, N.J.,
Prentice-Hall.
Singh, S. (1999). The code book : the evolution of secrecy from Mary, Queen of Scots, to
quantum cryptography. New York, Doubleday.
Smith, M., J. J. Cadiz, et al. (2000). Conversation Trees and Threaded Chats. ACM2000
Conference on Computer Supported Cooperative Work, Philadelphia, PA, ACM press.
Spiliopoulou, M. (2000). "Web Usage Mining for Site Evaluation: Makking a Site Better Fit
its Users." Communications of the ACM 43(8): 127-134.
Spiliopoulou, M., C. Pohle, et al. (1999). Improving the Effectiveness of a Web Site with
Web Usage Mining. Workshop on Web Usage Analysis and User Profiling
(WebKKD99), San Diego, August 1999.
Srikant, R. and R. Agrawal (1995). Mining sequential patterns. Int'l Conference on Data
Engineering (ICDE), Taipei, Taiwan.
Srikant, R. and R. Agrawal (1996). Mining sequential patterns: generalizations and
performance improvements. Fifth Int'l Conference on Extending Database Technology
(EDBT), Avignon, France.
Stinchcombe, A. L. (1968). Constructing social theories. New York, Harcourt Brace &
World.
Strauss, A. (1985). "Work and the Division of Labor." The Sociological Quarterly 26(1): 1-
19.
Su, Z., Q. Yang, et al. (2002). "Correlation-Based Web Document Clustering for Adaptive
Web Interface Design." Journal of Knowledge and Information Systems 4(2).
Suchman, L. (1996). Supporting Articulation Work. Computerization and Controversy. R.
Kling, Academic Press: 407-423.
Teege, G. (2000). "Users as Composers: Parts and Features as a Basis for Tailorability in
CSCW Systems." Computer Supported Cooperative Work 9(1): 101-122.
Toolan, F. and N. Kushmerick (2002). Mining web logs for personalized site maps.
International Conference on Web Information Systems Engineering, Sigapore.
VanGundy, A. B. (1988). Techniques of structured problem solving. New York, Van
Nostrand Reinhold.
Walsham, G. (1995). "Interpretive case studies in IS research: nature and method." European
Journal of Information Systems 4: 74-81.
Watson, R. T., G. Desanctis, et al. (1988). "Using a Gdss to Facilitate Group Consensus -
Some Intended and Unintended Consequences." Mis Quarterly 12(3): 463-478.
Websters Merriam Webster Collegiate Dictionary. 2002.
Weisberg, H. F., J. A. Krosnick, et al. (1996). An introduction to survey research, polling, and
data analysis. Thousand Oaks, Calif., Sage Publications.
Wellman, B., J. Salaff, et al. (1996). "Computer Networks as Social Networks: Collaborative
Work, Telework, and Virtual Community." Annual Review of Sociology 22: 213-238.