Evaluating Training Effectiveness: An Integrated Perspective

EVALUATING TRAINING EFFECTIVENESS:
AN INTEGRATED PERSPECTIVE IN MALAYSIA
Lim Guan Chong

Master of Business Administration (Finance)
International Graduate School of Management

Division of Business and Enterprise
University of South Australia
Submitted on this 5th of August in the year 2005 for the partial requirements of the degree of
Doctor of Business Administration
UNIVERSITY OF
SOUTH AUSTRALIA
12 JUL 2006
LIBRARY
DOCTOR OF BUSINESS ADMINISTRATION
PORTFOLIO SUBMISSION FORM
Name: Lim Guan Chong Student Id No: 0111487H
Dear Sir/Madam
To the best of my knowledge, the portfolio contains all of the candidate's own work
completed under my supervision, and is worthy of examination.
I have approved for submission the portfolio that is being submitted for
examination.
Signed:
14
Dr Travis Kemp/Professor Dr Leo Ann Mean Date
Supported by:
ge)(2,/2$
Dr Ian Whyte Date

Chair: Doctoral Academic Review Committee
International Graduate School of Business
DBA Portfolio Declaration
I hereby declare that this paper submitted in partial fulfillment of

the DBA degree is my own work and that all contributions from any
other persons or sources are properly and duly cited. I further declare
that it does not constitute any previous work whether published or
otherwise. In making this declaration I understand and acknowledge
any breaches of the declaration constitute academic misconduct which
may result in my expulsion from the program and/or exclusion from the
award of the degree.
Signature of candidate:
Lim Guan Chong Date:5th August 2005
11
TABLE OF CONTENTS
Portfolio Submission Form

Portfolio Declaration
Acknowledgements
Overview 1
1 Research Paper 1 Methodological Issues 3

In Measuring Training Effectiveness
1.1 Abstract 4
1.2 Introduction 4
1.3 Approaches to Training Evaluation 6
1.3.1 Discrepancy Evaluation Model 7
1.3.2 Transaction Model 10
1.3.3 Goal-Free Model 10
1.3.4 Systemic Evaluation 12
1.3.5 Quasi-Legal Approach 13
1.3.6 Art Criticism Model 13
1.3.7 Adversary Model 14
1.3.8 Contemporary Approaches Stufflebeam's 14
Improvement-Oriented Evaluation (CIPP)
Model, 1971
1.3.9 Cervero's Continuing Education Evaluation, 1984 15
1.3.10 Kirkpatrick Model, 1959a, 1959b, 1960a, 1960b,
1976, 1979, 1994, 1996a, 1996b, 1998 16
1.4 Critical Review 22
1.5 Future Research 27
1.5.1 The Transfer Component 27
1.5.2 Evaluating Beyond the 4 Levels 28
1.5.3 Incorporating Competency-based Approach 29
into Training Evaluation
1.5.4 Multi-Rater System in Training Evaluation 31
1.6 Conclusion 33
1.7 References for Paper One 34
2 Research Paper 2 Evaluating Training Effectiveness: 43

An Empirical Study of Kirkpatrick Model of
Evaluation in the Malaysian Training Environment For
The Manufacturing Sector
2.1 Abstract 44
2.2 Introduction 44
2.3 Training Practices in Malaysia 45
2.4 Practice of Evaluation in Training 46
2.5 Training Evaluation Practices In Malaysia 49
2.6 Methodology of Study 53
2.6.1 Questionnaire Construction 54
2.6.2 The Sample and Sampling 55
2.6.3 Questionnaire Responses 56
2.7 Findings and Discussion 56
2.8 Limitations of Study 63
2.9 Conclusion 64
2.10 References for Paper Two 65
2.11 Appendix A The Questionnaire for Research Paper Two 69
3 Research Paper 3 Multi-rater Feedback For Training 74

And Development: An Integrated Perspective
3.1 Abstract 75
3.2 Introduction 75
3.3 The Use of Multi-rater Feedback 76
3.4 The Effectiveness Of Multi-rater Feedback For 79
Development
3.5 The Effectiveness Of Multi-rater Feedback For 81
Appraisal
3.6 The Variation Of Multi-rater Feedback Information 81
3.7 Multi-rater Feedback Practices In Malaysia 83
3.8 Integrating Multi-rater Feedback With Developmental 86
Tool
3.9 Multi-rater Feedback: Process Consultation As A 87
Developmental Tool
3.10 Micro Perspective Of Conversation Theory In Process 89
Consultation
3.11 An Integrated Approach for Post Multi-rater Feedback 91
Development
3.12 Conclusion 96
3.13 References for Paper Three 97
iv
Acknowledgements
I am sincerely grateful to my supervisors, Dr Travis Kemp and Professor Leo Ann

Mean, who have been so supportive, by taking their time to look through my papers
and gave me tremendously useful feedback and suggestions.
First and foremost, thanks to my spouse, Linda Liew Mei Ling who acted as my
research assistant and has put in her late nights and thoughtful moral support
throughout this endeavor.
Special thanks are reserved for my friends who acted as my proof readers who
never let me produce less than the best I had to offer.
In particular, my sincerest thanks to my respondents, relatives, families and other

parties who have supported me along the way and helped me find the time to
complete my thesis.
Finally, my utmost appreciation to University of South Australia, International

Graduate School of Management for their support and enthusiasm to achieving
excellence in education.
Lim Guan Chong

Overview
The majority of organizations realize that training must be a worthwhile effort; there must be
returns towards labour productivity after training. Evaluation is possibly the least developed
aspect of the training cycle. This research portfolio looks at the effectiveness of Kirkpatrick
Four-Levels of Evaluation with emphasis on the assessment of the methodology within the
training perspective.
Evaluating training is typically linked with measuring change and quantifying the degree of
change which leads to performance. Measuring gains in organization effectiveness that
resulted from training interventions is probably the most difficult task in training evaluation.
This research portfolio, as a partial fulfillment of the requirement of the degree of Doctor of
Business Administration, develops a series of ideas that expand on traditional approaches to
training evaluation. The research portfolio is divided into three papers.
Paper 1 critically reviews the methodological problems faced when adopting the evaluation
model developed by Donald L Kirkpatrick in 1959. A series of industrial research conducted
shows little application of this definite approach. The literature provides little understanding
about the transfer of the learning component when using Kirkpatrick model to determine
training effectiveness. Most current researchers find that future research on training
evaluation lies in the effectiveness of transfer of the skills learned. The objective of this
research portfolio through the anatomy of this classical theory is to effectively address the
weaknesses by re-focusing the issue of transfer of learning as a major key to unlock the
model's practicality and validity.
Paper 2 adopts a survey method to track the history, rationale, objectives, implementation and
evaluation of training initiatives in the Malaysian manufacturing sector. It utilizes the survey
research to triangulate reliable and convincing findings.
The research looks at the extensiveness of Kirkpatrick model as practised in the Malaysian
manufacturing sector. This paper reports the practice of Kirkpatrick's 4 levels of evaluation
and the effectiveness of this evaluation model within the Malaysian manufacturing sector.
1
Paper 3 is on the effective use of the multi-rater feedback system in providing multi-source
information and creating self awareness based on individual strength and weaknesses. One
underlying rationale to such system is their potential impact on the individual's self
awareness which is thought to enhance performance at the development stage.
This paper serves as a conceptual paper, which studies how multi-rater feedback could
effectively lead to a successful developmental process through process consultation in the
context of Malaysia training environment. Through the years, training evaluation culture in
Malaysia has not been properly developed. A comprehensive approach is necessary for
organizations to see the benefits of conducting pre training analysis. This should be followed
by an effective development plan so that a comprehensive training approach could be
instilled in the Malaysian environment.
The process consultant holds the key to effective development process by using a multi-rater
assessment as a pre-training gap analysis. Process consultation provides the opportunity to
'check and balance' the degree of learning and development activities through reflections,
problem solving capabilities and application of theory throughout the developmental process.
Good conversation was introduced as an intervention tool to complement double loop
learning during process consultation.
This portfolio systematically discusses the issue of training evaluation faced by the Malaysian
manufacturing sector. It is recommended that an integrated model approach comprising
preliminary and post assessment using multi-rater feedback, followed by a developmental
process using process consultation, complemented by good conversation as an intervention
tool, may serve as a rational balance between training financial outlays and development
outcome.
2
Research Paper I
METHODOLOGICAL ISSUES IN MEASURING

THE TRAINING EFFECTIVENESS
Lim Guan Chong

University of Hull

3
Methodological Issues in Measuring
Training Effectiveness
Lim Guan Chong

1.1 Abstract
This literature review examines the effectiveness and the methodological issues related to
Kirkpatrick's four-level model of evaluation and its application to training. The paper first
measures the extent that the Kirkpatrick's evaluation model has been used by organizations to
measure learning outcomes, reactions towards development, transfer learning, change of
behavior and return of investment after training. Research was conducted to determine the
weaknesses of this model faced by most practitioners. An examination of this classical
theory was carried out to address the weaknesses of this model by re-focusing the issue of
transfer learning as a key to unlock the model's practicality and validity.
1.2 Introduction
Training evaluation is regarded as an important human resource development strategy.

However, there seems to be widespread agreement that systematic evaluation is the least well
carried out training activity. Chen and Rossi (1992) commented that evaluation knowledge
found in the literature has not been fully utilized in program evaluation. This reveals that
training evaluation has not been culturally embedded in most organizations. The first reason
could be that companies have no knowledge in conducting training evaluation. Secondly, the
available training evaluation models are not sufficient in providing a total approach for
effective training evaluation. This is further evidenced by a study on the benefits of training
in Britain, which revealed that 85 percent of British companies make no attempt to assess the
benefits gained from undertaking training (HMSO, 1989).
4
Since evaluation started in the area of education, most of the early definitions were in that
area. Tyler (1949) was the first researcher to define evaluation as a process of determining to
what extent the educational objectives are actually being realized by the curriculum and
instruction. The early researchers emphasized the need to look at attaining objectives as an
important process in determining the effectiveness of any programs. This was found in the
study by Steel (1970), who compared effectiveness of the program with its cost. Boyle and
Jahns (1970) defined evaluation as the determination of the extent to which the desired
objectives have been attained or the amount of movement that has been made in the desired
direction. Further study by Provus (1971) conceptualized the need to have a certain standard
of performance as an objective-based criterion to judge the success of the program. His
model made comparisons between this preset standard and what actually exists. Noe (2000)
defined evaluation by referring to training evaluation as the process of analyzing the
outcomes needed to determine if training was effective. However, Goldstein and Ford (2002),
were of the opinion that evaluation is a systematic collection of descriptive and judgemental
information necessary to make effective training decisions which are related to the selection,
adoption, value, and modification of various activities.
After many in-depth studies were conducted on training evaluation and the high cost-
effective expectation from training, the term evaluation has been given a broader perspective
in which it no longer focuses on achieving program objectives but mainly covers the
methodology element of evaluation (Brinkerhoff, 1988; Goldstein, 1986; Junaidah, 2001;
Shadish & Reichardt, 1987; Stufflebeam & Shinkfield, 1985). The basis of goal-based
process formed only part of the overall evaluation process, unlike in the past when
researchers used one preferred methodological principle to assess the degree to which
training had attained their goal. With the availability of a wider range of philosophical
principles and scientific methodologies, many social scientists emphasized scientific rigor in
their evaluation models, and this is reflected in their definition of the field (Junaidah, 2001).
The evaluation model of these social scientists involves primarily the application of scientific
methodologies to study the effectiveness of the programs. These evaluators emphasized the
importance of experimental designs (Goldstein & Ford, 2002), quantitative measures (Rossi
& Freeman 1993) and qualitative assessment (Wholey, Hatry & Newcomer, 1994).
Contemporary social scientists, Cascio (1989), Mathieu and Leonard (1987), Morrow, Jarrett
5
and Rupinski (1997), Tesoro (1998) even adopted utility analysis in evaluating the worthiness
and effectiveness of the programs.
In brief, the concept of evaluation consists of two distinct definitions; congruent and
contemporary definitions (Junaidah, 2001). The congruent definition is more concerned with
meeting the desired objectives. It is a process of collecting information, judging the worth or
value of the program and ensuring training objectives are met. The contemporary definition
of evaluation places emphasis on scientific investigation to facilitate decision-making.
Stufflebeam (1971) mentioned that evaluation is the process of delineating, obtaining and
providing useful information for judging decision alternatives. This can be seen from the
evolution of the early 70s models to the current contemporary evaluation models.
1.3 Approaches to Training Evaluation
Evaluation in its modern form has developed from attempts to improve the educational
process (Bramley, 1996). Evaluating the effectiveness of people became popular at about the
same time as scientific management, and school officials began to see the possibility of
applying these concepts to school improvement (Bramley, 1996). Tyler (1949) model is
generally considered an early prominent evaluation model which was planned to evaluate the
value of progressive high-school curricula with more conventional ones (Stufflebeam &
Shinkfield, 1985).
Tyler (1949) introduced the Basic Principles of Curriculum and Instruction, which is
organized around four main concerns:
What educational purposes should the organization seek to attain?
How to select learning experiences that are likely to be useful in achieving these
purposes?
How can the selected learning experiences be organized for effective instruction?
How can the effectiveness of these learning experiences be evaluated?
Tyler laid the foundation for an objective-based style of evaluation. Objectives were seen as
being critical because they were the source for planning, guiding the instruction and
6
preparing the test and measurement procedures. Tyler's objective-based evaluation model
concentrates on clearly stated objectives by changing the evaluation from appraisal of
students to appraisal of programs. He defined evaluation as assessing the degree of
attainment of the program objectives. Decisions made on any program had to be based on the
goal congruence between the objectives and the actual outcomes of the program (Stufflebeam
& Shinkfield, 1985).
1.3.1 Discrepancy Evaluation Model
The Discrepancy Evaluation Model, developed by Provus (1971) is used in situations where a
program is examined through its development stages with the understanding that each stage
(which Provus defines as design, installation, process, product and cost-benefit analysis) is
measured against a set of performance standards (objectives). The cost-benefit analysis
identifies the potential benefits of the training before it is carried out. The expected
behaviours which result from the training are agreed upon between the trainer and the
trainees. The analysis also establishes training objectives, which are defined as changes in
work behaviour and increased levels of organizational effectiveness (Bramley & Kitson,
1994). The program developers had certain performance standards in mind regarding how the
program should work and how to identify if it were working. The discrepancies that are
observed between the standards and the developed design are communicated back to the
relevant parties for review or further corrective action. A discrepancy evaluator's role is to
determine the gap between what is and what should be. This model helps the evaluators to
make decisions based on the difference between preset standards and what actually exists
(Boulmetis & Dutwin, 2000).
Provus's Discrepancy Evaluation Model can be considered an extension to Tyler's earlier

objective-based model where a set of performance standards must be derived to serve as the
objectives to which the evaluation of the program is based. Furthermore, the model may be
also viewed as having properties of both the formative and summative evaluation (Boulmetis
& Dutwin, 2000). The design stage comprises the needs analysis and program planning
stages; installation and process are parts of the implementation stage where formative
7
evaluation is done; and product and cost-benefit analysis stages comprises a summative
evaluation stage.
Formative evaluation focuses on the process criteria to provide further information to

understand the training system so that the intended objectives are achieved (Goldstein &
Ford, 2002). Brown, Werner, Johnson and Dunne (1999) note several potential benefits of
formative evaluation. The program could be assessed half way through to see whether it is
on track, effectively performed, and whether the activities are meeting the needs of the
training. The evaluator determines the extent to which the program is running as planned,
measures the program progress in attaining the stated goals, and provides recommendations
for improvement. The evaluation findings in these reports and the monitoring data could be
used to end a program in midstream (Goldstein & Ford, 2002). Unlike formative evaluation,
summative evaluation is fairly stable and does not allow adjustments during the program
cycle. Summative evaluation involves evaluating and determining whether the program has
experienced any unplanned effects. It helps organizational decision makers decide whether
to use the program again or improve it in some way. Campbell (1988) discriminates between
two types of summative evaluations; the first evaluation simply questions whether a
particular training program produces the expected outcome. The second evaluation compares
and investigates the benefits and viability of programmed instruction procedures. By
comparing the two evaluations, it was found that programmed instruction produces quicker
mastery of the subject, but the eventual level of learning retention is the same with either
technique (Campbell, 1988).
Provus Discrepancy Evaluation Model provides information for establishing measures of

training success by determining whether the actual content of the training material would
develop knowledge, skill and ability (KSA) and eventually lead to a successful job
performance. However, there are too many subjective issues that exist, especially on the
setting up of the performance criterion. The chosen criterion is based on the relevance of
three components: knowledge, skill and ability which are necessary to succeed in the training
and eventually on the job. Considering that modem approaches to assessing training
programs must be examined with a multitude of measures, including participant reactions,
learning, performance, and organizational objectives, it is necessary for training evaluators to
view the performance criteria as multidimensional (Goldstein & Ford, 2002). Training can
best be evaluated by examining many independent performance dimensions. However, the
8
relationship between measures of success should be closely scrutinized because the
inconsistencies that occur often provide important insights into training procedures
(Goldstein & Ford, 2002). Decisions and feedback processes depend on the availability of all
sources of information. There are many different dimensions in which the performance
criteria can vary. Issues like relevance and reliability of the criterion are important to consider
should one wish to adopt this discrepancy evaluation model. There are several considerations
in the evaluation of the performance criteria. These include acceptability to the organization,
networks and coalition that can be built between trainees and realistic measures (Goldstein &
Ford, 2002).
Responsive approaches used in the goal-free model are better evaluative approaches as there
is considerable variation in what the objectives of a program are thought to be. Responsive
approaches are a form of action research which involves the stakeholders in the data
collection process (Bramley, 1996). The intention is not to attribute causality, but to gain a
sense of the value of program from different perspectives. The term "responsive evaluation"
was first used by Stake (1977) to describe a strategy in which the evaluator is less concerned
with the objectives of the program than its effect in relation to the concerns of interested
parties, namely the stakeholders.
The responsive approach involves protracted negotiations with a wide range of stakeholders
in constructing the report. It is thus more likely to reflect their reality and be useful for them.
However, the underlying philosophy of responsive evaluation is different from the goal-based
approach. Evaluators are seen as subjective partners and the evaluation is based upon a joint-
collaborative effort which results in findings being constructed rather than revealed by the
investigation. Truth is a matter of consensus among informed parties. Facts have no meaning
except within some value framework. Phenomena can only be understood in the context in
which they are studied, generalization is not possible.
The suggested method intends to achieve progressive focus by giving more attention to
emerging issues rather than seeking the truth. Legge (1984) introduced a model similar to
goal free evaluation which evaluates planned organizational change. The evaluation is a joint,
collaborative process, which results in something more constructed than revealed by the
investigation. Legge (1984) suggests that instead of attempting evaluation as a thoroughly
9
monitored research, a contingency approach should be adopted. The contingency approach
is used to decide which approach is more appropriate or best matches the functional
requirements of the evaluation exercise. Campbell (1988) revealed that internal validity of
the scientific approach may not be so crucial. To increase internal validity, the legitimate
stakeholders should agree on the evaluation approach. The highlight on internal validity in
the scientific approach will frequently imply controlling key aspect of the context and many
organizational variables. This may lead to rather simplified information which clients find
difficult to use because it does not reflect their perception of organizational reality. Due to
this strong bipolarity between practitioners and academics, not many responsive evaluations
have been described in the training literature (Bramley, 1996).
1.3.2 Transaction Model
The Transaction Model developed by Stake (1977) affords a concentration of activity among
the evaluator, participants and the project staff (Madaus, Scriven & Stufflebeam, 1986). This
model combines monitoring with process evaluation through regular feedback sessions
between evaluator and staff. The evaluator uses a variety of observational and interview
techniques to obtain information and the findings will be shared with all the relevant parties
to improve the overall program. The evaluator participates and provides project activities.
Besides trying to obtain objectivity, the evaluators use subjectivity in the transaction model.
This model may have a goal-free or a goal-based orientation. Findings are shared with the
staff of all the projects in order to improve both individual and overall projects (Boulmetis &
Dutwin, 2000).
1.3.3 Goal-Free Model
Unlike early models, the goal-free model developed by Michael Scriven is a model that
involved methodological studies and processes (Popham, 1974). The evaluation model
examines how the program is performing and how the program could address the needs of the
client population. Program goals are not the criteria on which evaluation is based. However,
it is a data gathering process which studies actual happenings and evaluates the effectiveness
10
of the program meeting the client's needs. The evaluator has no preconceived notions
regarding the outcome of the program (as opposed to the goal-based model). Categories of
evaluation naturally emerge from the evaluator's actual observation. Once the data have been
collected, the evaluator attempts to draw conclusions about the impact of the program in
addressing the needs of the stakeholders.
However, this model has its weakness in terms of its subjective measures. There are some
preconceived notions that the evaluator must be an expert in his respective field and some say
no expertise is better (Rossi & Freeman, 1993). Some researchers said that an evaluator who
is not familiar with the nuances, ideologies and standards of a particular professional area
will presumably not be biased when observing and collecting data on the activities of a
program. They maintain, for example, that a person who is evaluating a program to train
dental assistants should not be a person trained in the dental profession. But other researchers
allege that a person who is not aware of the nuances, ideologies and standards of the dental
profession may miss a good deal of what is important to the evaluation. Both sides agree that
an evaluator must attempt to be an unbiased observer and be adept at observation and capable
of using multiple data collection methods (Wholey, Hatry 8z Newcomer, 1994). This is a
topic of debate among many experts. Scriven suggested using two goal-free evaluators, each
working independently to address the preconceived issues and reduce the possible biasness in
evaluation (Scriven, 1991).
A study by O'Leary (1972) illustrates the importance of considering other dimensions of the
criteria. She used a program of role-playing and group problem-solving sessions with hard-
core unemployed women. At the conclusion of the program, the trainees had developed
positive changes in attitude toward themselves. However, it also turned out that these
changes did not reflect the lack of positive attitudes toward their tedious and structured jobs.
These trainees apparently raised their levels of aspiration and subsequently sought
employment in a working setting consistent with their newly found expectations. It was
obvious that the trainees were leaving the job as well as experiencing positive changes in
attitude. However, there are many other cases in which the collection of a variety of criteria
related to the objectives is the only way to effectively evaluate the training program
(Goldstein & Ford, 2002). This has caused goal-based evaluation lost ground during the last
20 years because of the growing conviction that evaluation is actually a political process and
11
that the various values held in the society are not represented by an evaluative process which
implies that a high degree of consensus is possible (Bramley, 1996).
Further studies by Parlette and Hamilton (1977) rejected the classical evaluation system,
which focuses on objective reality, assumed to be equally relevant to all stakeholders in
acknowledging the diversity posed by different interest groups. They suggested the
"illuminative evaluation", with description and interpretation rather than with measurement
and prediction.
1.3.4 Systemic Evaluation
Systemic evaluation analyses the effectiveness of the whole system and enhances the
interfaces between the sub-systems in such a way as to increase the effectiveness of the
system. That is what the "system approach" sets out to do (Rossi & Freeman 1993). The most
comprehensive purpose of systemic evaluation is to find out to what extent training has
contributed to the business plans of various parts of the organization and consider whether the
projected benefits obtained outweigh the likely cost of training.
The main questions, which this strategy sets out to answer, are (Bramley, 1996):
Is the program reaching the target population?
Is it effective?
How much does it cost?
Is it cost effective?
These questions are used to derive facts about the evaluation by defining the size of the target
population and working out the proportion that have attended the training and not opinions of
whether useful learning has taken place. Effectiveness is difficult to measure as the word may
imply different meanings to different people. However, the model seems to measure quantity
rather than the quality of what is being done.
In the system analysis model, the evaluator looks at the program in a systematic manner,
studying the input, throughput and output (Rivlin, 1971).
12
Input are elements that come into the system (i.e. clients, staff, facilities and resources).
Throughput consists of things that occur as the program operates, for example, activities,
client performance, staff performance, and adequacy of resources such as money, people and
space. Output is the result of program-staff effectiveness, adequacy of activities etc. The
evaluator mainly examines the program efficiency in light of these categories.
1.3.5 Quasi-Legal Approach
Quasi-legal evaluation operates in a court of inquiry manner. Witnesses are called to testify
and tender evidence. Great care and attention is taken to hear a wide range of evidence
(opinions, values and beliefs) collected from the program. This approach is basically used to
evaluate social programs rather than formally evaluate training or development activities.
Quasi-legal evaluation was reported flawed by Porter and McKibbin (1988) in the area of
management education in the USA. The substantial information received from stakeholders
was analysed by a small group of professors from a business school. The students were
basically satisfied with the qualification which they have obtained and found course
worthwhile and useful. However, the researchers criticized that young graduates who attend
MBA courses have never worked in an organization and thus do not understand the sort of
issues, which should be the basic discussion material of MBA courses. A similar problem
arose with Constable and McCormick's (1987) report on the demand for and supply of
management education and training in the UK. The researchers found that judgement by
insufficiently impartial judges in the quasi-legal approach may be irrelevant, biased or
inconclusive (Bramley, 1996).
1.3.6 Art Criticism Model
In the Art Criticism Model developed by Eisner (1997), the evaluator is a qualified expert in
the nuances of the program and becomes the expert judge of the program's operation. The
success of this model depends heavily upon the evaluator's judgment. The intended outcome
may come in the form of critical reflection and/or improved standard. This model could be
13
used when a program wishes to conduct a critical review of its operation prior to applying for
funding or accreditation.
1.3.7 Adversary Model
In Owen's Adversary Model, the evaluator facilitates a jury that hears evidence from
individuals on particular program aspects (Madaus, Scriven & Stufflebeam, 1986). The jury
uses multiple criteria to "judge" evidence and make decisions on what have happened. This
model can be used when there are different views of what is actually happening in a program
such as arguments for and against program components.
1.3.8 Contemporary Approaches - Stufflebeam's Improvement-Oriented

Evaluation (CIPP) Model, 1971
Stufflebeam considers the most important purpose of evaluation is not to prove but to
improve (Stufflebeam & Shinkfields, 1985). The four basic types of evaluation in this model
are context (C), input (I), process (P) and product (P).
Context evaluation defines relevant environment and identifies training needs and
opportunities of specific problems. Input evaluation provides information to determine usage
of resources in the most efficient way to meet program objectives. The results of input
evaluation are often seen as policies, budgets, schedules, proposals and procedures. Process
evaluation provides feedback to individuals responsible for implementation. It is
accomplished through providing information for preplanned decisions during implementation
and describing what actually occurs. This includes reaction sheets, rating scales and content
analysis. Ultimately, product evaluation measures and interprets the attainment of program
goals. Contemporary approaches could take place both during and after the program with the
aim to improve program evaluation by expanding the scope of evaluation through its four
basic types of evaluation (Madaus, Scriven & Stufflebeam, 1986).
The CIPP model was conceptualized as a result of attempts to evaluate projects that had been
funded through the Elementary and Secondary Act of 1956 (Stufflebeam, 1983). To conduct
CIPP model evaluation, the evaluator needs to design preliminary plans and deal with a wide
14
range of choices pertaining to evaluation. This requires collaboration between clients and
evaluators as a primary source for identifying the interest of the various stakeholders.
1.3.9 Cervero's Continuing Education Evaluation, 1984
In Cervero's book titled "Effective continuing education for professionals" he suggested

seven categories of evaluation questions organized around seven criteria to determine
whether the programs were worthwhile (Cervero, 1988). The seven criteria are (a) program
design and implementation, (b) learner participation, (c) learner satisfaction, (d) learner
knowledge skills and attitudes, (e) application of learning after the program, (t) impact of
application of learning and (g) program characteristic associated with outcomes.
Program design and implementation is concerned with what was planned, what was actually
implemented and the congruence between the two. Factors such as the activities of learners
and instructors and the adequacy of the physical environment for facilitating learning are
common questions which are asked in this category.
Learner participation has both quantitative and qualitative dimensions. The quantitative
dimension deals with evaluative questions that are most commonly asked in any formal
program. The data is not used to infer answers in the other categories. Qualitative data is
collected in an anecdotal fashion by unobtrusively observing the proceedings of the
educational activities.
Learner satisfaction is concerned with the participants' reaction and is collected according to
various dimensions, such as content, educational process, instructor's performance, physical
environment and cost.
Learner knowledge, skills and attitudes focus on changes in the learner's cognitive,
psychomotor and affective goals. Normally, the evaluator will adopt a pen and paper test to
judge the effectiveness of these categories.
Application of learning addresses the degree of skill transfer to the actual work place. The
impact of application of learning focuses on the second-order effects, which means the
transfer and impact on the public (Cervero, 1988).
15
Program characteristics are associated with the outcome of the program. There are two kinds
of evaluative questions: the implementation questions and the outcome questions.
Implementation questions are useful for determining what happened before and during the
program. Outcome questions are useful for determining what occurred as a result of the
program.
The seven categories in this model are not viewed as a hierarchy (Junaidah, 2001). Cervero's
ideas have several antecedents in the evaluation literature. His framework was influenced by
Kirkpatrick's (1959) and Tyler's (1949) models. It is considered to be a comprehensive
model as it covers all the stages involved in starting from the program design stage to the
outcome stage. However, this model evaluation may be viewed as being too tedious to
implement due to its complexity. The author is too immersed in getting facts of the entire
process and ignores the efficiency of the whole evaluation process. This makes the model
more summative than formative in nature.
1.3.10 The Kirkpatrick Model, 1959a, 1959b, 1960a, 1960b, 1976, 1979,
1994, 1996a, 1996b, 1998
One of the most widely used model for classifying the levels of evaluation, used by Barclays
Bank PLC, Reeves in 1996 and others, was developed by Kirkpatrick. His model looks at
four levels of evaluation, from the basic reaction of the participants to the training and its
impact to the organizational. The intermediary levels examine what people learned from the
training and whether learning has affected their behaviour on the job. Level one (Level 1)
concerns itself with the most immediate reaction of participants and is easily measured by
simple questionnaires after the training. Level two (Level 2) is harder to measure and is
concerned with measuring what people understood and how they were able to demonstrate
their learning in the work environment. Level two (Level 2) can be measured by pen and
paper tests or through job simulations. Level three (Level 3) looks at the changes in people's
behaviour towards the job. For example, after a writing skills course, did the individual make
fewer grammatical and spelling errors and were their memos easier to understand? Level
four (Level 4) measures the "result" gained from the training. It focuses on the impact of the
training on the organization rather than the individual.
16
Kirkpatrick (1959) developed this coherent evaluation model by producing what was thought
to be a hierarchy system of evaluations which indicates effectiveness through:-
Level 1 (Reaction)
Level 2 (Learning)
Level 3 (Behaviour)
Level 4 (Results)
Kirkpatrick's (1994) Training Evaluation Model
Reaction How did the participants react to the training?
Learning What information and skills were gained?
Behaviour How have participants transferred knowledge and skills to their jobs?
Results What effect has training had on the organization and the achievement
of its objectives? (Timely and quality performance appraisals are
corporate goal)
Kirkpatrick was the first researcher to develop a coherent evaluation strategy by producing
what was thought to be a hierarchy of evaluations, which would indicate benefit (Plant &
Ryan, 1994).
Level 1: Reaction Evaluation
Kirkpatrick proposed the use of a post course evaluation form to quantify the reactions of
trainees. Evaluation at this level is associated with the terms "happiness sheet" or "smile
sheet" because reaction information is usually obtained through a participatory
questionnaire administered near or at the end of a training program (Smith, 1990).
Studies on evaluation mechanisms have shown that such evaluation sheets are not held in
high esteem, despite their general use by trainers of many organizations and in institutions
of higher learning (Bramley 1996; Clegg, 1987; Love, 1991; Rae, 1986;). Clegg (1987)
found that training evaluation was conducted for 75 percent of training programs done in
17
organizations. A study by Dawson (1993) found that Level 1 evaluation sheets were
ubiquitous.
Level 2 Learning Evaluation
The learning level is concerned with measuring the learning principles, facts, techniques
and skills presented in a program (Kirkpatrick, 1994). Tyler (2002) found that 32
percent of companies in America have carried out post-training evaluation on Level 2.
Another research conducted by Mathews, Ueno, Kekale, Repka, Pereira and Silva (2001)
on 450 companies in UK, Portugal and Finland which focused on training quality and
training evaluation showed that 40 percent of UK companies, 31 percent of Finland
companies and 51 percent of Portugal companies conduct formal assessment on learning
of the principles, facts, skills and attitudes which were specified as training objectives.
This level evaluates the knowledge, skills development and attitudinal changes that have
taken place. Examination of both knowledge and attitudinal outcomes is important to
increase coverage of training impacts because the pattern of change can vary between the
pre-test and post-test (Basadur, Graen & Scandura, 1986; Kraiger, Ford & Salas, 1993).
Researchers either assessed change before and after a program (Basadur et al., 1986;
Bretz & Thompsett, 1992), or they look merely at the post-training attainment score
(Davis & Mount, 1984; Warr & Bunce, 1995). Measures of learning should be objective,
with quantifiable indicators of how new requirements are understood and absorbed. This
data is used to confirm that participant learning has occurred as a result of the training
initiative (Phillips & Stone, 2002).
Level 3 Behavioural Evaluation
Job performance after training is referred to as behavioural by Kirkpatrick (1959, 1976)

and transfer by Alliger, Tannenbaum, Bennett, Traver and Shotland (1997). Level 3
evaluates the extent to which the "transfer" of knowledge, skills and attitudes has
18
occurred. Tyler (2002) reported that only 9 percent of America industries have carried
out post training evaluation at this level. The focal point is on performance at work after
a program. It is essential to record before and after performance but sometimes self-
report are obtained if information are unavailable to an evaluator (Wexley & Baldwin,
1986). It determines the extent of change in behaviour that has taken place and how this
behaviour would be transferred to the workplace. It further encourages one to take into
account the possible factors in the job environment that could prevent the application of
the newly learned knowledge and skills since a positive climate is important for
transferring.
Level 4 Results Evaluation
The evaluation of a particular training program becomes more complex as one progress
through every level of Kirkpatrick model. Results can be defined as the final results that
occurred because the participants attended the training program. This includes increased
production, improved quality, increased sales and productivity, higher profits and return
on investment. Level 4 evaluation observes changes in the performance criteria (i.e. key
results area) of organizational effectiveness. This level anticipates the gains the
organization can expect from a training event. This level of evaluation is made more
difficult as organization often demand that the explanation be given in financial terms
with measurable quantifiers (Redshaw, 2001).
For the past 30 years since Kirkpatrick's first idea was published in 1959, much debate had
been recorded on this model. Despite criticism, Kirkpatrick model is still the most generally
accepted by academics (Blanchard & Thacker, 1999; Dionne, 1996; Kirkpatrick, 1996a;
1996b; 1998; Phillips, 1991). However, research conducted in the United States has
suggested that US organizations generally have not adopted all of Kirkpatrick's 4-level
evaluation (Geber, 1995; Holton, 1996). This is especially true for the last two, more
difficult, levels of Kirkpatrick's hierarchy (Geber, 1995). In a survey of training in the USA,
Geber (1995) reported that for companies with 100 or more employees, only 62 percent
assessed behavioural change. Geber's (1995) results also indicated that only 47 percent of
US companies assess the impact of training on organizational outcomes. This poses a good
19
research question about the model's methodology and it forms the basis for epistemological
studies around the methodology.
Kirkpatrick's work has received a great deal of attention within the field of training
evaluation (Alliger & Janek, 1989; Blanchard & Thacker, 1999; Campion & Campion, 1987;
Connolly, 1988; Dionne, 1996; Geber, 1995; Hamblin, 1974; Holton, 1996; Kirkpatrick,
1959; 1960; 1976; 1979; 1994; 1996a; Newstrom, 1978; Phillips, 1991). His concept calls
for four levels of evaluation namely reaction, learning, behaviour and results. His four levels
of training effectiveness stimulated a number of supportive and conflicting models of varying
levels of sophistication (Alliger & Janek, 1989; Campion & Campion, 1987). There are
models and methods that incorporate financial analyses of training impact (Swanson &
Holton, 1999). However, Warr, Allan and Birdi (1999) conducted a longitudinal study of the
first three levels of training evaluation. The study correlated the following: relationships
between evaluation levels, individual and organizational predictors of each level and the
differential predictions of attainment vs change score. The study showed that immediate and
delayed learning were predicted by the trainee's motivation, confidence and use of learning
strategies. The researchers highlighted that it is preferable to measure training outcomes in
terms of change from pre-test to post-test, rather than merely through attainment (post-test)
scores (Warr, Allan & Birdi, 1999).
A review of the most popular procedures used by US companies to evaluate their training
programs showed that over half (52 percent) use assessments about participants' satisfaction
with the training. 17 percent assessed application of the trained skills to the job and 13
percent evaluated changes in organizational performance following the training. 5 percent
tested for skill acquisition immediately after training while 13 percent of American
companies carried out no systematic evaluation of their training programs (Mann &
Robertson, 1996). Many of these procedures reflect Kirkpatrick's four levels of reactions,
learning, behaviour and results of which will be further discussed.
More than 50 evaluation models available use the framework of Kirkpatrick model (Phillips,
1991). Currently, majority of the employee training is evaluated at Level 1. Evaluation at
Level 1 is associated with the terms smile sheet or happiness sheet, because reaction
information is usually obtained through a participatory questionnaire administered near the
end or at the end of a training program (Smith, 1990). The specific indication of the smile
20
sheet or happiness sheet is enjoyment of the training, perceptions of its usefulness and its
perceived difficulty (Warr & Bunce, 1995).
Phillips and Stone (2002) enhanced the popularity of the Kirkpatrick model by inserting the
fifth level into the existing 4-level model, though he further argued the inadequacy of this
model in capturing the return on investment aspect of the training outcome. Phillips and
Stone's (2002) 5-level evaluation model was seen as an extension of Kirkpatrick's 4-level
evaluation model as different companies have their own definition of pay offs to measure the
training results. Return on investment compares the training's monetary benefits with the
cost of the training, so that the true value of the training to the organization can be assessed.
Converting data to monetary values is the first phase in putting training initiatives on the
same level as other investments that organizations make (Phillips, 2002). It cannot be used to
cover other variables that may affect the results (i.e. culture, productivity, etc). Kirkpatrick
(1994) refuted this idea by claiming that there are many ways to measure training results.
This raises the question whether training evaluation be varied only as a measure of financial
benefits? Lewis and Thornhill (1994) are of the opinion that there should be 5 levels of
evaluation measuring the training effects on the department (i.e. Level 4) and its effects on
the whole organization (i.e. Level 5). Lewis and Thornhill (1994) emphasized the need to
look at the value and the organization cultures as the variables to measure training
effectiveness.
In recent times others have tried to make the system easier to deal with. Warr et al. (1999)
came up with the context, input, reaction and outcome (CIRO) evaluation system with the
context part going someway towards front-loading the evaluation and partly towards
mirroring Kirkpatrick model. Dyer (1994) proposed an evaluation system that suits all
organizations, irrespective of size or diversity of operation. It is a system that is relatively
easy to come to terms with and can be implemented at all the hierarchical stages of an
organization. It fits the individual and it fits the whole organization. The system puts
Kirkpatrick's evaluation system against a mirror. The benefits of using Kirkpatrick's Mirror
should be self-evident to anyone involved in management. Application of the paradigm
allows the individual to become more business focused, and if adopted universally should
provide efficient and effective training throughout any organization (Dyer, 1994).
21
A different model was used in a study by Shireman (1991) on the evaluation of a hospital
based health education program. The study adopted the CIPP model in examining the type of
evaluation which was being conducted in the hospital. A structured questionnaire was sent to
a stratified random sample of 160 hospitals of four different sizes in four mid-western states.
The result showed that 48 percent of the respondents reported that product evaluations were
usually done and less than 25 percent reported that other types (i.e. context, input, process) of
evaluations were done. The product evaluation is outcome-based and quite similar to
Kirkpatrick's end process evaluation. Both types of evaluations require appropriate data
collection activities.
Kirkpatrick model was used by most researchers as an initial framework of evaluation model
generation. This paper addresses the methodological issues surrounding the taxonomy of
Kirkpatrick model as an area for epistemological study. The theoretical an empirical
literature of Kirkpatrick model will be critically evaluated and further research opportunities
will be outlined.
1.4 Critical Review
Phillips (1991) concluded that out of more than 50 evaluation models available, the
evaluation framework that most training practitioners used is the Kirkpatrick model. Though
the model seemed to be weathered well, it has also limited our thinking on training evaluation
and possibly hindered our ability to conduct meaningful training evaluation (Bernthal, 1995).
More than ever, training evaluation must demonstrate improved performance and financial
results. But in reality, according to Garavaglia (1993), training evaluation often assessed
whether the immediate objectives have been met; specifically, how many items were
answered correctly on the post-test. Some based their evaluation only on trainee reaction; the
first level of Kirkpatrick model developed in 1959 (Brinkerhoff, 1988). Such information
gave organization no basis for making strategic business decisions (Davidove & Schroeder,
1992). Most practitioners are familiar with Kirkpatrick's 4-level evaluation model but many
never seemed to get beyond Levels 1 and 2 (Regalbutto, 1992). Numerous organizations have
adopted the model presented by Kirkpatrick to suit their own situations; the solution seems to
cause the growth of generic models (Dyer, 1994).
22
Kirkpatrick called for a definite approach to the evaluation model. All 4 levels must be
measured to ensure effectiveness of the whole evaluation system since each level provides
different kinds of evidence.
This view was supported by Hamblin (1974), who suggested that reaction leads to learning
and learning leads to change in behaviour, which subsequently leads to changes in the
organization. He further stated that each level can be broken at any link and having positive
reaction is necessary to create positive learning. According to Bramley and Kitson (1994),
there is not much evidence to support this linkage. Further research carried out by Alliger
and Janek (1989) found only 12 articles which attempted to correlate the various levels
advocated by Kirkpatrick. Although there are problems in external validity with such a small
data, the tentative conclusion was that there was no relationship between reaction and the
other three levels of evaluation criteria. A correlation study, which was run on these four
levels of evaluation showed insignificant results. A literature search based on Kirkpatrick's
name, yielded 55 articles but only 8 described evaluation results and none described
correlations between levels (Toplis, 1993). This concluded that good reactions did not
predict learning, behaviour or results.
A series of industrial surveys conducted in the last 30 years show little application of all 4
levels of Kirkpatrick model. Surveys conducted since 1970 showed that most industrial
trainers rely on student reaction, fewer on test learning and almost none on test application
and benefit (Brandenburg, 1982; Plant & Ryan 1994; Raphael & Wagner, 1972). In the last
20 years, a number of writers claimed to have performed a full Kirkpatrick evaluation;
however, the linkages described in connecting the training event with the outcome are
subjective and tenuous (Salinger & Deming, 1982; Sauter 1980).
A survey conducted by the Bureau of National Affairs and American Society of Training and
Development (ASTD) in 1969 using questionnaires indicated that most of the companies
conducted Level 1 evaluation and unsystematic approaches to Level 2 evaluation (Raphael &
Wagner, 1972). The survey indicated that problems of evaluation at higher levels were
mainly due to a lack of understanding of the approach used. Kirkpatrick model seems to
offer a one-size fits all solution to measure training effectiveness. However, there has been
little contribution and reliability of this model despite great industrial emphasis in this area.
23
Kirkpatrick model focuses mainly on immediate outcome rather than the process leading to
the results. The following questions were never successfully addressed. In fact the
improvement of these processes is the main forces of effectiveness (Murk, Barrett &
Atchade, 2000).
How well a person's motivation level affects the learning behaviour

The degree of superiors' support after the training
The extent to which training interventions was appropriate for meeting needs
Longer-term effects of the training, the pay-off in determining a course's overall
impact and cost-effectiveness
The conduciveness of the training environment
An empirical study by Warr, Allan and Birdi (1999) showed that external processes like
increasing confidence and motivation levels of trainees as well as use of certain learning
strategies are important contributing factors towards training effectiveness. A 2-day training
course was studied on 23 occasions over a 7-month period in the Institute of Work
Psychology, UK. Technicians who attended the training courses which involved operating
electronic tools were asked to complete a knowledge test questionnaire on arrival and at the
end of the course. A follow up questionnaire was mailed to the trainees one month later.
More than 70 percent of the respondents returned the questionnaire. The questionnaire was
designed to capture what the researches defined as third factors (i.e. confidence, perception,
motivation, learning strategies, age, etc). The results showed a non-significant correlation
between reactions towards the course and job behaviour. Perceptions of course difficulty
were significantly negatively associated with frequency of use of equipment. Correlation
between levels two and three evaluation were small. Learning scores and changes in those
score - Level 2 were strongly predicted by trainee's specific reactions to the course, but those
reactions were not significantly associated with later job behaviour - Level 3 (Warr, Allan &
Birdi, 1999).
Alliger et al. (1989) carried out a meta-analysis of studies where reaction measures had been
related to measures of learning (11 studies) and changes in behaviour (9 studies). They found
that positive reactions did not predict learning gains better than negative ones (the average
24
correlation between reactions and amount of learning was .02 nor were they any better at
predicting changes in behaviour after the program was .07).
Bramley and Kitson (1994) asserted that measuring learning is problematic because designing
a reliable measuring instrument is difficult and the necessary skills are often not available.
Grove and Ostroff (1990) pointed out that training directors often do not possess the essential
skills to conduct training evaluation. This could be part of the reason why companies are
reluctant to evaluate their training effectiveness.
Though Kirkpatrick's traditional assessment methods were widely used on Level 1 and 2
evaluations, the benefits of collecting data at each level are unclear. This uncertainty may
result in organization failing to evaluate training completely or selecting forms of evaluation
that may not be reliable. Inadequacy in Kirkpatrick model on each level forces one to look
for other possible measures. Therefore, one may argue that to make Kirkpatrick model
definite, a more detailed assessment method must be conducted at each level to ensure
practicality, validity and applicability (Mann & Robertson, 1996).
Mann and Robertson (1996) undertook to investigate the utility of various methods used in
evaluating training programs. Twenty-nine subjects were selected from a three-day training
seminar for the European National Run in Geneva, Switzerland. The seminar was a computer
training event (on e-mail and the Internet) for youth workers, and trainees were asked to
complete training evaluation forms before and after the training program and by post one
month later. Sixteen people returned this final questionnaire. Each questionnaire contained
three sets of questions designed to measure knowledge, attitudes and self-efficacy. The
results showed doubt over the value of the data received from reaction and learning levels.
Recommendations were made based on the following findings:-
Measuring learning (Level 2) as a method of evaluating training effectiveness is

important. The study showed that not all of what is learned immediately after training
is retained one month later. This denotes that the practitioner should be aware of the
short-term training effectiveness.
25
To ensure a more realistic evaluation at Level 2, one must be prudent of the pre and
post course evaluation method proposed by Kirkpatrick. The time frame for learning
to take place was never specified. An appropriate measuring model is necessary to
determine the extent of learning has taken place. In another words, Kirkpatrick model
lacks longitudinal considerations.
Measuring changes in learning through data collection as prescribed by Kirkpatrick

(absolute term) gained no value in predicting how well a person can perform the skills
attained from the training after a one-month period.
A positive attitude does not show any relevance on how well a person can perform a
trained task after a month. Reaction evaluation that shows positive attitude attained
have no direct linkage to performance.
However, individual self-efficacy did not decrease over time. Empirical studies
shown that self-efficacy correlates with actual performance (Kraiger, Ford & Salas
1993). One might look at the possibility of measuring self- efficacy instead of
reaction evaluation. In another words, self-efficacy offers more tangible results as
compared to reaction evaluation.
The reasons for Kirkpatrick failure in Level 3 and Level 4 evaluation was due to lack of a
defined framework and specific tools that are appropriate for measuring transfer of learning
since its first introduction 40 years ago. It is necessary, at the most basic level, to have a body
of case studies from which the generalizations can be drawn and thus hypotheses formed.
However, this body of information has not been published (Bramley & Kitson, 1994).
The issue here is whether or not the knowledge taught during training is being transferred or
demonstrated by the trainees on the job. The transfer component of training evaluation was
examined by Olsen (1998) in a study conducted in 1996. Transfer is evidence of whether
what has been learned is actually being used on the job for which it was intended.
The survey asked questions regarding how Kirkpatrick's 4-level evaluation were performed,
what percentage of payroll was spent on training, how much training was actually transferred
26
to the job and what specific items would enhance the level of transfer. A content analysis
was carried out on the 138 survey comments received on how the respondents made estimates
of the percentage of transfer value they reported. Follow up interviews were also undertaken
to provide additional clarification on responses and record impressions and opinions about
the data collection. The results showed that the percentage of transfer depended on the types
of training. Technical training showed the best rate of transfer, soft skills (interpersonal) do
not transfer as readily and are not easily observed. Transfer is not so readily apparent in the
effective work areas (Olsen, 1998).
Bramley (1996) offered an explanation why evaluation is not being carried out at the
behaviour and result levels. Traditionally most trainers use individual and educational
models of training process. The process has its limitation as emphasis is on encouraging
individuals to learn something rather than to find uses (if any) for the learning.
1.5 Future Research
Bramley and Kitson (1994) argued that the problems of evaluation at Levels 3 and 4 were not
well understood because not enough evaluation of this kind has been carried out. This is due
to the fact that effective measurement methods for Levels 3 and 4 are not available and the
amount of work in setting up the criteria for measuring these two levels is time consuming. It
is apparent that the incompleteness of Kirkpatrick model lies in its Levels 3 and 4 of
evaluation.
1.5.1 The Transfer Component
The transfer component is a potential area for future research. Transfer of training can be
defined as 'the application of knowledge, skills and attitudes learned from training on the job
and subsequent maintenance of them over a certain period of time (Baldwin & Ford, 1988;
Xiao, 1996). This process does not appear to have received much attention since most
organizations were apparently looking primarily at Levels 1 and 2 evaluations. Early studies
lacked theoretical framework to guide these investigations (Baldwin & Ford, 1988).
27
A survey conducted by Cheng and Ho (1998) revealed that there were inconsistent findings
on the variables that promised positive training transfer. The main intention of further
research is to develop common variables that are critical to different training and transfer
situations, including the establishment of common scales or instruments that can be used in
different research settings.
The current approach which uses variables such as individual ability, motivation and
environmental favourability has shown a profound effect on training transfer research (Noe &
Schmidt, 1986). However, this approach raises the question of application. This is because
individual differences (e.g. self efficacy and locus of control) are expected to extent
considerable influence on transfer outcome (Cheng & Ho, 1998).
A longitudinal study would be a better way of measuring the effectiveness of transfer

learning. It is argued that trainees who show similar levels of transfer performance after a
short period of training, may differ substantially in the long run (Kraiger & Ford, 1993).
Therefore, another major aspect of transfer research is to examine the level of newly acquired
knowledge, skills or behaviour retained in the transfer settings after a longer period of time.
For example, research should record the changes in terms of levels of skill proficiency as a
function of time after training.
1.5.2 Evaluating Beyond the 4 Levels
In considering the above studies, an effective evaluation should measure beyond the aspect of
reaction, learning, behaviour and results. Lewis and Thornhill (1994) suggested that an
effective training evaluation needs to be integrated and matched to the culture of the
organization. This integrated culturally related approach is advocated because it would be
able to minimize the risk of not meeting the objectives of carrying out training at the input
stages as well as evaluating reactions and impact at the outcome stage. This brings more
strategic approaches in identifying and prioritizing training needs, in relation to
organizational objectives.
28
To justify the training evaluation results, we may consider Brinkerhoff s (1987) criticism on
Kirkpatrick model, which only concentrates on the outcome of training. This is further
supported by Bernthal (1995) who found necessary to look for a broader linkage between
training and the organization context. Bernthal (1995) introduced the training-impact tree
method in measuring organization context. This is done by listing the barriers of training and
the factors that facilitate training next to their associated values and practices which are
aligned with the organization objectives.
Although Kirkpatrick model focuses on the attainment of tangible outcomes, it is important to

note that the question of measuring intangible outcomes that are related to training
effectiveness must not be ignored. Kirkpatrick (1994) revisited his 4-level evaluation model
and states that as long as the evidence collected is beyond a reasonable doubt, one should be
satisfied with the evidence. Perhaps an experienced training practitioner may want to explore
the possibility of interacting the absolute 4-level evaluation model with other process models.
As a result of this, the gap that exists in short and long term measures of training evaluation
may be minimized. Future research may be built upon deriving the integrated model that
would complement both absolute and process evaluation on training effectiveness.
1.5.3 Incorporating Competence-based Approach into Training

Evaluation
The aim of future research is to develop a comprehensive training evaluation by

incorporating the absolute Kirkpatrick model with the competence-based process. The
competence-based assessment system could be used in collecting sufficient evidence to
determine whether individuals are performing competently in their jobs.
Strebler, Robinson and Heron (1997) classified two different meanings of the term
competency namely expressed as behaviours that an individual needs to perform a job and as
minimum standards of performance. The term competency has been used to refer to the
meaning expressed as behaviours and performance standards. Competence-based assessment
is helpful to provide a behaviourist framework for learning in training evaluation. A
behaviourist approach to learning provides simpler tasks for the trainer and clarity of
outcome for the learner (Hoffmann, 1999). Another definition of competencies is the quality
29
of outcome which may be used to evaluate gains in productivity or efficiency in the
workplace as a result of training (Strebler et al., 1997).
Further research by Sternberg and Kolligian (1990) defined competency as the underlying
attributes of a person such as their knowledge, skills or abilities. The use of this definition
created a focus on the required inputs of individual in order for them to produce competent
performances. This is aligned with the traditional training evaluation approach of measuring
knowledge, skills and abilities of a person after training. Rowe (1995) suggested that
competence-based assessment which looks at evaluating the whole process of learning should
consist of:-
Objective: The trainer should exhibit clear learning objectives and methods for
obtaining those objectives.
Evidence: Evidence must be provided to indicate competent performance.
Observation: An assessor looks out for competent performance.
Peers' Comments are obtained from work colleagues, peers.

Comments: and customers.
The key point is that a competence-based model supplements knowledge-based

achievements. Programs will be designed by permitting competence-based models to build
on knowledge-based achievement. In this way knowledge supports work, learning supports
skill and theory supports practice (Rowe, 1995).
The competence-based method would be able to assess whether knowledge and skills learned
are being effectively applied in the workplace and whether the trainee can now be described
as competent after completion of a training program.
This integrated model could also be used prior to designing a training program in order to
establish development needs and to determine training program content.
30
1.5.4 Multi-Rater Feedback System in Training Evaluation
There does not appear to be a distinct individual who founded or invented this process and
according to Moses, Hollenbeck and Sorcher (1993), the term multi-rater feedback is
misleading as it suggests a newly discovered concept, whereas they argue that perceptions of
people have been available as long as there have been people to observe them.
Nowack (1993) presents a useful summary of some of the reasons for the increased use of
multi-rater feedback in organizations:
The need for a cost-effective alternative to assessment centers;

The increasing availability of assessment software capable of summarizing data from
multiple sources into customized feedback reports;
The need for continuous measurement of improvement efforts;
The need for job-related feedback for employees affected by career plateauing; and
The need to maximize employee potential in the face of technological change,
competitive challenges and increased workforce diversity.
From the organizational perspective, multi-rater feedback can be used solely for
developmental purposes. Romano (1994) and Atwater et al. (1993) found that the most
common use is in the area of training and development. The overall net effect of training and
development should enhance organizational performance.
From the individual perspective, the feedback is invaluable because it comes from numerous
sources, providing multiple perspectives and opinions. Each opinion and perspective may
provide relevant yet different feedback (Atwater et. al, 1993; Hazucha et. al, 1993; Tornow,
1993). This form of feedback can increase the reliability, fairness and acceptance of the data
by the person being rated (London, Wohlers & Gallagher, 1990). This occurs because the
feedback is received from multiple sources and not just from one ratee.
One of the advantages of using multi-rater feedback is that it provides the opportunity for
individuals who are being assessed to compare their self perceptions against the perceptions
of others regarding their behaviour (Rosti & Shipper, 1998).
31
The difference in perspective between the rater and the ratee is not treated as an error but is a
source of information which can enhance personal learning. Ratees can learn from the
discrepancy between self rating and the rating of others.
The use of multi-rater feedback provides a natural method for both enhancing learning of the
participants and improving the evaluation process. Feedback is seen as a critical element in
affecting change (Bennis, Benne & Chin, 1969). Multi-rater feedback could be used to serve
as an unfreezing process in Lewin's (1948) model of change. This would enhance the ratee's
learning by creating doubts on the ratee's current performance standard and provides an
opportunity for prospective development. Most training evaluation models emphasize the
absolute outcome of training. However, multi-rater feedback involves the change process
where the resultant behaviour involved reinforcement of past performance and also provides
an opening for future learning. Thus, collecting multi-rater feedback before and after training
will enhance learning and provide at least part of the data needed to evaluate training.
Moses et al. (1993) provides the following criticism of multi-rater feedback:
It relies on generalized traits as there is a limited or non-existent frame of reference

for making rater/observer judgments.
It is based on an individual's memory, which can often be incomplete descriptions of
past performance.
The observer may be unable to interpret behaviours
It relies on the instrument designers' scoring system, factor analysis or data collection
methods to interpret the information for the participant.
The main argument of Moses et al. (1993) is that multi-rater feedback is based on other
people's observations and that such observations are often incomplete descriptions of past
performance because the observer does not know what to look for. The unresolved issue is
what behaviours to study. Multi-rater feedback has been taken to identify the behaviour of
effective management. There is lack of sufficient definitional detail to study managerial
proficiency or the effectiveness of training (Morrison & McCall, 1978; Schriesheim & Kerr,
1977). Yulk (1994) argued that further refinement of these constructs is needed by identifying
32
specific skills which make up each construct. Hence, development of construct and its
validity is important prior to training.
Multi-rater feedback has been found to be widely used in managerial and leadership
development programs (Cacioppe R., 1998; Cacioppe & Albrecht, 2000; Garavan, Morley &
Flynn, 1997; McCauley & Moxley, 1996; Thach, E.C., 2002). However, its usage in other
fields needs further research and exploration. This is further supported by Rosti and Shipper
(1998) in their study on the impact of training in a management development program based
on multi-rater feedback.
1.6 Conclusion
It is widely acknowledged that the Kirkpatrick evaluation model has been providing the most
basic thoughts on training evaluation throughout this decade. However, there seems to be
incomplete application of Kirkpatrick's 4-level evaluation model being carried out by the
industries. No significant success has been identified from the use of 4-level evaluation
model by the majority of organizations that have conducted training evaluations.
Based on this literature review, it may be concluded that Kirkpatrick model has not reached a
stage of clarity for in-depth training evaluation to be carried out. His model would provide
training managers with the idea of what is training evaluation on a systematic approach
however the aspect of training measurement method was not well explored or detailed.
While training has been conceptualized as a continually evolving process, the existing
literature appears to have failed to provide adequate strategies for organizations wanting to
evaluate the immediate, as well as the long-term, effectiveness and value of their training
efforts.
At face value, the literature shows that the full Kirkpatrick evaluation strategy is being widely
applied; however, more detailed analysis found that none were able to demonstrate Level 4
evaluation and of those who claimed evaluation at Levels 2 or 3, none were able to
demonstrate a systematic approach to the problem.
33
Arguably the dilemma in adopting the Kirkpatrick's taxonomy as a comprehensive and
integrated approach to evaluation lies in both the qualitative and quantitative attempts that
may or may not provide good phenomenological studies. Further analysis of the method
shows considerable confusion as to what is, or is not, a valid indicator for evaluation. Clearly,
there has been little change in terms of level of confidence towards the reliability of training
evaluation, notwithstanding greater emphasis on this key organizational development
process.
The weaknesses of Kirkpatrick model have brought opportunity for future research in
incorporating competencies and multi-rater feedback approach into the long-term evaluation
of training.
These weaknesses have also opened up opportunities for further research in the transfer
learning especially in the studies of its longitudinal and application effect.
1.7 References for Paper One
Alliger, G.M. & Janek, E.A. 1989, 'Kirkpatrick's levels of training criteria: thirty years
later', Personnel Psychology, vol. 42, pp. 331-342.
Alliger, G.M., Tannenbaum, S.I., Bennett, W., Traver, H. & Shotland, A. 1997, 'A meta-
analysis of the relations among training criteria', Personnel Psychology, vol. 50, pp.
341-358.
Atwater, L., Roush, P. & Fishthal, A. 1993, The Impact of Upward Feedback on Self and
Follower Ratings of Leaders, Centre for Creative Leadership, New York.
Baldwin, T.T. & Ford, J.K. 1988, 'Transfer of training: a review and directions for future
research', Personnel Psychology, vol. 41, pp. 63-105.
Basadur, M., Graen, G.B. & Scandura, T.A. 1986, 'Training effects on attitudes toward
divergent thinking among manufacturing engineers', Journal of Applied Psychology,
vol. 71, pp. 612-617.
Bennis, W.G., Benne, K.D. & Chin, R.1969, The Planning of Change, 2nd edn, Holt,
Rinehart & Winston, New York.
Bernthal, P.R. 1995, 'Education that goes the distance', Training and Development, vol. 49,
no. 9, pp. 41.
34
Blanchard, P.N. & Thacker, J.W. 1999, Effective Training, Systems, Strategies and
Practices, Prentice Hall Publisher, New Jersey.
Blanchard, P.N., Thacker, J.W. & Way, S.A. 2000, 'Training evaluation: perspectives and
evidence from Canada', International Journal of Training and Development, vol. 4,
no.4, pp. 295-303.
Boulmetis, J. & Dutwin, P. 2000, The ABCs of Evaluation: Timeless Techniques for
Program and Project Managers, Jossey-Bass Publisher, San Francisco.
Boyle, P.G. & Jahns, I. 1970, 'Program development and evaluation' in Handbook of adult
education, eds Smith, R.M., Aker, G.F. & Kidd, J.E., Macmillan Company, New
York, pp. 70.
Bramley, P. & Kitson, B. 1994, 'Evaluating training against business criteria', Journal of
European Industrial Training, vol. 18, no.1, pp. 10-14.
Bramley, P. 1996, Evaluating Training Effectiveness, McGraw-Hill, Maidenhead and New

York.
Brandenburg, D. 1982, 'Training evaluation: what is the current status?' Training and
Development Journal, pp. 14-19.
Bretz, R.D. & Thompsett, R.E. 1992, 'Comparing traditional and integrative learning
methods in organizational training programs', Journal of Applied Psychology, vol. 77,
pp. 941-951.
Brinkerhoff, R. 0. 1987, Achieving results from training, Jossey-Bass Publisher, San

Francisco.
Brinkerhoff, R.O. 1988, 'An integral evaluation model for human resource development',
Training and Development Journal, vol. 42, no. 2, pp. 66-68.
Brown, K.G., Werner, M.N., Johnson, L.A. & Dunne, J.T. 1999, Formative evaluation in
Industrial/Organization Psychology: further attempts to broaden training evaluation,
presented at a symposium on training evaluation: advances and new directions for
research and practice, Society of Industrial and Organizational Psychology, Atlanta.
Cacioppe, R. 1998, 'An integrated model and approach for the design of effective
leadership development programs', Leadership and Organization Development
Journal, vol. 19, no. 1, pp. 44-53.
Cacioppe, R. & Albrecht, S. 2000, 'Using 360-degree feedback and the integral model to
develop leadership and management skills', Leadership and Organization
Development Journal, vol. 21, no. 8, pp. 390-404.
Campbell, J.P. 1988, Training Design for Performance Improvement, in Productivity in

Organizations, eds Campbell, J.P. & Campbell, R.J., Jossey-Bass Publisher, San
Francisco.
35
Cascio, W.F. 1989, Using utility analysis to assess training outcomes, in Training and
Development in Organizations, ed. I.L. Goldstein, Jossey-Bass, San Francisco.
Cervero, R.M. 1988, Effective Continuing Education for Professionals, Jossey-Bass

Publisher, San Francisco.
Campion, M.A. & Campion, J.E. 1987, 'Evaluation of an interview skills training
program in a natural field setting', Personnel Psychology, vol. 40, no. 4, pp. 675-91.
Chen, H.T. & Rossi, P.H. 1992, Using Theory to Improve Program and Policy
Evaluations, Greenwood Press, Westport, CT.
Cheng, E. & Ho, D. 1998, 'The effects of some attitudinal and organizational factors on
transfer outcome', Journal of Managerial Psychology, vol. 13, no. 5/6, pp. 309-317.
Clegg, W.H. 1987, 'Management training evaluation: an update', Training and Development
Connolly, M.S. 1988, 'Integrating evaluation, design and implementation', Training and
Development Journal, vol. 42, no. 2, pp.20-23.
Constable, J. & McCormick, R. 1987, The Making of British Managers, BIM, CBI, London.
Davidove, A.E. & Schroeder, P.A. 1992, 'Demonstrating ROI of training' Training and
Davis, B.L. & Mount, M.K. 1984, 'Effectiveness of performance appraisal training using
computer assisted instruction and behaviour modeling', Personnel Psychology, vol.
37, pp. 439-452.
Dawson, R.P. 1993, Model of evaluations of equal opportunities training in local

government with special reference to women, unpublished PhD thesis, South Bank
University, London.
Dionne, P. 1996, 'The evaluation of training activities: a complex issue involving

different stakes', Human Resource Development Quarterly, vol. 7, pp. 279-86.
Dyer, S. 1994, `Kirkpatrick's mirror', Journal of European Industrial Training, vol. 18,
no. 5, pp 31-32.
Eisner, E.W. 1997, The Enlightened Eye: Qualitative Inquiry and the Enhancement of
Educational Practice, 2nd edn., Merrill, New York.
Garavaglia, L.P. 1993, 'How to ensure transfer of training', Training & Development
Journal, vol. 47, no. 10, pp. 63-68.
Garavan, T.N., Morley, M. & Flynn, M. 1997, '360-degree feedback: its role in employee
development', Journal of Management Development, vol. 16, no.2, pp. 134-147.
36
Geber, B. 1995, 'Does your training make a difference? Prove it!', Training and
Development Journal, vol. 3, pp. 27-34.
Goldstein, L.I. 1986, Training in Organizations: Needs Assessment, Development and

Education, Cole Publishing Company, California.
Goldstein, L.I. & Ford, J.K. 2002, Training in Organizations: Needs Assessment,
Development and Evaluation, Thomson Learning, Wadsworth, Canada.
Grove, E.A. & Ostroff, C. 1990, Program evaluation, in Developing Human Resources, eds
Wexley, K. & Hinnicks, J., BNA Books, Washington D.C.
Hamblin, A.C. 1974, Evaluation and Control of Training, McGraw-Hill Publisher, New
York.
Hazucha, J.F., Hezlett, S.A. & Schneider, R.J. 1993, 'The impact of 360-degree feedback on
management skills development', Human Resource Management, vol. 32, pp. 325-
351.
HMSO 1989, Training in Britain: A Study of Funding, Activity and Attitudes, Her
Majesty's Stationery Office, London.
Hoffmann, T. 1999, 'The meanings of competency', Journal of European Industrial

Training, vol. 23, no. 6, pp. 275-285.
Holton, E.F. III 1996, 'The flawed four-level evaluation model', Human Resource
Development Quarterly, vol. 7, pp. 5-21.
Junaidah, H. 2001, 'Training evaluation: clients' roles', Journal of European Industrial

Kirkpatrick, D.L. 1959a, 'Techniques for evaluating training programs: part 1 - reaction',
Journal of American Society for Training and Developing, vol. 13, pp. 3-9.
Kirkpatrick, D.L. 1959b, 'Techniques for evaluating training programs: part 2 - learning',
Journal of American Society for Training and Developing, vol. 13, no. 12, pp. 21-26.
Kirkpatrick, D.L. 1960a, 'Techniques for evaluating training programs: part 3- behaviour',
Kirkpatrick, D.L. 1960b, 'Techniques for evaluating training programs: part 4 - results',
Kirkpatrick, D.L. 1976, Evaluation of Training, Training and Development Handbook: A

guide to human resource development, 2nd edn, Craig, R.L.O., McGraw-Hill
Publisher, New York.
Kirkpatrick, D.L. 1979, 'Techniques for evaluating training programs', Training and
37
Kirkpatrick, D.L. 1994, Evaluating Training Programs: The Four Levels, Berrett-Koehler
Publishers, San Francisco.
Kirkpatrick, D.L. 1996a, 'Great ideas revisited', Training and Development Journal,
vol. January, pp. 54-59.
Kirkpatrick, D.L. 1996b, 'Invited reaction: reaction to Holton article', Human Resource
Development Quarterly, vol. 7, pp. 23-24.
Kirkpatrick, D.L. 1998, Evaluating Training Programs: The Four Levels, Berrett-
Koehler Publishers, San Francisco.
Kraiger, K., Ford, J.K. & Salas, E. 1993, 'Application of cognitive, skill-based and affective
theories of learning outcomes to new methods of training evaluations', Journal of
Applied Psychology, vol. 78, no. 2, pp. 311-328.
Legge, K. 1984, Evaluating Planned Organizational Change, Academic Press, London.
Lewin, K. 1948, Resolving social conflicts, Harper & Bros Publishers, New York, NY.
Lewis, P. & Thornhill, A. 1994, 'The evaluation of training an organizational culture

approach', Journal of European Industrial Training, vol. 18, no. 8, pp. 25-32.
London, M., Wholers, A.J. & Gallagher, P. 1990, '360-degree feedback surveys: a source of
feedback to guide management development', Journal of Management Development,
vol. 9, pp. 17-31.
Love, A.J. 1991, Internal Evaluation: Building Organizations From Within, Sage
Publication, California, CA.
Madaus, G.F., Scriven, M.S. & Stufflebeam, D.L. 1986, Evaluation Models: Viewpoints
on Educational and Human Services Evaluation, Kluwer-Nijhoff Publishing, Boston.
Mann, S. & Robertson, I. T. 1996, 'What should training evaluation evaluate?'

Journal of European Industrial Training, vol. 20, no. 9, pp. 14-20.
Mathieu, J.E. & Leonard, R.L. Jr. 1987, 'Applying utility concepts to a training program in
supervisory skills: a time-based approach', Academy of Management Journal, vol.
30, pp. 316-335.
Mathews, B.P., Ueno, A., Kekale, T., Repka, M., Pereira, Z.L. & Silva, G. 2001, 'Quality
training: needs and evaluation-findings from a European survey, Total Quality
Management, vol. 12, no. 4, pp. 483-490.
McCauley, C.D. & Moxley, R.S. Jr. 1996, Developmental 360: How Feedback Can Make
Managers More Effective, Jossey-Bass Publisher, San Francisco.
Morrison, A.M. & McCall, J.D. 1978, Feedback to Managers: A Comprehensive

Review of Twenty-four Instruments, Centre for Creative Leadership, Greensboro, NC.
38
Morrow, C.C., Jarrett, M.Q. & Rupinski, M.T. 1997, 'An investigation of the effect and
economic utility of corporate-wide training', Personnel Psychology, vol. 50, pp. 91-
119.
Moses, J., Hollenbeck, G.P. & Sorcher, M. 1993, 'Other people's expectations', Human
Resource Management, vol. 32, Summer Fall.
Murk, P., Barrett, A. & Atchade, P. 2000, 'Diagnostic techniques for training and
education: strategies for marketing and economic development', Journal of
Workplace Learning, vol. 12, no. 7, pp. 296-306.
Noe, R.A. & Schmitt, N. 1986, 'The influence of trainee attitudes on training
effectiveness: test of a model', Personnel Psychology, vol. 39, pp. 497-523.
Noe, R.A. 2000, Employee Training and Development, McGraw-Hill Publisher, New York.
Nowack, K. 1993, '360-degree feedback: the whole story', Training and Development
Newstrom, J.W. 1978, 'The problem of incomplete evaluation of training', Training and
O'Leary, V.E. 1972, 'The Hawthorne effect in reverse: effects of training and practice on
individual and group performance', Journal of Applied Psychology, vol. 56, pp. 491-
494.
Olsen, J. H. Jr. 1998, 'The evaluation and enhancement of training transfer', International
Journal of Training and Development, vol. 2, no. 1, pp. 61-75.
Parlette, M. & Hamilton, D. 1977, 'Evaluation as a new approach to the study of innovative
programmes', in Beyond the Numbers Game, eds Hamilton, D. et al., Macmillan,
London.
Phillips, J.J. 1991, Handbook of Training Evaluation and Measurement Methods, Gulf
Publishing Company, Houston, TX.
Phillips, J.J. 2002, Return on Investment in Training and Performance Improvement

Programs, 2nd edn, Butterworth-Heinemann, Woburn, MA.
Phillips, J.J. & Stone, R.D. 2002, How to Measure Training Results, A Practical Guide to
Tracking the Six Key Indicators, McGraw-Hill Publisher, New York.
Plant, R.A. & Ryan, R.J.1994, 'Who is evaluating training?', Journal of European Industrial
Popham, W. J. 1974, Evaluation in Education: Current Applications, Berkeley,

McCutchan, California.
Porter, L., & McKibbin, L. 1988, Future of Management Education and Development Drift
Or Thrust Into the 21' Century?, McGraw-Hill Publisher, New York.
39
Provus, M. 1971, Discrepancy Evaluation, Berkeley, McCutchan, California.
Rae, L. 1986, How to Measure Training Effectiveness, Gower Publications, Aldershot,

London.
Raphael, M. & Wagner, E. 1972, 'Training surveys surveyed', Training and Development
Journal, vol. 26, pp. 10-14.
Redshaw, B. 2001, 'Evaluating organizational effectiveness', Measuring Business

Excellence, vol. 5, no. 1, pp. 16-18.
Regalbutto, G.A. 1992, 'Targeting the bottom line', Training and Development Journal, vol.
46, no. 4, pp. 29-32.
Rivlin, A.M. 1971, Systematic Thinking for Social Action, Brookings Institution,
Washington.
Romano, C. 1994, 'Conquering the fear of feedback', Human Resource Focus, vol. 71, no. 3.
Rossi, P.H. & Freeman, H.E. 1993, Evaluation.. A Systematic Approach, 5th edn, Sage
Publication, California.
Rosti, R.T. Jr. & Shipper, F. 1998, 'A study of the impact of training in a
management development program based on 360 feedback', Journal of Managerial
Psychology, vol. 13, no.1/2, pp. 77-89.
Rowe, C. 1995, 'Incorporating competence into the long term evaluation of training and
development', Industrial Commercial Training, vol. 27, no.2, pp. 3-9.
Salinger, R. & Deming, R. 1982, 'Practical strategies for evaluating education', Training and
Sauter, J. 1980, 'Purchasing public sector executive development', Training and

Schriesheim, C.A. & Kerr, S. 1977, 'Theories and measurement of leadership: a critical
appraisal of present and future directions', in Leadership: The Cutting Edge, eds
Hunt, J.G. & Larson L.L., Southern Illinois University Press, Carbondale, IL.
Scriven, M. 1991, Evaluation Thesaurus, Sage Publication, Newbury Park, California.
Shadish, W. R. & Epstein, R. 1987, 'Patterns of program evaluation practice among

members of the evaluation research society and evaluation network', Evaluation
Review, vol. 11, no. 5, pp. 555-590.
Shadish, W.R. & Reichardt, C.S. 1987, 'Evaluation studies', Evaluation Review, vol. 12, pp.
13-30.
40
Shireman, J.A.R. 1991, Utilization of program evaluation for decision making regarding
hospital based patient/client focused health education programs, doctoral dissertation,
University of Iowa, dissertation abstracts international, 52/12A, AA C9212928.
Smith, A.J. 1990, 'Evaluation of management training subjectivity and the individual',
Journal of European Individual Training, vol. 14, no. 1, pp. 12-15.
Stake, R. 1977, 'Responsive evaluation', in Beyond the Number Game, eds Hamilton, D.,
Jenkins, D., King, C., MacDonald, B. & Parlett, H.M., Macmillan, London.
Steel, S. 1970, 'Program evaluation: a broader definition', Journal of Extension, vol. 13, pp.
13-20.
Sternberg, R. & Kolligian, J. Jr. 1990, Competence Considered, Yale University Press, New
Heaven, CT.
Strebler, M., Robinson, D. & Heron, P. 1997, 'Getting the best out of your
competencies', Institute of Employment Studies, University of Sussex, Brighton.
Stufflebeam, D.L. 1971, Education Evaluation: Decision Making, by the PDK national study
committee on education, Itasca, III: F.E. Peacock Publisher Inc, Boston.
Stufflebeam, D.L. 1983, 'The CIPP model for program evaluation', in Evaluation Models,
eds Madaus, G.F., Scriven, M.S. & Stufflebeam, D.L., Kluwer-Nijhoff Publishing,
Boston, pp. 117-141.
Stufflebeam, D.L. & Shrinkfield, J.A. 1985, Systematic evaluation, Kluwer Nijhoff
Publishing, Boston.
Swanson, R.A. & Holton, E.F. 1999, Results: How to Assess Performance, Learning And
Perceptions in Organizations, Berrett-Koehler Publishers, San Francisco.
Tesoro, F. 1998, 'Implementing an ROI measurement process at Dell Computer',

Performance Improvement Quarterly, vol. 11, pp. 103-114.
Thach, E.C. 2002, 'The impact of executive coaching and 360-feedback on leadership
effectiveness', Leadership and Organization Development Journal, vol. 23, no. 4,
pp. 205-214.
Toplis, J. 1993, 'Training evaluation reflections on the first steps', European Work
Organization Psychology, vol. 2, no. 2, pp. 146-152.
Tornow, W.W. 1993, 'Perceptions or reality, is multiple-perspective measurement a

means or an end?', Human Resource Management, vol. 32. no. 2 & 3, pp. 209-408.
Tyler, R.W. 1949, Basic Principle of Curriculum and Instruction, University of Chicago
Press, Chicago.
Tyler, R.W. 2002, 'Evaluating evaluations', Human Resource Magazine, vol. June, pp. 85-
93.
41
Warr, P. & Bunce, K. 1995, 'Employee age and voluntary development activity',
International Journal of Training and Development, vol. 2, pp. 190-204.
Warr, P., Allan, C. & Birdi, K. 1999, 'Predicting three levels of training outcome', Journal
of Occupational and Organizational Psychology, vol. 72, pp. 351-375.
Wexley, K.N. & Baldwin, T.T. 1986, 'Post-training strategies for facilitating positive
transfer: an empirical exploration', Personnel Psychology, vol. 29, pp. 503-520.
Wholey, J.S., Hatry, H.P. & Newcomer, K.E. 1994, Handbook of Practical Program
Evaluation, Jossey-Bass Publisher, San Francisco.
Xiao, J. 1996, 'The relationship between organizational factors and the transfer of
training in the electronics industry in Shenzhen, China', Human Resource
Development Quarterly, vol. 7, no. 1, pp. 55-73.
Yulk, G.A. 1994, Leadership in Organizations, 2nd edn, Englewood Cliffs, Prentice Hall
Publisher, New Jersey.
42
Research Paper 2
EVALUATING TRAINING EFFECTIVENESS:

AN EMPIRICAL STUDY OF KIRKPATRICK
MODEL OF EVALUATION IN THE
MALAYSIAN TRAINING ENVIRONMENT FOR
THE MANUFACTURING SECTOR
Lim Guan Chong

University of Hull

43
Evaluating Training Effectiveness: An Empirical
Study of Kirkpatrick Model Of Evaluation in the
Malaysian Training Environment for the
Manufacturing Sector
Lim Guan Chong

2.1 Abstract
This research adopted an empirical approach to track the history, rationale, objectives and the
implementation of training evaluation initiatives in Malaysia's manufacturing sector. Since
the establishment of the Human Resource Development Fund, training activities in Malaysia
have increased. The majority of Malaysian organizations that conduct training are doubtful
about how training activities could add value to the organization performance and justify
their training investment. This research provides an understanding of training evaluation
culture within the Malaysian manufacturing sector and the effectiveness of this Kirkpatrick's
4-level evaluation model as applied to the Malaysian manufacturing sector.
2.2 Introduction
The Malaysian government is committed towards education, training and human resource
development. The government recognizes the importance of human resource development in
its quest for achieving a fully developed nation status. This commitment has translated into
the establishment and growth of the training practice in the country.
Being the sole provider of training previously, the government has adopted the policy of
involving private enterprises in all aspects of training. Training needs have become crucial
and vital to the development of capital-intensive and value added industries. Apart from
44
involving enterprise to make training more market-driven, there is a need for enterprise to
share the burden of training. In the Seventh Malaysia Plan, the private sector was expected to
play a more active role in upgrading the qualification and skill of its workers (Junaidah,
2001).
2.3 Training Practices in Malaysia
Training activities within Malaysian companies are behind countries like Singapore, Japan
and Korea. Training activities in Malaysia are mainly conducted by large multinational
companies. The International Labour Organization's study in 1997 showed that Malaysia is
in the 12th position in terms of providing in-company training (Junaidah, 2001).
The Malaysian government passed a new Act of Parliament entitled Human Resources
Development Act in 1992, to encourage and stimulate the private sector to introduce training
and development for its employees (HRDC, 1992). The objective of this Act is to set aside
accumulated funds to promote training activities within the organization. Under this Act,
companies with more than 50 employees will have to contribute 1 percent of their total staff's
monthly salary to the Ministry of Human Resources through the Human Resources
Development Council (HRDC). The fund is known as the Human Resources Development
Fund (HRDF), was launched in January 1993. The government set up the HRDC to manage
this fund by identifying the systematic training needs and approving relevant training
programs required by organizations. The levy is partially refunded under special schemes
known as Training Aid Scheme and Approved Training Program (ATP) Scheme to the
respective organizations once the training program is completed. The policy lays down the
parameters for a Human Resource oriented development strategy that is designed to mobilize
national effort to increase technological capabilities and competitiveness as well as create
highly skilled, productive, disciplined and efficient workforce. This strategy would aid
Malaysia's transition into an industrialized economy. Private sector companies are also
expected to enhance their training activities by utilizing the HRDF and participating in skill
development programs run by the state governments (MEPU, 1996). Since the establishment
of the HRDC, how has the Malaysian manufacturing sector gained from the training
conducted? With information on how training benefit organizations, it would help the
45
Malaysian government to chart the progress and expected time frame needed for Malaysia to
transform into an industrialized economy.
The need to develop a highly trained workforce is evident from the increase of more than 200
management consulting and training institutions, professional associations and management
schools operating in Malaysia (Arthur Anderson & Co, 1991). The number of employees
who return to formal education and training has increased consistently since 1972 (Ahmad,
1998). The government set up the National Institute of Public Administration Malaysia
(INTAN) which is responsible for training government employees in administration and
management (Junaidah, 2001).
There are some real difficulties in assessing the full extent of skill development for
government training in Malaysia even after conducting evaluation (Mirza & Juhary, 1995).
Firstly, much of skill development takes place in the private sectors. Most skills even those
involving advanced manual skills are acquired on the job. Secondly, skill development
during employment tends to be demand-driven (Pillai, 1994). Workers gain experience on
the job and upgrade their skills when they are exposed to a higher skill level. A study by
Pillai and Othman (1994) showed that the budget for training and education in Malaysia has
increased by 40 percent. Company emphasis has been on improving the quality of training to
help develop competent labour force that improves the competitiveness of the industrial
sector in Malaysian. This new demand will force employers to further develop employee
competencies. Saiyadain (1995) found that as many as 82.6 percent of organizations
sponsored their managers for training, and on average these organizations spent 4.65 percent
of the managerial payroll on training managers. This shows that the number of knowledge
workers and new knowledge-based opportunities is expected to increase dramatically in the
next few years.
2.4 The Practice of Evaluation in Training
Although the methodology of evaluating training effectiveness may look fair, it could make it
difficult to express rational criticism. A survey by Wagel (1977) found that 75 percent of
companies have no formal method for evaluating training effectiveness. In a subsequent
46
survey by Easterby-Smith (1985), the result showed that out of 15 organizations with 320
300,000 employees, only one conducted some form of evaluation on a regular basis which
was a post-course questionnaire. According to Rowe (1992), although every training manual
gives lip service to evaluation, it is notoriously difficult to carry out effectively. The
extensive survey by Plant and Ryan (1994) served to further underline the lack of widespread
sophistication in evaluation. They point to budget cutting and economies pressures as being
possible explanations. A recent study by Blanchard, Thacker and Way (2000) on 202
organizations in Canada reported that more than half of the organizations are not
comprehensively evaluating their training.
According to Carnevale and Schulz (1990), the American Society for Training and
Development (ASTD) research indicated that the most popular reasons for evaluation are to
gather information to help decision makers improve the training process and facilitate
participants' job performance. This explains why the outcome-based Kirkpatrick model is so
popularly used. Evaluation also helps measure the degree of improvement in application and
assesses how well the learner achieves the established goals (Attkinsson, Sorenson,
Hargreaves & Hororwitz, 1978).
For the past 30 years the Kirkpatrick model had been considered the most prominent training
evaluation model (Bernthal, 1995). Phillips (1991) concluded that, out of more than 50
evaluation models available, the evaluation framework that most training practitioners use is
the Kirkpatrick model. It is easy to find firms that practice training evaluation. However,
most firms only conduct post course evaluation using Kirkpatrick's Level 1 evaluation.
Another important purpose for training evaluation is to meet the accountability requirements
of funding groups or clients (Rossi & Freeman, 1993). The demand for accountability has
been the major impetus for program evaluation since 1980s. Fiscal constraints have
increased the competition of companies' activities for available dollars and raised the
question of value for money from their activities (Ruthman & Mowbray, 1983).
Training evaluation is more than a set of empirical methods governed solely by the standards
of social science. Judgments on the quality of program evaluation must also be based on
criteria that are meaningful both to immediate users and the larger system in which the
program is embedded (Corday & Lipsey, 1986).
47
Phillips (1991) stated that when it comes to training evaluation, there still appears to be more
talk than action. In many organizations, training evaluation is either ignored or approached in
an unsystematic manner. Previous literature (Davidove & Schroeder, 1992; Shelton &
Alliger, 1993; Smith, 1990) demonstrated that training evaluation is unsystematic and based
on simple means. Gutek (1988) stated that there was little or no demand on the part of the
organization to seriously evaluate a training program. Most organizations evaluate their
training programs by emphasizing one or more levels of Kirkpatrick model (Chen & Rossi,
1992). The researchers, however, commented that evaluation knowledge found in the
literature is not being fully utilized in evaluation practices.
Admittedly it is difficult to completely ascertain a training program's effectiveness. What

works at a particular time at a particular training location with a group of participants may not
necessarily work as well when transferred to another time, setting and group (Junaidah,
2001).
Bramley and Kitson (1994) asserted that measuring learning is problematic because it is
difficult to design a reliable measuring instrument. There are also few people who possess
the necessary skills to evaluate training however these skills are often not available. Grove
and Ostroff (1990) mentioned that training directors often do not possess the necessary skills
to conduct training evaluation. However, Bramley (1996) mentioned that the lack of training
evaluation skills could be due to the methodological weakness embedded within the
Kirkpatrick model of evaluation.
In addition to the unavailability of a reliable measuring instrument, Barron (1996)

commented that why management does not demand evaluation because the management
believes that training will be reflected in an employee's work performance. The research by
Smith and Piper (1990) supported this view and showed that trainers openly said, "We do just
what we are asked to do deliver training. We do not do what we are not asked to do
improve human performance in the workplace". Smith and Piper (1990) also mentioned this
as one of the reasons for providing training but not evaluation. The research found that their
clients did not request for an evaluation. This could be the reason why training providers do
not evaluate their products.
48
A research by the ASTD in 1990 showed that most companies now conduct some form of
evaluation of their training programs. Practitioners tend to use different methodology and
approaches. In examining evaluation methods in business-education partnerships, Erickson
(1991) found that there is little standardization in the methodology. Shadish and Epstein
(1987) conducted a study to look at program evaluations among members of the Evaluation
Research Society and Evaluation Network. They found that practitioners had different
methodologies as well as different assumptions about evaluation. In their study, three patterns
of practices emerged from the evaluation practices which they labeled the academic pattern,
decision-driven pattern and the outcome pattern.
Heneman and Schurab (1986) stated that the evaluation of training programs is considered
different compared to the theory and models in the literature. Many authors commented that
once participants leave the training setting, program providers seldom attempt to determine
the effect of their program. Indeed, the word evaluation raises all sorts of emotional defense
reactions. Such response indicates a low level of commitment among training professionals
toward evaluation. Most of the time, the practices are informal, unsystematic and based on
one popular model. However in the study by Junaidah (2001) on Malaysian training
evaluation practices, it was found that evaluation was moderately formal, comprehensive and
systematic but could be further improved. Nevertheless, it is uncertain whether this so-called
comprehensive approach to training evaluation is within the taxonomy of the Kirkpatrick
framework. Currently, there is little literature on the evaluation system within the Malaysian
context.
2.5 Training Evaluation Practices in Malaysia
Validation of training effectiveness and benefits of training and development programs have
gained importance in public and private sectors in Malaysia. The Malaysian government
places great emphasis on program evaluation and appointed two federal agencies to be
responsible for evaluation. They are the National Institute of Evaluation and the Evaluation
Unit at the Prime Minister's Department. This unit is responsible for evaluating special
governmental projects and programs (Maimunah, 1990). Another evaluating body is the
Publication and Consultancy Bureau which carries out evaluation for government training.
49
There are three types of evaluation process currently being practiced in the agency. The
formal training evaluation uses standard evaluation questionnaires and oral evaluation in the
form of informal discussions, while the informal evaluation conducted during training
(Junaidah, 2001).
The reasons why Malaysian organizations do not evaluate training may lie in the inability to
develop relevant measuring tools or the difficulty in determining which performance
outcomes are attributed to training.
The rise in the awareness of training evaluation during the Malaysian economic downturn in
1997 has increased the pressure for organizations to justify the investment cost placed on
training (Junaidah, 2001). Organizations realized that training must be a worthwhile effort
and this raises the need for measuring training effectiveness. Evaluating training
effectiveness does not seem to be the culture of most organizations in Malaysia. Thousands
of training programs have been conducted in Malaysia since the rise of HRDF, (Mirza &
Juhary, 1995). However, effectiveness in terms of productivity, skills improvement, increase
in performance standards and return on investment is still unknown. Training should be
evaluated to learn the weaknesses of the training program. The selection criteria for
evaluation should be able to find out the improvement in the participants' work performance.
The need for greater quality management during the economic downturn forced Malaysian
companies to upgrade their current version of International Standard Organization (ISO) to
ISO 9001:2000 which emphasized on documenting the training evaluation process.
Companies that pursued this latest version of ISO are required to justify their training efforts
and money spent by linking skill development with the quality philosophy of the company.
As organizations pursue the latest version of ISO, evaluating training ranks high among top
management as a means of justifying training investment (Junaidah, 2001). The opportunity
cost of foregoing training commitment has become extremely high. More than ever, training
evaluation must demonstrate improved performance and financial results. As the investment
spent on training is costly, it is understandable why top managers wish to see value for
money and demand justification for training cost. Training providers need to show clients that
they are getting good returns on their investment in training. This demand for accountability
had been the major impetus for training in the past few years (Junaidah, 2001).
50
Most organizations in Malaysia have sufficient training facilities. Most managers are
sponsored to attend training programs on production, general management and human
resources management for an average of 2 days (Mirza & Juhary, 1995). On average
organizations spend 4.65 percent of the managerial payroll on training (Saiyadain, 1995).
The measurement of training effectiveness varies from organization to organization. A few
organizations have developed systematic plans to follow up on training. The top
management's attitude towards training has been identified as a critical factor in effective
operationalization of training (Mirza & Juhary, 1995). In organizations where the top and
middle management have been perceived to be supportive, training seems to have contributed
to the overall growth. But how far the evaluation process has been conducted to prove the
growth is still questionable. In order to improve the overall effectiveness of training, all
organizations should undertake training evaluation effectively. As mentioned by Brinkerhoff
(1988), training needs to adopt evaluations and measuring systems that can improve the
feedback mechanism in order to build their response capacity. A system of pre course
evaluation followed by post course evaluation may help in setting relevant expectations for
improvement.
A serious gap in the Malaysian training context is the insufficient information on the number,
nature and content of training facilities in the country. The skill-level at which the output
would fit into the labour market is not known while the syllabus, duration and quality of
training vary from one agency to another. This is due to the lack of collaboration and
consultation between industry and training institution. The quality of training is not up to the
mark. Trainees have theoretical knowledge but little practical experience (Pillai, 1994).
There has been limited study on training evaluation practices in Malaysia. A training
evaluation research by Shamsuddin (1995) was on the contextual factors associated with
evaluation practices of selected adult and continuing education providers in Malaysia.
According to him even though the management directed an evaluation to be conducted, it was
only for a narrow purpose. It was used to demonstrate program success by showing how
good was the training and how many people received the training which is merely Level 1
evaluation. The wider purpose of program evaluation such as measuring the acquired
learning (Level 2), program impact (Level 3) and cost effectiveness (Level 4) was not the
management priority. According to Shamsuddin (1995), the clients were not aggressive
stakeholders who cared and demanded accountability from the training providers. Their
51
behaviour and characteristics did not push the training provider to examine the real effect of
the programs in terms of learning gain and program effectiveness.
Besides Shamsuddin's (1995) study, four other studies conducted locally included the
element of evaluation practice. The first study by Hamid, Mohd, Muhamad and Ismail
(1987) asked 235 organizations if management education in Malaysia significantly provides
candidates with a set of skills. Organizations found that 67.6 percent of management
programs offered by local universities and colleges are too theoretical. Out of 121
respondents, 60.3 percent indicated that training is important while the rest felt the contrary.
This study focused on reaction evaluation (Level 1) to study the participants' satisfaction
level towards the overall programs. Another study conducted by Asma (1994) examined the
design of training practices of four training providers in Malaysia and found that the
evaluation practiced by the trainers do not conform to any theory and most of the evaluations
used were ad hoc and informal.
Mirza and Juhary (1995) conducted a study on local and multinational organizations and
found that in the majority of these organizations even if managers who return from training
may write a report, no formal systematic mechanism exists to assess how well they are
utilizing their training in the organizations. The research further found that participants were
only encouraged to apply learning at work but do not take the effort to find out what caused
the change. The result indicates that the behaviour towards measuring training effectiveness
is not popularly practiced. Organizations feel that if learning does not take place, it would
show in the next appraisal report. Participants who have learned something should have
applied it and therefore not necessary to track changes in performance.
Mirza and Juhary's (1995) study also revealed that most organizations in Malaysia evaluate
training effectiveness on a superficial level. Some encourage their managers to try out new
ideas while others do not show the same kind of support. Unfortunately for most companies,
measuring training effectiveness may not be practiced organization wide. This is because
measuring training effectiveness has never been a policy in most organizations. Lack of
support by most department heads is deterring most organizations from carrying out post-
training evaluation. Most organizations felt that if they had a more supportive top
management they could have established systems for measuring training effectiveness.
52
The most recent study was by Junaidah (2001) on training evaluation practices by training
institutions in Malaysia. The study showed moderately formal training evaluation practices
by Malaysian training practitioners. However, the researcher was uncertain whether these
training practitioners applied the taxonomy of Kirkpatrick model in training evaluation
practices.
Generally, training evaluation practices in Malaysia are either not done or if done, do not
follow any theory suggested in the literature. There is a paucity of detailed evidence of direct
causal links between investment in training and the resultant return in the form of increased
performance. Brandenburg (1982) suggested that part of the reason training practitioners
tended not to conduct evaluation or if they did, they relied heavily on soft information
evaluation methods and did not disseminate the results widely. Pauzi (1985) felt that part of
the problem lies in the attitude of the top management who do not show full commitment to
the evaluation process.
A further study is needed to study current training evaluation practices in Malaysia and to
understand updates of this practice. It is important to understand training effectiveness in
Malaysia as it is worthwhile to analyze the training evaluation process which has undergone
in the country. This study would contribute to the existing body of knowledge as current
information on training evaluation is inadequate. Since a large number of professional
associations, private consultants and management schools in universities are organizing
training programs in Malaysia, the results of the study would indicate areas where training
evaluation could be practiced for different training programs.
2.6 Methodology of Study
Most recent surveys of training and evaluation practices in Malaysia were conducted by
Hamid et al. (1987), Asma (1994), Mirza and Juhary (1995), Shamsuddin (1995) and
Junaidah (2001). The dearth of published materials on training and development activities of
managers in Malaysia has prompted this study.
53
This explorative study was conducted to understand the evaluation culture and the
extensiveness of training evaluation practices in Malaysia. The lack of baseline information
prevented the evaluation of transfer learning. This prompted the use of empirical approach in
this study. The study evaluates the perceptual effects on both management and non-
management levels of training programs in the manufacturing sector. This survey asked the
level of training evaluation performed, the percentage of payroll spent on training, the
impediments to training and the percentage of training transferred to the job. Follow up
interviews were also undertaken to provide additional clarification and interpretation on
responses and enabled impressions and opinions about the data to be recorded accurately.
2.6.1 Questionnaire Construction
A comprehensive survey of the literature was done to find out the degree of training
evaluation being conducted by training practitioners in Malaysia. The survey questions asked
the degree that training evaluation practices were conducted in Malaysia based on
Kirkpatrick's 4-level of evaluation (Kirkpatrick, 1959a, 1959b, 1960a, 1960b, 1976, 1979).
Examples of questions are:-
Reaction How did the participants react to the training?
Learning What information and skills were gained?
Behavior How have participants transferred knowledge and skills to their jobs?
Results What effect has training had on the organization and achievement of
its objectives?
The instrument was designed primarily based on the published work of Blanchard, Thacker
and Way (2000) with modification based on the Malaysian training environment. The
modifications from Blanchard et al. questionnaire include rephrasing and simplifying
question structure to suit local linguistic understanding. Words which were ambiguous or
misunderstood were replaced. These modifications were applied in order to encourage a more
accurate response. Care was taken to ensure that simple and clear questions were used to
54
seek information on significant areas of training evaluation activity in Malaysia. The
questionnaire can be found in Table 4.
The questionnaire is made of 34 questions. There are 8 questions in Level 1, 5 in Level 2, 13

in Level 3 and 8 in Level 4. Level 3 was constructed with the most questions as it asked
about practices for measuring transfer learning. Practitioners could use a variety of
assessment to measure transfer learning hence the survey questions require detailed practices
undertaken by practitioners.
The questions in the questionnaires were randomly sorted to avoid biasness caused by the
order of the questions. The survey questions used a 5-point Liken scale to permit good scale
discrimination.
A panel of experts which consisted of training professionals from the Malaysia Institute of
Management was used to evaluate the items in the questionnaire. Extensive pilot testing was
undertaken by the training professionals to ensure that the questions were easily understood.
The internal consistency was determined using the Cronbach alpha method. The Cronbach
alpha coefficient is 0.8458.
2.6.2 The Sample and Sampling
To improve the effectiveness and efficiency in terms of time and resources, a purposeful
sampling technique was employed. The sample was manufacturing based companies found
in the HRDC Directory. The HRDC Directory listed approximately 5000 organizations but
only 40 percent from the listing are manufacturing based companies. The questionnaires
were sent to 2000 manufacturing based companies with more than 50 employees. The
questionnaires were posted between December 2003 and January 2004. The questionnaires
were addressed to the Personnel and Human Resources Managers of the organizations. A
self-addressed stamped envelope was enclosed to maintain anonymity on the return of the
completed questionnaires through the postal service.
55
2.6.3 Questionnaire Response
The questionnaires were posted to 2000 of manufacturing organizations in Malaysia found in

the HRDC Directory. The appeal highlighted the focus of the study, i.e. training evaluation
activities that relate to the benefits of training.
Of the 2000 questionnaires posted 94 were returned with a note that the organizations were
closed down or had moved to a new address. This reduces the original samples of 2000 to
1906. Reminder notes were sent out three weeks after the first posting in order to encourage
greater response rate. However there were only 109 completed questionnaires returned.
The overall lack of organizational response can be attributed to a variety of causes: low
interest, lack of time to respond, current restructuring of the organization, unavailable contact
person, and outdated addresses.
2.7 Findings and Discussion
Data was analysed using SPSS for XP Windows (Version 13). Statistical significance was
accepted at the 0.05 level of confidence. A total of 5.5 percent of the questionnaires were
returned. Part 1 of the questionnaire gathered information on the background of the
companies. It was found that out of the 109 companies, 46 percent are multinational
companies while 54 percent are Malaysian companies. Part 2 of the questionnaire gathered
information on the organization's commitment to training. The results are shown in Table 1.
56
Table 1. Commitment to Training
Commitment to Training Statistics (n =109)
Does your organization conduct training programs Yes = 100 percent

for employee development No = 0 percent
Does your organization conduct training needs analysis before Yes = 41.3 percent
conducting any training programs No = 58.7 percent
Multinational = 39
Malaysian companies = 6
What type of training is conducted by your organization

Management 45.9 percent
e.g. Leadership, supervisory, managing change,
communication, human relations and interpersonal skills
Organization Specific 18.9 percent

e.g. training programs related to policies, values, cultures,
goals and objectives of the whole organization
Technical 64.3 percent

e.g. quality, productivity, product training, IT training,
accounting system and job related training
Personal Improvement 24.2 percent

e.g. motivation, time management, self development,
managing self, presentation skills and business
communication skills
Others 0 percent
A total of 41.3 percent of organizations agreed that a training needs analysis was conducted
prior to conducting any training program. The rest of the organizations conduct training to
meet the needs of the organization such as low productivity or a morale problem, reaction to
a crisis and frequently not coordinated with other functions of the organization. The lack of
baseline information prevents evaluation and no meaningful comparison of the participant's
performance before and after training can occur.
The results indicate that 64.3 percent of organizations organized technical training. A large
number of organizations felt the need to upgrade the technical competence of their employees
in the areas of quality, productivity, product training, IT training, accounting system and job
related training. Of all the organizations interviewed, 65 percent reported that they have
57
extended their range of products during the last two years and 88 percent had made changes
to machinery and equipment.
Management training was ranked the second at 45.9 percent followed by personal
development at 24.2 percent. One fifth of the organizations are also concerned with
management training. Many feel that skills such as leadership, supervision, managing
change, communication, human relations and interpersonal skills are needed for management
development. Although organization specific training is an emerging area, only about 18.9
percent of the organizations feel the need to impart training in this field.
Table 2 shows the level of evaluation conducted on management and non-management

training by the organization.
Table 2. Training Evaluation Practices in Organization
Training Evaluation Practices in Organization Statistics (n =109)
Level 1 reaction evaluation 35 percent
Level 2 learning evaluation 25 percent
Level 3 behavioural evaluation 16.5 percent
Level 4 results evaluation 11 percent
No training evaluation practices 12.5 percent
The results indicate that out of 109 companies, 35 percent of organizations conducted Level 1
evaluation by measuring the participant's reactions towards the training program while 25
percent of the organizations conducted Level 2 evaluation by measuring the participant's
degree of learning as the result of the training initiatives. Only 16.5 percent of organizations
conducted Level 3 evaluation by measuring the changes in the participant's behaviour
towards the job after each training program. However, 11 percent of organizations
quantified the results of training and calculated its return on investment in training which is
classified as level 4 evaluation. The remaining 12.5 percent of organizations have never
conducted training evaluation after each training program. The results indicate that more
58
than half of the organizations do not evaluate their training at the behavioural or the results
levels. The reason for this is that sometimes training function is seen as an isolated and
peripheral function, which is not truly integrated into the job setting (Olsen, 1998).
The means and standard deviations of the four levels of training evaluation for all 109
companies are shown in Table 3.
Table 3: Means and Standard Deviations of the Four Levels of Training Evaluation
Level
Mean + SD
Level 1 3.63 ±0.62

Level 2 3.41 + 0.62
Level 3 3.26 + 0.63
Level 4 2.99 + 0.68
Note: Likert scale: where 5 = strongly agree; 4 = agree; 3 = neutral; 2 = disagree; 1 = strongly disagree
The majority of the organizations agree that they conduct Level 1 evaluation after each
training program. The average for Level 1 evaluation is 3.63 which suggest that the majority
of organizations conduct Level 1 evaluation. The average score for Level 2 evaluation is
3.41 which indicate that some companies conduct Level 2 evaluation selectively and the
majority is done on technical training. The average for Level 3 evaluation is 3.26. The result
indicates that the degree of measuring behavioral changes in the job after training is not that
popular among these manufacturing organizations. This could be due to the unavailability of
specific tools to measure the subjective changes in behavior. The average score for level 4
evaluation is 2.99 indicating that the majority of these manufacturing organizations do not
conduct result evaluation. The result was further confirmed by an interview which
mentioned that the benefits of training are not easily measured in quantitative terms and most
benefits cannot be measured immediately.
The means and standard deviations of the 34 questions in the instrument for all 109
companies are shown in Table 4.
59
Table 4: Means and Standard Deviations of 34 Questions in the Instrument
Mean SD
L evel 1 Score
Departmental heads conducted collective opinions from
- React ion
1.
participants with regards to the training program conducted
Evaluation 4.12 0.689
2. Evaluate perceptions of participants on key benefits and value 3.03 0.934
arising from training
3. Conduct training environmental audit to track participants 4.03 0.724
satisfaction after training
4. Focus on perception of trainees towards the training program. 4.38 0.862
5. Measure trainers competency and credibility after each 2.74 0.908

training program
6. Most training programs conduct post course reaction 3.89 0.715
evaluation after training.
7. Always make an effort to ask participants whether they enjoy 4.20 0.815
attending the training programs
8. Measure the accuracy of the training program in addressing 2.67 1.021
the exact requirement of the job
Level 2 1. Allow participants to write down what they have learned 3.69 0.641
which might be useful for their work
- Learning
Evaluation 2. Conduct pen and paper test for measuring the amount of 4.28 0.703
knowledge gained from a training program
3. Administer a test before and after training with regards to the 3.41 0.912
knowledge gained from a training program
4. Identify the principles, facts and techniques learned by 2.98 1.090
participants
5. Participants were asked if there were any barriers preventing 2.69 0.932
them from using what they have learned
Level 3 1. Develop performance-based tests as part of the training 2.89 0.909
- Behavioral evaluation
Evaluation 2. Assess the level of transfer of learning to the job 3.04 0.994
3. Measure the success rate of participants performing each item 3.23 0.089
learned
4. Define an action plan for participants and evaluate the 3.43 0.745
implementation success rate
5. Identify specific skill improvement as a result of a training 3.93 1.079
program
6. Measure positive changes in personnel efficiency and 3.77 0.931
effectiveness after training
7. Measure the behavior changes resulting from the training 3.51 1.099
program
8. Organize the trainer's follow up session to track the 3.28 1.141
participant's behavioral change after training
9. Use observation techniques to monitor changes of behavior 2.62 1.062
and attitudes resulting from the training program
10. Conduct work performance evaluation in the workplace after 2.71 0.703
training
11. Observing and documenting the practice of knowledge and 3.32 0.773
skills learned by the trainee into the workplace.
12. Assess the increase in knowledge and skills as well as attitude 2.84 0.842
change of trainees
13. Conduct a preview session with your trainee to specify the 3.79 0.952
expected objectives to achieve from the training
60
Level 4 1. Measure the level of productivity before and after a training 2.56 0.721
- Results program
Evaluation 2. Link effectiveness of training to financial benefit 2.91 0.668
3. Conduct cost-benefit analysis on training programs conducted 3.10 0.711
4. Measuring the worthiness of attending training in terms of 3.35 0.823

cost and time away from work
5. Measure the tangible cost in terms of reduced cost and 2.82 0.913
improved quality after training
6. Calculate the cost of training and its impact towards 2.71 0.793
organization improvements
7. Compare the cost of training program with benefits obtained 3.24 0.894
from it
8. Finding evidence of direct links between training investment 3.18 0.615
and returns from training
Note: Likert scale: where 5 = strongly agree; 4 = agree; 3 = neutral; 2 = disagree; 1 = strongly disagree
The results indicate that Level 1 evaluation (reaction) seems to be the most significant
training evaluation practice. A high mean score of 4.38 indicates that the majority of
Malaysian manufacturing companies focus on the perception of trainees towards the training
program. Managers do play an active role in conducting Level 1 evaluation by collecting
opinions from participants with regards to the training program conducted. Measuring the
accuracy of a training program in addressing the exact requirement of the job is the least
practiced and is indicated by a low mean score of 2.67.
The practice of pre and post pen and paper test after a training program is most popularly
practiced by these manufacturing companies and is shown in the mean score of 4.28. The
lowest mean score for Level 2 evaluation was 2.69 indicates that organizations seldom ask
participants if there were any barriers which prevented them from using what they have
learned.
Level 3 evaluation is modestly practiced by manufacturing companies in Malaysia. The

highest mean score of 3.93 indicates that the majority of these manufacturing companies
identified specific skill improvement as a result of a training program. The use of
observation techniques to monitor changes of attitude and behaviour as a result of the training
program shows the lowest mean score of 2.62.
The apparent lack of practice in Level 4 evaluation (result) is probably due to the effort and
potential complexities involved which entails much more work. This is reflected in the
61
survey result which indicates low interest in conducting cost-benefit analysis of training by
these organizations. Measuring the worthiness of attending training in terms of cost and time
away from work showed a mean score of 3.35. This is regarded as one of the most popular
practice of Level 4 evaluation by these organizations. Calculating the costs of training and its
impact towards organization improvements showed the lowest mean score of 2.71.
Independent t-tests were used to test for significant difference in the four levels of training
evaluation conducted by multinational and Malaysian companies. It was found that there
were significant differences between training evaluation at Level 1, Level 2, Level 3 and
Level 4 between multinational companies (N=50) and Malaysian companies (N=59) at p <
0.05. See Table 5.
Table 5: Summary oft-tests of the four levels of training for multinational companies
and Malaysian companies
Level 1 Level 2 Level 3 Level 4

Company (Mean + SD) (Mean + SD) (Mean + SD) (Mean + SD)
Multinational 3.78 + 0.52 3.69 + 0.56 3.63 + 0.48 3.50 + 0.53

Malaysian 3.49 + 0.67 3.17 + 0.57 2.94 + 0.56 2.56 + 0.46
t-value 2.635 * 4.758 * 6.794 * 9.838 *
*p <0.05
The results indicate that the majority of multinational companies operating in Malaysia have
a clearer objective of what ought to be done and have enshrined this in mission statements on
training. These multinational companies provide training and development for all employees
in all areas of operations with expensive investment and serious attempts to produce a
competent and quality workforce. The results show that multinational companies judge
training effectiveness as their immediate reaction to training evaluation. These multinational
companies applied formal and systematic procedures and processes to assess training
effectiveness as compared to Malaysian companies.
The results show that the majority of Malaysian companies did not conduct Level 3 and
Level 4 evaluations. Most Malaysian companies seem to lack the formal mechanism to
62
assess training effectiveness. The results of the t-tests were further confirmed by interviews
which suggested relatively mild commitment of top management to training and some
resistance by middle management to the function of training in Malaysian companies.
Training seems to be a low priority area and training evaluation is conducted on an ad hoc
basis. Part of this could be because identifying individual performance improvement after
training is regarded as a tedious and lengthy process.
Only six Malaysian companies have conducted training needs analysis prior to conducting
training. The result was further confirmed by interview which mentioned at times managers
send employees to training programs just to fill the quota. These employees are not the
intended participants of the training program and would return without learning much.
Since they do not have much commitment for learning after training, it does not permit Level
2, Level 3 or Level 4 evaluation to take place. This trend is shown in less Malaysian
companies practising Level 2, Level 3 and Level 4 evaluation as compared to multinational
companies.
2.8 Limitations of Study
The number of respondents was relatively low. Even though the majority of manufacturing
companies that have more than 50 employees registered with the Human Resource
Development Council, the actual number of organizations that actively participate in training
and development is rather low.
Out of the 2000 manufacturing based organizations listed in the EIRDC Directory, less than
40 percent of them conducted at least one training program in a year (HRDC, 2003). Details
of companies that do not participate in training were not disclosed by HRDC. The reason is
because HRDC does not want training providers to seek for organizations with high unused
funds. The survey was decided to send to all 2000 manufacturing based organizations as the
details of the organizations that do not conduct training program were not known. Hence, the
majority of these organizations that do not conduct training could not answer the
questionnaire.
63
The success of the study depends on the willingness of respondents to cooperate. Some may
not see the value in participation while others may view the topic as sensitive or irrelevant.
Despite reminder notes were sent out three weeks after the first posting to encourage greater
response rate. A comparison between respondents and non-respondents would have been
helpful. Unfortunately, data were not available for making such comparisons in this study.
2.9 Conclusion
Kirkpatrick model has been considered one of the most prominent models of evaluation
practised in Malaysia. The application of the 4-level of evaluation in Malaysia is not well
adopted. This study reveals that training evaluation carried out by most organizations in
Malaysia is mainly to judge trainees' reactions. A culture of fill in one of this before you go
typically pervades in training evaluation. Most organizations lack the formal and systematic
mechanisms to assess training effectiveness. Many companies remain blissfully unaware of
how much they spend on training, whether it is effective or not. Indeed, even the use of
expensive external trainers does not appear to trigger detailed evaluations.
The majority of Malaysian organizations show little or no interest in conducting training

evaluation and have even less interest in the results of evaluation as method of evaluating
effectiveness. Some find evaluation difficult as it is almost impossible to determine which
participant efforts are attributable to training and which are not.
Although Kirkpatrick model of evaluation serves as an outcome of training, most

practitioners do not know what evaluation criteria to look for. The confusion of the actual
outcome possibly hindered the ability to conduct Level 3 and Level 4 evaluations
meaningfully.
Hence this research gap shows the opportunity to examine specific outcome required from
training and the transfer component of training in detail. This study will determine what
strategies might be most helpful in maximizing the transfer learning and constructing an
appropriate model for evaluation.
64
2,10 References for Paper Two
Ahmad, R.H. 1998, 'Educational development and reformation in Malaysia: past, present
and future' Journal of Educational Administration, vol. 36, no. 5, pp. 462-475.
Attkinson, C. C., Sorenson, J.E., Hargreaves, W.A. & Hororwitz, M.J. 1978,
Evaluation of Human Service Programs, Academic Press, London.
Arthur Anderson & Co. 1991, Professional Services in Malaysia, Arthur Anderson & Co.,
Kuala Lumpur, Malaysia.
Asma, A. 1994, Training design development: The practice of four development agencies in
Malaysia, Unpublished Ph. D. dissertation, University Pertanian Malaysia, Serdang.
Barron, T. 1996, 'A new wave in training funding', Training and Development
Bernthal, P. R. 1995, 'Evaluation that goes the distance', Training and Development',
vol. 49, no. 9, pp. 41-49.
Blanchard, P.N., Thacker, J.W. & Way, S.A. 2000, 'Training evaluation: perspectives and
evidence from Canada', International Journal of Training and Development, vol. 4,
no.4, pp. 295-303.
Bramley, P. 1996, Evaluating Training Effectiveness, McGraw-Hill, Maidenhead and New

York.
Bramley, P. & Kitson, B. 1994, 'Evaluating training against business criteria', Journal of
European Industrial Training, vol. 18, no.1, pp. 10-14.
Brandenburg, D.C. 1982, 'Training evaluation: what's the current status', Training and
Brinkerhoff, O.R. 1988, 'An integrated evaluation model of HRD', Training and
Carnevale, A. P. & Schulz, E.R. 1990, 'Return on investment: according to training',

Chen, H.T. & Rossi, P.H. 1992, Using Theory to Improve Program and
Policy Evaluations, Greenwood Press, Westport, CT.
Davidove, A.E. & Schroeder, P.A. 1992, 'Demonstrating ROI of training',

Erickson, M.R.C. 1991, Business-education partnerships: a study of evaluation

methods, Doctorial dissertation, the George Washington University, dissertation
abstracts international, vol. 52/07A, AAC9133008.
65
Easterby-Smith, M. 1985, 'Training course evaluation from an end to a means',
Personnel Management, vol. April, pp. 25-27.
Gutek, S.P. 1988,'Training program evaluation: an investigation of perceptions and

practice in non-manufacturing business organizations', doctoral dissertation, Western
Michigan University, Kalamazoo, MI, dissertation abstracts international, vol. 49/05a,
AA C8811388.
Groove, E.A. & Ostroff, C. 1990, 'Program evaluation', in Developing Human Resource, eds
Wexley, K. & Himicks, J., BNA Books, Washington D.C.
Heneman, H.G. & Schurab, D.P. 1986, Human Resource Management, Irwin, Illinois.
Hamblin, A.C. 1974, Evaluation and Control of Training, McGraw-Hill, New York.
Hamid, A.A., Mohd, S., Muhamad, A.H. & Ismail, Z. 1987, Management Education in
Malaysia, in Developing managers in Asia, eds Tan Jing Hee & You Poh Seng,
Addison-Wesley, Singapore.
Human Resource Development Council 1992, Human Resource Development Act 1992,
Ministry of Human Resource, Kuala Lumpur, Malaysia.
Human Resource Development Council 2003, Ministry of Human Resource, Kuala Lumpur,
Malaysia.
Junaidah, H. 2001, 'Training evaluation: clients' role', Journal of European Industrial

Journal of American Society for Training and Developing, vol. 13, pp. 3-9,
Kirkpatrick, D.L. 1960a, 'Techniques for evaluating training programs: part 3- behaviour',
Kirkpatrick, D.L. 1960b, 'Techniques for evaluating training programs: part 4 - results',
Kirkpatrick, D.L. 1976, Evaluation of Training, Training and Development Handbook: A

guide to human resource development, 2nd edn, Craig, R.L.O., McGraw-Hill
Publisher, New York.
Kirkpatrick, D.L. 1979, 'Techniques for evaluating training programs', Training and
Malaysia Economic Planning Unit 1996, Seventh Malaysia Plan: 1996-2000, Government
Printer, Kuala Lumpur, Malaysia.
66
Maimunah, I. 1990, Extension: Implication to Community Development, 2nd ed, Dewan
Bahasa and Pustaka, Kuala Lumpur, Malaysia
Mirza, S.S. & Juhary, H.A. 1995, Managerial training and development in Malaysia,
Malaysian Institute of Management, Malaysia.
Olsen, J.H. 1998, 'The evaluation and enhancement of training transfer', International
Journal of Training and Development, vol. 2, no. 1, pp 61-75.
Pauzi, M. 1985, 'Training nuisance, 12th ARTDO International Conference', Petaling Jaya,
Malaysia, 22-27 July.
Phillips, J.J. 1991, Handbook of Training Evaluation and Measurement Methods, Gulf
Publishing Company, Houston, TX.
Pillai, P. 1994, Industrial Training in Malaysia: Challenge and Response, ISIS Publication,
Setiakawan Printers Sdn Bhd, Malaysia.
Pillai, P. & Othman, R. 1994, 'Learning to work, working to learn', Institute of Strategic and
Institutional Studies, Kuala Lumpur.
Plant, R.A. & Ryan, R.J.1994, 'Who is evaluating training?', Journal of European Industrial
Rowe, C. 1992, 'How useful was it? The problem of evaluating in-house training programs',
Industrial and Commercial Training, vol. 24, no. 7, pp. 14-18.
Rossi, P.H. & Freeman, H.E. 1993, Evaluation: A Systematic Approach, 5th edn, Sage
Publications, California.
Ruthman, L. & Mowbray, G. 1983, Understanding Program Evaluation, Sage Publication,

London.
Saiyadain, M.S. 1995, 'Perceptions of sponsoring managers, training organizations, and top
management attitude toward training', Malaysian Management Review, vol. 30, no. 4,
pp. 69-74.
Shadish, W.R. & Epstein, R. 1987 'Patterns of program evaluation practice among members
of the evaluation research society and evaluation network', Evaluation Review, vol.
11, no. 5, pp. 555-590.
Shamsuddin, A. 1995, 'Contextual factors associated with evaluation practices of selected

adult and continuing education providers in Malaysia', unpublished PhD dissertation;
University of Georgia, Athens, G.A.
Shelton, S. & Alliger, G. 1993, 'Who's afraid of level of evaluation?', Training and
Smith, A. 1990, 'Evaluation of management training subjectivity and the individual', Journal
of European Industrial Training, vol. 14, no. 1, pp. 12-15.
67
Smith, A.J. & Piper, J.A. 1990, 'The tailor-made training maze: a practitioner's guide to
evaluation', Journal of European Industrial Training, vol. 14, no. 8, pp. 2-24.
Wagel, H.W. 1977, 'Evaluating management development and training programmes',

Personnel Management, vol. 54, no. 4.
68
2.11 Appendix A The Questionnaire for Research Paper Two
This survey is about .... grainituj gAlat.t.k. 9...N.4 4. aZatatt44.
Enormous resources of time, money, and energy are invested in every imaginable kind of
training and development program. Little effort is invested in discovering the how well those
process work, how they might be improved or, indeed, if they work at all. It is important for
organization that uses training and development activities to seek practical ways of
evaluating those activities.
With greater emphasis by the Ministry of Human Resources since the enactment of Human
Resources Development Act, 1992, there is a need to improve the effectiveness of training
activities in Malaysia in order to achieve greater productivity among the workforce.
However, effective evaluation requires the examination of training outcomes at several levels
of evaluation. This research study is designed to study to what extent the Malaysian
manufacturing sectors have carried out training evaluation and how these organizations have
benefited from the training event. The information you provide will help us better
understand the quality and effectiveness of training evaluation system that has so far being
carried out within the Malaysian context.
Because you are the one who can give us a correct picture of how you experience conducting
training evaluation, I wish to invite you to participate in this research study. The results
will be presented in an aggregate and untraceable manner.
If you have any enquiry about this research or the questionnaire, feel free to contact me, Lim
Guan Chong, at No. 54, Jalan SS2167, 47300 Petaling Jaya, Selangor Darul Ehsan, or my cell
phone 019-4781553, or my e-mail cipdlgc@tm.net.my. You can also contact my supervisors,
to verify this survey and my doctoral candidateship: Dr. Travis Kemp (e-mail:
travis@teleran.com.au) or Professor Dr. Leo Ann Mean (e-mail: drleo@itd.com.my).
Part 1: Tell us about your Organization
Name of organization:
Type of company:
Multinational
Malaysian companies
Nature of business:
Manufacturing
Service
Others, please specify
69
Part 2: Commitment to Training
Do your organization conduct training program (in house training program, public program
and on-the-job training) for employees development?
Yes
No
Do your organization conduct training needs analysis before conducting any training
programs?
Yes
No
What types of training programs conducted by your organization?
Management (e.g. leadership, supervisory, managing change, human relation and

interpersonal skills, communication)
Organizational specific (e.g: training programs related to whole organization policies, values,
culture, goals and objectives)
Technical (e.g. quality, productivity, product training, IT training, accounting system and job
related training)
Personal Improvement (e.g. motivation, time management, self development, managing self,
presentation skills and business communication skills)
Others. Please specify
Part 3: Training Evaluation Practices
Instructions: Please indicate your agreement and disagreement that truly represents the practice in
your organization on a scale of 5 (strongly agree), 4 (agree), 3 (neutral), 2 (disagree) to 1
(strongly disagree), to express your view.
Training Evaluation Practices Strongly Agree Neutral Disagree Strongly

Agree Disagree
1 Most training programs conduct 5 4 3 2 1
post course reaction evaluation
after training
2 Always make an effort to ask 5 4 3 2 1
participants whether they enjoy

attending the training programs
70
Agree Disagree
3 Departmental heads conducted 5 4 3 2 1
collective opinions from participants

with regards to the training program
conducted.
4 Participants were asked if there 5 4 3 2 1

were any barriers preventing them
from using what they have learned
5 Allow participants to write down what 5 4 3 2 1

they have learned which might be useful
for their work
6 Define an action plan for participants 5 4 3 2 1
and evaluate the implementation

success rate
7 Conduct pen and paper test for 5 4 3 2 1

measuring the amount of knowledge
gained from a training program
8 Administer a test before and after 5 4 3 2 1

training with regards to the knowledge
gained from a training program.
9 Develop performance-based tests 5 4 3 2 1
as part of the training evaluation
10 Identify specific skill improvement 5 4 3 2 1

as a result of a training program
11 Measure positive changes in personnel 5 4 3 2 1

personnel efficiency and effectiveness
after training
12 Measure the behaviour changes 5 4 3 2 1
resulting from the training program
13 Conduct a preview session with your 5 4 3 2 1

trainee to specify the expected
objectives to achieve from the training
14 Organize the trainer's follow up 5 4 3 2 1

session to track the participant's
behavioural change after training
15 Measuring the worthiness of 5 4 3 2 1

attending training in terms of cost
and time away from work
71
Agree Disagree
16 Use observation techniques to 5 4 3 2 1

monitor changes of behaviour and
attitudes resulting from the training
program.
17 Measure the level of productivity 5 4 3 2 1
before and after a training program
18 Link effectiveness of training to 5 4 3 2 1

financial benefit
19 Conduct cost-benefit analysis on 5 4 3 2 1

training programs conducted
20 Evaluate perceptions of participants 5 4 3 2 1

on key benefits and value arising
from training
21 Identify the principles, facts and 5 4 3 2 1

techniques learned by participants
22 Measure the tangible cost in terms 5 4 3 2 1

of reduced costs and improved
quality after training
23 Measure the accuracy of the training 5 4 3 2 1

program in addressing the exact
requirement of the job
24 Measure the success rate of 5 4 3 2 1
participants performing each

item learned
25 Conduct training environmental 5 4 3 2 1

audit to track participants
satisfaction after training
26 Measure productivity improvement 5 4 3 2 1

after each training
27 Calculate the cost of training and its 5 4 3 2 1

impact towards organization
improvement
28 Conduct work performance 5 4 3 2 1

evaluation in the workplace
after training
29 Measure focus on perception 5 4 3 2 1

of trainees towards the
training program
72
Agree Disagree
30 Assess the increase in knowledge 5 4 3 2 1
and skills as well as attitude change

of trainees
31 Compare the cost of training program 5 4 3 2 1
with benefits obtained from it
32 Observing and documenting the 5 4 3 2 1

practice of knowledge and skills
learned by the trainee into the workplace.
33 Assess the level of transfer of 5 4 3 2 1
learning to the job
34 Measure trainers competency and 5 4 3 2 1
credibility after each training program
35 Finding evidence of direct links 5 4 3 2 1
between training investment and

returns from training.
Thank you for your participation!
73
Research Paper 3
MULTI-RATER FEEDBACK FOR TRAINING

AND DEVELOPMENT: AN INTEGRATED
PERSPECTIVE
Lim Guan Chong

University of Hull

74
Multi-Rater Feedback For Training And
Development: An Integrated Perspective
Lim Guan Chong

3.1 Abstract
This paper looks at the difference between success and failure of multi-rater feedback in
enhancing employee self awareness and encouraging them to engage in development
programs. Multi-rater feedback is basically used as an unfreezing process in which
employees are motivated to rethink their behaviour and its impact on others. Multi-rater
feedback provides employees with good data from multiple perspectives, encouraging
openness to listen and accept self weaknesses for development. A comprehensive integral
model encompassing process consultation and good conversation is used to facilitate
effective development after multi-rater feedback. Process consultation provides a description
of prescriptive approach to help the employees recognize and accept responsibility for the
difference in perception. The flexibility of process consultation should be enhanced by
integrating good conversation to promote ideal communication and interaction between the
process consultant and employees which will eventually build trust for open learning and
development. Post multi-rater feedback is introduced to assess the degree of performance
improvement which resulted from the development program.
3.2 Introduction
Multi-rater or 360-degree feedback has gained wide acceptance and usage to support
development of leadership and management skills (Cacioppe & Albrecht, 2000). The multi-
rater feedback process provides a comprehensive feedback collected from people around
ratees, in the workplace. One underlying rationale to such system is its potential impact on
75
the target ratee's self awareness: increasing self awareness is thought to enhance development
(Ashford, 1993; Mount, Judge, Scullen, Sytsma & Hezlett 1998). According to Van Veslor
et al. (1993), the number of multi-rater feedback instruments has increased significantly in
the past 15 years. It is estimated that American companies spent $152 million in 1992 on this
form of feedback for development (Hoffman, 1995). Multi-rater feedback was first
introduced to the UK in the early 1990s, and has spread quickly across public and private
sector organizations (Fletcher & Baldry, 2000). The spread is based on the perceived benefits
of fairer and greater accuracy in representing a performance, which creates development and
learning potential that can consequently motivate changes in behaviour. In review of 20
organizations responding to the delivery of multi-rater feedback, London and Smither (1995)
found that 40 percent of the respondents always linked multi-rater feedback to specific
developmental activity. According to Moses, Hollenbeck and Sorcer (1993), there does not
appear to be a distinct individual who founded or invented the multi-rater feedback process.
They argued that the term multi-rater feedback has been mistaken to be a newly discovered
concept, as perceptions of people have been available as long as there were people around to
observe them.
3.3 The Use of Multi-rater Feedback
The implementation of multi-rater feedback varies among organizations. The widespread

adoption of multi-rater feedback and other multi-source feedback is based on the perceived
benefits of fairer and greater accuracy in representing performance because it offers a more
rounded assessment of the individual, not just the top-down perspective of conventional
appraisal. It is an empowering mechanism, which allows subordinates to exert some
influence over the way they are managed. The same is true for peers, who can think back and
improve a colleague's role perform as a team member. Moreover, the multi-rater feedback
system provides a natural method for both enhancing learning and improving performance.
As the complexity of job function increases in the workplace, it is crucial for employees to
receive feedback from a variety of constituencies and not only the traditional superior-
subordinate appraisal approach. This feedback facilitates self awareness by enabling
participants to compare their own perceptions of their skills and personal style with the
perceptions of important observers in their work environment. Multi-rater systems are
76
assumed to improve performance by increasing self awareness through diversified
information from multi-rater feedback (Borman, 1997). Ratees who receive feedback or
appraisal on their performance from a variety of sources will be responsible to improve
current job performance through continuing to add value to the organization needs and be
prepared for the future.
Continuing development has been the key priority for most organizations in keeping the
workforce updated with on-going technological changes. Measuring and improving worker
performance has become increasingly important for organizations to stay competitive.
According to Nowack (1993), the increased use of multi-rater feedback in organizations was
mainly due to the increasing need for continuous measurement of improvement efforts; the
need for job-related feedback for employees affected by career plateauing; and the need to
maximize employee potential in the face of technological changes, competitive challenges
and increased workforce diversity.
Multi-rater feedback has also been seen to increase reliability, fairness and acceptance of the
data by the person being rated (London, Wohlers & Gallagher, 1990). This is because
feedback is received from multiple sources and not just from one. A study conducted in an
American company showed that only 3.9 percent of staff felt that feedback should come
solely from the superior, while 94.8 percent felt that feedback should come from both
superior and co-workers (Cacioppe & Albrecht, 2000). This result indicates that there is an
on-going trend in American companies to have performance feedback from multiple sources
despite the fact that there may always be variation in terms of perceptual differences between
the self and others on the feedback results. Although multi-rater feedback system provides
the ratee with greater information as a base for development, Tornow and London (1998)
suggested that multiple feedback sources require balancing, as multiple sources may
potentially offer conflicting viewpoints to the ratee. However, the differences in perspective
between the rater and the ratee should not be treated as an assessment error. This is further
supported by Ashford (1993) who found that multi-rater feedback is important to the ratee
because the information could further stimulate the ratee's cognitive reactions that would
likely give impact to subsequent behavioral changes. A multiple feedback system is a source
of information, which can enhance personal learning by providing the opportunity to ratees
who are being assessed to compare their self-perceptions against the perceptions of others
77
regarding their behaviour. Multi-rater feedback is simply a set of performance-related
information which is essential for learning and development.
Bennis, Benne and Chin (1969) is of the view that multi-rater feedback is a critical element in
affecting change in performance evaluation. According to Zemke and Zemke (1995), adults
undertake learning experiences when they see a need to acquire a new or different skill or
knowledge. Multi-rater feedback provides the opportunity for open communication between
rater and ratee to discuss on ratee's past behaviour and weaknesses, encouraging openness to
hearing and accepting feedback. Such feedback and open communication is instrumental for
an unfreezing process which ratee is motivated to rethink back his or her weaknesses and
strengths (Shipper & John, 1992). McCauley and Moxley (1996) also viewed multi-rater
feedback as an instrument in an unfreezing process, in which ratees will have a chance to re-
think their previous and current behaviour based on the discrepancies of the results between
self and others and how their weaknesses would create impact on others. Hence, conducting
multi-rater feedback before and after training provides an avenue for the training provider to
evaluate performance changes. This shows that the feedback received can be used both as
reinforcement of past learning and also an opportunity for future learning (Rosti & Shipper,
1998).
Evidence from various settings has demonstrated an association between self awareness and
performance outcome (Fletcher, 1997). It has been found that high self awareness is related
with high performance ratings in various aspects (Atwater, Ostroff, Yammarino & Fleenor,
1998; Bass & Yammarino, 1991; Furnham & Stringfield, 1994). Nasby (1989) is of the
opinion that ratees with high self awareness are more able to integrate various feedbacks into
self-perception in order to reach a higher performance outcome. Ashford (1984) found that
people with low self awareness are more likely to ignore or discount feedback about them
and will have a negative attitude towards work. This indicates that a highly self-aware ratee
is likely to exert self-positivism to accept feedback and show self-motivation for
improvement (Fletcher & Bailey, 2003).
However, according to Greguras, Ford and Brutus (2003), further research needs to be
conducted to investigate the effectiveness of multi-rater feedback systems in increasing one's
self awareness. This will lead to eventual improvement in one's performance as the ratee
may react differently to different source of information they received. Information obtained
78
from different sources would have different affects on the ratee's self awareness. If the
feedback proves to be true, how would the ratee react to this perceptual reality. Conway and
Huffcutt (1997) commented that although different raters present different information, not
much research has explored how the ratee attends to, integrates, and uses the information
from the various raters. Although multi-rater feedback system increases one's self awareness
and leads to further improvement, a further specific mechanism needs to be included
(Hazucha, Hezlett & Schneider, 1993; Reilly, Smither & Jasilopoulos, 1996; Walker &
Smither, 1999). The specific mechanism refers to identification of appropriate personality
traits, skills or competency needed by the ratees; establishing an appropriate feedback rating
approach and selection of relevant raters. This proposed mechanism must be embedded
within the feedback system prior to the implementation. This will increase ratee readiness in
accepting the multi-rater feedback and lead to individual self awareness for further
improvement. Many studies have been conducted to discover specific interventions of the
multi-rater feedback mechanism.
3.4 The Effectiveness of Multi-rater Feedback for Development
Fletcher and Bailey (2003) found that multi-rater feedback provides the opportunity for the
ratee and raters to agree on the level of competence that is needed. Church (1997) also
supports this view and suggested that multi-rater feedback provides both the rater and ratee
with the opportunity to agree on the development needs of a required performance standard,
competency and skills necessary for the ratee. Both the rater and ratee would have an
opportunity to clarify their respective expectations in order to develop a psychological
contract or agreement. The ratee would be more focus on what is needed by people working
around them and raters would have a better understanding of the ratee's strengths and
weaknesses. In certain multi-rater feedback systems this is known as a gap analysis process
and other multi-rater literature refer to it as congruence-d (Warr & Bourne, 1999). Edward
(1993, 1994) stated that d is the score difference score between the ratee and other raters.
However, Fletcher and Bailey (2003) commented that telling a ratee d score is of no use
unless the rater can provide specific and meaningful information to reduce the gap between
ratee's and raters' scores. Congruence-d is obtained by subtracting the average score of the
other raters from the self-rating for each feedback questionnaire item, and dividing that value
79
with the standard deviation of raters and ratee's scores (Warr & Bourne, 1999). The level of
self awareness is signified by the d score. If the d score is equal to 0, this signifies complete
agreement between the self and the others rating on all items. Disagreement between ratee's
and raters' ratings generally showed low correlations between-source ratings. This showed
that different rater sources actually provide different information (Conway and Huffcutt,
1997; Harris and Schaubroeck, 1988). Ashford (1993) and Brutus, London and Martineau
1999) conducted studies on relative impact of different rater sources showed raters have
different implications on development of the ratee. The studies discovered that subordinate
ratings had the largest impact on goal selection, followed by peers and superior. This shows
that the selection of information from different rater source is important for ratees to decide
which rater source is most qualified and which feedback is important for further improvement
(Kluger & Denisi, 2000). Mount (1984) supports the validity of subordinates rating and
indicated that the majority of ratees show approval for subordinate ratings for developmental
purposes (Bernardin, Dahmus & Redmon, 1993; Facteau et al., 1997, 1998; London et al.,
1990; McEvoy, 1990).
Further study by Greguras, Ford and Brutus (2003) on 213 managers using a policy capturing
design that allowed factors (i.e. lead others, general administrative performance, building
working relationship and overall performance) to be manipulated. The study showed that
superior ratings would be weighted more heavily than peer or subordinate ratings for the
ability to lead others, general administrative performance, building working relationship and
overall performance. Ratees will attend more to peer ratings than subordinate ratings for
general administration of roles and responsibilities because peers are more likely to
understand the ratee's duties, which are similar to their own. However, a study by Atwater,
Roush and Fischthel (1995) showed that the ratee attend more to subordinate ratings as
compared to peer ratings for the ability to lead others as subordinates have first-hand
experience with the ratee's leadership behaviour. Selection of feedback information is tied
closely to the ratee's perception. Needless to say, ratee development success is closely related
to the ratee's perception of the source of information. User should consider whether multi-
rater is best used for development or only provides different dimensions of reference for the
ratees.
80
3.5 The Effectiveness of Multi-rater Feedback for Appraisal
There are debates on the use of multi-rater feedback for appraisal and development (Bracken,
Dalton, Jaka, McCauley & Pollman, 1997). According to London et al. (1990) and Antonioni
(1994), respondents will answer questions differently if it is for appraisal purposes. A study
by London and Smither (1995) showed that 40 percent of people who provided multi-rater
feedback ratings said they would have altered those ratings if the company planned to use
them for evaluation or appraisal. According to McEvoy and Buller (1987), ratees view the
process as most useful when uses for development as apposed to appraisal. London and
Beatty (1993) found evidence to support this. They reported that 34 percent of the
respondents in their study would rate their superior differently if the feedback were shared
with their superior. Hence, there is still an element of fear for individuals to appraise their
superiors honestly. Further studies should be carried out to determine the capability of
multi-rater feedback that is used in performance appraisal. Few researchers agree that multi-
rater is useful solely for developmental purposes as it is also widely used in managerial and
leadership development programs (Cacioppe, 1998; Cacioppe and Albrecht, 2000; Garavan,
Morley & Flynn, 1997; McCauley & Moxley, 1996; Thach, 2002). O'Reilly (1994)
suggested that when multi-rater feedback is used for development purposes, scores from
raters do not vary much. However, this was not the case for formal performance appraisals.
3.6 The Variation of Multi-rater Feedback Information
A study by Kluger and DeNisi (1996) on the effectiveness of multi-rater feedback

interventions showed that only one-third actually yielded positive improvements in
performance. There is an urgent need to take a closer look at the effectiveness of multi-rater
feedback in performance development. Feedback is invaluable to ratee as it comes from
multiple sources, and provides multiple perspectives. Each opinion or perspective may
provide relevant yet different feedback for the ratee to focus upon (Atwater & Yammarion,
1993; Hazucha et al., 1993; Tornow, 1993). Ghorpade (2000) commented that having more
information does not necessarily mean a higher accuracy rate and information provided by
just a superior does not mean it is not impartial. If the source does not have an opportunity to
observe the ratee's behaviour, or does not recognize the requirements of a particular
81
performance dimension, feedback from the source may be inaccurate for the ratee's
development. Therefore the quality of ratings from different sources for a particular
dimension should be assessed (Kluger & DeNisi, 1996). London and Smither (1995) stated
that ratings provided by different raters are likely to be inconsistent because it may create
much confusion and disagreement on the results and may not increase future development.
According to Moses et al. (1993) multi-rater feedback relies solely on the instrument scoring
system or data collection methods to interpret the information for ratees. Moses et al. (1993)
argued that multi-rater feedback is based on people's observations and the observer may not
know what behaviour to look for. If the primary purpose of multi-rater feedback is to
identify developmental opportunities, then a set of competent performance behaviours has to
be identified and communicated to all raters prior to the process. This would enable the rater
to understand the required habits, behaviors or styles so that a proper and fair judgment
towards the ratee's performance is ensured. The rater's feedback is important and may have
an impact on the ratee's subsequent developmental priorities.
The rater's feedback such as perception bias, cultural issues and gender should also be given
special attention (Cacioppe & Albrecht, 2000). An example of perception bias is a man will
show better leadership than a woman. A study of three organizations, with a total of over
20,000 employees, showed that there was a positive correlation between performance and age
until the age of 45 (Cacioppe & Albrecht, 2000). The study indicates that raters are likely to
stereotype younger ratees as performing better than older ratees, or older ratees may have
better experience compared to younger ratees. However, by looking at the cultural
dimension, Leslie, Gryskiewicz and Dalton (1998) argued that multi-rater feedback might not
necessarily be well accepted by cultures in certain countries. Some cultures do not subscribe
to the same notion that feedback is valuable and can guide manager development. For
instance, cultures such as the French may place more value on lineage or social class than
developing managers. Different cultures may find it a shock to be asked personal
information regarding their superiors. American managers find difficult to get those that
report directly to them to give negative feedback (Wilson et al., 1996). Another example is
Asian value of face-saving where a request for information needed in a multi-rater feedback
may come across as offensive (Wilson et al., 1996). An organization that wishes to conduct
multi-rater feedback needs to take a closer look at the age, culture and genders of the raters or
ratee as these may affect the effectiveness of multi-rater feedback process.
82
Honey and Mumford (1982) reflected that in the event of self assessment, most managers are
poor reflectors. They prefer to charge on with new ideas rather than look backwards and
reflect on how things might have gone better. However, according to Waldman, Atwater and
Antonian (1998) individuals who rated themselves higher are likely to have higher self-
esteem and self-concept. Disagreement over the result could be a threat to the ratee's self
esteem and weaken their motivation for further development. Special caution need to be
taken in designing multi-rater feedback in order to minimize the potential of ratee being
pessimistic and to ensure that ratee's self-image is converted to productive behavioural
change (Wood, Allen, Pillenger & Kahn, 1999). The feedback process should be designed as
a tool to ensure effective interpretation of information received from multi-rater feedback to
stimulate individual and organization improvement in attaining strategic business objectives
(Heisler, 1996). Information from multi-rater feedback is mainly used for developing people
but increasingly, it is being used for strategic planning in training and development (Romano,
1994; Atwater et al., 1993). A research conducted with 48,000 participants indicated that
multi-rater feedback could successfully contribute to the effectiveness of training and
development (Cacioppe & Albrecht, 2000).
3.7 Multi-rater Feedback Practices in Malaysia
In the Malaysian training environment, multi-rater feedback could be used as one of the
assessment models for training and development. Al imo-Metcalf (1998) commented that
multi-rater feedback should only be used in the context of assessment for development.
Payne (1998) supported the view that multi-rater feedback could be a potentially powerful
and even dangerous tool. Therefore it should be confined to the developmental arena and
used by people who know what they are doing.
Training evaluation and assessment practices in Malaysia are still considered at an

elementary stage. A study conducted by Zakaria and Rodzhan (1993) on 94 manufacturing
and service organizations in Malaysia found that only 44 percent of respondent organizations
conducted formal training. Of those who conducted formal training, 23 per cent did not
conduct any training needs assessment. The main reason was lack of expertise to perform
83
assessment. Among these respondents, the main source of information for training needs
assessment was the problems faced by their organizations. This evidence shows the lack of
attention given to transference of skills in training evaluation and feedback. Therefore, it is
wise to establish and instill the right approach to training and development as jobs today are
increasingly complex, and the traditional method of having a superior rate a subordinate
performance is inadequate in giving quality information to improve performance and skills.
The training culture in Malaysia has been indirectly influenced by multinational companies
operating in Malaysia. This is supported by a survey by Wan Aziz (1994) showed that the
majority of multinational companies operating in Malaysia brought in training culture. A
survey by Zakaria and Rodzhan (1993) on 108 manufacturing companies, suggested that
about 67 percent of the multinational companies interviewed conducted general and specific
training programs for all levels of staff. These multinational companies in Malaysia need to
conduct training because they require highly-skilled manpower who are able to operate new
and sophisticated machinery or research product improvement. A study by Wan Aziz (1994)
on 120 companies showed that 55.6 percent of Malaysian-owned companies conduct training.
This shows that Malaysian-owned companies are emulating the training culture of
multinational companies in order to cope with a challenging environment. A research by
Junaidah (1999) showed that Malaysian companies feel discouraged when undertaking
training, as they are not able to mark the progress of development after training. The main
reason may to lie in their inability to see the tangible benefits of training (Saiyadain &
Juhary, 1995). The majority of Malaysian companies conduct training needs on a general
basis. Zakaria and Rodzhan (1993) found that only 16 per cent of Malaysian companies
indicated that their training needs assessment was based on the strategic plan of the
organization. This indicates a lack of strategic orientation in the way training was conducted
in Malaysian companies. Components of training and development in an organization need
to cohere with one another in supporting organization strategy.
During the pre-training stage, a needs assessment is crucial in identifying relevant skills
needed by the individual to contribute to the strategic objectives of the company. Multi-rater
feedback would compliment the needs analysis by providing ratee with multi-source
feedback for further development. The ratee will be given an opportunity to understand their
strengths and weaknesses from a different source and focus on reducing weaknesses and
maintaining strengths. In Malaysia training needs assessment are not conducted by measuring
84
individual skill deficiency but through general perceptions of a few top executives in the
whole department or organization. Organizations see training as an organizational need
rather than an individual need. Mirza and Juhary (1995) found that training organizations in
Malaysia offered training programs that were relevant to the needs of the organizations and
were too theoretical, one-shot with no follow up and not interactive. Organizations have
neither the professional competence nor the resources to identify training needs and mount
relevant training programs. Mirza and Juhary (1995) indicated that the stated flaws of
training could be attributed to the partial training culture brought by multinational companies
in Malaysia. They commented that assessment is difficult; it is almost impossible to
determine which employee weakness can be addressed by training. The culture of
conducting training evaluation among Malaysian companies was simply not popular or
encouraging.
According to June and Rozhan (2000), given no proper pre and post training evaluation, the
organization will be constrained in its ability to link training with strategic objectives. It
would be difficult for the training and development to have a meaningful impact on
organizational effectiveness. Their study also provided the argument that multi-rater
assessment is not practiced by Malaysian-owned companies for development. If Malaysian
companies wish to conduct complete training and proper assessment, it is wise to use multi-
rater feedback on the training needs assessment so that organizations would be more focused
in the development process and able to measure its effectiveness.
Shipper and John (1992), found that multi-source information may be a mechanism for open
communication among diverse groups to establish proper psychological contract and clarify
expectations. This is supported by Luthans and Farner's (2002) study using the Kirkpatrick
(1994) training evaluation framework integrated with multi-rater feedback on 409 expatriate
workers from 49 multinational companies on whether transfer learning on the job was well
received. The mentioned training evaluation framework may be applied to local managers
who worked in multinational companies in Malaysia who are not clear of the cultures
brought
in by expatriates and the expectation of their foreign counterparts within the company.
Therefore, multi-rater feedback, which has been described as needs analysis process, will
clarify ratee's expectations with the people working around them (Fletcher & Bailey, 2003).
85
Instilling multi-rater feedback as part of the pre-training needs analysis in Malaysian can
bring practical benefits for the organization by focusing on a particular behaviour or key
competency that is necessary for employee development. Employees will also have the
chance to audit self-perception against others through this self awareness mechanism, which
will result in higher work performance (Atwater et al., 1998; Bass & Yammarino, 1991;
Fumham & Stringhfield, 1994). The information received from the multi-rater feedback
would impact on the targeted individual's self awareness and lead to the achievement of
agreed developmental needs (Fletcher & Bailey, 2003). Indeed, research has confirmed that
the use of multi-rater feedback is one of the best methods to promote ratees' self awareness of
their strengths and skill deficiencies (Hagberg, 1996; Rosti & Shipper, 1998; Shipper &
Dillard, 2000). Multi-rater feedback has been defined as an information gathering process
from relevant observers and is linked to specific business needs or objectives. Therefore, a
multi-rater feedback refers to the practice of providing an employee with perceptions of his or
her performance competencies from numerous sources (Cacioppe & Albrecht, 2000). By
reviewing different perceptions of their performance competencies, ratees can confirm their
strengths as well as identify their blind spots, habits, behaviours and styles, which may have
an adverse impact on others and their developmental priorities. This process helps a ratee to
focus on and develop performance competencies through a well-structured development
process. Waldman et al. (1998) were concerned about the lack of research examining the
effectiveness of multi-rater feedback on the performance developmental cycle.
3.8 Integrating Multi-rater Feedback with Development Tool
Organizations need to look at development as a continuous process by incorporating

development model in the multi-rater feedback system. If the purpose of having multi-rater
feedback is not clear and not integrated with the developmental systems, it will come across
like a trend. This is shown by Judge and Cowell (1997) using executive coaching as a
development process after conducting multi-rater assessment. The study showed that the
combination of multi-rater feedback coupled with individual coaching as a developmental
process increased leadership development effectiveness by 60 percent. This was based on the
direct report and peer post-survey feedback. Another study by Heisler (1996a) used a
comprehensive combined model which integrated multi-rater feedback with the leadership
86
and management skills development process. This approach was applied to a sample of 304
superiors and more than 1000 subordinates. The result showed that the ratees felt an increase
of ownership towards their personal and professional development. Ratees reported
improved communication and interaction with their superiors, peers and subordinates
(Heisler, 1996a). Effective communication and interaction between raters and ratee will
reduce possible multi-rater feedback drawbacks.
The developmental process of multi-rater feedback involves a great deal of cognitive

complexity and acknowledgement of the validity and legitimacy of the feedback. It also
requires balancing multiple or conflicting perspectives and balancing a sense of self with the
larger context and role requirements. There should be some mechanism to address the
discrepancy between the ratee's and rater's feedback in order to make it into a coherent
developmental tool. The identified discrepancies can be used to assist ratees in developing
their personal action plan for development. Research is needed to clarify and validate the
most effective concept design to develop ratee after multi-rater feedback.
3.9 Multi-rater Feedback: Process Consultation as a Development Tool
Process consultation is an ideal support tool for development (Schein, 1997). The process
consultation session should be conducted after multi-rater feedback so that it turns out to be a
very positive experience, regardless of discrepancies in the results. Process consultation is an
ongoing development system approach that has skilled third party (process consultant) work
with ratees and helping them learn about their competency gap from the multi-rater feedback
process. The process consultant should emphasize on the ratee's strengths and improve on
what the ratee does best, not what he does worst. In spite of this, if different world-view
arises between the process consultant and the ratee (client), the process consultant may use
non-directive techniques in order to help the client recognize and accept responsibility for the
deficiency (Hall, Otaza & Hollenbeck, 1999; Judge & Cowell 1997; Thach & Heinselman,
1999).
Process consultation is based on the idea that ownership of the issues of concern remain with
the ratee, who has actively participated in defining the key issues resulting from multi-rater
87
feedback and formulating a solution that is culturally appropriate (Schein, 1987). The role of
the consultant revolves around facilitation and engaging in a helpful relationship with the
ratee, rather than simply being a provider of expertise. The process consultant's role is more
nondirective and questioning as he or she gets the groups to solve their own problems
(French & Bell, 1999). This approach increases the likelihood of confronting the most
pressing issues and helps the ratee benefits from problem-solving skills needed for ongoing
organizational change.
Schein (1987) commented that process consultation is not one single thing the process
consultant does but are paramount goals the process consultant helps the ratees (client)
achieve, change and resolve key issues of concern through different interventions. Although
information on the stages of change (Lovelady, 1989) and the focus of intervention
(Fagenson & Burke, 1990) are important, it reveals too little about the specific activities that
process consultants engage in, and the skills they need to accomplish them successfully.
Schein (1987) concluded that process consultants make interventions in the following order:
agenda setting, feedback of observations or other data, counseling and coaching, and
structural suggestions if any. During the process consultation, ratee (client) who wishes to
change their traditional practices and behaviors need to be given the opportunity to reflect on
a wide range of meaningful feedback. Without reflection, it is just lip service to change
ratee's behaviour or performance. Process consultants will also be given the opportunity to
reflect their feelings, thoughts and perceptions on ratee's development. Through the
reflection process, process consultants will be able to evaluate the degree of reaction and
learning of the ratee. This is supported by Kolb's (1984) learning theory which states that an
individual will learn effectively if he or she is able to reflect on the feedback received.
It is important to take a closer look at a process consultant's actual intervention role which
involves intertwining events, issues, thoughts, emotions and human interactions. Schein
(1987) and Weisbord (1988) showed appreciation for the complex role and behavioural
repertoire required by the process consultant. Most research does not distinguish between the
different settings and contexts for consultancy practice (Chapman, 1998). The question arises
on whether the process consultant engages in different activities in their work within the
organization. If so, what particular skills are required for them to successfully develop a
ratee (client). This question is of interest to many people including the process consultants
themselves. Chapman (1998) asserted that a successful facilitation process requires building
88
emotional ties between the process consultant and client through good communication and
interpersonal skills. According to Kirkpatrick (1959), adults must be motivated to learn.
Hence through effective communication and interpersonal interactions, development of
psychological contract and emotional ties between both parties will motivate them to
participate in the development plan (Wolfe & Kolb, 1984).
3.10 Micro Perspective of Conversation Theory in Process Consultation
Pask's (1975) work in developing a human learning system through conversation theory may
be used to enhance facilitation between the process consultant and client. Conversation
theory is a framework for intervention analysis called the conversations model developed by
Ledington in 1989. The elements of the framework are individuals, groups or organizations
that formed. The framework is used to manage an intervention and the intervener is free to
construct a social group or community. Conversation is a means of knowledge acquisition
and is a process in gaining self-understanding and mutual understanding. It is also a way to
achieve predetermined objectives by using specific strategies (Navarro, 2001). The specific
strategies could be used in a fair manner by trying to genuinely convince the other
participants in a good conversation session.
The conversation model would be able to guide process consultants on how to go through
various intervention strategies so that the client is stimulated to tell his or her story with
minimal disruption of either the process or content. This can be done if every learning
conversation is followed by reflections by both the process consultant and client. Pask's
(1976a) conversation theory mentioned that reflection would bring about a desired emergent
behaviour which shows what the participants have learned and achieved, how the participants
have interacted interpersonally, and what the participants need to learn in the future.
Pask's (1976a) conversation theory contemplated the phenomenon of human learning as the
result of an emergent process of conversation such as linguistic interaction based on
conscious, conceptual resonance between several P-individuals. These P-individuals can be
distinct points of view within a biological individual, different biological individuals or even
specific groups of them. P-individual is an effective participant in a conversation, which
89
connect with many P-individuals. He suggested the existence of a close relationship between
these two aspects P-individuals as perspectives and P-individuals as participants.
According to Pask's (1976a) conversation theory, process of communication can also be
considered as a P-individual: a strict conversation is a prototypical P-individual. There are
three different conceptual aspects coexisting in the idea of a P-individual: the concept of a
cognitive perspective, the concept of a participant (in a conversation) and that of a whole
conversation. The conversation is a P-individual, and so are the participants who converse
with each other (Pask, 1961).
Hence, good conversation is an important concept derives from Pask's conversation theory
that forms the basis for effective process consultation by encouraging process consultant and
his client to reproduce new behaviour through mutual information transfer and network of
concepts. Pask (1975) adopted a few alphabetical equations to explain his conversation
process by phrasing A (the process consultant) is conscious with B (the client) and
committing themselves to some dependency or relationship T (the agreed course of action).
The commitment of A and B to T is sought because this supposedly leads to desired
outcomes. In an analogous manner, the performance of a client in a conversation will
potentially involve the whole personality and not just the epistemic resource. Pask (1975) did
not consider information and conversation as a pre-selection of interaction but as a
consequence of the emergence of new realities when a given system interacts with other
systems. This emergence is due to the synchronization effect of the two systems.
The creation of the learning context requires self awareness as well as a social context for
intentional interaction (Black & Mendenhall, 1990). The learning context can be facilitated
by developing good conversation between the process consultant and his or her client. Good
conversation creates a form of conversation between the process consultant and client where
norms of discourse are developed consensually, values and assumptions can be surfaced and
tested, and all voices can be heard (Schuurman & Veermans, 2001). Through good
conversation the process consultant can enhance transfer learning and proceed with the
development process for his or her client.
Good conversation creates a cycle of effective transaction between the process consultant and
client who come into conversation and learn from each other. Research on stereotyping has
found an association between the level of self-acceptance a client feels and the tendency to
90
stereotype or accept others (Adorno, Frenkel-Brunswik, Levinson & Stanford, 1950; Rubin,
1967). Therefore, we expect that as clients increasingly accept themselves, they are more
able to let go of their prejudices and stereotypes of the process consultant. When clients are
fully free to speak, and feel they are genuinely being heard, the affirmation they experience
enhances self-acceptance. This enables them to listen more completely, allows for the
synergistic cycle of being heard and experience increased self-acceptance (Rogers, 1970).
Through good conversation, the possibility of stereotyping will be minimized to allow the
process consultant plays his/her role more effectively.
3.11 An Integrated Approach for Post Multi-rater Feedback Development
Post multi-rater feedback development start with a contact client or known as the ratee with
whom the process consultant meets concerning his or her performance deficiencies.
Whether or not that client admits to owing the performance deficiencies that is to be worked
on in the event of development, the process consultant would not want to be prematurely
perceived as an expert. The process consultant would want the client to feel helped after a
few meetings. The client should feel that every conversation is helpful especially during
early interactions.
According to Schein (1997), the process consultant and client have something to learn from
each other during the development process. He came up with eight general principles to
improve the flexibility of process consultation. They are: always be helpful, always deal with
reality, access your ignorance, everything you do is an intervention, it is the client who owns
the problem, go with the flow, be prepared for surprises and learn from them, share the
problem. These eight general principles govern the process consultant's roles and
relationship with the client. Chapman (1998) said that a process consultant should adopt
flexible consulting roles. Some clients may need a mentor and adviser on general
management matters as much as they require a facilitator and project manager. He further
commented that good process consultants help to identify the real issues and challenges
facing the organization as well as discuss a tailor-made process for constructive change.
91
Although Schein's eight general principles were used to enhance flexibility of the process
consultant's role, it does not mention how an effective dialogue session could be established
between the process consultant and his/her client. Vygotsky (1978) mentioned that
psychological contract could be established through effective dialogue and it will help the
process consultant and client reach higher levels of understanding. The establishment of
psychological contract would be an opportunity for the process consultant and client to learn
important new things about a situation when they explore it together.
According to Schein (1997), the client owns the problem and has to live with the
consequences of the problem and the solution. Therefore the consultant must not withdraw
any problems away from the client because the client is the best person to understand and
appreciate what would be the next best steps. Involvement of client depends on their
willingness to openly discuss issues they are facing and the trust they give the consultant.
Sometimes the client hides the real problem because he or she is testing the consultant to
determine whether the relationship is characterized by sufficient trust to reveal what may be
very intimate and personal information. Trust building therefore requires greater 'good
conversation' between both parties to explore their commitment and intention.
Learning about good conversations and adjusting our responses to different individuals,
groups and issues appropriately, can have a dramatic impact on outcomes for individuals,
teams and whole organizations. Any significant human learning is not just cognitive
information-processing but also moral and aesthetic co-construction of parts of our life-world
(Boyd, 2001). The process consultant and client may have conflicting views and feelings if
both parties hold strongly to their beliefs or worldview. The resolution to this conflicting
belief or worldview demands that new realities be generated through synchronisation of
perceptual differences (Navarro, 2001). Reflections provide an avenue for both parties to
understand each other's views and learn from each other's differences. The process
consultant may use reflections to help his or her client to focus on one behavioral change they
would like to make as a result of their experience: "What do you want to work on, and are
you willing to make a commitment to change?" This gives their reflection an action
component which is often beneficial.
Learning occurs in two forms: single-loop and double-loop (Argyris, 1994). Single-loop
learning asks a one-dimensional question to elicit a one-dimensional answer. Double-loop
92
learning takes an additional step, or more often than not, several additional steps. It turns the
question back on the questioner. It asks what the media calls follow-up. A double loop
process might also ask why the current setting was chosen in the first place. Because double-
loop learning depends on questioning one's own assumptions and behaviour, this apparently
benevolent strategy is actually anti-learning (Argyris, 1994). Admittedly, being considerate
and positive can contribute to the solution of single-loop problems for example cutting costs.
But it will never help people figure out why they lived with problems for years, why they
covered up, why they were so good at pointing to the responsibility of others and so slow to
focus on their own. The notion of good conversation expands the phenomenon of ideal
speech to include ideal listening and promote interaction. For ideal listening to occur, the
individuals must feel secure and accepting enough of themselves to be open to new
possibilities. Enhanced self-acceptance can contribute to the possibility of valuing the
diversity of others. Thus, the responsibility of the process consultant includes nurturing
clients' self-acceptance and inspiring a sense of personal power among people around them.
This framework promotes a deep commitment to empathetic interaction between process
consultant and client to construct a shared reality as a common setting for development
pathway (Navarro, 2001).
In good conversation theory, learning is approached from an inside-out perspective based on

personal experience (Hunt, 1987). Under the process consultation practice, individual
personal experience is required to be reflected on. Reflection pinpoints and dramatizes what
individuals have learned and achieved, how individuals have interacted interpersonally, and
what individuals need to learn in the future. Through valuing each person's individual
experience, the uniqueness of every person is assumed and considered a resource. With the
more typical outside-in approach to learning, the dissimilarity of each person is considered a
problem to be solved (Hunt, 1987). The point of departure for learning in 'good
conversation' is not only the presumption that each individual is different, but that diversity is
an inherent resource. In this consensual and self-reflective process, as more and more diverse
reflections become fully heard within the group, the values and perspectives of each member
influence others, and the process of mutual socialization evolves.
The process consultant must become the reflective practitioner cum learner besides helping
the individual benefit from the double loop processes (Argyris & Schon, 1978). They must
diagnose the issues and take action to improve the practice by involving themselves directly
93
and fully, preparing to investigate such experiences from as many different perspectives as
possible and patterning their observations into meanings through reflection. Documentation
of the agreed proposed course of action was not mentioned by Schein (1997). In the absence
of documented agreement between the process consultant and client, commitment in fulfilling
the course of action is unlikely to happen. Schuurman and Veermans (2001) derived two
classes of consequences from conversation theory: the weak consequence and the strong
consequence. The weak consequence stresses observation, the strong consequence stresses
control. The weak consequence concentrates on record keeping of the experimental subject,
closed conversation and topic of exchanges. The strong consequence notes the records of
agreements derived from the outcome of negotiations between two parties. Both these
consequences will bring about total commitment between the process consultant and the
client. The documentation process that records the reflections made between two parties
during the learning process will become an obligated implication to be fulfilled. Besides this,
the record keeping will also provide both parties with a flow of progression towards their
development goals.
Pask (1961, 1965, 1975a, 1975b, 1976a, 1976b) introduced both the object-language and
meta-language to explain the required exchanges during the learning process. He stressed the
need for researchers to distinguish object-language and meta-language with any of the
learning interfaces (Schuurman & Veermans, 2001). The object-language comprises a system
of expression (i.e. sentences during conversation) belonging to the object of study. These
sentences should be internal expressions of the object, that is to reflect properties of the
object and these expressions should conform to well define rules (in this case, the
developmental pathway undertaken by the process consultant and the client). Within the
meta-language, a new object-language can be proposed. If the new object-language fits the
purpose (i.e. learning objective) better than the original object-language then it can be
replaced. However, this is only possible if the process consultant and/or the client knows
what to replace. Keeping apart object-language and meta-language allows revisions to be
tracked. This is a crucial prerequisite for systematic inquiry (De Zeeuw, 1995). Record
keeping process holds a very important key in making this a success (Schuurman &
Veermans, 2001). Pask (1965) considered interaction between object-language and meta-
language pivotal in learning and human performance in general. Pask observed object-
language and meta-language interactions so as to study how conversations are punctuated by
agreements (including agreements to disagreements). According to Pask, researcher needs to
94
keep proper record of the interaction between object-language and meta-language and to
mark all agreements. The agreements serve as controlled conversation of true hard data. He
argued that psychological experiments start with basic meta-language interactions: the
experimenter and experimental subject have to agree on their respective roles (Schuuman &
Veermans, 2001). The meta-language interactions that Pask strongly advocated should serve
as the whole basis of process consultation sessions where emergent behaviour for learning is
likely to happen.
However, there are a few drawbacks to the conversation theory where the theory itself
actually eliminates some basic traits of the human mind, human interaction and ignores other
aspects of human reality (Navarro, 2001). The other factors which prevent a straightforward
application of conversation theory to the study of social realities are strong dependence on
Pask's theory and not the study of real social life, specifically of human interactions. To
address the weakness of conversation theory in a real social environment and natural
conversation situation, the strength of the theory depends on its ability to bring into sharp
focus on specific aspect of the world. Massaro and Cowan (1993) suggested that in building
a community of good conversation, people are required to put themselves in the shoes of
others and to empathise if they are to arrive at consensually developed norms. Through
empathy for others, they can begin to understand, bring life, feelings and even accommodate
for the consequences of each other's norms distinct from themselves. It has to be an attempt
to truly see the world as the other sees it, understand the real life situation of the other and
adopt other's perspectives and values (Massaro & Cowan, 1993). One assumption of good
conversation is its essential dynamic quality and process, resisting a tendency to control for
predictability. This form of conversation implicitly and explicitly sets the conditions for
valuing individuals or organization through the integration of affective and cognitive modes
of experience and learning (Argyris, 1994). In this initial phase of the process, the process
consultant's role is particularly important. As part of the norm creation process, the process
consultant needs to be continually modeling a respectful and inclusive approach throughout
to foster a safe, receptive space for the conversation to unfold (Argyris, 1994).
95
3.12 Conclusion
Although research on multi-rater feedback assessment indicates that different rater sources
provide different information, multi-rater feedback technique is still useful at the preliminary
stage to provide information or create self awareness on individual strength, weaknesses or
blind spots. One underlying rationale to such systems is their potential impact on the target
individual's self awareness which increasing self awareness is thought to enhance
performance. This paper provides a concept on how multi-rater feedback can lead to a
successful developmental process through process consultation in Malaysia. Through the
years, training evaluation culture in Malaysia has not been properly practiced, hence it is
recommended that a proper approach be used to enable organizations to see the benefits of
holding pre training needs analysis and effective development approach so that a
comprehensive training and effective development approach could be instilled in the
Malaysian environment. Hence, the process consultant holds the key to effective
development process using multi-rater assessment as a pre-training gap analysis.
Process consultation provides the opportunity to check and balance the degree of learning and
development activities through reflection, problem solving capabilities and application of
theories throughout the developmental process. The flexibility of process consultation should
be enhanced by integrating conversation theory using good conversation and documentation
of pre-agreed commitment of action known as reflection. This will promote ideal
communication and interaction between the process consultation and client which will
eventually build trust for open learning and development. Good conversation is an important
intervention tool that has potential for applying effective human communication, decision
making, and policy making in the development process through single loop and double-loop
learning.
Multi-rater feedback approach also gathers information from various sources, in order to
evaluate the level of transfer learning of an individual at the end of the development stage of
process consultation. It is recommended that an integrated and comprehensive model
comprising preliminary multi-rater feedback assessment, followed by developmental process
using process consultation and good conversation in an effort to facilitate transfer learning to
the organization.
96
3.13 References for Paper Three
Adorno, T.W., Frenkel-Brunswik, E., Levinson, D.J. & Stanford, R.N. 1950, The
authoritarian personality, Harper and Brother, New York.
Alimo-Metcalf, B. 1998, 'Editorial 360-degree assessment and feedback',

Professional Forum, vol. 6, no. 1, pp. 16-18.
Antonioni, D. 1994, 'Designing an effective 360-degree appraisal feedback system',

Personnel Psychology, vol. 47, pp. 349-356.
Argyris, C. & Schon, D. 1978, Organization Learning: A Theory in Action Perspective,

Addison-Wesley, Reading, MA.
Argyris, C. 1994,'Good conversation that blocks learning', Managerial Excellence, Harvard

Business Review, vol. 15, pp. 303-317.
Ashford, S.J. 1984, 'Self-assessments in organizations: a literature review and integrative

model', Research in Organizational Behaviour, vol. 11, pp. 133-174.
Ashford, S.J. 1993, 'The feedback environment an exploratory study of cue use', Journal of
Organizational Behaviour, vol. 14, pp. 201-224.
Atwater, L.E. & Yammarino, F.J. 1993, 'Personal attributes as predictors of superiors' and
subordinates' perceptions of military academy leadership', Human Relations, vol. 46,
pp. 645-668.
Atwater, L.E., Ostroff, C.M., Yammarino, F.I. & Fleenor, I.W. 1998, 'Self-other agreement:
does it really matter?' Personnel Psychology, vol. 51, no. 3, pp. 577-598.
Atwater, L.E., Roush, P. & Fischthal, A. 1995, 'The influence of upward feedback on self-
and follower ratings of leadership', Personnel Psychology, vol. 48, pp. 35-49.
Bass, B.M. & Yammarino, F.I. 1991, 'Congruence of self and others' leadership ratings of
naval offices for understanding successful performance', Applied Psychology: An
International Review, vol. 40, no. 4, pp. 437-454.
Bennis, W.G., Benne, K.D. & Chin, R. 1969, The Planning of Change, 2nd edn, Holt,
Rinehart and Winston, New York, NY.
Bernardin, H.J., Dahmus, S.A. & Redmon, G. 1993, 'Attitudes of first line supervisors
towards subordinate appraisals', Human Resource Management, vol. 32, pp. 315-324.
Black, J.S. & Mendenhall, M. 1990, 'Cross-cultural training effectiveness: a review and a
theoretical framework for future research', Academy of Management Review, vol. 15,
no. pp. 113-136.
97
Borman, W.C. 1997, '360-degree ratings: an analysis of assumptions and research agenda
for evaluating their validity', Human Resource Management Review, vol. 7, pp. 315-
324.
Boyd, G. 2001, 'Reflections on the conversation theory of Gordan Pask', Kybernetes, vol.
30, no. 5/6, pp. 560-570.
Bracken, D.W., Dalton, M.A., Jako, R., McCauley, C.D. & Pollman, V.A. 1997, Should
360-degree Feedback Be Used Only for Developmental Purposes? Greensboro, NC:
Center for Creative Leadership.
Brutus, S., London, M. & Martineau, J. 1999, 'The impact of 360-degree feedback on
planning for career development', Journal of Management Development, vol. 18, pp.
676-693.
Cacioppe, R. 1998, 'An integrated model and approach for the design of effective
leadership development programs', Leadership and Organization Development
Cacioppe, R. & Albrecht, S. 2000, 'Using 360-degree feedback and the integral model to
develop leadership and management skills', Leadership and Organization
Cacioppe, R. & Albrecht, S. 2000, 'Differing perceptions of managers: behaviours

using the holon leadership-management model', in Parry, K. edn. forth coming.
Chapman, J. 1998, `Do process consultants need different skills when working with
nonprofits?', Leadership and Organization Development Journal, vol. 19, no. 4, pp.
211-215.
Church, A.H. 1997, 'Managerial self awareness in high performing individuals in

organizations', Journal of Applied Psychology, vol. 82, pp. 281-292.
Conway, J.M. & Huffcutt, A.I. 1997, 'Psychometric properties of multi-source performance
ratings: a meta-analysis of subordinate, supervisor, peer and self-ratings', Human
Performance, vol. 10, pp. 331-360.
Dezeeuw, G. 1995, 'Values, science and the quest for demarcation', System Research, pp.
15-24.
Edwards, J.R. 1993, 'Problems with the use of profile similarity indices in the study of
congruence in organizational research', Personnel Psychology, vol. 46, pp. 641-65.
Edwards, J.R. 1994, 'The study of congruence in organizational behaviour research: critique
and proposed alternative, Organizational Behaviour And Human Decision Processes,
vol. 58, pp. 51-100.
98
Facteau, J.D., Facteau, C.I., McGonigle, T.P. & Fredholm, R.I. 1997, Characteristics of
feedback and managers' reactions in multi-source appraisal systems, paper presented
at the 12th annual conference of the Society of Industrial and Organizational
Psychology, St. Louis, MO.
Fagenson, E. & Burke, W. 1990, 'Organization development practitioners' activities and

interventions in organizations during the 1980s', Journal of Applied Behavioural
Science, vol. 26, no. 3, pp. 285-297.
Fletcher, C. 1997, 'Self awareness: a neglected attribute in selection and assessment?',

International Journal Of Selection And Assessment, vol. 5, no. 3, pp. 183-187.
Fletcher, C. & Baldry, C. 2000, 'A study of individual differences and self awareness in the
context of multi-source feedback', Journal Of Occupational And Organizational
Psychology, vol. 73, pp. 303-319.
Fletcher, C. & Bailey, C. 2003, 'Assessing self awareness: some issues and methods',
Journal of Managerial Psychology, vol. 18, no. 5, pp. 395-404.
French & Bell 1999, Organization Development: Behavioural Science Interventions for
Organization Improvement, 6th edn, Prentice-Hall Publisher, New Jersey.
Furnham, A. & Stringfield, P. 1994, 'Correlates of self and subordinate ratings of managerial
practices as a correlate of supervisor evaluation', Journal of Occupational and
Organizational Psychology, vol. 67, no. 1, pp. 57-67.
Garavan, T.N., Morley, M. & Flynn, M. 1997, '360-degree feedback: its role in employee
development', Journal of Management Development, vol. 16, no.2, pp. 134-147.
Ghorpade, J. 2000, 'Managing five paradoxes of 360-degree feedback', Academy of

Management Executive, vol. 14, pp. 140-50.
Greguras, G.J., Ford, J.M. & Brutus, S. 2003, 'Manager's attention to multi-source feedback',
Journal of Management Development, vol. 22, no. 4, pp. 345-361
Hagberg, R. 1996, 'Identify and help executives in trouble', Human Resource Magazine,
vol. 41, no. 8, pp. 88-92.
Hall, D., Otazo, K. & Hollenbeck, G. 1999, 'Behind closed doors: what really happens in
executive coaching', Organizational Dynamics, vol. 27, no. 3, pp. 39-58.
Harris, M.M. & Schaubroeck, J. 1988, 'A meta-analysis of self-supervisor, self-peer and
peer-supervisor ratings', Personnel Psychology, vol. 41, pp. 43-62.
Hazucha, J. Fr., Hezlett, S.A. & Schneider, R.J. 1993, 'The impact of 360-degree feedback on
management skills development', Human Resource Management, vol. 32, pp. 325-
351.
Heisler, W.J. 1996a, '360-degree feedback: an integrated perspective', Career Development

International, vol. 1, no. 3, pp. 20-23.
99
Hoffman, R. 1995, 'Ten reasons you should be using 360-degree feedback', Human
Resource Management Magazine, vol. 40, no. 4, pp. 82-86.
Honey, P. & Mumford, A. 1982, Manual of Learning Styles, Honey Publication,

Maidenhead.
Hunt, D.E. 1987, Beginning With Ourselves, Brookliine Books, Cambridge, MA.
Judge, W. & Cowell, J. 1997, 'The brave new world of executive coaching', Business
Horizons, vol. 40, no. 4, pp. 71.
Junaidah, H. 1999, Training Management: A Malaysian Perspective, Prentice-Hall

Publisher, Pearson Education, Malaysia.
June, M.L. P. & Rodzhan, 0. 2000, 'Management training and development practices of
Malaysian organizations', Journal of the Malaysian Institute of Management,
Malaysian Management Review, vol. 35, no. 2, pp. 77-85.
Journal of American Society for Training and Developing, vol. 13, pp. 3-9,
Kirkpatrick, D.L. 1994, Evaluating Training Programs The Four Levels, Berrett-Koehler
Publishers, San Francisco.
Kluger, A.N. & Denisi, A.D. 1996, 'The effects of feedback interventions on performance:
historical review, a meta-analysis and a preliminary feedback intervention theory',
Psychological Bulletin, vol. 119, pp. 254-284.
Kluger, A.N. & Denisi, A.D. 2000, 'Feedback effectiveness: can 360-degree appraisals be
improved?', Academy of Management Executive, vol. 14, pp. 129-139.
Kolb, D. 1984, Experimental Learning, Prentice-Hall Publisher, New Jersey.
Leslie, J., Gryskiewicz, N. & Dalton, M. 1998, 'Understanding cultural influences on the
360-degree feedback process', in Maximizing the Value of 360-degree Feedback: A
Process for Successful Individual and Organization Development, eds Tornow, W. &
London, M., Jossey-Bass, San Francisco, pp. 196-216.
London, M. & Beatty, R.W. 1993, '360-degree-feedback as a competitive advantage', Human

Resource Management, vol. 2-3, pp. 353-372.
London, M. & Smither, J.W. 1995, 'Can multi-source feedback change perceptions of goal
accomplishment, self evaluations and performance related outcomes? Theory-based
applications and directions for research', Personnel Psychology, vol. 48, pp. 803-839.
100
London, M., Wholers, A.J., & Gallagher, P. 1990, '360-degree feedback surveys: a source of
feedback to guide management development', Journal of Management
Development,
vol. 9, pp. 17-31.
Lovelady, L. 1989, 'The process of organization development: a reformulated model of the

change process, Part 1', Management Decision, vol. 27, no. 4, pp. 143-154.
Luthans, K.W. & Farner, S. 2002, 'Expatriate development: the use of 360-degree feedback',
Journal of Management Development, vol. 21, no. 10, pp. 780-793.
Massaro, D.W. & Cowan, N. 1993, 'Information processing models: microscopes of the
mind', Annual Review and Psychology, vol. 44, pp. 383-425.
McCauley, C.D. & Moxley, R.S. Jr. 1996, Development 360: how Feedback Can Make
Managers More Effective, Jossey-Bass Publisher, San Francisco.
McEvoy, G.M. 1990, 'Public sector managers' reactions to appraisals by subordinates',

Public Personnel Management, vol. 19, pp. 201-212.
McEvoy, G. M. & Buller, P.F. 1987, 'User acceptance of peer appraisals in an industrial
setting', Personnel Psychology, vol. 40, pp. 785-797.
Mirza, S.S. & Juhary, H.A. 1995, Managerial training and development in Malaysia,
Malaysian Institute of Management, Malaysia.
Mount, M. K. 1984, 'Psychometric properties of subordinate ratings of managerial

performance', Personnel Psychology, vol. 37, pp. 687-701.
Mount, M.K., Judge, T.A., Scullen, S.E., Sytsma, M.R. & Hezlett, S.A. 1998, 'Trait, rater,
and level effects in 360-degree performance ratings', Personnel Psychology, vol. 51,
pp. 557-576.
Moses, J., Hollenbeck, G. P. & Sorcer, M. 1993, 'Other people's expectations', Human
Resource Management, vol. 32, Summer Fall.
Nasby, W. 1989, 'Private self-consciousness, self awareness and the reliability of self-
reports', Journal of Personality and Social Psychology, vol. 56, no. 6, pp. 950-957.
Navarro, P. 2001, The Limits of Social Conversation, Kybernetes, MCB University Press,
vol. 30, no. 5/6, pp. 771-788.
Nowack, K. 1993, '360-degree feedback: the whole story', Training and Development
O'Reilly, B. 1994, '360-degree feedback can change your life', Fortune Magazine, vol. 130,
no. 8, pp. 93-97.
Pask, G. 1961, An Approach to Cybernetics, Hutchinson, London.
Pask, G. 1965, Inleiding tot de Cybernetica, Het Spectrum, Utrecht.
101
Pask, G. 1975a, The Cybernetics Of Human Learning And Performance', Hutchinson,
London.
Pask, G. 1975b, Conversation, Cognition and Learning: A Cybernetic Theory and

Methodology, Elsevier, Amsterdam.
Pask, G. 1976a, Conversation Theory: Applications In Education And Epistemology,

Elsevier, Amsterdam and New York.
Pask, G. 1976b, Revisions in the foundations of cybernetics and general systems theory as a
result of research in education, epistemology and innovation (mostly in man-machine
systems), proceedings of the 8th International Congress on Cybernetics, Namur, vol.
6, no. 11, September, pp. 83-109.
Payne, T. 1988, 'Editorial 360-degree assessment and feedback', International Journal of

Selection and Assessment, vol. 6, no. 1.
Reilly, R.R., Smither, J.W. & Vasilopoulos, NJ. 1996, 'A longitudinal study of upward
feedback', Personnel Psychology, vol. 49, pp. 599-612.
Rogers, C.R. 1970, Encounter Groups, Harper and Row, New York.
Romano, C. 1994, 'Conquering the fear of feedback', Human Resource Focus, vol. 71, no. 3.
Rosti, R.T., Jr & Shipper, F. 1998, 'A study of the impact of training in a management
development program based on 360-degree feedback', Journal of Managerial
Psychology, vol. 13, pp. 77-89.
Rubin, I. 1967, 'The reduction of prejudice through laboratory training', Journal of Applied
Behavioural Science, vol. 3, no. 1.
Saiyadain, M.S. & Juhary, A. 1995, 'Managerial training and development in Malaysia',
Journal of the Malaysia Institute of Management, Management Review, vol. 5, pp.
23-36.
Schein, E.H. 1987, Process Consultation, vol. 2, Addison-Wesley, MA.
Schein, E.H. 1997, 'The concept of 'client' from a process consultation perspective', Journal
of Organizational Change Management, vol. 10, no. 3, pp. 202-216.
Schuurman, J.G. & Veermans, K. 2001, 'Conversation and research', Kybernetes, vol. 30, no.
7/8, pp. 881-890.
Shipper, F. & Dillard, J.E. Jr. 2000, 'A study of impending derailment and recovery of
middle managers across career stages', Human Resource Management, vol. 39, no. 4,
pp. 331-345.
102
Shipper, F. & John, J. 1992, 'Employees' feedback: its use for management development and
the results in a government organization', in Fargher, J.S. edn., Proceedings of
Symposium on Productivity and Quality Improvement with a Focus on Government,
Industrial Engineering and Management Press, Washington, DC.
Thach, E.C. 2002, 'The impact of executive coaching and 360-degree feedback on leadership
effectiveness', Leadership and Organization Development Journal, vol. 23, no. 4, pp.
205-214.
Thach, I. & Heinselman, T. 1999, 'Executive coaching defined', Training and Development
Journal, vol. 53, pp. 34-39.
Tornow, W.W. 1993, 'Perceptions or reality: is multi-perspective measurement a means or

an end?' Human Resource Management, vol. 32, no. 2 and 3, pp. 221-230.
Tornow, W.W. & London, M. 1998, Maximizing the Value of 360-Degree Feedback: A
Process for Successful Individual and Organizational Development, Jossey-Bass
Publisher, San Francisco.
Van Veslor, E., Taylor, S. & Leslie, J.B. 1993, 'An examination of the relationship among
self-perception accuracy, self awareness, gender and leaders' effectiveness', Human
Resource Management, vol. 32, summer fall, no. 2/3, pp. 249-263.
Vygotsky, L.S. 1978, Mind in Society, Harvard University Press, Cambridge.
Waldman, D.A., Atwater, L.E. & Antonian, D. 1998, 'Has 360-degree feedback gone
amok?', Academy of Management Executive, vol. 12, no. 2, pp. 86-94.
Walker, A.G., & Smither, J. W. 1999, 'A five-year study of upward feedback: what managers
do with their results matters', Personnel Psychology, vol. 52, pp. 393-423.
Wan Aziz, W.A. 1994, 'Transnational corporations and human resource development',
Personnel Review, vol. 23, no. 5, pp. 50-69.
Warr, P. & Bourne, A. 1999, 'Factors influencing tow types of congruence and similarity as
related to interpersonal evaluation in manager-subordinate dyads', Academy of
Management Journal, vol. 23, pp. 320-30.
Weisbord, M. 1988, 'Towards a new practice theory of OD: notes on sharpshooting and
moviemaking', in Research in Organizational Change and Development, eds
Pasmore, W. & Woodman, R., JAI Press, Greenwich, CT, vol. 2, pp. 59-96.
Wilson, M.S., Hoppe, M.H., & Sayles, R.S. 1996, Managing Across Cultures: A Learning
Framework, Centre for Creative Leadership, Greensboro, NC.
Wolfe, D.M. & Kolb, D.A. 1984, 'Career development, personal development and
experiential learning', in Organization Psychology: Readings on Human Behaviours
in Organizations, 4th edn, Prentice-Hall, NJ.
103
Wood, R., Allen, T., Pillenger, T. & Kahn, N. 1999, '360-degree feedback: theory, research
and practice', in Human Resource Strategies: An Applied Approach, eds Travaglione,
T. & Marshall, V., McGraw-Hill, Sydney.
Zakaria, I. & Rodzhan, 0. 1993, Human resource development practice in the manufacturing
sector in Malaysia: an empirical assessment, Paper Presented at the Seminar on
Human Resource Management, Faculty of Business Management, University
Kebangsaan Malaysia.
Zemke, R. & Zemke, S. 1995, 'Adult learning: what do we know for sure?', Training and
104

Evaluating Training Effectiveness: An Integrated Perspective

Uploaded by

Document Information

Original Description:

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Evaluating Training Effectiveness: An Integrated Perspective

Uploaded by

Copyright:

Available Formats

EVALUATING TRAINING EFFECTIVENESS:

AN INTEGRATED PERSPECTIVE IN MALAYSIA

Lim Guan Chong

International Graduate School of Management

DOCTOR OF BUSINESS ADMINISTRATION

PORTFOLIO SUBMISSION FORM

Name: Lim Guan Chong Student Id No: 0111487H

Dr Ian Whyte Date

I hereby declare that this paper submitted in partial fulfillment of

Lim Guan Chong Date:5th August 2005

Portfolio Submission Form

1 Research Paper 1 Methodological Issues 3

2 Research Paper 2 Evaluating Training Effectiveness: 43

3 Research Paper 3 Multi-rater Feedback For Training 74

I am sincerely grateful to my supervisors, Dr Travis Kemp and Professor Leo Ann

In particular, my sincerest thanks to my respondents, relatives, families and other

Finally, my utmost appreciation to University of South Australia, International

Lim Guan Chong

METHODOLOGICAL ISSUES IN MEASURING

Lim Guan Chong

International Graduate School of Management

Lim Guan Chong

Training evaluation is regarded as an important human resource development strategy.

1.3 Approaches to Training Evaluation

1.3.1 Discrepancy Evaluation Model

Provus's Discrepancy Evaluation Model can be considered an extension to Tyler's earlier

Formative evaluation focuses on the process criteria to provide further information to

Provus Discrepancy Evaluation Model provides information for establishing measures of

1.3.2 Transaction Model

1.3.3 Goal-Free Model

1.3.4 Systemic Evaluation

1.3.5 Quasi-Legal Approach

1.3.6 Art Criticism Model

1.3.7 Adversary Model

1.3.8 Contemporary Approaches - Stufflebeam's Improvement-Oriented

1.3.9 Cervero's Continuing Education Evaluation, 1984

In Cervero's book titled "Effective continuing education for professionals" he suggested

Kirkpatrick's (1994) Training Evaluation Model

Reaction How did the participants react to the training?

Learning What information and skills were gained?

Level 1: Reaction Evaluation

Level 2 Learning Evaluation

Level 3 Behavioural Evaluation

Job performance after training is referred to as behavioural by Kirkpatrick (1959, 1976)

Level 4 Results Evaluation

1.4 Critical Review

How well a person's motivation level affects the learning behaviour

Recommendations were made based on the following findings:-

Measuring learning (Level 2) as a method of evaluating training effectiveness is

Measuring changes in learning through data collection as prescribed by Kirkpatrick

1.5 Future Research

1.5.1 The Transfer Component

A longitudinal study would be a better way of measuring the effectiveness of transfer

1.5.2 Evaluating Beyond the 4 Levels

Although Kirkpatrick model focuses on the attainment of tangible outcomes, it is important to

1.5.3 Incorporating Competence-based Approach into Training

The aim of future research is to develop a comprehensive training evaluation by

Evidence: Evidence must be provided to indicate competent performance.

Observation: An assessor looks out for competent performance.

Peers' Comments are obtained from work colleagues, peers.

The key point is that a competence-based model supplements knowledge-based

The need for a cost-effective alternative to assessment centers;