You are on page 1of 29

Performance Appraisal

Job Analysis

Performance Standards: Criteria

Performance Appraisal

Job Performance Criteria

Objective

Production data
Sales volumes
Tenure or turnover
Absenteeism
Accidents
Theft

Problems?
Unreliability
Focus on outcome of behavior
Modification of performance
by situational characteristics

Subjective (Judgmental data)

Correlations between objective and subjective performance


measures (Bommer et al, 1995) .39 (These two are NOT
interchangeable)

Performance Appraisal
subjective data

Graphic rating scales


Employee-comparison methods

Rank order
Paired comparison
Forced distribution

Behavioural checklist and scales

Behaviourally-anchored rating scale (BARS)


Behavioural-observation scale (BOS)

Performance Appraisal

Examples of graphic rating scale


Job Knowledge

High

Low

Quality of work
Superior

Dependability

Above
Average

Average

Below Unacceptable
Average

Rate this employees dependability by assigning


a score according to the following scale: ______
1 to 5

(poor) gives up quickly

6 to 10 (Average) does the routine work


11 to 15 (good) rarely gives up

Practical judgment

Graphic Rating Scale

Advantage

Simple!!!!!!!!
Easy to develop

Disadvantage

Lack of clarity and definition


what do you mean by quality of work? What do
you mean by poor or Average

No control over central tendency

Employee comparison methods

Ranking
Paired comparison
Forced distribution
5% = very poor; 25% = poor; 40% = average; 25% =
good; 5%= very good)

Advantage
Avoid central tendency
Helpful in making employment decisions

Disadvantage
Hard to compare employees across different
departments

Behavioural checklist and scales

To overcome problems of GRS and hence to


provide more accurate and valid performance
ratings

Based on CIT

Types of Behavioural scales


BARS
BOS

An example of BARS
USE of KNOWLEDGE
Very
high 9
8
7
6
5
4
3
2
Very
low

[definition should follow]

A customer wanted to deposit a large amount of money.


The teller explained to the customer that he could earn
more interest on a money market account than with a
savings account
A customer applied for a new auto loan and had an E/I
too high for approval. This employee suggested a lowerpriced auto with a lower payment to reduce his E/I
When a customer called, this employee accurately
answered her question about finance charges
When a customer came to the bank for a loan, this
employee had to search for instructions and kept the
customer waiting

A customer wanted to cash a large check. The teller said


that it could not be cashed but did not realized it was all
right as long as the customer had that amount in her
account

An example of BOS

Performance dimension: Review previous work performance

1. Communicate mistakes in job activities to subordinates


Almost Never 1

5 Almost always

Praises subordinates for good worker behavior


Almost Never 1

3.

5 Almost always

Discuses hindrances in completing projects


Almost Never 1

5 Almost always

5 Almost always

11. Inspects quality of output materials


Almost Never 1

12. Reviews inventory of necessary parts and equipment


Almost Never 1

5 Almost always

Behavioural scales vs. GRS

The scale formats have little (or no) impacts on


the quality of ratings (Landy & Farr, 1980)
No one format is consistently better than the
others
WHY BOTHER THEN!!!

HOWEVER,

Increased feelings of justice and fairness


Favorable reactions from raters
Useful for developmental purposes
Learning effect
Legally more defensible (maybe)

Therefore, still worth using Behavioural scales

Who should rate?

Supervisors a primary source

Most employees preference


Maybe too result-oriented
Limited opportunities to observe interpersonal aspects

Subordinates

Little info about task performance, but good


opportunities to observe interpersonal behaviours
uncomfortable?

Who should rate?

Self

Unlikely to be used as the sole method of


evaluation
But this source are well-informed
More lenient than supervisor ratings (whos
wrong?)
1. Self-ratings move closer to supervisors when
extensive performance feedback is given (Steel
& Ovalle, 1984)
2. Self-ratings are less lenient if raters knows that
the ratings will be checked against some
objective criterion

Who should rate?

Peers

Good opportunities to observe both task and


interpersonal behaviors
Can observe uncensored behaviours
Multiple ratings are usually available
Friendship/rivalry effect
Range restriction (unwillingness to differentiate their
peers)
Uncomfortable in the role of rater

Why ratings differ ?

Potential explanations

Egocentric bias

Differences in organizational level

Correlations between ratings sources

(Harris & Schaubroeck, 1988)

Self-supervisor: .36
Self-peer
:.35
Supervisor-peer: .62

Differences in rating means

Self-supervisor: d = .70
Self-peer:
d = .28

360 degree feedback

Information from self, supervisors, peers


and subordinates is used as a source of
developmental feedback
Issues

Disagreement among sources


Harris & Schaubroek (1988)
r btw self and super/peer = .30s
R btw super and peer = .60s

Not necessarily a bad thing

Negative reactions to peer or upward


feedback?
Developmental purposes only? Or
administrative decision purposes as well?

Bettenhausen & Fedor (1997)


Expectations that peer and upward appraisals
would generate positive outcomes?
4
3.5
3
2.5
administrative
developmental

2
1.5
1
0.5
0

peer

upward

Other issues in performance appraisal

Rater error & accuracy

Halo
Leniency/Severity
Central tendency (Range restriction)

Improving Rating accuracy

Rater Error & Accuracy

Leniency

Central tendency or range restriction

Shift of mean rating away from scale midpoint


Skewness of rating distribution
SD across ratees within dimensions

Halo: The raters tendency to let global evaluation color

ratings on specific dimensions or The raters unwillingness


to discriminate among separate aspects of a ratees
performance
Inter-correlation among dimension ratings
SD of ratings across dimensions
Size of the first unrotated factor

Supervisor As ratings
Dim 1

Dim 2

Dim 3

Ratee 2

10

Ratee 3

11

Ratee 4

Ratee 1

SD=0.5

Supervisor Bs ratings
Dim 1

Dim 2

Dim 3

Ratee 1

Ratee 2

Ratee 3

Ratee 4

SD=3.59

Supervisor As ratings
Dim 1

Dim 2

Dim 3

Ratee 1

Ratee 2

Ratee 3

Ratee 4

SD=3.06

Supervisor Bs ratings
Dim 1

Dim 2

Dim 3

Ratee 1

Ratee 2

Ratee 3

Ratee 4

SD=0.58

Rater Error & Accuracy

Do these error measures correlate


negatively with accuracy measure?
(Murphy & Balzer, 1989)

Not really
The use of rater error measures as indirect
indication of accuracy is not recommended.

Rater Error & Accuracy

Then what is a direct measure of accuracy?

We should have true score

True score: represents the rating that would be


expected from an unbiased, careful rater who
completed the rating task under optimal conditions

Cronbachs accuracy components


Rater As ratings
Ratee1

Dim 1

Dim 2

x11

x12

Dim j

True ratings
mean

Dim 1 Dim 2 Dim j

Mean

Mean
Xi

t11

Mean
ti

t12

Ratee 2

Mean
Xi

Mean
ti

Ratee 3

Mean
Xi

Mean
ti

Ratee i
mean

Mean
Xj

Mean
Xj

xij

Mean
Xi

Mean
Xj

Grand
X

Mean
tj

Mean
tj

tij

Mean
ti

Mean
tj

Grand
T

Cronbachs accuracy components


E (x t )
2

1
DE
n[( xi x ) (ti t )]2
2

1
SA
k[( x j x ) (t j t )]2
2

Accuracy in discriminating
among ratees
Accuracy in diagnosing
strengths and weaknesses of
work groups
Accuracy in diagnosing
strengths and weaknesses of
individuals

1
DA
kn[( xij x j xi x ) (tij t j ti t )
2

Improving rating accuracy

Rater training

Rater error training


Performance dimension training

Get raters familiar with the dimensions on which


performance is rated prior to the observation of
performance
Involves reviewing rating scales, get them
participated in the development of the rating scale

Frame-of-reference training

(Woehr & Huffcuff, 1994)

Training raters with respect to performance standards


as well as performance dimensionality
Providing the definition of dimension, and sample of
behavioural incidents for each dimension
E.I., train raters to share and use common
conceptualization of performance

Behavioral observation training

note taking diary keeping

Rater training

(Woehr & Huffcuff, 1994)

0.9
0.8
0.7
0.6
0.5
rating accuracy

0.4
0.3
0.2
0.1
0

RET

PDT

FOR

BOT

Rater training

(Woehr & Huffcuff, 1994)

0.5
0.4
0.3
0.2

observational
accuracy

0.1
0
-0.1
-0.2

RET

PDT

FOR

BOT

You might also like