You are on page 1of 10

Research performance indicators at Aalborg Hospital Initial analysis

Jens Peter Andersen, MLISc


Address: Medical Library, AHSIC, Aalborg Hospital, Aarhus University Hospital, Sdr. Skovvej 15, DK-9000 Aalborg, Denmark

Abstract
OBJECTIVES: To apply and evaluate a system of research performance indicators at the department level of a university hospital. METHODS: The performance indicators used in the study are differentiated in five main areas of research activities, namely funding, scholarly activities, mediation, merits and networking, innovation and technology transfer. These areas are divided into various indicators, using a scoring system which allows for some types of comparisons. The scholarly activities stand out as being the only activity without a threshold for maximum score. The model used for this score is akin to the Norwegian model for academic performance as well as the Australian research quality framework. RESULTS: The complete system of performance indicators is applied on a 2007 dataset for the first time, hence results are purely indicators of which departments perform well in given areas, and in which fields certain activities are focused. Scholarly activities in particular are reported from 2002-2007 offering a timeline for these activities. CONCLUSIONS: The 2007 performance analysis mainly serves as a reference point for future evaluations, as it is being used for the first time as a strategic research management tool. However, the results of the initial analysis also serve as actual indicators. These indicators manage to express a general overview of the five main areas of performance as well as pinpoint areas of specialised activity and potential areas of concern.

Introduction
Over the last eight years, Aalborg Hospital has made substantial changes to research leadership, organisation and infrastructure in order to qualify as part of Aarhus University Hospital. This transition to university hospital came to effect in 2003 and the increased focus on research has remained since. In order to evaluate this focus and investment, the hospital designed a research assessment framework which was implemented in 2008. The framework consists of five main assessment areas; Funding, scholarly activities, mediation, merits and networking, innovation and technology transfer. Each of these areas has several indicators on the departmental level, using a scoring system which allows some types of comparisons. The actual indicators will be described in the methods section. The first results from the entire framework (2007 data) are reported in this paper, along with a wider range of data on the purely scholarly activities (20022007). These are currently submitted for review in a Danish, national journal. A second objective of this paper is to evaluate the assessment framework, with respect to the complete

framework as well as the more specific publication data subset. Findings reported in this paper are the initial part of an ongoing analysis. The future work will include analysis of other indicators as well as adding a time dimension where the current data are restricted to one year.

Framework
The purpose of the Aalborg Hospital research assessment framework is to evaluate the research and development activities at the hospital, on the department level. The evaluations are used to identify areas which require or allow for potential investments. As mentioned earlier, the framework uses five main areas. These areas including their specific indicators are listed below with a description of the area and the intent of assessment: Funding The funding area is split into two categories, internal and external funding. While they are listed as two

Andersen JP. Research performance indicators at Aalborg Hospital Initial analysis. International Congress on Medical Librarianship 2009; 2009 Aug 31-Sep 4; Brisbane, Australia

1 of 10

categories below, they are reported as one collective area in the assessment framework.
Activity Points

R&D employees / total 1 point per %, max 10 points employees # employees with academic degree # PhD students Internal financing Total max: 0.5 points per employee, max 5 points 1 point per student, max 5 points 2 points per % of department budget, max 10 points 30 points

While the scores for books, chapters and theses are pretty straight-forward, journal articles are awarded points according to the so-called normalised impact. This method is an adaptation of an early version of the Norwegian national assessment model, specialised for the health institutions (1). The method is a way of objectively measuring the quantity and quality of an institutions research (expressed as scientific publications). The actual scoring system and the implications will be discussed in the methods section. Mediation The mediation area is the smallest area, representing the mediation of research to the public. This includes written submissions to newspapers and popular scientific magazines as well as mentioning in the media.
Activity Articles, written submissions Mentioning and interview in media Total max: Points 1 point per article, max 10 points 1 point per mentioning / interview, max 10 points 20 points

Table 1 Internal funding activities Activity Funds and private donations EU grant NIH grant Total max: Points 1 point per DKK 50,000 per R&D employee, max 10 points 2.5 points per grant, 5 points for coordination, max 5 points 2.5 points per grant, 5 points for principal investigator, max 5 points 20 points

Table 2 External funding activities

Given the above, the maximum achievable points for this area are 50 points. As this particular area deals with investments in research and development, it may be seen as a counterweight to the other areas; high scores in this area should preferably lead to high scores in the other areas. However, the score in this category should not be seen as directly proportional to that of the other categories. Scholarly activities The scholarly activities area is the only which does not have a maximum score. Maximums for the other areas have been implemented in order to minimize skewness, as some indicators have an easier achievable score than others. The scholarly scoring system, however, is intended to grant a score to any and all scientific publications (of appropriate quality, see below).
Activity Normalised impact Books Points No maximum 10 points / scientific book as single author / editor, partial points for partial authoring 1 point per chapter 10 points per doctoral degree, 5 points per PhD degree none

Table 4 Mediation activities

While mediation is seen as an important part of research, it has not been possible to identify other indicators than those listed in table 4, realistically possible to measure objectively at Aalborg Hospital. Merits and networking The purpose of this area is to assess mediation to the scientific community as well as other scholarly activities. While one might assess many other indicators than those listed in table 5, the chosen indicators represent the most important networking activities and scholarly honours.
Activity Congress participation with contribution Review of professor Review of theses Contracts with other research institutions Awards Total max: Points 1 point per lecture or poster, max 10 points 1 point per review, max 10 points 1 point per thesis, max 5 points 2 points per contract, max 10 points 1 point per award, 0.5 points per poster award, max 5 points 50 points

Research-based education 1 point per course, max 10 points

Book chapters Theses Total max:

Table 5 Merits and networking activities

Table 3 Scholarly activities

All indicators in this area are straight-forward sums at the department level.

Andersen JP. Research performance indicators at Aalborg Hospital Initial analysis. International Congress on Medical Librarianship 2009; 2009 Aug 31-Sep 4; Brisbane, Australia

2 of 10

Innovation and technology transfer The innovation and technology transfer area deals with activities related to patents, intellectual property rights (IPR), establishment of new enterprises and cooperation with external industries. The activities behind the indicators generally require many working hours which is represented by the high scores for one unit in any indicator.
Activity Patent application Patent approved Sale of IPR, including licenses Start-up enterprise established Consultancies Points 2 points per patent application, max 10 points 5 points per patent approved in Europe / USA, max 5 points 2 points per agreement, max 10 point 5 points per enterprise, max 5 points 2 points per agreement, max 10 points

Cooperation with medical 2 points per agreement, max 10 points industry Total max: 50 points Table 6 Innovation and technology transfer

Other factors such as income generated from the sale of IPR are not recorded. The purpose of the assessment framework is primarily to promote the activities listed, and not to report details on each indicator, e.g. by distinguishing between more and less profitable patents.

innovation and technology transfer area are registered by the contract office. All the above data are entered into Regis as raw values and points are automatically calculated based on these. Computations of points are thus identical for any department at any time. Publication data are collected by the medical library using PubMed (first author address and author names of most active researchers), Web of Science (address) and SCOPUS (affiliation) as well as annual reports from the departments. Neither collection method is complete, but the combination of these three databases and the annual report offers a close to complete registration of all publications from Aalborg Hospital. Each publication is catalogued by a librarian, attaching exact, authorised local affiliations and document type (for journal content distinguishing between articles, reviews, editorials, letters and comments, where only the first two are accepted for normalised impact calculation). Only those publications where Aalborg Hospital is actually mentioned in the author affiliation of the published paper are registered in Regis and calculations are performed automatically. Publication data are registered from 2002 while other data are registered from 2007. In this assessment 43 departments from Aalborg Hospital have supplied data, with a total of 2,751 publications. However, not all of the registered publications count in the assessment e.g. published abstracts, pamphlets, working papers etc.

Normalised impact
The most advanced measure of the assessment framework is the normalised impact for journal articles and reviews. The purpose of this measure is to report both the quantity and quality of research at Aalborg Hospital. The quantity is quite simply the amount of publications (journal articles and reviews, not counting editorials, abstracts, letters etc.) in peerreviewed journals, while the quality is determined by the journal in which the item is published. Traditionally, this might have been done using journal impact factors, however the Norwegian model (1)2 uses journal performance indicators (2) instead to group journals into three levels. The levels are labelled A-C, with A representing the top 5% of ISIindexed journals, based on journal performance indicators, level B the next 15% and level C all remaining, peer-reviewed journals. Using this method, one loses some information as the exact value of the JPIs are normalised into three groups. However, this loss of information allows for greater comparability, ensuring that no journal will skew the distribution.

Methods and materials


The following section describes the data used and the collection hereof, some methodological considerations particularly about the normalised impact method and the evaluation methods used.

Data collection
Data for this study have been collected from several different sources and collected in a single MySQL database dubbed Regis. Data for funding have been collected using a local research database as well as annual reports from the individual departments. All data on merits, networking and media attendance have been collected using PU:RE1. Data for EU and NIH grants are continually registered by the EU office at Aalborg Hospital while all contracts, patents and all other indicators for the
1

PU:RE Publication and Research platform: web-based solution for registering certain activities (e.g. conference participation) and publications. It is mandatory for all researchers at Aarhus University (including hospitals) to register relevant activities here.

In the early version adapted by Aalborg Hospital

Andersen JP. Research performance indicators at Aalborg Hospital Initial analysis. International Congress on Medical Librarianship 2009; 2009 Aug 31-Sep 4; Brisbane, Australia

3 of 10

From level A is extracted an additional level A*, represented by five general journals of especially high esteem. We adapted these journals from the Norwegian model, as well as the levels. The journals on level A* are Lancet, New England Journal of Medicine, Proceedings of the National Academy of Sciences, Nature and Science. Each level of journals awards a different score to items published therein, as shown in table 7.
Level A* A B C Journal Score, N SJ 10 5 3 1 5 140 351 388

Table 7 Scores at the four journal levels. N is the number of journals at each level.

While N for levels A*-B in table 7 represent the complete list of ISI-indexed journals, level C only contains those journals which have actually been used for publication by researchers from Aalborg Hospital. Papers are assigned a score based on the level of the journal in which they have been published. The corresponding journal score, SJ, is fractioned based on the number of collaborators, and counted in the assessment framework. We have chosen to use author affiliations as identification of collaborators, as the aim of the framework is to assess activities on the department level (comparable to affiliations) and not on a personal level. If an item is published by a single affiliation, it will receive the full score of the journal level. However, if several, N, affiliations collaborate, the first affiliation (identical to affiliation of first author) receives half the score [F1], and the remaining affiliations receive an equal fraction of the remaining half [F2]. [F1] [F2]

several times (3;4), citations to papers in a single journal are strongly skewed, and appear more like a Bradford distribution than a Gaussian distribution. Measuring the impact of a given paper based on the journal in which it is published is clearly an approximation. However, as there is some relation between journal impact factor and the use of journals (5) as well as researcher rankings of journals (6), the impact factor does work as an indicator. This should, however, not be confused with or used as exact, actual impact or quality. Actual impact (the cumulated citations to a given paper) is by far a much more exact measure of that papers quality (assuming it may be measured using citations). However, citations take time; in the health sciences, citation half-life has been found to be six years (7), which would make any performance assessment historical. Such historical analysis might not be very useful in a world developing as fast as the health sciences; however, it could be useful for documenting the validity of older data, possibly allowing conclusions to be generalised for newer data. Thus, we have chosen to use an indicator over actual, exact citation data, as timeliness was very important for the assessment framework. Using Journal Performance Indicators as this indicator has been adapted from the aforementioned Norwegian model. Other measures could have been used as well, e.g. the Journal Impact Factor or the newer Eigenfactor (8).

Evaluation
The data reported in the following results section will be split into two groups; the complete assessment of all five indicators on the department level, and the publication data used for normalised impact computations. The purpose of the results section is to evaluate the data in order to report findings as well as to evaluate the assessment framework. The five indicators are naturally reported as raw values; however, these do not offer much actual information about the departments or the assessment framework. In order to evaluate the framework, a cluster analysis will be performed, attempting to group departments into research profiles. The exact procedure will be explained below. The publication data from 2002-2007 will be analysed separately in order to report findings as well as analysing certain parameters of the evaluation method. This will be described in more detail below. Clustering The purpose of clustering departments is to identify similarities in research profiles, not so much based on size rather than distribution, i.e. a small department focusing on mediation should be clustered with a

SA = 1 SJ 2
SA = SJ 2( N 1)

Using the affiliation as unit for fractioning points was initially chosen in order to obtain an exact, reproducible result, as practically all journal articles published in the health sciences will contain author affiliations. Other units might have been chosen, such as author names, and other distributions might also have been chosen (e.g. the national Danish bibliometric indicator promotes cooperation between different institutions by adding 25% to SJ prior to fractioning). Using normalised impact as an indicator for research performance is inherently problematic. As pointed out

Andersen JP. Research performance indicators at Aalborg Hospital Initial analysis. International Congress on Medical Librarianship 2009; 2009 Aug 31-Sep 4; Brisbane, Australia

4 of 10

large department having the same focus. One way of doing this would be normalising scores according to the number of employees. However, this would require all departments to have the same distribution of employee types, which is not the case. This may be solved by representing each department as a five-dimensional vector, each dimension representing one performance indicator. The performance vector, P, of department i may then be defined as [F3]: [F3]

Using the cosine similarity measure for clustering departments thus fulfils the purpose of identifying research profiles independently of their size. Forming clusters based on cosine similarities (values ranging from 0-1) requires a nxn similarity matrix, C, where n is the number of departments involved, and the value of Cij is given by [F7]: [F7]

C ij = S Pi , P j

Pi = {x i 1 , x i 2 , x i3 , x i 4 , x i 5 }

where xi1, xi2, xi3, xi4, xi5 are the values of the funding, scholarly activities, mediation, merits and networking, innovation and technology transfer indicators, respectively. Such vectors may be compared using various similarity measures to decide which departments are most similar. Which similarity measure to use depends on the case. Some of the most used measures are the Euclidean distance [F4] and the cosine similarity measure [F5] (9).

[F4]

S ( Pi , P j ) =

( x
k =1
5

ik

- x jk )

( x
[F5]

ik

x jk )
5 2

S ( Pi , P j ) =

k =1

( ( x ) x )
2 ik ij k =1 k =1

Both similarity measures can be used for describing the likeness of two departments. However, the Euclidean distance describes the distance between the endpoints of the vectors in the five-dimensional vector space, i.e. the length of the vectors, and thus the absolute value of the performance indicators weighs heavily. The cosine similarity measure describes instead the angle between two vectors, meaning that the absolute values of the indicators do not count, however, their relative values do. An exact cosine match (S=1) would mean department A could be described as any scaling of department B, e.g. every indicator in A is exactly twice the value in B, or [F6]: [F6]

From such matrix, clusters may be formed using a range of different clustering algorithms. These may basically be split into hierarchical and partitional methods. Partitional methods often require certain values to be known prior to analysis, e.g. the k-means algorithm requires the number of clusters to be predefined (11), or produce potentially overlapping clusters, e.g. using graph-theoretic distance (12). As a classification of departments into research profiles is desired, clusters should not overlap, and the number of clusters is not known either. Thus, the hierarchical approach appears to be the most useful as it does not require any of this information. The hierarchical method may form clusters of varying strengths based on a predefined similarity threshold, or one may produce a dendrogram illustrating clusters at different similarity levels. As any choice of similarity threshold would not be meaningful in this context, the dendrogram approach is chosen. The inclusion of a department in a cluster will occur at the lowest similarity level, meaning the included department has a similarity of at least x to all existing members of the cluster. This is similar to the complete-link clustering method (with predefined similarity threshold). If grouping the departments into research profiles is possible, the raw values of the framework may be used to classify departments, allowing for greater comparison between these. Publication analysis The purpose of analysing publication data specifically is to evaluate the normalised impact measure used to indicate a large part of the scholarly activities. Specifically, I will investigate whether there is a correlation between the number of collaborating author affiliations and the level of the journal a given paper is published in. This is interesting as a presumption about such relationship is used as an argument for not explicitly crediting large collaborations.

PA = s PB

where s is any positive real number. The cosine similarity measure is closely related to Pearsons correlation coefficient, commonly denoted as r. However, when data are centred as it is the case here, it is an advantage to use cosine, as it uses the original values of the vector (10).

Andersen JP. Research performance indicators at Aalborg Hospital Initial analysis. International Congress on Medical Librarianship 2009; 2009 Aug 31-Sep 4; Brisbane, Australia

5 of 10

This leads to a research hypothesis: H1: There is a statistically significant difference between papers published in low versus high level journals, regarding the number of collaborating departments, as indicated by author affiliations.

a matter of transforming all similarity values, S, into dissimilarity values, SD [F8]: [F8]

S D= 1 S

Thus, the corresponding null hypothesis may be formulated as: H10: There is no statistically significant difference between papers published in low versus high level journals, regarding the number of collaborating departments, as indicated by author affiliations. The data used for statistical testing of the null hypothesis will be drawn from three different populations; papers from journals at level C, B and the combined A+A* levels. The combination of A+A* is necessary as the number of papers published in A* journals is too small to represent an individual population. Also, those journals at level A* were manually selected from level A in the first place. Data in each population are the number of collaborating departments of each individual paper (ordinal data). As data cannot be expected to be normally distributed, the Mann Whitney U test will be used to test pairs of populations, with a required significance p<0.05 in order to discard the null hypothesis for one pair.

This means values of 0 indicate a perfect match, while values of 1 indicate a complete mismatch. The resulting hierarchical, complete-link clusters are illustrated as a dendrogram in figure 1. The one outlying department (Neurology) is not counted in the following discussion. I have chosen to form clusters at the 0.15 cut-off, which is the same as a similarity threshold of 0.85. This level has been chosen as a reasonably large variation in similarities occurs at this value. The same could be said about 0.30.6. However, only two clusters would be formed at this value where five are formed at 0.15. These five clusters are labelled P1-P5 as indicated in figures 2a-e.
Indicators 2 3 4 5 23.63 1 12.5 0 0.48 0 0 0 6 0 0 4 0 0 1 0 8.18 7 25 0 0 0 0 0 4.85 0 1 2 3.13 1 14 2 1.95 1 11 14 2.29 0 1 0 10.96 1 10 10 0 0 0 0 14.35 1 11 0 9.77 0 2 0 82.49 0 16 0 1.23 0 2 4 6.71 1 11 6 0.29 0 0 2 0.38 0 0 0 22.95 1 6 10 34.34 3 13 4 0 0 1 0 2.25 0 4 8 1 0 8 0 4.92 1 4 12 9.84 0 4 6 2.75 0 0 0 5.29 0 1 2 2.67 1 9 0 1.2 0 4 2 1.75 0 1 0 1.77 0 1 0 24.75 2 1 2 2.6 1 0 0 for Aalborg Hospital

Results
In this first section, data from the complete 2007 performance assessment are reported and analysed, followed by the 2002-2007 publication analysis. Clustering The complete data from the 2007 performance assessment may be found in table 8. In the following, the five main indicators will be labelled by numbers as follows: 1. 2. 3. 4. 5. Funding Scholarly activities Mediation Networking and merits Innovation and technology transfer

From the data in table 8, a similarity matrix has been formed, containing cosine similarities as described in the methods section. Agglomerative hierarchical clustering has been achieved by using the R Statistical Computing software which, however, requires dissimilarity data rather than similarity. This is simply

Department 1 ANESTH 9 OCCUPAT.MED 22 PEDIATR 11 PHYS.THER 7 CLIN.NURS 28 GERIATR 18 OBSTETR.GYNECOL 17 CARDIOTHORAC.SURG 16 HEMATOL 36 INFECT.MED 13 CARDIOL 29 VASC.SURG 16 SURG.GASTROENTEROL 26.5 CLIN.BIOCHEM 10 CLIN.EPIDEMIOL 23 CLIN.IMMUNOL 11 CLIN.MICRIOBIOL 18 MAXILLOFAC.SURG 17 RESPIR.MED 23 MED.ENDOCRIN 33 MED.GASTROENTEROL 38 NEUROSURG 8 NEUROL 4 NUCL.MED 16 NEPHROL 17 ONCOL 19.5 ORTHOP.SURG 16 PATHOL 18 RADIOL 14 RHEUMATOL 7 SOCIAL.MED 8 UROL 20 OPHTHALMOL 13 SURG.OTORHINOLARYNGOL 12 Table 8 Complete indicator scores departments, 2007

Andersen JP. Research performance indicators at Aalborg Hospital Initial analysis. International Congress on Medical Librarianship 2009; 2009 Aug 31-Sep 4; Brisbane, Australia

6 of 10

P3
90 80 70 60 50 40 30 20 10 0 0 1 2 3 4 5 6

Figure 1 Dendrogram showing clusters of departments

Figure 2c

P1
90 80 70 60 50 40 30 20 10 0 0 1 2 3 4 5 6

P4
90 80 70 60 50 40 30 20 10 0 0 1 2 3 4 5 6

Figure 2a

Figure 2d

P2
90 80 70 60 50 40 30 20 10 0 0 1 2 3 4 5 6

P5
90 80 70 60 50 40 30 20 10 0 0 1 2 3 4 5 6

Figure 2b Figure 2e

Figure 2a-e - Cluster P1-P5 showing values for all departments at all five indicators. The values combined with lines represent the mean values of those indicators. The line itself serves entirely as a visual aide and carries no meaning.
Andersen JP. Research performance indicators at Aalborg Hospital Initial analysis. International Congress on Medical Librarianship 2009; 2009 Aug 31-Sep 4; Brisbane, Australia 7 of 10

In general, all clusters have some value for funding, why this will not be commented further on for the single clusters. Cluster P1 clearly consists of a group of very active research departments. The scholarly activities in this cluster are especially high, while there is also some representation from networking and merits, but practically no mediation or innovation and technology transfer. P2 appears more as an allround group with decent values in all groups except for mediation. P3 has a clear focus on the networking activities and merits, while the other values are not very high. The departments in P4 are hardly represented at all. A likely explanation is that many of these departments are mainly clinical departments with very little focus on research, but another explanation might also be that reports from some of these departments have been lacking information. P5 resembles P2 somewhat, however with a greater emphasis on innovation and technology transfer. I find that each cluster describes a distinct type of department, allowing classification of these departments into four active categories (P1-P3 + P5) and one passive category (P4). Publication analysis

0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 2001 2002 2003 2004 2005 2006 2007 2008

Figure 4 Mean points per paper from Aalborg Hospital each year from 2002-2007. Figures 5 and 6 show these parameters, and it appears obvious that the increase in mean number of collaborating departments is greater than any change in journal level.
100%

During the period 2002-2007, the number of papers published each year from Aalborg Hospital has greatly increased (roughly doubled), and so has the normalised impact (points) of these papers (figure 3). While the normalised impact has increased less than the number of papers, the mean value of points per paper (figure 4) has not changed overwhelmingly. However, the largest change in points per paper appears at the same time (2005-2006) as the largest annual increase in number of papers. This raises the question of whether this is due to publishing at lower level journals, or a change in the mean number of collaborating departments per paper as both parameters would affect the mean points per paper.
350 300 250 200 n 150 100 50 0 2001 2002 2003 2004
Year

90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 2002 2003 2004 2005 2006 2007 C B A A*

Figure 5 Relative distribution of papers on the four journal levels.

2005

2006

2007

2008

Papers

Points

Figure 3 Number of papers published from Aalborg Hospital each year from 2002-2007 and the cumulative points (normalised impact) for these papers.
Andersen JP. Research performance indicators at Aalborg Hospital Initial analysis. International Congress on Medical Librarianship 2009; 2009 Aug 31-Sep 4; Brisbane, Australia 8 of 10

8 7 6 5 4 3 2 1 0 2001 2002 2003 2004 2005 2006 2007 2008

number of collaborators for papers published in level A and B journals; however, both populations are significantly different from the level C papers. The median number of collaborators for levels A and B is 4 while it is 3 for level C, meaning there are significantly more collaborators per paper in the A and B populations. This means that the argument for not rewarding these collaborations explicitly holds. There may be other arguments, and there may be other distributions elsewhere, but for the given premise, we are satisfied that collaborations are able to reward themselves by achieving publication in higher level journals.

Figure 6 Mean number of collaborating departments per paper. Seeing that such change in the number of collaborators has occurred, it becomes even more important that the argument for not explicitly rewarding such collaboration - as they in general publish in higher level journals - holds true. The distribution of collaborators per paper in the three populations (level C, B & A, see method) is illustrated in figure 7.
25.00%

Discussion
The results have shown that cluster analysis allows us to narrow the number of departments down to four research profiles and a group of inactive departments. These profiles have distinct features adding new value to evaluation; those departments which do not obviously distinguish themselves, e.g. due to low scores for small departments, may be evaluated more appropriately buy fitting them into one of these research profiles. The clusters created are dependent on the clustering algorithm used, why it may be fruitful to investigate if results vary much by using other clustering approaches than the one proposed in this paper. This will also be necessary when the time dimension is added in future assessments. A method for either displaying changes in clusters over time or creating clusters based on not only indicator scores but also changes in these must be devised. It has been shown that papers written by many collaborating departments generally are able to publish in higher level journals than those written by fewer collaborators. However, at the same time results show little change over time in the relative distribution of journal levels published in, while there still is a drop in points per paper. This indicates that the effect of collaboration might not be as high as needed to maintain the points per paper, or that the high amount of collaborators on some papers skews the distribution. While the initial argument has been shown to hold true, the effect of it might still be too small. A possible solution could be to implement a minimum score per collaborator, e.g. 1/10th of SJ, which would mean an increase in total points as well. However, one might fear such scoring schema might lead to a behaviour change where (local) collaborators are added pro forma to boost local scores. While there is no guarantee such behaviour would develop, one must still consider the possibility and implications in contrast to the lost points per paper. The discussion about the scoring system for departments may also be carried on to authors;
9 of 10

20.00%

15.00%

10.00%

5.00%

0.00% 0 5 10 15 Level C 20 Level B 25 30 Level A 35 40

Figure 7 Relative number of papers (y-axis) written by variable numbers of collaborators (x-axis). Each series represents papers published at the respective level, and the relative number is derived from the total number of papers published at that level from 2002-2007. Testing the probability that populations are alike gives the following results: p(A,B)=0.298 p(A,C)=0.000 p(B,C)=0.000 Given the significance threshold of 0.05, the null hypothesis holds for the case (A,B) while it does not hold for the cases (A,C) and (B,C). This means that there is no statistically significant difference in the

Andersen JP. Research performance indicators at Aalborg Hospital Initial analysis. International Congress on Medical Librarianship 2009; 2009 Aug 31-Sep 4; Brisbane, Australia

should the points possibly be fractioned per author instead of department? It appears obvious that the use of authors is by far a more direct way of fractioning scores, in particular when many authors from one department collaborate with few authors from another each department would receive the same score using the current method, which does not appear to be entirely fair. When we chose to use departments it was for several reasons; the above example is not very common, most often each department will not be represented by more than one or two authors3. Also, it is often easier to identify and register each affiliation address than the relationship between these and the authors which is also a considerably larger, manual work. All affiliation registrations are performed manually in order to maintain as high data quality as possible. This has made it practically impossible to use author data for fractioning in our case, yet it remains interesting to further investigate. The assessments reported in this paper are mostly descriptive and while they are interesting by themselves, a reference lacks. It is my hope that other university hospitals will try our assessment framework so that there may be other institutions we can compare our results to and vice versa.

References
(1) Sivertsen G. Mling av forskningsaktivitetene ved helseforetakene. Vitenskapelige artikler og doktorgrader som resultatindikatorer. NIFU Skriftserie 1[2003]. 2003. Oslo, NIFU. (2) Garfield E. Use of Journal Citation Reports and Journal Performance Indicators in Measuring Short and Long Term Journal Impact. Croat Med J 2000;41(4):368-74. (3) Seglen PO. The Skewness of Science. J Am Soc Inf Sci 1992;43(9):628-38. (4) Opthof T, Coronel R, Piper HM. Impact factors: no totum pro parte by skewness of citation. Cardiovasc Res 2004;61:201-3. (5) Tsay MY. The relationship between journal use in a medical library and citation use. Bull Med Libr Assoc 1998;86(1):31-9. (6) Saha S, Saint S, Christakis DA. Impact factor: a valid measure of journal quality? J Med Libr Assoc 2003;91(1):42-6. (7) Tsay MY. Library Journal Use and Citation HalfLife in Medical Science. J Am Soc Inf Sci 1998;49(14):1283-92. (8) Bergstrom CT, West JD, Wiseman MA. The EigenfactorTM Metrics. J Neurosci 2008;28(45):11433-4. (9) Schneider JW, Borlund P. Matrix comparison, Part 1: Motivation and important issues for measuring the resemblance between proximity measures or ordination results. J Am Soc Inf Sci Technol 2007;58(11):1586-95. (10) Anderberg M. Cluster analysis for applications. New York, USA: Academic Press; 1973. (11) MacQueen J. Some Methods for Classification and Analysis of Multivariate Observations. 1967 Jun 21; Berkeley, USA: University of California Press; 1967 p. 281-97. (12) Augustson JG, Minker J. An Analysis of Some Graph Theoretical Cluster Techniques. J ACM 1970;17(4):571-88.

Conclusions
This paper has shown a novel research assessment framework and reported results for Aalborg Hospital, Aarhus University Hospital in 2007 and a detailed publication analysis for 2002-2007. The presented results mainly serve as a reference point for future evaluations, as it is being used for the first time as a strategic research management tool. Using cluster analysis it has been possible to identify four research profiles into which most departments at Aalborg Hospital fit. These profiles simplify assessment and allow smaller departments to be more precisely identified. The method also allows identification of areas of potential interest, or which require attention. The publication analysis results show progress in the amount of papers published during the period, and it has been shown that papers published by several collaborators generally are able to achieve a higher journal score.

Acknowledgements
A special thank to Sren Lundbye-Christensen from the Centre for Cardiovascular Research at Aalborg Hospital for inspiration.

for local publications, this is no general assumption

Andersen JP. Research performance indicators at Aalborg Hospital Initial analysis. International Congress on Medical Librarianship 2009; 2009 Aug 31-Sep 4; Brisbane, Australia

10 of 10

You might also like