You are on page 1of 12

IBM Software Business Analytics

Government

Predictive Analytics in Human Capital Management


Introduction Contents:
1 Introduction 2 Predictive analytics in recruitment 5 Predictive analytics in career management

10 Predictive analytics in employee retention. 11 Conclusion 11 IBM SPSS products for predictive analytics 12 About IBM Business Analytics

Human Capital Management involves a number of different areas, including recruitment management, career training and skills development, and the reduction of turnover or attrition. In this paper, we will not focus on providing a comprehensive set of best practices for these areas. Instead, we will focus on providing concrete examples of how statistical analysis and predictive analytics have helped organizations solve specific problems related to human capital management. In general, predictive analytics combines technologies for analyzing past, present, and projected future relationships among data points with decision management technologies for delivering predictive insights and recommended actions to the systems or people that can effectively implement them. SPSS was one of the pioneers in the field of data analysis; it was first on the scene and continues to be one of the most popular and widely used software applications. As a new member of the IBM organization, SPSS brings its leading-edge analytic products and solutions to an even greater number of organizations worldwide. IBM SPSS offerings include industry-leading products for data and text mining, data collection, statistics and management that can assist in your companys human capital management through the predictive analysis of recruitment priorities, career management, and employee satisfaction and retention. IBM SPSS tools are based on industry standards and can easily integrate with your existing infrastructure to improve accuracy, decrease manpower and minimize loss. The combined effort of IBM and SPSS brings you the utmost in flexibility in the kinds of data you mine and how you deploy results.

IBM Software Business Analytics

Government

Highlights:
IBM SPSS predictive analytics can help organizations solve several specific problems related to human capital management. By becoming better able to anticipate and plan for their needs, organizations can more efficiently recruit staff, make the best use of their skills and help people advance in their careers.

IBM SPSS technologies for predictive analytics encapsulate advanced mathematical and statistical expertise to extract predictive knowledge that, when deployed into existing processes, makes them adaptive to improve outcomes. Our predictive analytics software will help you:

Capture all the information you need about peoples attitudes and opinions Predict the outcomes of interactions before they occur Act on your insights by embedding analytic results into business processes

Predictive analytics in recruitment


Recruitment refers to the process of sourcing, screening, and selecting people for a job at an organization or firm, or for a vacancy in a volunteer-based organization or community group. While managers or administrators can undertake some components of the recruitment process, mid- and large-size organizations and companies often retain professional recruiters (internal or external to the organization) or outsource some of the process to recruitment agencies. Traditionally, recruiting has relied heavily on the experience of the individual recruiter in determining the viability and likely success of a potential recruit. A recruiter with knowledge and experience of a particular field can often determine the likely success of an employment candidate by analyzing and screening that persons credentials, past work history, and associated skills. The recruiters experience also allows him or her to quickly focus on the factors that are most likely to determine the success in a given job or employment environment. However, there are situations in which the volume of potential recruits or the intricacies of a specific job requirement can overwhelm the efforts of even the best individual recruiter. Through the use of statistical analysis and predictive analytics, a recruiter or recruiting agency can successfully apply the experience and intuition of expert recruiters in creating a model that helps an organization to prioritize and target the individuals most qualified for a specific position. As an example, one of the branches of the U.S. military is responsible for getting more than 100,000 new recruits every year under contract. In order to achieve this goal, the recruiting command must target the right group of people through a variety of marketing campaigns, public speaking engagements, and personal interviews. These efforts will deliver approximately 600,000 leads that must then be prioritized and sent to individual recruiters. The local recruiters will then focus on the candidates that they feel are most likely to agree to contracts, and least likely to attrite after being contracted.

Business benefits:
Some of the aspects of human capital management that benefit most from IBM SPSS predictive analytics are:

Recruitment: More easily identify the most suitable candidates to fill positions Career management: Spot the criteria that predict high on-the-job performance and job satisfaction Retention: Determine which factors contribute to employee attrition, allowing for better workforce planning

IBM Software Business Analytics

Government

IBM SPSS modeling for lead prioritization


The process of creating a predictive model to provide a prioritized list to local recruiters in the example above begins with data collection. Data points for a potential recruit can be attained through a variety of methods, including in-person or phone interviews, online surveys, and write-in forms. In very large organizations, such as military organizations, this data is often passed to the recruiting office without any personally identifiable information. If this is the case, it is difficult to determine if that individual was contracted and what degree of success he or she had in their career. Anonymous data can be used in

Figure 1: Data describing characteristics of potential students and their interactions


with the school helps recruiters focus their efforts on those most likely to enroll.

Figure 2: An interactive, 3-D view of enrollment by the number of contacts and


interest in a department.

IBM Software Business Analytics

Government

order to create a general predictive model describing whether the recruiting office is attracting the interest of the right type of individual, but it is typically not sufficient in helping to create a model to predict recruiting success at an individual level. In order to successfully create a predictive model for recruiting, the recruiting office will need to collect that same information and maintain it alongside a historical list of which individuals were successfully signed or contracted. That information can then be maintained alongside a database of each persons career performance record in order to create predictive models for retention at an individual level. In situations where it is possible to retain demographic characteristics and a history, such as for a university recruiting and enrollment drive, predictive models can be created that help explain what factors are predictive or enrollment. The model can then be applied to new recruits to gauge their enrollment potential. Figure 1 is an example of data collected in a recruiting and enrollment drive by a university. When a potential student has contact with the university, the information provided by the student is maintained in a database. Over time, additional data is collected through phone calls initiated by the student or school, through a number of visits, and includes standardized test scores as well as demographic information. Academic recruitment offices also typically maintain a flag field that states whether that person applied for enrollment in the university. Over a period of time, the university may have collected data on thousands of potential students. An experienced recruiting staff would likely be able to focus on a few key variables that their experience shows determine whether a potential student will apply to the school.

Figure 3: An interactive heat map showing combined SAT scores by enrollment status
and interest in a department.

The office, however, may be collecting a good amount of likely enrollment predictors, including distance from school, high school grade point averages, and department or major of interest. Individually, there are many variables that could be identified as important factors in
4

IBM Software Business Analytics

Government

determining a likely applicant. However, the challenge for any recruiter, when faced with a rich set of data, is determining the relative importance of each variable. In addition, it is important to determine how the likelihood of enrollment might be affected by different combinations of values in the predictor variables. For example, a person with a high combined SAT score might still have a low likelihood to apply if they have had very little contact with the university. Likewise, those with a small number of contacts with the university might still have a high likelihood to apply if their current residence is a very short distance from the university. By using a predictive modeling algorithm, this schools recruiters are able to create models that distinguished which combination of variables and values tended to lead to an enrollment application, and which combination do not. The resulting model can then be used to score new cases for which the outcome was not yet known. The model is able to not only predict which cases would likely lead to enrollment, but also provides a propensity score for each classification, making it easy to prioritize cases for the recruiters. A view of the model, in Figure 4, on the next page, shows an example rule found in a dataset. This model was generated by an auto-modeling classification technique. Auto-modeling selects the best technique based on the data and the outcome and automatically creates a powerful ensemble (combination) model that is typically more stable and more accurate than one based on a single technique. When the predictive model is applied to new data, every case is scored against the set of rules created by the model in order to classify the likely outcome, as shown in Figure 5.

Predictive analytics in career management


Predicting the success of a potential employee or recruit in a given work environment is a more difficult task than the recruiting process described in the previous section. Once an employee is hired, there are numerous additional variables that affect a successful outcome for that persons career. For example, over time, a person hired to fill a specific job requirement may eventually be assigned tasks outside of the position they were hired for. Those changes have to be accounted for over time. Additionally, changes in management, co-workers, and mission goals, among other factors, can positively or negatively affect the performance of the individual. As in the previous example, collecting a more complete set of data will usually provide more flexibility in creating a predictive performance model, and will likely lead to a more accurate model. However, in a work environment where the job requirements are dynamic and skill levels can vary widely, the prediction of future performance for a specific job function is akin to hitting a moving target. In these cases, it is often useful to find a set of criteria that focus on common attributes that can be easily compared from one person to

IBM Software Business Analytics

Government

another or one department to another. These attributes are often obtained through opinion surveys, satisfaction surveys, and past performance reviews. This type of data, while often subjective, can provide a framework for creating a predictive performance model.

Figure 4. A rule describing which students are most likely to apply for enrollment.

Figure 5. This shows two new variables (columns). The first is a prediction of whether this person is likely to apply at the university. The second is the propensity score for this prediction.

IBM SPSS modeling for performance prediction


Can we assume that an employee who has consistently positive performance reviews, considers his or her job motivating, and reports a high degree of job satisfaction, is more likely to have a high level of performance, regardless of their specific job duty, title, or pay grade? We cannot ever dismiss the importance of specific and quantifiable job skills, experience, or intelligence. However, it is likely that in the
6

IBM Software Business Analytics

Government

absence of a complete and common set of quantifiable data for model creation, data collected on a persons beliefs, outlook, and attitude can augment the accuracy of a predictive performance model. This hypothesis can be tested by analyzing job performance in a controlled environment, where every person shares a similar environment, similar work expectations, and for the most part changes are applied equally to all persons. An example of such an environment is a university-level military academy. In a military academy, almost all individual variables, such as academic expectations, student housing, and extracurricular activities are fairly well controlled. In addition, all students share a somewhat common expectation of their careers after graduation. For example, a military academy collected more than 200 separate data points from incoming freshmen. Recruiters and administrators were interested in creating a model to predict the likelihood of an incoming cadet successfully completing the four-year program, as well as the likelihood of their exceeding the minimum length of service commitment after completing the academic program. Given the large number of potential predictor variables in the dataset, the first step, after data cleansing, was to screen, rank, and select the predictor variables. In this example, all candidate predictors were screened to remove unwanted or problematic variables, such as variables with too many missing values, values that represent unique values like ID, or variables with a very low coefficient of variation. The ranking process calculated the importance value of each variable by finding the p value of the appropriate statistical test of association between the candidate predictor and the target variable. In this case, the each predictor was tested against the target flag field (yes or no) that indicated whether an incoming student exceeded the target length of service (5 years). Finally, the selecting process used a statistical measure based on the total number of candidate predictors in order to select a subset of the most relevant or highly ranked predictors. The four most important predictors were obtained from quantifiable data. This result was expected, since past academic success is often an indication of future academic success and dedication to a specific long-term goal. A more surprising result was that of the other 32 predictors chosen as most important in predicting the future performance of an incoming freshman, 21 were based on their opinions, values, and beliefs. The students opinions on their life, academic priorities, dress code, and separation from family and friends all ranked higher than high school SAT or ACT test scores in determining future success. Figure 6 shows how attitudinal data and structured data may be combined for analysis.

IBM Software Business Analytics

Government

Figure 6. IBM SPSS Modeler can consolidate data visually from multiple sources, such as demographics data and attitudinal data.

Modeling employee satisfaction


As noted in the previous example, employee opinions and outlooks can be an important predictor of performance. Similarly, opinion surveys and employee feedback are useful in determining satisfaction and improving employee loyalty. A number of survey research methods might be useful in analyzing employee satisfaction. It is beyond the scope of this paper to determine the most appropriate survey methods and questions for obtaining this information. Instead, we will focus on the extraction of meaning from employee surveys and the analysis of survey responses. One survey research organization found two questions that drive job satisfaction and employee commitment:

Do I like my experiences working at this job? Do I approve of how this organization functions?

From those two questions, a number of other more specific questions can be derived that explore the different aspects of the work environment, including recognition, teamwork, and pay and benefits. These questions can be asked in a number of different ways. Some of the questions can be easily presented as binary response. For example, an employer may ask Do you feel that your health and safety is a priority within our company? Other questions are better presented as multiple-choice questions or in Likert-scale format. (The format of a typical five-level Likert item is: 1=strongly disagree, 2=disagree, 3=neither agree nor disagree, 4=agree, and 5=strongly agree.)

Using and analyzing responses to open-ended questions


In addition, some opinions can only be properly addressed through the use of open-ended survey questions. These typically begin with words such as Why and How, or phrases such as Tell me about.... Often, they are not technically a question, but a statement which implicitly asks for a response.1 The analysis of open-ended questions requires a larger investment of time and resources from an organizations human resources department. It also requires a different set of analysis tools, commonly referred to as text analytics or text mining.

IBM Software Business Analytics

Government

IBM SPSS text analytics provides a technical foundation for extracting usable knowledge from unstructured text data through identification of core concepts and sentiments. Text analytics allows users to understand the relationships between concepts and the sentiment around concepts, and ultimately create a structure for unstructured text data that can be integrated with analytics. IBM SPSS technology uses a linguistic-based approach, rather than frequency or statistically-based approach, which provides for richer analysis and a deeper understanding of the underlying concepts.

Figure 7: A view into text analytics within IBM SPSS Modeler Premium. On the left is a list of extracted categories and on the right is a visual representation of the linkages between concepts and sentiments (sentiment analysis).

For example, a survey may be administered to recruits to gauge sentiment towards a new recruiting initiative. Some of the questions are typically Likert-scale items (for example, How effective was this initiative? 1=very effective, 2=somewhat effective, 3=neutral, 4=somewhat ineffective, 5=very ineffective). The final question is open-ended and asks each recruiter to comment on the new recruiting initiative. Considering the large number of recruiters providing comments, reading and categorizing every persons open-ended responses is a daunting task. Manual categorization typically results in inconsistencies between analysts and is also time consuming. By using IBM SPSS Modeler Premiums integrated text mining workbench, natural language processing is used to extract concepts from the survey comments. Rich, industry-specific linguistic resources that span over 180 linguistic taxonomies allow the user to explore relationships between concepts and sentiments within the text. For example, concepts related to a soldiers family may be automatically included in a type called family. Text link analysis extracts not just the type, but the sentiment associated with the type. For example, a comment related to worries about leaving family may be categorized into a negative family category while a comment related to the positive feelings around providing for the family may be categorized into positive family category. Finally, a set of categories is created to provide a high-level grouping for each response. This categorization can be at either the concept or the concept and sentiment level.

IBM Software Business Analytics

Government

The most important benefit of integrating text analytics into modeling is the ability to improve model accuracy through data that is considerably richer in content than structured data alone.

Predictive analytics in employee retention


Employee attrition involves both direct and indirect costs, and neither type is trivial. In fact, the cost of employee turnover in for-profit organizations has been estimated to be up to 150% of the employees remuneration package. Direct cost relate to leaving costs, replacement costs, and transitions costs, while indirect costs relate to the loss of production, reduced performance levels, unnecessary overtime, and low morale.2 Although it might initially appear that there is a strong positive correlation between an increase in employee benefits and a reduction in attrition or turnover, studies have shown that often this is not the case. The effect of benefits on turnover varies somewhat by industry group. Adding benefits is a more effective way of lowering turnover among firms that have mainly part-time employees, in comparison to those with mostly full-time employees. While there might be some commonality in factors leading to turnover and attrition, different types of organizations often have unique environments that can positively or negatively affect turnover rates. Statistical analysis and predictive analytics techniques have been successfully applied in uncovering some of these factors in a number of different organizations. As an example, a branch of the military was interested in uncovering the factors that lead to attrition in soldiers. In examining the data, there were certain variables that appeared as likely candidates for predicting attrition. To start with, a comparison was made between data on a persons area of expertise or PMOS (Primary Military Occupational Skill) with their current job or DMOS (Duty Military Occupational Skill). Researchers expected to see a higher rate of attrition of soldiers that were trained to do one job, but were currently assigned to do something different. Further analysis found that the answer to predicting attrition lies not in any one factor but rather in a combination of factors, and specific values within those factors that together lead to a higher or lower tendency for attrition. The predictive model was able to uncover the predictive value of individual factors, as well as determined which combination of factors, was most predictive of attrition. The model may reveal exactly how specific values and ranges of time in service, time left in service, education level, current marital status, pay grade, and a specific job duty all combine to determine the likelihood of attrition. With this information, this military command can not only intelligently target those individuals that are most likely to attrite, but can also work towards addressing the conditions that might cause dissatisfaction among their soldiers.

10

IBM Software Business Analytics

Government

Conclusion
This paper presents some of the challenges associated with human capital management and some of the analytic techniques that have been effective in solving some specific issues within this space. It is important to recognize that the techniques presented do not demonstrate the only way of solving some of these problems. The application of predictive technologies has the potential of revolutionizing how human capital is managed. By using predictive analytic techniques to examine the wealth of data that is typically available to human resource departments, organizations can find those hidden patterns and relationships within the data and provide a view into the future.

IBM SPSS products for predictive analytics


Improving the way in which your organization manages people your valuable human capital can be critical to your ability to achieve your objectives. The IBM SPSS product line offers a host of capabilities that can assist your organization in understanding peoples views, opinions, and aspirations, and then predicting their likelihood to exhibit certain behaviors. Put another way, our software helps you understand what people want and what they are likely to do, so that you can attract the people you need, keep them on the job and help them progress and grow in value to your organization. Our products deliver the following functionality:

IBM SPSS Data Collection Get an accurate view of opinions and attitudes with a feature-rich suite of survey research software IBM SPSS Statistics Be confident in your results and your decisions, with the rich features available in the most widely used suite of statistical software in the world. IBM SPSS Modeler Discover hidden relationships in both structured and unstructured (text) data and anticipate the outcomes of future interactions with IBM SPSS Modeler Professional for modeling structured data and IBM SPSS Modeler Premium for both structured and unstructured data. IBM SPSS Collaboration and Deployment Services Manage analytical assets, automate processes and share results with the business to drive reliability, consistency and excellence. IBM SPSS Decision Management Place the power of predictive analytics in the hands of business users to automatically deliver high-volume, optimized decisions at the point of impact.

References 1. http://www.mediacollege.com/journalism/interviews/open-ended-questions.html 2. State of Wyoming: http://doe.state.wy.us/LMI/0203/a2.htm 11

About IBM Business Analytics


IBM Business Analytics software delivers complete, consistent and accurate information that decision-makers trust to improve business performance. A comprehensive portfolio of business intelligence, predictive analytics, financial performance and strategy management, and analytic applications provides clear, immediate and actionable insights into current performance and the ability to predict future outcomes. Combined with rich industry solutions, proven practices and professional services, organizations of every size can drive the highest productivity, confidently automate decisions and deliver better results. As part of this portfolio, IBM SPSS Predictive Analytics software helps organizations predict future events and proactively act upon that insight to drive better business outcomes. Commercial, government and academic customers worldwide rely on IBM SPSS technology as a competitive advantage in attracting, retaining and growing customers, while reducing fraud and mitigating risk. By incorporating IBM SPSS software into their daily operations, organizations become predictive enterprises able to direct and automate decisions to meet business goals and achieve measurable competitive advantage. For further information or to reach a representative visit www.ibm.com/spss.

Copyright IBM Corporation 2010 IBM Corporation Route 100 Somers, NY 10589 US Government Users Restricted Rights - Use, duplication of disclosure restricted by GSA ADP Schedule Contract with IBM Corp. Produced in the United States of America May 2010 All Rights Reserved IBM, the IBM logo, ibm.com, WebSphere, InfoSphere and Cognos are trademarks or registered trademarks of International Business Machines Corporation in the United States, other countries, or both. If these and other IBM trademarked terms are marked on their first occurrence in this information with a trademark symbol ( or TM), these symbols indicate U.S. registered or common law trademarks owned by IBM at the time this information was published. Such trademarks may also be registered or common law trademarks in other countries. A current list of IBM trademarks is available on the web at Copyright and trademark information at www.ibm.com/legal/copytrade.shtml. SPSS is a trademark of SPSS, Inc., an IBM Company, registered in many jurisdictions worldwide. Other company, product or service names may be trademarks or service marks of others. Please Recycle

Business Analytics software

IMW14291USEN-01

You might also like