22 views

Uploaded by Rochana Ramanayaka

MV - Factor

- The environmental worldview of.pdf
- Final Synopsis12
- Pca
- Health and Dietary Patterns of the Elderly in Botswana.pdf
- BRM Project nokia
- Articulo 6
- Big 5 personality traits and organisational dissent
- IOSR JRME Vol 9 Issue 1 Versi 1-Januari 2019-(Suryadi - Waspodo)
- Exploring Marketing Research 11th Edition
- AMR
- 41-157-1-PB
- DEVELOPEMENT AND MANAGEMENT OF HUMAN RESOURCES.docx
- Principal Component Analysis
- Course Outlines (1).pdf
- Attitudes Towards
- Factor Analysis Q 7
- CHISQ
- Principal Component Analysis Outliers Validation Reliability
- 438180.pdf
- Passengers Preference and Satisfaction of Public Transport in Malaysia

You are on page 1of 27

Faculty of Science University of Colombo

Session 3 Factor Analysis

Factor Analysis

Analysis of Interdependence: for data reduction and the discovery of underlying themes in the data

The term factor analysis was first introduced by Thurstone, 1931 Factor analysis tries to simplify attitudinal data by providing an alternative way of looking at it

What are the main underlying themes in the data? Which perceptions are related?

FA is based on analysing correlation matrix of attributes and aims to identify questions that measure, what respondents see as, similar or related concepts Essentially factor analysis is applied as a data reduction or structure detection method

Illustration

Can a set of 30 imagery statements for the shampoo category be simplified without any loss of information? There seem to be as many as 42 purchase decision criteria for my category. Can you help summarize these criteria? What all would Customer Service Orientation constitute? What variables? Can the variables be grouped into themes / dimensions? I want to deploy an objectively tested, valid scale for my Customer Satisfaction studies. My team has developed a huge battery of statements? Can you help?

Factor Analysis

Factor Analysis

Investigates interrelationships among variables. Variable reduction exercise: Reduces the variables into a sub-set of factors without loss of information Used to define or discover themes or underlying (latent) dimensions of a large set of attributes / variables. Often an intermediate step to some other procedure Factors are used as independent variables in Multiple Regression. Interdependence technique: no variable designated dependent or independent. All variables to be metric (interval) Large samples preferred The worth of the solution often depends on the intuitive interpretability of the factors rather than statistical rules.

To understand the principle behind factor analyses let us just consider two variables from a study to measure the peoples satisfaction with their life:

1. How satisfied they are with their hobbies? 2. How intensely they are pursuing their hobbies?

One can summarize the correlation between the two variables in a scatter plot. A regression line can be fitted that represents the best summary of the linear relationship between two variables

Simple Regression

9 8 7 6 5 4 3 2 1 0 0 2

Intensity of pusrsuing Hobby

Ajay Macaden

4 6 8 10 Satisfaction with Hobbies

7 point scales

If we could define a variable representing the regression line then that variable would capture the essence of two variables

In a sense we have reduced two variables to one factor

Combining two correlated variables in to one factor, illustrates the basic idea of Factor Analysis or Principal Component Analysis (PCA) If we extend the two variables example to multiple variables then computations become more involved but basic principle of representing two or more variables by single factor remains the same

Orthogonal Factors

After we have found the line on which the Variance is maximal, there remains variability around this line We continue and define another line that maximizes the remaining variability In this manner consecutive factors are extracted Because each factor is defined to maximize the variability that is not captured by preceding factor, consecutive factors are independent of each other

Put another way consecutive factors are uncorrelated or ORTHOGONAL to each other

Customers asked to rate bus travel on a number of attributes on a 10 point scale: 1 = Doesnt describe bus travel at all 10 = Totally describes bus travel

Relaxed Friendly Nervous Tolerate it Easy Interesting Uncertain Waste of time

Which statements did they rate similarly? ie which statements are correlated? common themes in the data

Correlations Grouped Q1 Relaxed

Q2 Friendly

Q3 Nervous

Q4 Tolerate it

Q5 Easy

Q6 Interesting

Q7 Uncertain

Q8 - Waste of

time

Component

1

Q2 Friendly Q1 Relaxed Q6 Interesting Q5 Easy Q4 - Tolerate it 0.823 0.803 0.732 0.725 0.456

2

-0.186

First four statement load mainly on first factor Positive bus travel Other 4 load on second factor Negative about bus travel Tolerate it loads on both

-0.265 0.253

Q7 Uncertain

Q3 Nervous Q8 - Waste of time -0.144

0.767

0.697 0.691

Component

1

Q2 Friendly Q1 Relaxed Q6 Interesting Q5 Easy Q4 - Tolerate it 0.82 0.80 0.73 0.72 0.45

First four statement load mainly on first factor Positive bus travel

Q7 Uncertain

Q3 Nervous Q8 - Waste of time

0.77

0.70 0.70

Other 4 load on second factor Negative about bus travel Tolerate it loads on both

Principal components analysis (PCA):

The most common form of factor analysis, PCA seeks a linear combination of variables such that the maximum variance is extracted from the variables. It then removes this variance and seeks a second linear combination which explains the maximum proportion of the remaining variance, and so on. This is called the principal axis method and results in orthogonal (uncorrelated) factors. PCA analyzes total (common and unique) variance.

Correlations

Grouped

Q1 Relaxed

Q2 Friendly

Q5 - Easy

Q6 Interesting

Q4 Tolerate it

Q3 Nervous

Q7 Uncertain

Q8 - Waste of

time

Total Variance Explained Initial Eigenvalues Component Total 1 2 3 4 5 6 7 8 9 5.459 1.249 .900 .830 .631 .478 .431 .353 .295 .204 Extraction Sums of Squared Loadings Rotation Sums of Squared Loadings

% of Cumulative % of Cumulative % of Cumulative Total Total Variance % Variance % Variance % 49.628 11.359 8.179 7.546 5.736 4.348 3.917 3.208 2.682 1.850 49.628 5.459 60.986 1.249 69.165 76.711 82.448 86.795 90.713 93.921 96.603 98.453 .900 49.628 11.359 8.179 49.628 2.894 60.986 2.634 69.165 2.080 26.312 23.948 18.905 26.312 50.260 69.165

How much of the total variation in the data is explained by the factors 11 .170 1.547 100.000 The factors should explain at least 2/3 of the Extraction Method: Principal Component Analysis. variance. In this data, the first three factors explain 69% of the variable.

10

Rotated Component Matrix(a) Component 1 Offers value-for-money products and services Offers wide range of products and services to suit different needs Progressive and provides innovative insurance solutions Has expertise in providing insurance solutions A reputable insurance provider/company An insurance company I can trust Global insurance company An insurance company with financial strength 2 3

.865 .257 -.006 .836 .101 .192 .741 .197 .432 .657 .326 .267 .251 .849 .086 .187 .809 .208 .425 .593 .283 .074 .575 .458

One of the insurance companies that I would first recommend to my customers .172 .086 .821 Has strong working relationships with its distributors/intermediaries Established local insurance company Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. .200 .342 .689 .334 .481 .543

Review factor loadings to decipher the factors. The factor loadings are the correlations between the factor and the attribute.

Rotated Component Matrix(a) Component 1 Offers value-for-money products and services Offers wide range of products and services to suit different needs Progressive and provides innovative insurance solutions Has expertise in providing insurance solutions A reputable insurance provider/company An insurance company I can trust Global insurance company An insurance company with financial strength 2 3

.865 .257 -.006 .836 .101 .192 .741 .197 .432 .657 .326 .267 .251 .849 .086 .187 .809 .208 .425 .593 .283 .074 .575 .458

Factor 2: Reputation

One of the insurance companies that I would first recommend to my customers .172 .086 .821 Has strong working relationships with its distributors/intermediaries Established local insurance company Extraction Method: Principal Component Analysis. Rotation Method: Varimax with Kaiser Normalization. .200 .342 .689 .334 .481 .543

1. 2. 3.

A three factor solution is selected for these data: Practical solutions Reputation Distribution/how well established

Key Concepts

Eigen Value

Also called characteristic roots. The eigen value for a given factor measures the variance in all the variables which is accounted for by that factor. The ratio of eigen values is the ratio of explanatory importance of the factors with respect to the variables. If a factor has a low eigen value, then it is contributing little to the explanation of variances in the variables and may be ignored as redundant with more important factors.

Variance Explained

To get the percent of variance in all the variables accounted for by each factor, add the sum of the squared factor loadings for that factor (column) and divide by the number of variables. (Note the number of variables equals the sum of their variances as the variance of a standardized variable is 1.) This is the same as dividing the factor's eigenvalue by the number of variables.

Types of Rotation

Varimax rotation is an orthogonal rotation of the factor axes to maximize the variance of the squared loadings of a factor (column) on all the variables (rows) in a factor matrix, which has the effect of differentiating the original variables by extracted factor. Each factor will tend to have either large or small loadings of any particular variable. A varimax solution yields results which make it as easy as possible to identify each variable with a single factor. This is the most common rotation option.

Promax rotation

is an alternative non-orthogonal rotation method which is computationally faster than the direct oblimin method and therefore is sometimes used for very large datasets.

There are two criteria of choosing the number of factors

The Kaiser Criterion

We can retain only factors with eigenvalue greater than 1 In essence this is like saying that, unless a factor extracts at least as much as the equivalent of one original variable, we drop it A graphical method is the Scree Test. We can plot the eigenvalues in a simple line plot The place where the smooth decrease of eigenvalues appears to level off to the right of plot. To the right of this point, presumably, one finds only factorial scree scree is the geological term referring to debris which collects on the lower part of a rocky slope.

The scree plot helps you to determine the optimal number of components. The eigenvalue of each component in the initial solution is plotted. Generally the components on the steep slope are extracted

Scree Plot

Eigenvalue

0 1 2 3 4 5 6 7 8 9

Component Number

Statistics to look at

KMO : Should be more than 0.5

Tells whether the partial correlation between variables is small or large

(0-1) should be close to 1 Below 0.5 implies factor analysis wont be useful.

Tells whether the variables are correlated or not

Bartlettsignificance level should be very small; say below 0.05. and NOT above 0.10

Choosing the number of factors is an art, as much as a science

Usual practice is to run several alternative analyses Researcher and analysts collaborative judgment are important, to generate a solution that provides a plausible explanation and interpretation of the factors

Must achieve a balance between, one the one hand, having enough factors to explain the variation in the original data satisfactorily and, on the other, not having so many factors that little or no data reduction had been achieved. Look for at least 65-70%+ with scale data, but 50+% with binary How big a sample is needed?

The larger the sample size, the more accurately we can estimate the correlations between questions and the more repeatable the analysis will be A sample of 400 or more should provide a stable factor analysis Minimum sample size of c200

Preferably interval data (5 or 7 point Likert, Agree/disagree scale is actually ordinal data but is treated as interval) as the correlations better estimated Binary (yes/no) variables often have a lower correlation

Question

Factor - respondent vs response ?

Factor vs Cluster ?

Summarises large amounts of data Identifies patterns easily that can be hard to find By basing factors on data patterns, analysis based on actual results, not preconceptions or questionnaire issues Used in conjunction with MLR But....

All variability in data not usually accounted for in factor analysis Factors can be hard to interpret - represent many measures Factors depend on data, and can differ for different sets of data

What it does

Identifies underlying families of parameters that are highly correlated, and each family represents a different factor. Helps reduce data from a large number of parameters to a small number of factors. Produces a set of independent variables to be used for further analysis.

Examples of application

What are the main characteristics based on which consumers form brand images in their mindsets? What are the main service aspects retailers consider when evaluating overall satisfaction with the service provided by supplier? Which 10 or 15 attributes should I finalize to be measured from a list of 30 attributes

Requirements

Type of scales: Interval Free association data (ordinal type) could be treated as binary interval data. Some binary nominal scales (involving opposites) could be treated as interval Exclude variables with very low variance, if any. If a pair of variables has a very high correlation, keep one, exclude the other.

Factor loading: correlation between factor and standardized variable. Higher loading means more weightage of the variable in the factor Use Promax Rotation to get correlated factors; For exploratory understanding Use Varimax rotation when un-correlated factors are required; to be used for regression and cluster Remove variables which load into multiple factors for cleaner solutions; especially for regression

Guidelines

Number of factors, based on: Total variance explained above 60% Eigen value above 1. Number of variables divided by 3 or 4. KMO measure: (0-1) should be close to 1.Below 0.5 implies factor analysis wont be useful. Bartlettsignificance level should be very small; say below 0.05. and NOT above 0.10

Analyze> data reduction > factor

Extraction > method >principal components > ( scree plot) Rotation > method > varimax. Eigen values over > 1 or number of factors =? Descriptives > KMO and Bartletts test of sphericity Scores > save as variables > method: regression Options > missing values > exclude cases pair wise.

- The environmental worldview of.pdfUploaded byRismaAfridaRosania
- Final Synopsis12Uploaded bySahil Arora
- PcaUploaded byMostafa Elkadi
- Health and Dietary Patterns of the Elderly in Botswana.pdfUploaded byJoviee Charisma
- BRM Project nokiaUploaded bychintan12386
- Articulo 6Uploaded bykevin
- Big 5 personality traits and organisational dissentUploaded byMeghna Kumar
- IOSR JRME Vol 9 Issue 1 Versi 1-Januari 2019-(Suryadi - Waspodo)Uploaded bywaspodo tjipto subroto
- Exploring Marketing Research 11th EditionUploaded bySparki Chilling
- AMRUploaded byAfaq Sana
- 41-157-1-PBUploaded bymarusya_smokova
- DEVELOPEMENT AND MANAGEMENT OF HUMAN RESOURCES.docxUploaded byNabil Eyg Iakini
- Principal Component AnalysisUploaded byVetrivel Sezhian
- Course Outlines (1).pdfUploaded bycheckmated
- Attitudes TowardsUploaded byXY-ikon Lelaki Berita Harian
- Factor Analysis Q 7Uploaded byfriendshippp
- CHISQUploaded bySasi Kumar
- Principal Component Analysis Outliers Validation ReliabilityUploaded byHeru Trid
- 438180.pdfUploaded byJOY JOY GASGA
- Passengers Preference and Satisfaction of Public Transport in MalaysiaUploaded byMicah Dianne Dizon
- art%3A10.1007%2Fs12397-013-9110-xUploaded byDaniel Kessler
- FactorUploaded byAishatu Musa Abba
- Perceptual Mapping.Uploaded byNAVEEN.A.S
- Factor analysisUploaded byLobna Qassem
- out-1 (1)Uploaded byXiomara Jaimes
- CHANGE MANAGEMENT ON BEHAVIOR OF TEACHING FRATERNITY AT COLLEGE LEVELUploaded byIAEME Publication
- Swot TextileUploaded byarcopiero
- Session 7 Factor AnalysisUploaded byBharat Mendiratta
- Factor AnalysisUploaded byAbrarul Haque
- Factor Analysis2Uploaded byTushar Kant

- Khoya Khoya ChandUploaded byRochana Ramanayaka
- fur eliseUploaded byHector Madrigal
- Bagatelle (Fur Elise), WoO 59.pdfUploaded byRochana Ramanayaka
- Welligton Sousa Prelude Pour Violoncelle 26453Uploaded byRochana Ramanayaka
- Abide With Me CelloUploaded byRochana Ramanayaka
- 5 Variations, WoO 79.pdfUploaded byzirconplus
- Software Requirement Specification Software EngineeringUploaded byRochana Ramanayaka
- L. V. Beethoven piano transcription of string quartetsUploaded byRochana Ramanayaka
- Computor scienceUploaded byRochana Ramanayaka
- Selesthina CelloUploaded byRochana Ramanayaka
- computor scienceUploaded byRochana Ramanayaka
- Uploads Resources 116 Cello.lstUploaded byRochana Ramanayaka
- 12 Variations, WoO 68.pdfUploaded byRochana Ramanayaka
- Computor scienceUploaded byRochana Ramanayaka
- Fur EliseUploaded byF Myers
- Internet TechnologiesUploaded byRochana Ramanayaka
- PHP Basics 1-1 (2)Uploaded byRochana Ramanayaka
- PHP Basics 1-1 (4)Uploaded byRochana Ramanayaka
- Java Scripts - ExamplesUploaded byRochana Ramanayaka
- PHP (2)Uploaded byRochana Ramanayaka
- Styles FinalUploaded byRochana Ramanayaka
- WEB AUTHORING (HTML) Samples Based on ExercisesUploaded byRochana Ramanayaka
- Data Communication and NetworksUploaded byRochana Ramanayaka
- CssUploaded byRochana Ramanayaka
- Java ScriptUploaded byRochana Ramanayaka
- PHP Basics 1-1Uploaded byRochana Ramanayaka
- Web Authoring (HTML)Uploaded byRochana Ramanayaka
- PHP Basics 1-1 (3)Uploaded byRochana Ramanayaka
- Data Input and Output10122009 (2)Uploaded byRochana Ramanayaka
- Data Input and Output10122009 (2)Uploaded byRochana Ramanayaka

- 102943485 Kyokushinjutsu Daito Ryu AikijujutsuUploaded byMma Best Sport
- 001_design Slab and Foundation Post-tensionedUploaded byMqAshlady
- Final Coaching Questions in ProfEdUploaded byjerome endo
- 2011 09 PsychiatryUploaded byGurpreet Chara
- Chester Amphitheatre Conservation Plan Sept 2001 Vol IUploaded byWin Scutt
- dellUploaded byRaul DE LA CRUZ
- 7Uploaded byKaiser
- MDX-540_USE_EN_R5Uploaded byAndres de Wysiecki
- a3Uploaded bysggdgd
- Random VariablesUploaded bysaintmh
- Pearson ComplaintUploaded byAaron Rupar
- 15-trampolinedcdUploaded byIsabel Macuco
- The photoelectric effect without photonsUploaded byBenjamin Crowell
- NickelUploaded bynothuide
- Exit Slip JournalUploaded byDwi Artawan Gold Knight
- final abortionUploaded byapi-285541436
- The Individualization of Team TrainingUploaded byerichamilt0n
- code for successful advocacyUploaded bybucolic
- EJBE2012Vol05No10p25NAIK-PADHIUploaded byHappy Singh
- Clause as ExchangeUploaded byTrotta Eelianav
- Different tools and techniques used in public relationUploaded byVikas Kapoor
- 22_vol 6_epaperUploaded byThesouthasian Times
- Redis: Persistence PowerUploaded byNick Quaranto
- Lecture 03 - Objectivity and DeadpanUploaded byBA (Hons) Photography, Contemporary Practice
- BBBhnJUploaded byastik
- On the relativity of the concepts of needs, wants, scarcity and opportunity cost.docxUploaded byArdela Ari Wibowo
- People vs MelchorUploaded byGemrose Santos
- What Are Pituitary TumorsUploaded byChristy Ocariza
- A Compass and a Straight Edge FinalUploaded byvar_jack
- Trust in India Charitable Trust Welfare TrustUploaded byKazhal Vendhan