You are on page 1of 18

SPSS & R

By

Gilbert MacKenzie The Centre of Biostatistics, University of Limerick

CBS, University of Limerick

Assess, York, Nov 2009

Introduction

This is a Then and Now talk

Revisit my York 2001 critique Review Progress with SPSS since then Talk about the opportunities with R

CBS, University of Limerick

Assess, York, Nov 2009

Then Assess Talk 2001

My York 2001 talk criticised the package in terms of:

Technical Content

Inconsistencies Programme Structure

CBS, University of Limerick

Assess, York, Nov 2009

Then Assess 2001 Technical Content

Then weak mainly in relation Complex Modelling:

Linear Mixed Models (Now fixed) Generalised Linear Models (Now fixed) Generalised Linear Mixed Models (Now in R) General Purpose MLE fitting ? (Now in R )
NB: Then Website referred to CNLR

CBS, University of Limerick

Assess, York, Nov 2009

Then Using CNLR


Fitting the Generalised Time Dependent Logistic Survival Model (MacKenzie, 1996, JRSS D, 45, 1, 21-34; SIM, 1997, 16, 1831-1843.)

GTDL:

exp( t x ' ) ( t | x ) . [1 exp( t x ' )] exp( t x ' ) ( t | x ) [1 exp( t x ' )]


' exp( t x ' 0 ) ( t | x 1 , x 0 ) exp[( x 1 x 0 ) ] [1 exp( t x 1' )]

TDL:

RR:

CBS, University of Limerick

Assess, York, Nov 2009

Then CNLR Novel Survival Programme


model program b0 = 0.05 alpha = 0.01. compute const = 1. compute fi = 0. compute lambda = exp(fi). compute term0 = b0*const. compute term1 = term0 + alpha*surtim. compute pi = exp(term1)/(1+exp(term1)). compute qi = 1-pi. compute wi = (1+exp(term0)). compute pred = ( ( 1+exp(term1)) / ( 1+exp(term0)) )**(-lambda/alpha) . compute loss = -1*(di*fi+di*ln(pi)+(lambda/alpha)*(ln(qi)+ln(wi)) ). cnlr di /pred = pred /loss=loss/save=pred/bootstrap. First time CNLR is used for survival analysis - this is the method but, watch out the bootstrap does not respect the censoring distribution

CBS, University of Limerick

Assess, York, Nov 2009

CNLR SPSS Results


Parameter Estimates 95% Confidence Interval 95% Trimmed Range Parameter Estimate Std. Error Lower Bound Upper Bound Lower Bound Upper Bound a,b b0 Bootstrap -1.629 .075 -1.782 -1.475 -1.765 -1.465 alpha -.065 .012 -.090 -.039 -.090 -.038 a. Based on 30 samples. b. Loss function value equals 2056.044.

Parameter Estimates 95% Confidence Interval 95% Trimmed Range Parameter Estimate Std. Error Lower Bound Upper Bound Lower Bound Upper Bound b0 -1.629 .073 -1.771 -1.486 -1.772 -1.499 alpha -.065 .011 -.087 -.042 -.087 -.042

a,b Bootstrap

a. Based on 1000 samples. b. Loss function value equals 2056.044.

CBS, University of Limerick

Assess, York, Nov 2009

NLM R Results

Parameter Estimates 95% Confidence Interval 95% Trimmed Range Parameter Estimate Std. Error Lower Bound Upper Bound Lower Bound Upper Bound a,b b0 Bootstrap -1.629 .075 -1.782 -1.475 -1.765 -1.465 alpha -.065 .012 -.090 -.039 -.090 -.038 a. Based on 30 samples. b. Loss function value equals 2056.044. Parameter Estimates
95% Confidence Interval 95% Trimmed Range Parameter Estimate Std. Error Lower Bound Upper Bound Lower Bound Upper Bound a,b b0 Bootstrap -1.629 .073 -1.771 -1.486 -1.772 -1.499 alpha -.065 .011 -.087 -.042 -.087 -.042 a. Based on 1000 samples. b. Loss function value equals 2056.044.

CBS, University of Limerick

Assess, York, Nov 2009

Now! SPSS Version 16 with R


Seems to solve many of these problems at a stroke

Calling R programme code within SPSS

Passing SPSS objects (eg data) to R Return R objects (eg results) to SPSS For SPSS users this is an immediate major advance

CBS, University of Limerick

Assess, York, Nov 2009

Now! How to run R in SPSS-V16


comment SPSS Syntax file . comment Preamble SPSS . GET FILE='C:\Data\SPSSDATA\R_SPSS\lung_cancer.sav'. DATASET NAME DataSet1 WINDOW=FRONT. freq surtim/histogram. comment Invoke R wrap code. BEGIN PROGRAM R. library(survival) # load libraries library(graphics) cancer<-spssdata.GetDataFromSPSS() # fetch SPSS data print(colnames(cancer)) # check variables subset<-cancer[1:10,1:8] # subset data print(subset) print(mean(surtim)) # send to spv print(var(surtim)) # send to spv #print(hist(age)) #print(nlm) END PROGRAM. comment more SPSS commands and/or R blocks
CBS, University of Limerick Assess, York, Nov 2009

Now! Block Structure of R in SPSS-V16


comment Preamble SPSS . GET FILE='C:\Data\SPSSDATA\R_SPSS\lung_cancer.sav'. DATASET NAME DataSet1 WINDOW=FRONT.
BEGIN PROGRAM R.

library(survival) cancer<-spssdata.GetDataFromSPSS() results1<- nlm()

# load libraries # fetch SPSS data

spsspivottable.Display(results1,title="Results1",format=formatSpec.GeneralStat) END PROGRAM. BEGIN PROGRAM R.

results2<- nlm()
spsspivottable.Display(results2,title="Results2",format=formatSpec.GeneralStat) END PROGRAM. BEGIN PROGRAM R.

results3<- nlm()
spsspivottable.Display(results3,title="Results3",format=formatSpec.GeneralStat) END PROGRAM. CBS, University of Limerick Assess, York, Nov 2009

Now! The nlm function in R


This is a general minimisation routine to minimise a function of p variables. Typical usage ie a call nlm.res <- nlm(loglik, theta, extra=extra, hess=TRUE) where: loglik = a user written likelihood function loglike(param,extra) theta = a vector of parameter starting values extra = an object containing other quantities required by loglike hess = TRUE forces computation of Hessian matrix H()

CBS, University of Limerick

Assess, York, Nov 2009

Now! Some Standard Theory

H() = D2{ loge(L() } Io() = -H() I() () = -E [H()] = [I()]


-1 -1 -1]

Hessian Observed Information Fisher Information Covariance Matrix Covariance Matrix Vector of variances Vector of Std. Errors

() [Io()]

v() = diag[Io()

se() = sqrt [v()]

CBS, University of Limerick

Assess, York, Nov 2009

Now! The loglike function in R


This is a user written function to maximise the log-likelihood ie to minimise minus the log-likelihood. Here is one for the Exponential Model (1 parameter)

Input Parameters
# must set up extra before call loglikexp <- function(x, extra) { fi <- x lam <-exp(fi) # keep scalar hazard lam >0 ti <- extra[,1] # vector of survival times deltai <- extra[,2] # vector of censoring indicators loglike <- -sum(deltai*fi-lam*ti) } Return Value call result <- loglikexp(param, extra) result # displays loglike on console in R
CBS, University of Limerick Assess, York, Nov 2009

Now! Some Survival Examples in R

Basic Exponential Survival

Non-PH GTDL Non-PH ~ Gamma Frailty

Move to Syntax File Now !

CBS, University of Limerick

Assess, York, Nov 2009

Now! Setting up the Software & Interfaces -steps


Install SPSS Version 16

Go to Cran mirror & download R V2.5 and install Download R integration installer and documentation from www.spss.devcentral and install Read the R integration document first Run SPSS using blocked syntax files

CBS, University of Limerick

Assess, York, Nov 2009

Then Assess 2001 - Conclusions


Obvious need to Improve the Technical Content of SPSS

Create a Consistent Programming Environment Support Object Orientated Language Encourage a Technical Development Group

CBS, University of Limerick

Assess, York, Nov 2009

Now! Overall Conclusions R

Basic Technical Deficiencies covered by R

Need to develop output interface Need to allow R graphical objects out or to chart editor Guidance on future SPSS development strategies for R integration Progress Indeed!!

CBS, University of Limerick

Assess, York, Nov 2009

You might also like