You are on page 1of 5

PSPP

A Purdue STAT 582 User Manual Kyle Bemis

1. Introduction to PSPP PSPP is a free, open-source, cross-platform statistical program being developed as an alternative to the proprietary statistical package SPSS. PSPP attempts to replicate SPSS functionality and interface while remaining free and keeping its source open to the public for developers to edit and improve on as a community. Despite being in development since 1995 now, PSPP has not yet hit a 1.0 release yet, and its current stable version is at 0.6.2. This is due to the fact that its developers are working on it as a labor of love, and they are devoted to PSPP remaining a free and open-source alternative. However, this means that PSPP remains a work-in-progress and much of its statistical functionality has yet to be realized. Though its future looks bright, in its current form, PSPP is severely limited in its usefulness for much of what statisticians do on a regular basis. One major reason for this is that its language is still under development, and this has taken a priority to the development of the graphical user interface, so many of the features that have been implemented in code have yet to reach the GUI that most users will most likely wish to use. This documentation will focus on the current capabilities of PSPPs graphical front-end, which is launched as psppire, which features drop-down menus for statistical functionality and data manipulation, as well as a spreadsheet window for entering data. 2. Entering and Manipulating Data As the graphical front-end of PSPP, psppire appears as a spreadsheet-like interface which allows the user to enter data into cells like in Excel and other spreadsheet applications. This allows easy entry and editing of data. It includes two views, called Data View and Variable View, which are seen as tabs at the bottom-left of the spreadsheet window that can be used to switch between these modes. PSPP can open SPSS data les as well as CSV les. See the gures on the next page.

Data View

Variable View

As seen above, Data View shows the actual values of the cells for every row and column of data. This is the mode for editing and manipulating the actual values of observations. The Variable View displays characteristics of each variable, such as the kind of variable (e.g., continuous, categorical) and allows the user to edit these. The button Insert Variable creates a new variable.

The Transform drop-down menu oers several options for manipulating variables. Selecting the menu item Transform -> Compute allows the user free reign to use all of PSPPs mathematical functions on variables. The variable y1 seen in the previous page screenshots was created by applying a transformation to x1 as shown:

3. Analysis of Univariate Data The Analyze menu provides two submenus that allow for analysis of univariate data. Choosing Analyze -> Descriptive Statistics, as the name implies, gives options for descriptive statistics of a dataset. For example, choosing the Descriptive item from this menu will give a window that allows the user to choose a variable and several common statistical measures such as mean, variance, etc., that describe the distribution, as seen below:

Choosing all options will give an output like the following for y1:

Choosing the Analyze -> Compare Means menu gives access to typical t-tests. For example, choosing the One Sample T-Test, the user can input the mean to test under the null hypothesis, and PSPP will conduct a typical one sample t-test and output the results. At this time, PSPP has no implementation for histograms, qqplots, or any other graphs of univariate data that is accessible from the graphical front-end. Some of these are accessible from the command line, however. 4. Analysis of Multivariate Data Again, analysis of multivariate data is accessible from the Analyze menu. Analyze -> Descriptive Statistics oers the menu item for Crosstabs analysis. This option gives descriptive statistics for two variables at once. In addition to accessing statistics as simple and vital as Pearsons correlation coecient, this menu item is also particularly useful because it is where PSPP calculates Chi-square tests. When in the Crosstabs window, clicking the Statistics button will oer the many other descriptive statistics available. Chi-square testing is always on by default. For testing multiple variables, the Analyze -> Compare Means -> Paired Samples T-Test menu item gives a straightforward two variable t-test:

From the same Analyze -> Compare Means menu, the user can access PSPPs ANOVA test.

Finally, linear regression is available from the Analyze -> Linear Regression menu item. This again opens a similar window as the above two for t-tests and ANOVA. As in the other windows, the model is built simply by placing the variables into their respective roles (i.e., dependent, independent). Placing multiple variables in the independent role will give multiple linear regression. The output from regressing y1 and x1 is shown below:

5. Further Information This is, unfortunately, the extent of PSPPs capabilities that are currently accessible via the graphical user interface. More information can be found at http://www.gnu.org/software/pspp/pspp.html.

You might also like