You are on page 1of 15

Learning Analytics Tools &

Techniques For Analysing Large


Volumes Of Educational Data
Dr. Richard Price,
Data Scientist,
Planning Services,
Flinders University

History of The Techniques and Methods of Learning


Analytics

Learning analytics draws upon techniques from a number of established fields:


Statistics
Artificial Intelligence
Machine Learning
Data mining
Social Network Analysis
Text Mining and Web Analytics
Operational Research
Information Visualization

Application domains such as business intelligence, national security intelligence


and learning analytics all have an interest in analysing large volumes of data
from disparate data sources and are providing the business cases for the rapid
growth in big data & data analytics.

Learning analytics encompasses support to both the business and teaching


functions of the learning institution.

Data Types

Structured data
Typically stored in databases or spreadsheets, required to be managed in accordance
with a standardised storage format and ontology e.g. names, place names,
E.g. SATAC applications, load, enrolments, FLO usage data

Unstructured data
text, audio, imagery, video
E.g. student email, chat rooms, questionnaire responses, lecture videos (audio &
video)

Different data types lend themselves to different analytical techniques. Unstructured data
often requires pre- processing prior to enable structured data analysis

Unstructured data analysis


Text : document clustering , topic detection, entity extraction (people, places,
locations, dates, times etc., sentiment analysis (+,-)
Audio : speaker identification, language identification, speech to text, keyword spotting
Video analysis : face recognition, object recognition, target tracking

Structured Data Analysis


Descriptive statistics sums, means, std devs, basic plotting (graphs,
charts, histograms)
Data visualisation
tools that enable the human to see meaningful patterns in data
Machine learning tools that enable computers to find patterns in data to perform
either classification, clustering or prediction
e.g. decision trees, neural networks, support vector machines,
linear regression, self organising maps, k-means
Predictive analytics
Algorithmic approaches (generally machine learning) for
predicting key target variables of interest.
Example LA projects: Identification of at risk students - Student
Success Project, Future University enrolments, topic enrolments

Data Visualisation
Structured Data

Unstructured Data

Advanced Data Visualisation


Combining Structured & Unstructured Data Sources

Predicting Enrolments From Applications Data

Aim: To predict next years commencing load using past 3 years of


SATAC applications data.

Predictions based at the applicant level not time series based.

Adopted a decision tree machine learning based approach.

Input variables for each applicant included: academic performance,


schooling, demographics (e.g. age, gender and postcode), information
regarding each of the applicants preferences such as; preference
number, course, institution, institution campus and a number of proximity
variables.

Output (target) variable : whether the student was enrolled at Flinders


University at Semester 1 census.

Predicting Enrolments From Applications Data

The three Ps - Prestige, Proximity & Price


Proximity input variables
For two given points P1= (lat1, lon1), P2 = (lat2, lon2) the haversine distance
in kilometres between P1 and P2 is defined as:
d(P1,P2) = ACOS(SIN(lat1)*SIN(lat2) + COS(lat1)*COS(lat2)*COS(lon2lon1) ) * 6371
Haversine distance calculated between applicants primary residence and all
SA major University campuses, with each value being an input into the
machine learning algorithm.

Two models developed, a) from 1st week in September b) from 2nd week in January.

Training data consisted of 3 years of data 2011, 2012 & 2013 to predict 2014
enrolments - 25,551 training examples for September and 74,516 for January.

A number of commonly used machine learning algorithms could have been used,
we chose to adopt a CHAID decision tree algorithm.

Predicting Enrolments From Applications Data


Results :
Model
September
January

Number Of
Applicants
(Predictions)
8557
26457

Predicted
Commencing
Load
1394
4340

Actual
Commencing
Load
1365
3858

% Error
2.1
12.5

Lift Versus Output Percentile Profiles For the September Model


Training

Validation

Predicting Enrolments From Applications Data

The strong consistency of the lift profiles between training results and
test and validation results are indicative of structural patterns of
behavior that appear to exist across applicants to South Australian
Universities.

These patterns of behaviour appear to be being captured via the rules


contained within the decision trees produced during the training stage of
the modeling process.

Paper reporting this work accepted for presentation at the Australian


Association for Institutional Research Forum in November & possible
publication in the Journal of Institutional Research.

If future years performance proves to be similar, the approach should


be able to provide valuable support to the management of the
applications process.

Predicting Topic Enrolments


Planning services approached by School of Nursing to predict
future topic enrolments to assist in resource and placement
management.
Primary focus on predicting topic enrolments for 2 nd year
undergraduate nursing topics.
Largely deterministic program complicated by pre-requisites,
large numbers of advance credit 2nd year commencers,
relatively high percentages of part-time students and a lack of
historical training data due to a major course restructure in
2013.

Predicting Topic Enrolments

Similar machine learning (decision tree) approach adopted however input variables
consisted only of: course code, attendance type, and previous topics passed (no student
demographic or BOA information).

Binary target variable - 1 did enroll in target topic, 0 did not enroll in target topic

Under new program 2nd year topics being run for first time in 2014. Therefore only have 1 st
year 2013 students to train and test on. Test results gave promising results and a model
was developed to predict topic enrolments for 2015.

Predictions for all seven 2nd year nursing topics were provided and validated by the School
as being consistent with their estimates.

The School of Nursing have requested for the approach to become part of their standard
business process in future years and discussions are underway as to how Planning
Services can meet this request.

School of Education, Humanities and Law have provided 12 topics of interest to assist
planning services further develop the approach within a less constrained course structure.

In Conclusion

Learning analytics is still in its infancy.

The Student Success Project, Topic Enrolment and University


Enrolment Prediction projects have demonstrated some early promise.

Across the University we have the technical expertise and strong


management support to progress learning analytics at Flinders.

Particularly keen to work with the faculties to progress analytics in


support of the teaching function.

Performing research-like activities within an operational environment


looking for trailblazers without the fear of failure.

Were keen, enthusiastic and were here to help !

You might also like