AQM (2015-2016) – Translink
(South Coast British Columbia Transportation Authority)

The 2015 capstone project was completed for the Translink Transportation Analytics team. The goal was to analyze 5.4 GB of bus data which contained GPS data, complex bus sensor data (bicycle rack, automated passenger count, etc), stop arrival times and numerous other variables. Three distinct student groups were formed to analyze various aspects of the data:

Machine Learning Project
The goal of the machine learning group was to apply an unsupervised approach to identify patterns the high-dimensional data. This included the use of variations in K-means clustering and self-organizing map (SOM) algorithms to arrange bus lines into distinct groups with specific explanatory variables for arrival delay with differing magnitude of effect. The results showed hidden relationships and patterns within the data and the bus line clusters were used as diagnostic tools for bus delay. More nuanced insights can be drawn using clustering and SOM due to the algorithms being very sensitive to subtle differences in the explanatory variables used to explain arrival delay variability.

Statistical Modeling Project
The focus of the statistical modeling group was to perform various dimension reduction procedures (PCA, LASSO, Elastic Net), then use the resulting subset of important explanatory variables to construct a generalized linear model (GLM) to predict the average travel time of a particular bus route. Predicting the travel time of buses is an important factor in customer satisfaction for TransLink as inaccurate bus arrival times are a primary source of customer complaints.

Heuristic Optimization Project
The goal of the Heuristic Optimization Group was to build an optimization model that reduced or reallocated the number of buses running on a particular route while ensuring that the buses were within a certain “rider comfort” range and an optimal amount of demand was met. Meeting the same rider demand with more efficient scheduling of buses allowed a fewer number of buses to be used on almost every route. This efficient routing method provides TransLink cost savings in terms of operational cost of every bus used.

AQM (2016-2017) – Translink and Best Buy Canada


  •  Advanced Clustering: We’re continuing our work with the Service Analytics team to build more advanced clustering methods to understand the arrival delay of bus lines. Also will be utilizing large scale, high performance supercomputing to push the envelope in analyzing bus data.
  • Social Media Analysis: A new area for Translink, looking into understanding the impact of transportation delays in social media. Utilizing sentiment analysis.

Best Buy Canada: 

  • Social Media Sentiment Analysis: Utilizing sentiment analysis and natural language processing to understand how customers “feel” about the Best Buy social media presence.
  • Improving customer engagement on social media: Building different strategies Best Buy can use to increase customer engagement on social media. Utilizing statistical models to understanding the drivers of customer engagement.
  • Advanced methods: Exploring applications of deep neural networks and latent Dirichlet allocation (LDA) on Best Buy Canada’s social media data.

AQM (2017-2018) – Translink, Best Buy Canada and Lululemon 


  • Further explore methods and tools that help Translink understand, predict and better plan for arrival delay of buses.

Best Buy Canada: 

  • Further explore methods and tools that help Best Buy Canada better understand their customers on social media.


  • The core need for Lululemon is a better understanding of how their customers move and the associated impact on their bodies. We’ll be working with the R&D team to provide advanced statistical and machine learning tools to understand patterns in the biometric data they’ve collected.