iSchool Capstone

2016

Project Logo

Predicting Student Churn

Each year, roughly 30% of first-year students at baccalaureate institutions do not return for their second year and over $9 billion is spent educating these students. Yet, little quantitative research has analyzed the causes and possible remedies for student attrition. Here, we describe initial efforts to model student dropout using the largest known dataset on higher education attrition, which tracks over 32,500 students' demographics and transcript records at one of the nation's largest public universities. Using a balanced dataset, an accuracy of 16% over baseline (66%) can be achieved. Logistic regression, random forest, and k-nearest neighbors models were used. Accuracy was boosted through a feature engineering approach. This project will inspire universities to use machine learning to identify at risk students. They can then target retention efforts towards these students; hopefully putting loan money to better use and ensuring that the most students possible earn degrees.
Project Logo

PrepSmart

The company Test Innovators helps students prepare for standardized tests like ISEE/SSAT. Students sign up to their platform and take practice exams. They have more than 20,000 students signed up till date and pocess more than 5,000,000 questions on their system. The company wanted to build a system which automatically analyzes the student's progress in their practice tests and recommend ways to improve their scores. We have built a system, PrepSmart, that automatically learns from past experiences of students who are similar to the current test taker and suggest their features to the current test taker to improve his scores. Recommendations can be anything like strong/weak subject areas, time to answer questions,additional question banks and tutoring services. PrepSmart helps students in improving their overall scores and it also helps the sponsor in generating additional revenue by selling recommended question banks and tutoring services.
Project Logo

RCMD:ME, A Media Explorer

Before RCMD:ME, media recommendation services solely recommended media based on user’s preferences within the same media type. For example, GoodReads recommends books to users based upon other books they like. However, there are no recommendation platforms which bridge multiple media forms--such as someone who likes The Road might want to find musical artists, movies, and books that other users who like this book also like (e.g. 67% of people who like The Road also like Apocalypse Now). We have developed a recommendation service which has created a new way to find relationships between different forms of media that are otherwise unknown. Our service allows artists to gain exposure in a way that respects independent artists and strengthens ties between print and digital media. With RCMD:ME, users are able to find new media more easily, decreasing wasted time and frustration, and ultimately improves the media discovery process holistically.
Project Logo

Slate

From hardware setup to data acquisition and analysis, depression researchers go through a long, convoluted process to understand electroencephalogram (EEG) data (popularly called brain waves) collected from patients. Acquired data is often messy and difficult to process, and analysis can take from hours to days to return results. Our project, Slate, aims to change that by providing a platform that streamlines this process, and provides a fully interactive 3D model to visually digest the data. Slate is a cross-platform desktop application that consists of two parts: an EEG data processing tool, and a data visualization tool that allows depression researchers to view and analyze brain signals obtained through a non-invasive, commercial EEG headset. Slate utilizes real-time modeling and neuroimaging techniques to allow more meaningful understanding of EEG data in the process of interpreting and identifying depressive patterns in research, with the goal of ultimately serving patients.
Project Logo

Spokin

Spokin is an experimental online civic network created by the non-profit research and development organization, Third Place Technologies. Spokin’s purpose is to allow users to connect with organizations, projects, events and many other happenings that they believe that they are connected to within their communities. This is where our specific capstone project comes into play. Our project is to use a JavaScript visualization library called D3 that will allow us to create a way of visualizing the connection between different people and organizations in the Spokin user network. This will allow their audience to be able find new users and organizations to connect with by looking at how their current connections compare to others.
Project Logo

SPSInteractive

In today’s society, we make most of our decisions by comparing all of the options side by side. When it comes to public schools, though, this is virtually impossible. As it stands, there is no convenient, simple way for the average person to compare every Seattle public school side by side. SPSInteractive, which is sponsored by Microsoft’s Civic Technology and Engagement department, aims to fill this void by providing a suite of interactive visualizations which make it easy to compare all of the schools in Seattle. We include test scores information, graduation rates, demographics, and many other metrics which are necessary to examine when studying a school. Not only will this help parents decide which school is the best fit for their child, but it will also help policy makers form a better picture of the Seattle Public School System, and make the best possible decisions for the next generation of students.
Project Logo

SPYRAL

Companies around the world are investing in cybersecurity but are not confident in their security controls as the technology landscape is ever changing. Spyral, our risk scoring framework, aims to provide operational cyber awareness for companies by providing them the fastest actionable insight into their risk posture. National Vulnerability Database(NVD) provides a vulnerability score for different software flaws. Spyral extracts this vulnerability score from NVD and superimposes it with other operating environment factors such as number of machines in environment, cloud infrastructure, and affiliation with external entities to come up with a risk score contextual to the company. Spyral is also industry agnostic and does not take special regulations into consideration to provide an unbiased scoring based on only the operating conditions. This information is going to be presented in an information visualization dashboard which will provide quick insights into their risk posture.
Project Logo

Spyro

Spyro: Empowering Familial Genetic Research Through Data Visualizations The field of precision medicine is projected for rapid growth in the next decade and genome analysis is at the forefront of this trend. However, current genome analysis tools have various issues: they have high learning curves, they are visually unappealing, and they have no easy way to compare multiple individuals’ genome data. In response to these issues, we have partnered with Spiral Genetics, a Seattle-based bioinformatics startup, to develop Spyro, a user-focused genome visualization tool. By leveraging Spiral Genetics’ next generation sequencing data, our tool allows researchers to utilize a comprehensive set of features in a simple, intuitive interface. Researchers can analyze an individual’s genetic variants, as well as make comparisons between multiple individuals. Because Spyro is focused on representing variant detection, researchers in this field can quickly adopt our tool and start analyzing.
Project Logo

Steam Achievement Classification & Analytics Project

Classification, whether it be for groceries, books, music or the countless other commodities that require effective browsing and retrieval, bring with them the challenge of organizing materials in a coherent manner. Console and computer games are no different in that respect, but provide an additional challenge. As an interactive medium, games are largely defined by the choices and actions that players are allowed to make. The Steam Achievement Classification Analysis Project is a quantitative exploratory analysis that utilizes the rewards given to players in the form of “achievements”, and by way of text-mining and clustering games based on those achievements, provides insight into how thematically similar games differ in the tasks players are directed to accomplish. As the research shows, although existing tags and categories can reasonably define groups of interactive games, those groupings can often belie serious variations in the interactive dimension of those games.
Project Logo

Surveying Usage of Academic Research in Journalism

With so many contradicting media articles citing research, knowing what to believe can be frustrating for everyday media consumers. Without diving into academic text themselves, it can be quite the challenge for consumers to know if the appropriate level of authority to attribute to the research cited (is the journal being cited well-respected? Are the findings controversial? Do similar studies exist with different results?). The particular factors which influence the amount of coverage academic research receives from journalists are not well understood, which further complicates the situation. We have gathered and analyzed descriptive data about media articles and the academic works they cite to help us understand what sort of research receives attention from journalists. Through this research we hope to give consumers a better understanding of the relationship between journalism and academic research, and thereby enable them to create more informed interpretations of information they encounter in media.