iSchool Capstone

2020

Project Logo

Patent Evaluator

Patent applications are currently an expensive process, requiring many work hours and high economic costs for creating, editing, and submitting applications to the United States Patent and Trademark Office. This project utilized machine learning methods to analyze method claim text and ultimately determine whether a method claims application would be rejected. This project aims to help stakeholders leverage the results from these models to streamline their patent application process and to implement the models in the future for better determining which words may contribute to their application being rejected.
Project Logo

Project Detox

Among gamers playing the top 15 games, roughly 70% have experienced online abuse. Project Detox is empowering the Gaming Safety team at Microsoft so that they can provide a safe environment to their customers worldwide. We are implementing an automated testing framework for their toxicity classifiers. After measuring model performances against each other and on different kinds of data, we have generated beautiful and intuitive reports which would enable the stakeholders to make data-driven decisions. All of this has been packaged into pipelines which would automate the process and eliminate manual work.
Project Logo

PUSHSPRING Marketplace Search

Our marketplace search recommender system is a web app that users can use to explore the apps related to a persona or a given mobile app based on PushSpring's local user data rather than average result such as the ones Google gives. Our system was implemented with both CNN and NLP to maximize diversity and relevancy in recommendations. Combining benefits from both content-based and user-based recommendation models, our model produces valuable insights that can't be found elsewhere.
Project Logo

SightLife International Operations Global Scorecard

Sightlife, is a leading global health organization working with low and middle-income countries aiming to eradicate corneal blindness by 2040. Historically, metric owners have tracked their progress using manual platforms leading to higher chances of errors. Our web application enables metric owners to easily enter and visualize data trends. This helps them identify pain points faced by the team, allowing them to make informed and effective decisions. This dashboard is home to all metric areas and provides leadership and other teams a platform to see how areas are doing across the organization.
Project Logo

SkillEdge

In a world with infinite demand, it can be difficult to commit time to learn something new. Our attention gets scattered and diffused. SkillEdge solves this issue by combining industry research and organizational techniques to make learning new skills achievable, productive, and fun. We do this through analytic charts based on your reported behavior to show you in what areas you may need to improve, and where your strengths are.
Project Logo

The modern product manager’s toolkit! Sentiment Analysis and Topic Modeling

Smartsheet is an enterprise platform that helps automate processes across a broad array of use cases, which strives to improve its offerings based on consumer feedback. The task of understanding and leveraging unstructured comments and emails from customers is challenging, which is why we propose a combination of topic modeling and sentiment analysis techniques to gather actionable insights, aggregated and presented as a dashboard to stakeholders thus enabling them to take data-driven decisions to improve user experience. The solution focuses on identifying negative feedback from users, in order to highlight potential areas of improvement, thus informing the future product roadmap.
Project Logo

Venture Funding and Patent Portfolio

With the rise in Venture Capital investments, it is necessary to provide investors with additional information apart from traditional data sources. A good investment indicator is the patent data of a company. Patents allow investors to know the technical capability and intellectual property of a company. This enables much more sound investments. However, a major drawback is the unavailability of a consolidated data source providing all the information. The project aims to resolve this by developing a novel algorithm for fast fuzzy matching of two companies between Pitchbook funding data and United States Patent and Trademark Office (USPTO) data.
Project Logo

Which is the Safest Car for You? Deriving Safety Score from Real World Collision Data

Which car model is the safest while facing a car collision? The National Highway Traffic Safety Administration has a popular car safety rating system based on crash tests in controlled environments, but how about a safety rating based on real world accident data? Provided with extensive vehicle collision records across the US, by VinAudit, we built a regression model to evaluate the vehicle safety for each vehicle make/model/year, and created a web application that allows people to look up and browse safety scores. We provide a brand new perspective for consumers who care car safety.

2019

Project Logo

Analyzing The Gates Foundation's Grant Giving Process

The Gates Foundation Grand Challenges has teamed up with InfoVision in order to develop a point-in-time first encounter analysis and bias report regarding their current blind review methodology. The Grand Challenges Explorations (GCE) team needs to know whether their “blind review” process of grants is working to ensure all types of organizations are being considered and brought into the loop. There should also be no biases altering decisions throughout.
Project Logo

Carrier Network Traffic Analysis - Data-driven solutions to keep the world connected

Network traffic and accurate location data draw significant interest from network operators and many other industries for marketing and service improvement potentials. At the same time, such data draw significant scrutiny due to privacy concerns. This project attempts to address privacy concerns by building a machine learning model based on truly anonymized data where all privacy-related information is peeled off. The project solution helps improve consumer confidence by eliminating privacy exposure while allowing the sponsor to save significant resources by simplifying the data wrangling process from complex and disintegrated data sources.