Big Data, Small Team: How HotelTonight’s Demand Prediction Runs On Machine Learning

Machine learning isn’t just for giants like Google, Facebook, and IBM anymore. These days, startups of all sizes are working to incorporate machine learning and AI into their core products. Thanks to simpler machine learning models, the toughest problems in tech are now an algorithm away from being solved.

At HotelTonight, we take the same approach to machine learning that we do to all of our engineering projects: We quickly build simple tools that make the user experience better. We’re a small, scrappy team, but we’ve been able to train machine learning models that understand market dynamics and predict hotel demand by leveraging our existing infrastructure and focusing on achievable goals.

In the past, we used machine learning to predict market dynamics and dictate where we should focus our supply efforts. We trained the models with five years’ worth of data that we’d already accumulated, which cut costs and paved the way for us to use machine learning technologies in more of our customer and hotel-facing products.

For our first external-facing machine learning effort, the Demand Prediction project, we enhanced a recent feature that guides our hotel partners on how to move up in our rankings. Instead of just encouraging them to improve their rankings, we wanted to show them exactly how much more demand they could expect if they increased their rank. That way, they could have a clearer view of the marketplace and better optimize their allotment, as well as provide a better customer service experience. Everybody wins.

Previously, our guidance lacked any description of the benefits of improving their rank.

To solve this, we could have built a complete model of booking behavior based upon the volatility of the market, the types of users visiting our app, or a multitude of other factors. Instead, we kept it simple with data that we already had available, and focused on predicting the number of rooms that will be sold at the current price for a particular hotel on a particular day.

We now can frame the guidance around the number of additional rooms that a hotel would sell by applying our suggestion. In this example, the hotel is under-allocated, meaning that they could sell more rooms than they currently have loaded in our marketplace, even without lowering their price.

We soon discovered that more limitations were needed based on what the data was telling us. For instance, some markets didn’t generate enough demand to allow us to train accurate models. It also became obvious that the last-minute nature of our app meant that we didn’t have enough data to give quality recommendations until there were less than 48 hours remaining in the booking window.

The last-minute nature of our app means that the majority of our bookings come on the same day as the check-in.

So again we could have spun our wheels thinking about how to solve these problems in all sorts of convoluted ways. Instead, we settled on a strategy of training individual machine learning models, one for each combination of market and day of week. This wasn’t the first strategy we tried, but having all of the data we needed to train new models at our fingertips made it easier to iterate on our models and try new things. The more granular models far outperformed the generic models, taking us from an average accuracy percentage of 73 percent to 87 percent.

No amount of data can change the fact that demand itself can be unpredictable. One of the biggest challenges with predictive modeling is ensuring users don’t lose trust when your predictions aren’t always right. Even small errors like being off by one predicted booking can harm your credibility. We tackle this by closely monitoring the predictions we’re making and identifying when a model isn’t returning accurate results. Then we iterate on the model and work to make it more accurate.

We measure our model’s performance by the percent of predictions that fall within two bookings of “on the money”. This chart shows the model getting more accurate as we get closer to the end of the booking window.

We also keep an eye on our predictions at the hotel level, so we can identify the partners that our algorithms really struggle with, and potentially stop trying to give them this type of guidance. Taking the time to add monitoring to our data pipeline, model creation, and prediction accuracy gave us much more confidence in the feature.

The Demand Prediction project has moved our engineering team forward in some really exciting ways. It forced us to formalize our real-time event stream, to vet the machine learning solutions that were available to us, and to think about and understand the factors that influence demand in our markets.