Hello Everyone, if you haven’t read the article on LinkedIn on the power BQML, and compare it’s prowess with custom algorithms by sheer numbers, then please click here, on this LinkedIn link.

To reiterate the case study from LinkedIn, this about a hotel where 60-70% bookings are actually coming from OTAs (Online Travel Agents). For this reason OTAs do get upto 30-40% of discount on the booking price. Part of this discount is passed on to the customer by OTA and rest they keep for themselves.

Needless to say, considering there’s 30-40% discount on 60-70% bookings, margin of the hotel is taking a big time hit. Now they want to increase margins from these customers, and they are looking for ideas.

One of the idea that popped up was selling ancillary services to these customers, that is room services, F&B, gym, spa, massage etc. As a way to increase adoption of these services, marketing team cape up with an idea of running a campaign program, where there will be 3 touch points:

  1. Awareness: Introduction to the services on the day of the booking
  2. Tactical/Conversion: 4 days prior to the visit, a user gets a 10% discount voucher for these services
  3. Top of mind: On the day of the visit, again an awareness campaign about the services

This idea seemed OK with the product team, but the idea of providing additional 10% discount to everyone seemed a little counter intuitive, when the biggest reason for launching this campaign is to bolster margins. This is the time when data team is involved, and are asked to see how we can improve this marketing strategy. This when it was decided that there will be 3 groups of customers created:

  • Strong ayes: Customers who are most likely to avail ancillary services. These folks won’t get the offer
  • Strong nays: Customers who are most unlikely to avail services. These folks again won’t get any offer
  • The maybes: Customers who are having some likelihood to take up the offer. These folks will get the 10% discount offer, as suggested by marketing teams

While we are working with the clients, we are having historical data of all those customers who came from OTAs, along with information about if they ordered some ancillary services in their visit. In the database we are having the following variables:

  1. Customer ID: each customer has a unique customer ID
  2. Room type: What kind of room was booked, ‘Standard’, ‘Deluxe’ or ‘Premium’
  3. Gender: Gender o the person made the booking
  4. Age: Age of the primary occupant
  5. Days: For the number of days booking is made
  6. Revenue: Revenue made from the booking
  7. Reason: ‘Business’, ‘Personal’ or an ‘Event’ visit
  8. Customer ID: Unique ID of the customer
  9. OrderFlag: A field marked by ‘1’, or ‘0’. ‘1’ if an ancillary order was placed, and ‘0’ if no order was placed.

Now that you have read it, you can download data from here:

Do remember, you will have to encode the text fields into numeric fields, before you can actually apply the classifier algorithms. In case you are having a difficulty on the same, please do download encoded data from here.

Now that you have downloaded data, I would even recommend normalizing your data, before you apply the algorithms.

Once you have created your classification models, do go ahead and identify your 3 clusters, that is:

  • Strong Ayes
  • Strong Nays
  • Maybes

Hope all this information helps you to take a stab at a very relevant hotel problem. Please do feel free to put your comments below.


Please enter your comment!
Please enter your name here