July 17, 2024
Prioritizing House Attributes Primarily based on Visitor Curiosity | by Pleasure Jing | The Airbnb Tech Weblog | Feb, 2023

How Airbnb leverages ML to derive visitor curiosity from unstructured textual content knowledge and supply customized suggestions to Hosts

By: Joy Jing and Jing Xia

At Airbnb, we endeavor to construct a world the place anybody can belong anyplace. We try to know what our friends care about and match them with Hosts who can present what they’re in search of. What higher supply for visitor preferences than the friends themselves?

We constructed a system known as the Attribute Prioritization System (APS) to hearken to our friends’ wants in a house: What are they requesting in messages to Hosts? What are they commenting on in critiques? What are frequent requests when calling buyer assist? And the way does it differ by the house’s location, property kind, worth, in addition to friends’ journey wants?

With this customized understanding of what house facilities, amenities, and placement options (i.e. “house attributes”) matter most to our friends, we advise Hosts on which house attributes to amass, merchandize, and confirm. We are able to additionally show to friends the house attributes which are most related to their vacation spot and desires.

We do that by way of a scalable, platformized, and data-driven engineering system. This weblog put up describes the science and engineering behind the system.

What do friends care about?

First, to find out what issues most to our friends in a house, we have a look at what friends request, touch upon, and get in touch with buyer assist about essentially the most. Are they asking a Host whether or not they have wifi, free parking, a personal scorching tub, or entry to the seaside?

To parse this unstructured knowledge at scale, Airbnb constructed LATEX (Listing ATtribute EXtraction), a machine studying system that may extract house attributes from unstructured textual content knowledge like visitor messages and critiques, buyer assist tickets, and itemizing descriptions. LATEX accomplishes this in two steps:

The named entity recognition (NER) module makes use of textCNN (convolutional neural network for text) and is educated and tremendous tuned on human labeled textual content knowledge from varied knowledge sources inside Airbnb. Within the coaching dataset, we label every phrase that falls into the next 5 classes: Amenity, Exercise, Occasion, Particular POI (i.e. “Lake Tahoe”), or generic POI (i.e. “put up workplace”).

The entity mapping module makes use of an unsupervised studying method to map these phrases to house attributes. To realize this, we compute the cosine distance between the candidate phrase and the attribute label within the fine-tuned phrase embedding area. We take into account the closest mapping to be the referenced attribute, and might calculate a confidence rating for the mapping.

We then calculate how often an entity is referenced in every textual content supply (i.e. messages, critiques, customer support tickets), and combination the normalized frequency throughout textual content sources. House attributes with many mentions are thought-about extra vital.

With this technique, we’re in a position to achieve perception into what friends are keen on, even highlighting new entities that we could not but assist. The scalable engineering system additionally permits us to enhance the mannequin by onboarding further knowledge sources and languages.

An example of a listing’s description with keywords highlighted and labeled by the Latex NER model.
An instance of an inventory’s description with key phrases highlighted and labeled by the Latex NER mannequin.

What do friends care about for various kinds of properties?

What friends search for in a mountain cabin is completely different from an city house. Gaining a extra full understanding of friends’ wants in an Airbnb house permits us to offer extra customized steering to Hosts.

To realize this, we calculate a singular rating of attributes for every house. Primarily based on the traits of a house–location, property kind, capability, luxurious stage, and so on–we predict how often every attribute will likely be talked about in messages, critiques, and customer support tickets. We then use these predicted frequencies to calculate a custom-made significance rating that’s used to rank all potential attributes of a house.

For instance, allow us to take into account a mountain cabin that may host six individuals with a median every day worth of $50. In figuring out what’s most vital for potential friends, we be taught from what’s most talked about for different properties that share these similar traits. The consequence: scorching tub, hearth pit, lake view, mountain view, grill, and kayak. In distinction, what’s vital for an city house are: parking, eating places, grocery shops, and subway stations.

Picture: An instance picture of a mountain cabin house
An example of home attributes ranked for a mountain cabin vs an urban apartment.
An instance of house attributes ranked for a mountain cabin vs an city house.
Picture: An instance of an city house house

We may instantly combination the frequency of key phrase utilization amongst related properties. However this method would run into points at scale; the cardinality of our house segments may develop exponentially massive, with sparse knowledge in very distinctive segments. As a substitute, we constructed an inference mannequin that makes use of the uncooked key phrase frequency knowledge to deduce the anticipated frequency for a section. This inference method is scalable as we use finer and extra dimensions to characterize our properties. This permits us to assist our Hosts to greatest spotlight their distinctive and various assortment of properties.

How can friends’ preferences assist Hosts enhance?

Now that we now have a granular understanding of what friends need, we may also help Hosts showcase what friends are in search of by:

However to make these suggestions related, it’s not sufficient to know what friends need. We additionally have to be positive about what’s already within the house. This seems to be trickier than asking the Host because of the 800+ house attributes we gather. Most Hosts aren’t in a position to instantly and precisely add all the attributes their house has, particularly since facilities like a crib imply various things to completely different individuals. To fill in a number of the gaps, we leverage friends suggestions for facilities and amenities they’ve seen or used. As well as, some house attributes can be found from reliable third events, corresponding to actual property or geolocation databases that may present sq. footage, bed room rely, or if the house is overlooking a lake or seaside. We’re in a position to construct a really full image of a house by leveraging knowledge from our Hosts, friends, and reliable third events.

We make the most of a number of completely different fashions, together with a Bayesian inference mannequin that will increase in confidence as extra friends affirm that the house has an attribute. We additionally leverage a supervised neural community WiDeText machine studying mannequin that makes use of options concerning the house to foretell the chance that the subsequent visitor will affirm the attribute’s existence.

Along with our estimate of how vital sure house attributes are for a house, and the chance that the house attribute already exists or wants clarification, we’re in a position to give customized and related suggestions to Hosts on what to amass, merchandize, and make clear when selling their house on Airbnb.

Cards shown to Hosts to better promote their listings.
Playing cards proven to Hosts to higher promote their listings.

What’s subsequent?

That is the primary time we’ve identified what attributes our friends need all the way down to the house stage. What’s vital varies vastly based mostly on house location and journey kind.

This full-stack prioritization system has allowed us to offer extra related and customized recommendation to Hosts, to merchandize what friends are in search of, and to precisely characterize fashionable and contentious attributes. When Hosts precisely describe their properties and spotlight what friends care about, friends can discover their good trip house extra simply.

We’re at the moment experimenting with highlighting facilities which are most vital for every kind of house (i.e. kayak for mountain cabin, parking for city house) on the house’s product description web page. We consider we will leverage the data gained to enhance search and to find out which house attributes are most vital for various classes of properties.

On the Host facet, we’re increasing this prioritization methodology to embody further suggestions and insights into how Hosts could make their listings much more fascinating. This contains actions like releasing up fashionable nights, providing reductions, and adjusting settings. By leveraging unstructured textual content knowledge to assist friends join with their good Host and residential, we hope to foster a world the place anybody can belong anyplace.

If this kind of work pursuits you, try a few of our associated positions at Careers at Airbnb!

It takes a village to construct such a sturdy full-stack platform. Particular because of (alphabetical by final title) Usman Abbasi, Dean Chen, Guillaume Guy, Noah Hendrix, Hongwei Li, Xiao Li, Sara Liu, Qianru Ma, Dan Nguyen, Martin Nguyen, Brennan Polley, Federico Ponte, Jose Rodriguez, Peng Wang, Rongru Yan, Meng Yu, Lu Zhang for his or her contributions, dedication, experience, and thoughtfulness!