Listings are NOT Properties

When I started working at realtor.com, a colleague who had been working in real estate tech for a long time told me something very useful that I’ll never forget. It’s one of the first things I now say to any team, stakeholder, or other colleague when we discuss real estate data.

Listings are not properties.

This baffled me at first, how could listings NOT be properties? Aren’t properties the thing that the listing is selling? Well, yes and no. In order to understand, let’s back up and get a  larger picture of the data. Understanding this difference will save you and your team hours of headaches and prevent you from making the biggest newbie mistake I’ve seen teams make.

There are two categories of property data

Property and property-related data is at the center of everything for both our real estate and mortgage operations. There are two main categories of property data:

As a rule of thumb, think of property data as the facts around a physical asset (a building or parcel of land), while the listing is an advertisement (it doesn’t even need to be tied to a physical property such as the case of a new home build or floor plan). 

Think of the property as a hot band on tour (I love the Killers), while listings are the flyers in every city advertising where the plan is playing and when. Now you’re closer to understanding the difference.

What are properties and where do they come from?

Well, simply put, properties are physical assets that exist in the real world, while a listing is an advertisement of that asset when it comes up for sale. Property data typically comes from tax records compiled by county auditors, it includes all of the facts about the property including but not limited to:

The easiest way to get property data is from an aggregator such as CoreLogic, Data Tree/First American, or ATTOM. These companies also can offer other data products (including proprietary information) to get even more details about a property such as rooftop geo, tax history, AVMs, deed & mortgage history and even liens. These data feeds are licensed by the aggregator and they set the terms and conditions. There are roughly ~150M properties in the United States in roughly 3100 counties.

What are listings and where do they come from?

Listing data comes directly from Multiple Listing Services (MLSs) and as I mentioned are essentially advertisements an agent or licensed salesperson compiles when the property goes on the market for sale. The data is technically owned by the listing brokerage but the MLS has a fiduciary duty to store the data, provide a search interface of the listings to agents and license the data to brokerages in accordance with National Association of REALTORS (NAR) policy and/or the local real estate association policies. 

When an agent adds a listing to their local MLS, the listing record contains facts about the property (just like in the public property data) AND copyrighted information such as photos, the listing description, and the listing status (whether the listing is active on the market, in contract, sold, or whatever other status depending on the local market).

Unlike property data, there is no single source for listing data in the United States. There are roughly 500 MLSs in the United States and if you want every listing you have to go to each one individually (there is a long tail where roughly 300 MLSs comprises over 90% of the US listings).

I’ll write more on the steps you have to go through to get a listing data from MLSs in another post. For now I’ll talk about the differences between property and listing data and how it might affect your decisions. 

The “Duplication” problem with listings

When you work with listing data from multiple MLSs, especially where MLSs are grouped around a particular geographical area, you’ll quickly find that there are often multiple listing records for the same property. These so-called “duplicate” (I actually prefer to call them “siblings”) records can be a source of confusion for a lot of teams new to working with real estate listings. 

In fact, if you remember that listings are simply advertisements, it makes a little bit more sense. The geographical boundaries of MLSs aren’t always clear cut, you’ll often see MLSs overlap each other in many markets. For example Georgia has a state-wide MLS (GAMLS), however there’s also First MLSs (FMLS) that serves Atlanta and its surrounding neighborhoods. It’s very common to see in the Atlanta area the same property being sold listed in both MLSs as agents try to expand their marketing reach as widely as possible. 

On the other-hand, when it comes to property data you’ll see less duplication from any particular single source. Each property record in the United States has a Federal Information Processing Standard (FIPS) code for the county and an Accessor’s Parcel Number (APN) for the property’s parcel. APNs by themselves aren’t guaranteed to be unique as it’s up to each county how they generate and assign IDs. Combining APN with the FIPS code should give you a decent unique identifier, plus the aggregators themselves might do additional work on their end to generate their own unique property ids.

The problem with addresses

The fact that the property relationship to listings is 0: N (in computer science terms this means “zero to many”) really shows itself when you start to grapple with addresses for listings. Listings don’t have to be tied to a specific parcel (like in a floor plan for a new build) and in some cases the seller can withhold the address. If you are trying to use addresses to link listing and property records together this can create some real headaches. Also consider the fact that there are parcels with multiple addresses (such as multi-family units or cases where there’s an address for deliveries and an address for utilities). 

The biggest mistake I’ve seen people make

The biggest newbie mistake I’ve seen is when you design a system that creates a tight rather than a loose coupling between property and listings. This is particularly fatal if your goal is to develop a listing search. If you architect your system around properties and then attach listings expecting it to be a perfect match you are setting yourself up for trouble. The best approach is to loosely-couple listings and property records and build your searches to treat them separately depending on your specific use cases. 

Hopefully, now that you understand the differences between property data and listing data, you can more effectively develop your systems to handle both types of records. Fully understanding the differences will save you and your team months of headache and frustration, especially as you scale out your solution to multiple markets. Good luck!