Garbage In, Garbage Out (GIGO)

“Garbage in, garbage out” is a term I learned shortly after joining Emetry. Although I did not have a catchy phrase for it before joining, I understood first hand the challenges of working with, trying to extrapolate insights from, and cleaning dirty data. The problem is you need rich data to know how to move forward and iterate over time. 

Why It Exists

There are many reasons why dirty data exists. Poor data collection, system migrations, and incompatible tech stacks are a few of the worst offenders. There are tools in many winery commerce systems to dedupe customers but many times, there is not enough time or attention given to the problem to make a difference. 

Why Dirty Data Matters

The problem with ignoring dirty data and not solving the root problems with how it gets into your system is that it compiles over time. Most wineries have years and years of customer history with system changes and new employees. Employees who have been around a while know what to look for. They can work around the dirty data challenges by filtering spreadsheets and creating random member types that exclude or include customer groups. The problem is when these employees leave, their knowledge of the database and what skeletons are in the closet often leave with them. 

Getting Insights from Dirty Data is Hard

Dirty data is detrimental to your business. It skews metrics making it more likely for companies to measure things that don’t matter. I believe this is also why so many spray and pray email models exist. 

Why Spray and Pray has Prevailed

There are so many wonderfully smart marketers out there, but this hasn’t changed the fact that aside from basic segmentation (think wine club, purchased Zin before, etc.), we primarily rely on mass emails. The problem is, over time, customers are  continuously bombarded with content that is not relevant to them or where they are in their journey with your brand. This leads to a deterioration of brand interest. Engagement does not always mean more emails, however, if you build relationships with timely and relevant touchpoints you are far more likely to keep that customer on a journey with your brand. To send better content and build up relationships, you need clean data.

Bad Data = Poor Results

Your data fidelity also affects the effectiveness of any third-party systems you use. This is where the garbage in garbage out applies. Flawed data coming into any system will produce erroneous results regardless of the quality of the analytics. For example, if your tasting room data is not flowing back into your database, you have no way of knowing when that same customer purchases via your website, or that they started their journey with your brand at your cellar door. 

Or consider the use of the generic TR Customer. This is the placeholder customer that many wineries use to capture tasting room sales, especially during peak traffic periods. Unfortunately, this type of customer not only prevents you from following a consumer through their journey and re-engaging them, but it can also skew analysis around customer lifetime value, market opportunities, and segmentation.

Not investing in things like data hygiene, new skills and customer understanding continually creates data debt that causes challenges in the future. The longer your wait the harder it will be to clean the data because the volume of debt will be too large.

Cleaning Your Data

Depending on what system you are on, cleaning your database can be quite the undertaking. Start slow, determine your most significant pain points, and pick a place to start. 

Here are the Common Types of Dirty Data

  1. Incomplete data: Key fields are empty, like address information, birthdate, or email.  Without this information, you cannot segment your sales and marketing based on age, location, or even contact them. 
  2. Duplicate data: Most wineries deal with duplicate customer records, and while there are tools to dedupe, these duplicates inflate your database and make it impossible to track a customer’s sales accurately.
  3. Incorrect data: Incorrect data occurs when field values are created outside of the valid range of values. For example, the value in the state code has California vs. CA, or an email is entered in the address field.
  4. Inaccurate data: Similar to incorrect data, inaccurate data is technically correct but wrong. The best example here is birthdate. 1/01/1900 anyone? Typos in addresses also fall into this category. Small errors in an address can cause wine to be delivered to the wrong address.
  5. Poorly Documented Processes: Undocumented member types, products that are set up and missing vital information like varietal, packaging size, etc. and new account creation for anonymous customers (as a marketer, this one hurts my heart)
  6. Inconsistent data: Data redundancy spread across systems. For example, most wineries have customer information in multiple systems, and the data is often not kept in sync.

Some of these are fixable internally with time and a little spreadsheet handiwork. Depending on what you’re working with, you might consider appending your data with a profile stitching company. If you are just looking for the basics (name, address, birth date, etc.), it’s pretty affordable. If you decide to go this route, keep in mind that not all stitching companies are equal. Be sure to ask about their match rates and how they source their data.

Preventing Future Dirty Data 

Once you have started a data cleanse project, don’t undo all of your hard work by letting dirty data wiggle its way back in. Collecting clean data at the point of sale is likely going be a challenge but this post is a great place to start. Take the time to make all your systems get along with each other. In an ideal world, everything will be connected, Google Analytics, Commerce, Email, Point of Sale, Social…I could keep going. Realistically, at a minimum, your point of sale, e-commerce, and wine club systems should all integrate flawlessly. From there, email. You will save yourself time, energy and gray hairs not continually importing and exporting every time you want to send an email (another reason spray and pray occurs). 

A Future with Clean Data is Bright

Having your email system integrated also enables timely automation emails beyond “Yay, Your Order Shipped”. Just imagine a world where your wine club member receives a shipment, and two weeks later, they are greeted with an email survey asking how it was. Based on their response, you can invite them to reorder in 3-4 weeks or someone on your team can reach out personally. This is just the beginning. There is a vast world of possibilities out there when we take steps to clean our data and leverage the right systems and software for our businesses.