By Sean Duffy on Sunday, 17 November 2024
Category: Email Strategy

Why your ESP’s data model matters

"They basically all do the same thing".

That’s a lot of marketers' views on ESPs, and if we take the basic purpose of sending and tracking a bunch of emails then yes, this is true. But those of you in the Only Influencers (OI) community know this isn’t really the case.

Most of you have probably used an ESP that really wasn’t suited to your needs. That doesn’t necessarily mean it’s a bad platform - you can’t get hundreds or even thousands of users with something fundamentally bad - but it shows how every brand needs to find the right fit for them.

And one of those key points of differentiation is the underlying data model of the ESP. That has changed dramatically over the years and there is a trend to what is called ‘schema-less’ data models. But is that right for everyone? And what are the advantages and disadvantages of each?

Let’s take a tour through the history of ESPs and their data models to work that out.

But before we do that I’d just like to point out I won’t be mentioning any vendor names. I don’t want this article to sell a specific vendor, nor criticises others. I’ll also be generalising - yes, there are exceptions to most rules!

The original way

When the first email platforms arrived in the 1990s, most enterprise brands were still on AS/400 systems, arguably technology closer to the first ever computers that were invented to help crack the Enigma code in WW2 than the cloud of today.

And marketers didn't really have any tremendously advanced requirements - merging in customer name was often as advanced as it got. You'd load in your CSV file, often doing all the segmentation outside of the ESP, and then maybe extract out bounce and unsubscribe info so your data team could update the goliath data warehouse.

With only 'flat' fields per customer (e.g. you can store their name because that exists once, but you can’t store what they have ordered because there could be many of those) segmenting on past behaviours inside the platform was really limited (unless you added some serious hacks) and triggers weren't even a thing back then.

So we have moved on from here but it's still important today. Many of these original platforms are still around today and have these original foundations, which makes it difficult for them to grow their capabilities. Trying to bring an original platform up to 2024 standards is like trying to build a skyscraper on the foundations of a bungalow.

And trust me, they still exist!

The relational way

As email marketing matured marketers needed more data in their platforms to segment and personalise. A new breed of platforms arrived that typically get referred to as 'legacy' today.

These gave marketers the ability to add additional tables of data such as orders, what items were in the order, product data and anything else they had. Some simplified these to create a standard data model based around ecommerce while others gave you carte blanche to add whatever you needed. This is where the term ‘relational’ came from - the extra tables of data about what the customer ordered, what they’ve added to a wish list and so on all ‘relate’ to the customer record.

The perceived benefit here is no longer did marketing need IT teams to constantly cut data lists for them as they could do it all in the platform's segmentation tools. Automated campaigns became mainstream as marketers could set up triggers on the back of data flowing automatically into the ESP.

Some legacy platforms that are still actively developed (many got bought by big tech companies and are slowly wasting away) are still a viable option today. But over time they have generated some challenges:

The overall main risk is the complexity with these tools make it very difficult for marketing to get anything done.

The modern way

A new wave of vendors started popping up in the mid/late 2010s disrupting the market and aiming to solve the challenges of these big, clunky relational-database-based ESPs.

A typical approach here was to store customer data in JSON format. This allowed nesting of data containing multiple values such as their content preferences so you wouldn’t need the complexity of an added table to store it in.

Nor would engineers need to create new fields (in most cases) - they just throw more data at it as they need to. This is often why they are referred to as schema-less, but you’ll also hear terms like NoSQL.

And they all have the concept of ‘events.’ Rather than having all these extra tables to store order information it is sent in as an event that sits against the customer record. These events are designed for real-time and of course can trigger actions in the ESP in real-time such as starting a journey whereas the older relational systems tend to rely on clunky schedules looking for new records.

This usually comes at a strong benefit of speed and performance. I’ve run sub-second segment calculations in some of these platforms on millions of records. Generally speaking these systems can handle far more scale than the legacy platforms before they started creaking and groaning.

But despite all of these advances they aren’t for everyone. Here are some limitations:

1) One time write of events

Apart from one of these vendors once you’ve written an event such as an order you can’t update it. Is that a problem? Actually, yes it can be a major issue that ends up with brands having to do workarounds.

Let's take the example of a retailer that has added order events and set up a post purchase experience off the back of it. In a relational system if the customer cancels or returns the order you can update the status and your journey will adapt. But in this type of platform, you can’t do that, so you have to create another event for a cancelled or returned order. And then you’d have to stop the journey to prevent any emails going out about their item even if it’s a different order or set of items returned.

In certain markets this is even more critical. Anything where you have events such as travel, fitness classes or concert tickets you have 2 dates - the day they book, and the day they attend. A lot of these systems struggle to deal with this, and you’ll be able to trigger and segment campaigns off the booking date, but nothing on the event date.

A third challenge relating to this is if you ever want to update any historical data against these events. For example, if you want to segment based upon product category they’ve ordered in the past, but in the last year you’ve changed the way you manage product categories you’ll have a real mess trying to make it work as all the categories will be all over the place.

It’s for these reasons why I always end up with the one vendor that can handle updates.

2) Lots of raw data but limited toolsets

An events model allows you to store endless data about customers - everything they’ve ordered, browsed, added to a wish list, and so on.

But the toolsets to segment and personalise with this event data are limited. Sure, you can find out those who have ordered 3 times in the last year - simples counts and matching is possible. But you can’t find out the last thing they did, nor segment upon how much they’ve spent in the last year, or what their favourite brand is.

And they always have the limitation of not being able to use this event data to personalise unless the email was triggered by the event. So good luck running that personalised reactivation campaign you had planned based upon what they bought previously.

Now to do this in legacy platforms isn’t always possible, nor is it ever exactly straightforward that a normal marketer can achieve. But where it is possible, at least it can still be done. You can easily end up with a new ESP with the intention of marketing being self-sufficient with no IT support but quickly realise you are back to square one when they need to start preparing custom data for you every 5 minutes.

And no, adding a CDP doesn’t necessarily improve these issues - most of them have the same challenges with their data models.

The future way?

The latest trend I'm observing, including from the modern breed of ESP we’ve just talked about, is to not actually store any data inside the platform.

Big brands have invested heavily in data lakes or tech like Snowflake where all company and customer data is managed in one single source of truth.

There is a view by some that keeping a sync of that data in other platforms like an ESP creates extra work for the tech team, creates risk of data being out of date, and whenever you need to add new fields or tables more development resources are eaten up.

But at this moment in time this is not a view I especially share (although I'm aware I didn't see all the benefits of the modern marketing platforms when they first arrived).

This is because they tend to take a step backward in usability. They require more technical knowledge to use - the pitch from these ESPs is aimed firmly at Data Engineering teams, it is they who see the benefits not the end marketers.

What does this all mean?

As I read back what I’ve written above it might come across that no ESP is exactly optimal. And that is true - perfect does not exist. It really is a case of weighing up the trade-offs between different platforms to find the one that BEST suits your needs.

And increasingly as the shift from legacy to modern platforms continues it’s easier to think of them as just better. But there can be significant compromises in capabilities when you make that leap. This won’t be a lift and shift and everything is better, you’ll have to do more work around your data and resources to make any new platform work for you.

But if we are aware of these limitations and trade-offs we can make informed decisions and ensure we make the right decision for which ESP is best for our current and future needs.

Photo by Ness fu on Unsplash

Leave Comments