Predictive Segmentation 101: A Brief History and Core Principles

Imagine this—your nephew’s birthday is in two weeks, and he loves Hot Wheels cars. You decide to buy him one and start browsing an online store. You expect to be done in 10 minutes max, but instead you end up scrolling for hours through the list of all toys the store has. No filters, no categories, no way to narrow down your search. Frustrating, isn’t it?

Fortunately, most online stores aren’t like this. And if they were, they’d go out of business quickly. Yet, this mirrors the way marketers have approached customer segmentation for ages.

When marketers create customer lists, it often resembles that unfiltered product feed. All the necessary data is there, but it’s not structured. There are no clear customer profiles with a shiny “I’m Ready to Buy!” tag beside their name.

Turning these lists into actionable data requires a great deal of effort, and even then, the results can be inconsistent.

Over time, marketers have developed new methods for structuring this data (i.e. segmenting the lists), but none have achieved 100% accuracy.

However, with recent advancements in data science and machine learning, we can now get closer to precision than ever before. (and it will remain so until we get fully-functioning crystal balls and B2C-friendly mind-reading solutions).

Why Do We Need a History Lesson In the First Place?

You may be tempted to jump right into our subject matter. Isn’t that why you’re here? However, without a brief overview of how we got here, understanding why we need predictive segmentation and what it addresses can be challenging.

As segmentation methods evolved, marketers went from blasting their entire customer lists with the same messages (hoping for the best) to more targeted campaigns, yielding better results.

Yet, despite all these efforts, one fundamental issue remained unaddressed until recently. Curious to learn more? Buckle up, we’re going back in time.

History of Marketing Segmentation Over the Years

While predictive segmentation might sound like yet another trendy buzzword from the AI era, its history goes back to the 20th century.

Before predictive segmentation: Early methods and static data

Before segmentation, marketers had to send the same message to their entire list of customers. As businesses grew, the need for higher accuracy and return on investment became pressing, especially given the costs of print advertising and direct mail—the primary marketing channels at the time.

In the early 20th century, marketers were already experimenting with different ways to structure their campaigns. And in the 1950s, these efforts finally took off. Today, we can learn about that from the study of Wendell R. Smith. In “Product Differentiation and Market Segmentation as Alternative Marketing Strategies,” he laid the groundwork for segmentation theory.

Early segmentation relied on demographic and geographic data. Marketers divided their customers into groups based on variables like age, gender, and location. This was a major improvement but still insufficient. Soon, other approaches emerged.

RFM analysis

By the 1960s-1970s, direct marketers developed RFM analysis to better manage their customer lists. They identified three key factors that contributed to the likelihood of future purchases by a customer.

Recency—when was the last time they bought something?
Frequency—how often do they make purchases?
Monetary value—how much do they spend in total?

By combining these factors, marketers could create accurate segments predicting customer value for a business. Customers who score high in all three areas are likely to buy repeatedly and should be treated as VIP buyers.

RFM analysis remains in active usage by marketers. Powered by modern automation, it can be very effective when used in combination with other segmentation tools.

Psychographic segmentation

In the 1960s-1970s, a new method called psychographic segmentation was developed, allowing marketers to gain a better understanding of their customers by analyzing more nuanced criteria. This approach considers factors such as lifestyle, social class, personality traits, values, attitudes, and interests.

Behavioral segmentation

Around the same time, behavioral segmentation came into light. This method goes beyond static data, taking into account customers’ behavior, including purchase history, interactions with the brand, preferences, and more.

While already a robust method, behavioral segmentation became even more effective in the 1990s. The rapid development of ecommerce and CRM systems enabled marketers to better track customer behavior, including minor signals like browsing history, product wish lists, and webpage bounces.

Classic marketing campaigns that relied on early methods

Let’s have a look at successful examples of marketing campaigns that relied on traditional methods.

How Marlboro completely changed cigarette marketing with a single campaign

Before the 1950s, cigarettes were primarily marketed towards women, using imagery of glamor and elegance to appeal to women. Marlboro, known for its mild taste, was perceived as a women’s cigarette. However, everything changed in 1954.

Advertising legend Leo Burnett created a campaign with a radically new appeal. Instead of targeting women, the new advertising campaign featured a rugged cowboy who radiated masculinity and strength. This approach resonated with men, who soon embraced the rebranded product.

Over the years, Marlboro Man became a pop-culture staple. While this ad campaign is fairly criticized for its negative health impact due to increased smoking, from a marketing standpoint, it revolutionized the market by targeting a different audience.

How Volkswagen carved a huge chunk of the US car market for itself by changing its appeal

The US car market in the mid-20th century was peculiar. The main selling points were size and power: the bigger, the better. Besides, buyers were also skeptical of anything produced outside the US.

German car manufacturer Volkswagen, with its tiny Beetle, was at a huge disadvantage then. There was almost no chance the car designed in Nazi Germany could compete with all-American brands like Ford and Chevy. They needed a different approach.

In 1959, Helmut Krone and Julian Koenig from the Doyle Dane Bernbach agency created a new campaign for Beetle titled “Think Small.” It targeted price-conscious buyers who wanted a simple car that was easy to service and didn’t consume much fuel.

As a result, this decision to target an entirely different psychographic segment led to a major increase in Volkswagen’s market share in the US.

Problems with early methods of segmentation

While demographic or behavioral segmentation offered significant improvements in marketing results, several issues remained.

Lack of precision: Early methods relied on broad criteria, leading to large and often inaccurate customer groups.

Static data: Traditional approaches considered factors that weren’t changing much. While separating men from women helped, it didn’t capture the smaller, ever-changing differences within these segments.

Limited storage capacity: Earlier storage solutions could not handle the massive volumes of data. The cost of storage was high, and systems weren’t optimized for handling diverse data types.

Inefficient data collection: Before the Internet and social media, there wasn’t enough accessible data generated to create actionable datasets. Integrating data from different sources was complex and time-consuming. And there were no methods for collecting and processing real-time data.

Technological limitations: Until recently, no systems were capable of processing large datasets. Traditional databases and processing frameworks were not designed with scalability in mind. And analytical tools were not sufficient to perform complex analytical operations.

While methods like cluster analysis partially addressed these problems, they were primarily explanatory. They look at certain criteria and explain that if a customer has certain characteristics or habits (e.g., recent purchases or frequent buying), they are more likely to buy again.

All these approaches are retroactive. They analyze past results but cannot predict future outcomes.

What if you could know your results in advance? Not just consider past performance, but have a clear expectation of what comes next. That’s where we need predictive modeling.

The rise of predictive segmentation in marketing

In her article “To Explain or to Predict?”, Galit Shmueli provides a great definition of this concept:

“I define predictive modeling as the process of applying a statistical model or data mining algorithm to data for the purpose of predicting new or future observations.”

Prof. Shmueli highlights two components of predictive modeling: a statistical model (or algorithm) and data. Both of these have seen major breakthroughs in the 21st century.

The history of digital databases dates back to the 1960s. Robert Kestnbaum, an early innovator in the field of customer-centered databases, developed complex strategies for analyzing customer data in new, previously unseen ways.

While these methods were ingenious, they were limited by the infrastructure of the time and thus were insufficient to achieve major results.

The hardware advancements in the 2000s led to an explosive growth of cloud computing services like Amazon Web Services (AWS), Google Cloud Platform (GCP), and Microsoft Azure. As a result, storing and processing customer data became dramatically more efficient and affordable for businesses.

Moreover, with the evolution of e-commerce, social media, and the Internet of Things, brands soon got access to an enormous amount of information about other businesses.

However, the final piece of the puzzle came with improved machine learning algorithms, which allowed businesses to go beyond data collection and storage. Now, they could also leverage all that data, driving insights and making informed decisions.

I want to test predictive segmentation

Try it now!

Critical Problem Solved By Predictive Segmentation

Early methods significantly improved marketing results. However, all of them shared one major fault, which wasn’t solved until predictive segmentation. Let’s look at a few examples.

The majority of beauty products are bought by women. Does it mean that men never buy cleansers or moisturizers (for themselves or as gifts)? Do all women buy all types of cosmetics?

Luxury items are bought primarily by affluent customers. Do all affluent people buy luxury items? What about those who splurge on something only to eat ramen for the next couple of weeks?

We know that people who made a recent purchase are more likely to buy again. Will all recent buyers make another purchase? What about those who made a purchase a long time ago but are now actively browsing new products?

All the above-mentioned factors are linked with a higher chance of conversion, but none of them causes conversion directly.

Traditional segmentation is correlational in its nature. It looks into factors associated with the desired marketing outcomes, such as probability of action or churn. However, it doesn’t address these factors directly.

Correlation ≠ causation

Correlation doesn’t imply causation because it only indicates a relationship between two variables, not that one causes the other. Other factors, such as coincidence or external variables, may also play a role.

Predictive modeling eliminates the need to test dozens of criteria to find the one that correlates more with the intended result. It helps us directly address conversion actions.

Early Implementations of Predictive Segmentation

Predictive methods have been used in different industries since the mid-20th century. The financial industry used credit scoring to assess risks. In healthcare, it turned out helpful in predicting the spread of diseases. And retailers used modeling to predict when they would need to restock their inventory.

However, accurate predictive segmentation required a lot of data and thorough preparation, making it accessible only for the corporate sector. Let’s look at the first commercially successful applications of predictive segmentation.

Meta Ads (previously Facebook Ads)

With its access to vast amount of customer data, Meta was among early adopters of predictive modeling. In March 2013, they revealed a new option available to advertisers—lookalike audiences.

Before this, advertisers relied solely on demographic and interest-based criteria for targeting ads.

With complex algorithms in the core of lookalike audiences, it became possible to reach new levels of precision. They started analyzing existing customers and finding similar prospects.

To use this feature on Meta Ads, you need to set three variables:

Source: your page’s followers, audiences from other campaigns, or even a customer list uploaded as a file.
Location: country or regions from which the algorithm will draw people for the lookalike audience.
Audience size — a percentage of people most similar to your source audience. For instance, you’ve uploaded your customer base of 15,000 names, and you want to create a lookalike audience in the region that has 100 million Meta users. A 10% audience size will create an audience of 10 million people that is closest to your original list of 15,000. The 1% option will narrow it further to k only 1 million.

With lookalike audiences, advertisers on Meta gained a powerful tool that allowed them to focus on other areas instead of endlessly testing different targeting criteria.

Google Ads (previously Google AdWords)

Same year, Google introduced a feature called similar audiences. Analogous to Meta’s lookalike audiences, these targeted people close to existing remarketing lists. Ten years later, in 2023, Google removed similar audiences due to data collection and privacy issues.

However, some of their functionality is still retained in optimized targeting, audience expansion, and smart bidding options.

Amazon

An ecommerce giant pioneered predictive methods in ecommerce with collaborative filtering used to predict the likelihood of next purchases. Amazon used two main types of collaborative filtering to generate its product recommendations:

User-based collaborative filtering: This method recommends products based on the similarity of users’ preferences. If there are two similar users and the first one bought items A and B while the second only got B, the system will recommend item A to the second user as well.
Item-based collaborative filtering: Unlike the previous one, this method provides recommendations based on products’ similarity. If two items are frequently purchased together, one might be recommended when the other is purchased.

With access to large volumes of customer data, Amazon kept developing and improving its predictive algorithms to get even better results. In 2020, Amazon released the MQ Transformer, a model capable of studying its own history to improve prediction accuracy and decrease volatility.

Other examples

British retailer Tesco launched its Clubcard loyalty program in 1995. It was among the first loyalty programs that extensively used data analytics.

Tesco gathered customer behavior and preferences data through Clubcard transactions. to predict future purchases and tailor recommendations.

In the early 2000s, Andrew Pole, a statistician at the US-based retailer, Target, developed a model to predict customer pregnancy. Pole identified a list of 25 or so products that correlated strongly with pregnancy (such as unscented lotion or cotton balls).

This allowed Target to target (pun intended) customers with baby-related offers before the baby was even born—way before other retailers had a chance to do it.

Netflix, the pioneer streaming service, used predictive segmentation to become the leading company in the industry. Netflix’s algorithms predict which shows and movies users are likely to enjoy based on their viewing history, ratings, and preferences.

They even create original series that are likely to become hits based on customer preferences for certain genres and actors.

Conclusion

Predictive segmentation has a long and colorful history. It’s not a passing fad but a technique that’s been a long time in development. Now, it has finally become accessible to anyone, not just big corporations.

It offers marketers a powerful way of reaching their goals while keeping customers happy. It improves the ROI of marketing efforts, minimizes the guesswork, and reduces the waste of budget.

Stay tuned for our next article, where we’ll explore the inner workings of predictive algorithms. Meanwhile, you can check how the Ukrainian retailer Prom increased sales by 10% with personalized product recommendations.

Get professional expertise