By: Robert Cardarelli, EVP Analytics, Ipsos MMA; John Guild, EVP Technology & Development, Ipsos MMA; Aniello Bianco, Sr. Director Analytics & Digital Attribution, Ipsos MMA

Ipsos MMA Open Source MMMIpsos MMA has been developing marketing mix models, unified measurement, Agile Attribution and commercial effectiveness solutions for clients for over 30 years, analyzing over $125B in investments and producing more than $33B in incremental client value. We have witnessed firsthand the significant benefits quantitative measurement offers to companies of all sizes and across a range of industry sectors.

As the appetite for measuring marketing and broader commercial investments grows, we are excited to see publishers like Meta and Google release open source frameworks that can help expand the use of measurement techniques across a broader array of companies. Every marketing-driven company today must implement holistic cross-platform marketing and commercial measurement of their investments – a core fundamental to understanding which activities and spends drive incremental value on a country-by-country, global and portfolio scale. The availability and diversity of data today makes its use a true competitive advantage – and similarly so if not.

While we applaud these open source MMM initiatives, they have raised questions and led to some uncertainty in the marketplace pertaining to the types of services that best serve a company’s investment requirements. To successfully integrate commercial measurement solutions, it’s important to understand your needs and objectives and evaluate and plan for what is required to achieve lasting and ongoing value from them.  To help provide some guidance on the topic, here we look to clarify what open source MMM options accomplish – and what they do not – within the broader context of marketing performance measurement programs.

The Rise of Open Source MMM

In today’s data-driven advertising world, where privacy is a top concern and signal loss is increasingly more worrying, Marketing Mix Modeling (MMM) has as never before, become an essential tool for advertisers. Its evolution in incorporating a richer, broader and more granular set of commercial variables, delivered faster and linked directly in real-time to business investment needs, has enabled companies to measure the impact of their marketing and as well as other commercial investments (price, promotion, operational, etc.) while adjusting for non-marketing factors. All of this can now be done while taking into account the synergies and halo effects of the collective set of investments. This helps advertisers make better-informed decisions about their budgets to maximize their working together and the effectiveness of their paid, owned, and earned media channels.

With the declining popularity of traditional Multi-Touch Attribution (MTA) methods, publishers are increasingly turning to holistic MMM solutions to give advertisers the insights they need to optimize their platform marketing spending and achieve a higher marketing return on investment (MROI).  This includes measuring both the short-term impact on sales as well as the longer-term tail related to brand equity.

Given the criticality to measuring growing marketing and commercial budgets, it’s not surprising open source MMM tools are gaining some interest, including some of which that are directly provided by publishers, such as Meta’s Robyn, Google’s recent release of Meridian, and others, such as PyMC. Publishers are optimistic that advertisers will leverage these tools to better understand performance, even at the ad format level.

As privacy regulations increasingly limit third-party individual-level data become obsolete, the media industry is increasingly leaning toward MMM.  As new advertising products powered by AI, such as Google’s Performance Max and Meta’s Advantage Plus, are being introduced, adopting MMM is critical. In fact, most MMM solution providers today, including Ipsos MMA who leverages it within its integrated MMM/Agile Attribution Solution, are increasingly using AI and Machine Learning to expedite richer, faster sets of insights that can be linked directly to incrementality.

Case Example: Meridian by Google

Google has recently released Meridian, its own open source MMM solution. It consists of a set of Python libraries with some basic fundamentals built into it that can enable data science teams to explore MMM-style measurement using well-regarded and standard techniques.

Meridian uses a Bayesian linear regression model to estimate media effects. In classical linear regression, the model selects values for each variable’s weights (model parameters or coefficients), which are assumed to be fixed parameters by minimizing the error and providing confidence intervals without prior assumptions on the weights. Bayesian linear regression uses probability distributions on the model parameters instead of fixed parameters. These assumptions, also known as prior distributions, rely on expert knowledge to estimate model parameters, resulting in posterior distributions on the observed data.

Bayesian linear regression works well when prior knowledge about the distribution of the estimates can compensate for the uncertainty or unavailability of information in the data. Although Bayesian methods have been applied to machine learning problems and have been a building block of AI since the 1980s, they are hard to handle for the uninitiated. The primary reason is that the modeler requires a good understanding of probability theory and computational power.

Since Meridian is a Bayesian Linear Regression, pre-existing beliefs about media effectiveness are required in its model estimation using statistical prior distributions. In this process, media effectiveness is typically assumed to be positive and is represented by a prior distribution defined only by positive values. By default, Meridian utilizes half-normal distributions as a default for media variables to ensure the posterior distribution is positive.

Furthermore, even when sticking to the default priors, users may not readily know how to adjust their parameters to incorporate the information correctly. For example, for the half normal, if a user chooses a smaller variance to express a higher degree of knowledge about the prior, they essentially specify that a media channel did not have any known prior effect. This is not immediately evident to the user and can result in unexpected behavior and results.

Some key characteristics of Google’s Meridian include.

  • National and geo-level modeling with 2-3 years of weekly data
  • Priors can be specified by the modeler, with different types of priors available
  • Joint estimation of the saturation and lag effect for the media variables
  • Simulation and media budget optimization

While implementing Meridian has some benefits, it requires careful handling and a thorough understanding of the assumptions and interpretations involved. Specialized knowledge is needed to specify priors correctly and fully understand their implications. One concrete example is that Meridian emphasizes the availability of reach and frequency in the model specification; however, not all marketing channels provide this information, making it challenging to compare media channels with and without reach and frequency.

Considerations for Internalization of Open Source MMM

MMM is a broad, cross-functional organizational initiative. Although data science is central to it, focusing only on this aspect without considering the wider requirements of a successful measurement program would be a mistake. A good rule of thumb is that core data science tasks, such as statistical modeling, account for roughly 10% of the total effort in a measurement program.  This begs the question, what are you in the market for?  A tool or a solution?

Holistic Data is Critical

It’s easy to underestimate the importance of gathering non-marketing variables, especially for organizations focusing on digital marketing. The accessibility of data from digital platforms often leads to a bias in favor of these channels. Econometric techniques can accurately analyze the impact of a wide range of consumer-facing activities, but only with access to a comprehensive dataset. When there is a lack of data, these techniques may result in omitted variable bias that serve to only reinforce existing biases and/or produce inaccurate results.

Every marketing initiative must begin with ingesting data representing every brand’s interaction with potential consumers. These data sources are diverse and include not only marketing (linear/CTV, online video, display, paid social, search, events, sponsorship, influencer, etc.), operations (price & promotion, assortment, new products, CRM, website, etc.) and external (macro-economic, competition, weather, seasonality, etc.), however, given the current data-sharing limitations, a pragmatic approach is necessary. This entails leveraging expert analysis of data sources and understanding industry dynamics to ensure the use of observed data or the creation of synthetic controls. This is crucial not only for measuring the impact of non-marketing factors but also for providing accurate measurement.

Low Latency Continuous Measurement is a Requirement

Strategic one-time cross-channel optimizations produce, by their very nature, sub-optimal results. Sub-optimal results compared to optimizations that run on the latest data consider the newest consumer sentiment and reflect the measurement of last week’s campaign and associated messaging.

Investing in data and model operations is essential to establishing a continuous measurement and optimization culture. Automated data ingestion and quality assurance programs, model version control, and value measurement, validation and tracking are equally if not as, or even more important than core measurement.

It is this combination that enables companies to achieve transformational, incremental value from their investments. Without them, there’s a risk of ending up with an expensive academic, analytic exercise that is lacking in the necessary holistic sets of data and difficult to validate from a true incrementality standpoint.  Creating real, measurable, and trusted value comes from an always-on, always-connected, end-to-end set of processes that align data, models, measurement, planning and recalibration to create outputs that can be consistently understood and validated.

Benchmarks and Norms Balance Art and Science

MMM experts bring these capabilities to advertisers, along with a database of industry norms derived from thousands of cross-industry expert analyses assessed and analyzed over decades, to guide the analytics and identify outliers. When done correctly and holistically, MMM is commonly known as Commercial Effectiveness or Business Drivers Analysis and is one of the most cross-functional initiatives an organization can implement.

Priors are Important but Need to be Handled with Care

We find it a positive that various open source MMMs use statistical priors in Bayesian frameworks. However, misused – they can also make a powerful tool to produce results that reiterate the results of, for example, a poorly designed experiment. Careful consideration of the data points used to inform the model is required. Ideally, these data points should never rely solely on a small set of experiments within your organizations, and that knowledge should be balanced and weighted against a broader set of industry/category norms and benchmarks.

Scenario Planning is a Marketing Function

Forward-looking scenario planning that can be trusted and validated is critical to usage and adoption.  Results must make sense and differences be explainable across internal and external partners. That begins with transparency, a critical factor in achieving real organizational change, which involves sharing measurement results directly with marketers. If running a “what-if” scenario involves using TensorFlow python code in a Colab environment, only a few people will likely be able to effectively use the insights and take advantage of the potential gains. It’s essential to have user-friendly planning tools capably supporting internal and external users in a collaborative manner to optimize and continuously foster a culture of data-driven planning that is accessible to everyone.

Consider the Total Cost of Ownership of Your Measurement Ecosystem

Tools such as Google’s Meridian or Facebook’s Robyn bear exploring, as in our opinion, the more marketers embrace measurement as a whole, the better. This set of Python libraries has provided a set of MMM capabilities that companies can build upon.

However, organizations looking to build an integrated, always-on and connected culture of quantitative planning that consistently, year-over-year produces more sales, better MROI and stronger customer equity for their brands should be sure they are considering the Total Cost of Ownership of the measurement ecosystem.  And it’s not just providing sets of modeled outputs.

Recommendations must be tracked, measured, evaluated and the results recalibrated in a manner that provides clients with the agility, speed and flexibility to quickly react and adjust to maximize the impact of their investments. The core data science components are crucial but a surprisingly small piece of the pie.

It is the sum of the parts that make successful MMM solutions work.  These also include specialized and experienced MMM resources, infrastructure, data feeds, quality management and harmonization, model ops, benchmarking and scenario planning tools, not to mention the work related to changing embedded behaviors related to processes that do not heavily rely on data-driven decisions. Moreover, it is absolutely essential that the team and software involved be able to fully integrate both the internal and external media and business planning groups in a manner that’s consistent with their ‘go to market’ strategy and tactics.

It is not uncommon for a global MMM program to require 15-20 dedicated resources to support continuous data, modeling, training, insight development, cross-functional organization integration and adoption, and market activation. This resource requirement and the specialized combination of skills and integration points should not be underestimated when budgeting for a sustainable, always-on and connected internal capability.

The evolution of data and analytics has enabled companies to leverage tera and petabytes of data to gain measurable competitive advantages that materially and positively impact on their top and bottom lines. Fully realizing the gains and return on investment from measurement programs means ensuring your marketers can leverage the richness of data available today (even in a post-cookie world), both via advanced data science techniques and end-user planning tools that operate at sufficient granularity and remain easy to use.