{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "(mmm_tv_intercept)=\n", "# MMM with time-varying parameters (TVP)\n", "\n", "In classical marketing mix models, the effect of advertising (or other factors) on sales is assumed to be constant over time. Similarly, the baseline sales—what you would have sold without marketing—is also assumed to be constant. This is a simplification that _typically doesn't_ match reality. There will be times when, for various reasons, your ads are more effective, or when your product just _sells better_.\n", "\n", "This _time-varying effect_ is something we can capture with a time-varying parameter. In the context of a marketing mix model, this could be changing trends, unexpected events, and other external factors that are not controlled for. For example, if you sell sunglasses or ice cream, an unusually sunny spring will impact both your baseline sales and likely also the effect of your ads on short-term sales.\n", "\n", "👉 In this notebook, we demonstrate how—and when—to use a time-varying parameter for intercept in an MMM, using `pymc-marketing`'s `MMM` model.\n", "\n", "The API is straightforward:\n", "\n", "```python\n", "mmm = MMM(\n", " adstock=GeometricAdstock(l_max=10),\n", " saturation=LogisticSaturation(),\n", " date_column=\"date\",\n", " channel_columns=\"channels\",\n", " control_columns=\"control\",\n", " time_varying_intercept=True, # 👈 This is it!\n", ")\n", "```\n", "\n", "🤓 Under the hood, the time-varying intercept is modeled as a Gaussian Process (specifically a [Hilbert Space GP](https://www.pymc.io/projects/docs/en/stable/api/gp/generated/pymc.gp.HSGP.html) to speed things up), constrained to $\\mu=1$ and then multiplied by a _baseline intercept_. So if the sampler infers that the baseline intercept is 1000 sunglasses sold per week, then the GP models the percentage deviation from that, over time. Have a look at the implementation of `MMM` for concrete structural details.\n", "\n", "Below, we give three simple usage examples:\n", "\n", "1. **Yearly seasonality**: The intercept is a cosine function with a period of one year. Normally, one would use a Fourier basis to model seasonality, but let's see what happens when we use a time-varying intercept 🤷♂️.\n", "2. **Upward trending sales**: The intercept is a linearly increasing function, mimicking overall sales growth not explained by marketing or controls. Again, you would normally use a linear increasing control variable for this, but let's see what happens when we use a time-varying parameter.\n", "3. **Unexpected events**: The intercept is a flat line, except for intermittent, randomly placed spikes/dips. This is a more realistic scenario, where the effect of marketing is not constant, but rather varies due to various unexpected factors.\n", "\n", "We conclude that while the GP-based time-varying intercept *can* technically do the job for seasonality and trends, it's not the most efficient way to do so (choose a Fourier basis or linear trend instead). However, to capture unexpected events that no other variable can explain, it's very powerful 💪." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import warnings\n", "from datetime import date\n", "\n", "import arviz as az\n", "import matplotlib.pylab as plt\n", "import numpy as np\n", "import numpy.typing as npt\n", "import pandas as pd\n", "import pymc as pm\n", "\n", "from pymc_marketing.mmm import MMM, GeometricAdstock, LogisticSaturation\n", "from pymc_marketing.prior import Prior" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "SEED = sum(map(ord, \"Time varying parameters are awesome!\"))\n", "rng = np.random.default_rng(SEED)\n", "\n", "warnings.filterwarnings(\"ignore\")\n", "\n", "az.style.use(\"arviz-darkgrid\")\n", "plt.rcParams[\"figure.figsize\"] = [12, 7]\n", "plt.rcParams[\"figure.dpi\"] = 100\n", "\n", "%load_ext autoreload\n", "%autoreload 2\n", "%config InlineBackend.figure_format = \"retina\";" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load synthetic data\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For this example, we load some simulated consumer goods marketing spend/control data.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### → Load input and define columns\n" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Weeks | \n", "Google Search | \n", "DV360 | \n", "AMS | \n", "TV | \n", "VOD | \n", "OOH | \n", "Radio | \n", "Numeric Distribution | \n", "RSP | \n", "Promotion | \n", "target1 | \n", "target2 | \n", "|
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "2020-01-06 | \n", "2.414281 | \n", "3.179336 | \n", "2.112389 | \n", "1.326498 | \n", "13.974318 | \n", "1.544316 | \n", "4.754408 | \n", "0.000000 | \n", "0.741301 | \n", "3.643304 | \n", "0.969624 | \n", "8.126478 | \n", "6.840064 | \n", "
1 | \n", "2020-01-13 | \n", "1.953829 | \n", "3.712402 | \n", "1.122114 | \n", "0.841185 | \n", "8.097841 | \n", "1.458398 | \n", "5.536986 | \n", "0.000000 | \n", "0.701279 | \n", "3.643304 | \n", "0.853508 | \n", "7.033357 | \n", "5.944537 | \n", "
2 | \n", "2020-01-20 | \n", "1.445275 | \n", "6.610630 | \n", "3.793022 | \n", "0.885655 | \n", "11.670006 | \n", "2.742102 | \n", "0.000000 | \n", "0.854066 | \n", "0.712682 | \n", "3.643304 | \n", "0.974842 | \n", "9.265676 | \n", "6.553764 | \n", "
3 | \n", "2020-01-27 | \n", "3.695156 | \n", "2.694912 | \n", "2.016691 | \n", "1.130929 | \n", "9.872921 | \n", "4.760902 | \n", "0.000000 | \n", "0.963224 | \n", "0.718657 | \n", "3.643304 | \n", "1.000000 | \n", "9.445138 | \n", "7.825555 | \n", "
4 | \n", "2020-02-03 | \n", "1.909138 | \n", "3.047636 | \n", "1.887042 | \n", "1.478925 | \n", "7.598348 | \n", "2.926870 | \n", "0.000000 | \n", "1.475399 | \n", "0.713845 | \n", "3.643304 | \n", "0.937466 | \n", "8.671742 | \n", "6.847199 | \n", "