Attribution Modeling Explained

These days, customers find your product through a variety of marketing channels (ad platforms, partnerships, content, organic, etc.). It is important to understand how these channels work together in driving conversions. After all, the journey of the converted customer is the one we really care about, the one we want to promote and replicate. To understand this journey, we use multi-channel attribution modeling.

Let’s say the follow diagram represents your customer’s journey, how much credit does each channel deserve? Let’s explore several ways distribute the credit.

Journey

First Click Attribution

First click attribution gives 100% of the credit to the first touchpoint. In our example, display would get 100% of the credit.

first_click_attribution

First click attribution is useful for figuring out how customers original found your product, but doesn’t shed much light onto the conversion driving touchpoints.

First click attribution is akin to giving my first girlfriend 100% of the credit for me marrying my wife.

– Avinash Kaushik

Although I love this quote, it’s not accurate. First click attribution is actually akin to giving your friend that introduced you to your wife full credit for your marriage.

Last Click Attribution

Last click attribution gives 100% of the credit to the last touchpoint. In our example, remarketing would get 100% of the credit.

last-click-attribution

Last click or last interaction is the classic model used in many reporting tools. It’s only good for figuring out which touchpoints are driving the actual conversions, it completely ignores the rest of the referral touchpoints.

Linear Attribution

Linear attribution is the most basic way of dividing a conversion. It divides the credit equally among each of the referring touch points.

linear-attribution-model

This model is useful when analyzing a conversion event that has long sales cycles, where all the touch points are important in building a brand image.

Time Decay Attribution

The time decay model is the most advanced model we provide. It divides credit to each filter based on the number of days before the conversion.

Time Decay Attribution

time-decay-attribution-model

The calculation we use for this is:

y = 2-x/7

where x is the number of days the referral happened prior to the conversion. The 7 in the equation is the half-life. A touchpoint 7 days before a different touchpoint, will receive half the credit.

For example, a user visits your site from a Google display ad, a remarketing ad and then finally a social channel, with the following timeline:

attribution-model

Based on the equation above, we would split the credit up for each channel accordingly:

DISPLAY REMARKETING SOCIAL
2-8/7 2-4/7 2-1/7
.453 .673 .906
22.29% 33.13% 44.58%

So, What Should I Use?

The truth of the matter is, there’s no silver bullet for modeling attribution. Our team has found the best way to get an accurate understanding, is to compare our numbers for each of the models. This is easy to do using the model selector in Attribution. Give it a try with our demo data.

The SaaS Calculator: How Much Should I Spend to Acquire a Customer?

SaaS companies typically spend money upfront to acquire customers, then have to wait many months before recurring revenue makes up for the initial cost to acquire. Revenue from a customer is defined equally by the length of time they stay and the size of their monthly payment. This is a common problem: you spend a lot of money upfront to acquire customers who are only valuable if they stay for a long time. We are betting on our future product with our current cash. Anyone building a SaaS company faces this problem, but if growth erodes profit, how do we separate the viable businesses from the bottomless money-pits?

In this post we will explore the industry benchmarks for acquiring customers. I’m going to do this using Brett Victor’s Tangle library. Drag the blue numbers left or right, and adjust them until they reflect your company.

Monthly Average Revenue Per User (ARPU)

So let’s assume a simple case – you have a software company with 100 customers, and you make, on average $3000 recurring revenue per month. This means you have $30.00 Average Revenue Per User every month. That’s relatively straightforward. There’s also expansion revenue but we will bundle that into churn for simplicity.

Average Revenue Per User: $30.00

Monthly Churn

Your churn is a measurement of how many customers leave your product. Let’s say that 90days ago you had $2000 monthly recurring revenue. If we ignore the new customers, those customers from 90 days ago now account for $1800 remaining monthly recurring revenue. This would mean you have 3.70% monthly churn and 45.06% yearly churn.[1]

Monthly Churn: 3.70%

Lifetime Value Calculation

We can divide 1 by the monthly churn rate to get an idea of how many months your customers would be expected to stay. So for our calculation, 1 / 3.70% monthly churn = 27.0 months of expected revenue. Earlier we calculated that the average customer spends $30.00 per month and now we see the average customer stays for 27.0 months. Multiply these two numbers together and we see that an average customer would have a $810 lifetime value.

Lifetime Value: $810

Spend Less than 1/3 of LTV

A simple and prevalent model is spending 1/3 of your customer lifetime value [2]. Using the numbers we calculated above, we would have a $810 LTV divided by 3 for a $270 Maximum Customer Acquisition Cost. The idea here is to reserve 2/3 or your gross revenue for product development, operating expenses, taxes and profit.

Spend less than $270
Which is 1/3 of a $810 customer lifetime value.

Spend less than 12 Months of Revenue

Another common way to think about this to spend a set number of months income on customer acquisition, usually 12 months or less [2]. So if your customers spend an average of $30.00 per month, you spend 12 months income or less on acquisition. This would give us a $360 maximum CAC. This model forces you to think of acquisition costs as money that you recoup over time, and it re-frames churn as lost money. This can have a profound effect on the way you think about churn.

Spend less than $360
Or one year’s income at $30.00 per month.

The SaaS Conundrum – Profitability is Elusive

If you spend 12 months income to acquire a customer, you are in the red for 12 months. Losing that customer any time in the first 12 months should be treated as a failure. In the last few years, we’ve seen an increase of SaaS companies offering discounts yearly pre-payment. This is for good reason, SaaS companies typically face a cash crunch as their growth accelerates. 12 months is a long time to wait for revenue to catch up with spending. Couple this cash crunch with the common expectation of exponential growth and you can see why startups are typically not profitable for many years.

Spending 1/3 of LTV and recouping the cost of acquisition within 12 months are just benchmarks. The reality is that you know your startup better than anyone. These benchmarks are meant to get you started but growth is unique for every SaaS business. Your model will grow in complexity as you learn more about your market and your best growth channels.

Get in touch in the comments if you have any feedback or additions.

[1] This is a simplified churn calculation, you should read Steven Noble’s Definition of churn , Jason Cohen’s post on the topic and Joel York’s analysis for more information. Churn changes as your product improves and competitors enter your market. Churn is a deep concept and building multi-year models around it is a risky way to run your business.

[2] Both the 12 month benchmark and the 1/3 of LTV benchmark come from from David Skok’s excellent article on Sass Metrics. They have become common benchmarks but every startup is unique. For a nuanced analysis of the topic take a look at David Kellog’s post on CAC ratio.

Apple: let’s solve iOS attribution for good

Apple recently announced that they will be releasing a new “Sources” tab in iTunes Connect. iTunes will now track the referring url and an optional campaign ID, passed in as a url param. This is a huge improvement, and it will work great for developers that build exclusively on iOS.

The problem is, iOS doesn’t dominate the market anymore. Most developers need to build for both iOS and Android. If you’re building on both platforms, having your data in an iTunes silo is hugely problematic. We’re going to explore how attribution works, why it’s a problem on iOS and what apple could do to solve it (hint: Android has already done it).

The problem with mobile attribution

If you sell things on a website, there are excellent ways to see what you’ve gained for all that effort you put into marketing. You can easily see where people came from. The web has HTTP referrers built right into the protocol, and in the cases where that doesn’t work you can use UTM tags or campaign-specific landing pages.

Knowing what works is critical. Even if you’ve built something people want, you still need to get their attention and show them what you’ve built. When resources are limited, dumping what little you have into the wrong channel or the wrong campaign can have real consequences. Knowing matters.

When you put something on an app store, you don’t have any say over the tracking data they give you. You don’t get to see where people came from or what params were appended to the url. It’s very difficult to know where you should spend the limited resources you have. Until recently, this was the plight of every app developer publishing to an app store.

How it works with Android and Google Play

Google has come up with a simple, open solution to this problem. They pass the params from the app store URL into the app when it is first launched. Developers can now see where their customers came from. It’s a simple solution that takes advantage of best practices on the web. It works for every ad platform, it doesn’t compromise our privacy, and it works for every use case we can imagine.

Apple’s closed garden approach

Apple’s solution looks similar on the surface, with one important exception. Apple passes the params to their own tracking system but doesn’t make them available to the developer. Let’s take a look at how it works:

itunes-connect-sources-pid-cid

https://itunes.apple.com/us/app/twodots/id880178264?pid=facebook-ads&cid=spring-blast

This is a great start but without the raw params, developers can’t see where actual users came from. Without the ability to tie the data together, developers cannot do a few really critical things. Let’s explore what those things are and why they are important.

Advantage to passing params: Deferred Deep Links

Here’s a scenario: you’re browsing Amazon.com on an iPhone looking at a fancy new coffee grinder. A message pops up, letting you know that Amazon has an app. You download the app, open the app, and get dumped into a blank slate. There is no reference to coffee grinders. “That’s foolish,” you say “shouldn’t Amazon know I was looking at porlex coffee grinders?”

Amazon doesn’t know because there is no (supported) way for developers to pass data through the App Store. If we could append params to the App Store links and load those params at app launch, we could simply pass through the porlex grinder product ID and load it when the app was first launched. Apple has done a great deal to support deep linking, but they’ve ignored the critically important first link.

Advantage to passing params: Discount links and app sales

Here’s another scenario: you have a successful company, and you’re launching an iOS app with in-app purchases. You want to offer a discount to you mailing list. Maybe you want to give them 50% off in app purchases. Great idea, but it’s not possible. There’s no (supported) way to identify which users came from your email blast on iOS.

Advantage to passing params: See ROI across devices
Let’s looks at a common scenario for people who sell the same app across multiple app stores. Maybe you sell a productivity app on iOS and Android. You’re in the middle of a Christmas promotion and you want to see how your campaign is doing on Twitter versus Facebook. There’s no way to merge the data, the iTunes half is locked in a silo.

iOS Attribution Approaches

As engineers, having a problem means we build a solution. The solution, in this case, is for ad networks to fingerprint every device that clicks an ad. This can be done using the Apple-regulated IDFA (ID For Avertisers) or by fingerprinting the device using a combination of browser information and the IP address.

Attribution Method: Using the ID For Advertisers (IFDA)

When you click an app ad on Facebook, your unique IDFA is stored on Facebook’s servers along with a reference to which ad was clicked. You include the Facebook SDK in your app so that your app can send Facebook the IDFA of every user that opens your app. When Facebook sees a match, they will respond with “yes, we sent them.” This is a very reliable method, because it’s based on a unique ID that rarely ever changes.

Limited platforms caveat: Because IDFA’s require an integration with each advertising platform, it is unlikely that you will find an SDK that is integrated with industry specific ad platforms, affiliate sites, or any custom ad deals you’ve made.

Waste caveat: Identifying users by IDFA requires requires an unruly amount of code. There are SDK’s on top of other SDK’s and any app advertising on more than one platform ends up with megabytes of unnecessary code. Spread across hundreds of apps on millions of phones, this adds up to colossal waste. A phone’s storage space should be reserved for music, photos, and code that provides value.

Privacy caveat: IDFA’s should not need to exist. This is a unique identifier – specific to your iPhone – that is the same across every app. This ID is passed around an entire ecosystem of analytics providers, ad networks, and individual apps. In many cases is is saved along with geographic information, photos, and other personal information. Unique ID’s don’t exist on desktops and they shouldn’t need to exist on phones. If we passed the params through the app store we could get rid of IDFA’s. That’s a huge win for privacy.

Attribution Method: Fingerprinted Redirect with URL Parameters

Tracking with UTM params is great. What if we could somehow pass an arbitrary list of params from the app store into our app? Well, you can. It’s just not as reliable. The way to do this is to set up a link redirect service, similar to bit.ly. Every time you link to the app store, it goes through this redirect service. When a request hits the redirect service, it stores the IP address of the request along with the user agent making the request and any other identifying information available. That information is used to create a semi-unique device fingerprint.

If the user downloads the app, that same fingerprint data is sent from the app on first launch. If a match is found, the params from the app store listing are passed back into the app, along with the referring URL. This can be used to identify anything you want to track about the source – including discount codes, affiliate markers, and cross-platform campaign names. This concept is further explained in Implementing Deferred Deep Linking on the URX blog and it has been productized by Tapstream. We’re working on a similar methodology at Attribution that we hope Apple will make obsolete.

Reliability caveat: Fingerprinting can break down if many people in the same place download an app at once. Imagine launching an iOS app at SXSW. When a throng of people with the same iPhone, running the same iOS, all download the app from the same cell towers, there’s no way to tell them apart. Compared to desktop computers, iPhones are less unique and hence they are harder to fingerprint. In most cases this is not problematic but it can break down at conferences and events.

There’s also a speed/accuracy tradeoff. Additional fingerprinting information is available via javascript, but loading an actual page and executing javascript code may noticeably slow the re-direct. Nobody wants that.

Apple: just pass us the params

All of this would just go away if Apple passed the params from App Store url’s into apps. No more privacy issues, no more heavy SDKs, and no more developers pulling their hair out. As a bonus, app developers would have the ability to privide a much better first launch experiences with deferred deep links. Everybody wins.

We can’t think of any reason why Apple would choose not to pass the params. Maybe an oversight, or maybe a conscious decision. Either way, we hope they change it.

Apple, please, Just pass the params.

Why Do I Have So Much Direct Traffic?

Direct traffic means that someone was familiar enough with your brand to go directly to to your site. That’s an accomplishment. The problem is, a lot of the traffic that being tracked at ‘direct’ may be referral traffic that wasn’t tracked.

HTTPS

The most common cause for dropped referrers is when a link goes from an https page to your http page. This isn’t a bug, in fact it’s in the W3C spec. W3C believes encrypted headers should stay encrypted, and passing them to an unencrypted page could cause a security leak (W3C Spec). So what’s a marketer to do? It’s simple, serve everything over https. It can be a little more overhead to set up but it means your tracking will work better and your visitors may even feel more secure being on a page with a little green lock.

Referrer Blocking

So https is part of the problem, but it’s not the whole story. Browsers, security software, and even links themselves can also block the referrer field from being sent. Some people really don’t like the fact that the HTTP spec sends referrers. Here’s a quote from a Microsoft forum:

What is wrong with Microsoft Internet Explorer developers since they do not understand that this is a security issue that needs to be addressed right god damn now?! I do not want referrer information leaked to god knows who when I am surfing the web in the privacy of my own home! Microsoft FIX THIS RIGHT NOW !!!!!!!!!!

Cool_1

That’s an extreme case but the fact is, some people don’t want you to see where they came from. That pressure has led to the development of extensions for every browser that block referrers, default settings in Norton and ZoneAlarm to block referrer and even an HTML5 spec for dropping rel=”noreferrer” directly into the link.

Emailed links, SMS links, App links, PDF links, etc.

Browsers are not the only things that can contain links. Traffic can come from outside the web, and generally that traffic will come through without referrer and get swept right into ‘direct’. There’s not much we can do about this, since the referrer field was never meant to track links outside the web.

Redirects and link shortners

Link shorteners can send along the referrer if they use a permanent 301 redirect. If they use a Javascript redirect or if they use a temporary 302 redirect, you’ll just see ‘direct’.

Solution: UTM Everything

UTM tags are really the only reliable way to track your traffic. As a rule, you should UTM tag every link you have the power to tag. This includes ads, links from social media, links from your email campaigns, links in your PDF e-books, links you text to your mother. Everything. All the links. Tag them all. The reason for this is that you control your UTM tags. There’s no settings, devices, or HTML specs that can disable them.

Cross Device Attribution

More than ever, people are visiting your product from phones, tables, laptops and work computers.

Cross device tracking is going from an optional nicety to a necessary call. Identifying your users on each device is the key to correlating their sessions and properly attributing referral sources.

Identifying Users

Identifying users with a unique user ID is essential. When identify is called for a specific user, all previous and future events on that device will be associated with that user. The key here is that previous events are associated as well. This allows for properly tracking many different cross device scenarios, such as the following:

  • Marcus is browsing his news feed on his phone and clicks a post about your product. He’s interested, but doesn’t sign up right away.
  • Later that week, Marcus decides to checkout your product again, but this time visits your domain directly from his computer. He decides to sign up. When he does so, your software makes anidentify call with a unique user id.
  • In a couple days, Marcus logs in to your app from his phone. On login, your software makes an identify call with Marcus’s unique user id. All of his browsing history on this device is now properly tied to his account and that original click from his news feed will be properly attributed for his conversion.

This might seem like an edge case, but these days it’s too common to ignore.

How To

Properly tracking this scenario is simple:

  • Make an identify call on your client side software (JS, iOS, Android, etc) whenever a user signs up or logs in (and pass a unique user id)

In order to associate traits with your user, we recommend making an additional server side identify call. This way you can associate information that you might not have in your client software.

It’s sometimes common to call identify using the email address instead of a user ID. This is, of course, a problem when a user changes their email. Tracking with the ID is really the best practice.

Why Your Javascript Tracking Leaks Data

Most of the modern analytics packages are based on event calls. When someone signs up, you fire an event. When they buy something, you fire an event. These events make up the bulk of the data that we use to track growth, a/b test our designs, and understand our products. The problem is that Javascript tracking is unreliable.

Here’s a time line of what Javascript tracking should look like:

javascript-event-tracking

This is what should happen.

When someone performs an action, the Javascript on that page will fire an event to a third-party tracking platform. When the platform (such as Google Analytics, Mixpanel, or cough Attribution) receives the event, the tracker will fire a callback telling the page “You can load the next page now, we received the event.” Then you direct the browser to the link they clicked on. The problem is, you don’t want to wait very long for that callback. You have to decide for yourself how long is too long, but for the sake of our demo we’ll assume a 300ms timeout. Unless you noticeably slow your conversion flow, you will miss some events:

javascript-tracking-timeout

In this case you missed the callback. Chances are the event wasn’t stored, although in some cases the event will be stored but you just missed the callback. When the event isn’t stored you miss out on important data. So how do you fix this?

The solution is fairly simple: use server side tracking wherever possible.

Call `track` events from the server, call `identify` and `page` from the client.

Events like identify and page are called when the page loads, so assuming normal human browsing, these calls should have enough time to reach a server. Most interesting ‘events’ involve a pageload – form submissions, button clicks, etc.

The best practice is to use server side tracking wherever possible. Calling identify on the page with Javascript is still necessary, there is no calling identify on the server. Calling pagefrom the server is possible but probably not necessary. Gathering critical event data in the browser is not recommend, and will probably leave you scratching your head at why the numbers don’t add up.

As a bonus, calling events from the server means that you can also correlate things that happen off-site to specific users. This is really valuable when sales happen offline or in a separate app.

Server side tracking can take more effort to set up but the benefits are enormous.