Privacy is a very important issue when it comes to digital. The way data is collected online and what happens to it is a much-scrutinized issue (and rightly so).
Digital data collection is also exceedingly complex, perhaps a reflection of the organic nature, and subsequent explosion, of the internet. Hence even sophisticated users find it difficult to know everything, one can hardly expect normal digital users to know what's really happening.
For example, people are really shocked when they hear that even with no web analytics or advertising analytics tool on a site their behavior on the site gets automatically logged into server web logs. Information like IP address, the page requested, time stamps, browser ids and more are stored. These server logs can then be used to do basic reporting using off the shelf software.
Another example, people don't realize that, depending on the browser you use, being in Private Browsing or Incognito or InPrivate mode does not mean that no data about you is collected by the sites you visit. Being in InPrivate or Incognito mode simply means that no data (cookies, history, etc) is stored on your computer when you close the browser. If your employer or ISP monitors your web usage, then they can still track you when you are using Private Browsing/InPrivate/Incognito mode. [Be careful! :)]
So, it is complicated. You can understand why there's a lot of confusion and scrutiny.
Two recent flare-ups highlight this scrutiny. The first was around Facebook and Twitter tracking your behavior across the web as you visited sites that use Facebook and Twitter social buttons, or integrate FB's commenting system. The second flare-up is the evolving regulation in the European Union around the use of cookies.
In this post I want to cover the second issue, implications of some of the still-evolving EU cookie regulations. Though Europe is our primary focus, regardless of where you are located you'll learn about web privacy, data collection, optimal tool decisions and how best to plan your data strategy.
Since this is not a blog about legal issues (and I'm not a lawyer!) I will not focus on whether these regulations are good or great or how you can comply with them or what tracking you can and can't do. Please reach out to a local technology lawyer who can help you navigate those issues. Please work hard to ensure you're in compliance with local laws.
What I want to do is cover the implications of whether the use of cookies is permitted or not. My hope is to simply help you internalize the impact of these decisions on reports, which metrics might be impacted and which will be fine, as well as what types of decisions you can still make with confidence and which decisions you might make with a grain of salt.
Web Data Collection Context: Cookies and Tools
Before we go forward let's set some context and try to get on the same page with some terms that we'll use for the rest of this article.
Cookie:
A cookie is a small text file placed on your computer. Cookies contain a small amount of anonymous information that allows a website to know that you have visited in the past, responded to a campaign, or had x items in your cart (which allows the site to retain those items in your cart the next time you come). And other such users.
Cookies are always set on behalf of the website owner. For example, they'll explicitly implement a tracking solution (like SiteCatalyst, comScore), typically via JavaScript tags, or an advertising solution (like DoubleClick, Kenshoo) or social buttons/commenting systems.
First-party Cookies:
These are cookies set on your behalf on your own domain, under your own domain. For example, when you visit this site, Google Analytics (which only uses first-party cookies) will set a small text file on your computer. This small text file (the cookie) can only be read on the browser it was set on, and only on this website. In other words, it only tracks what you do here and nowhere else.
Cookies are never permanent; they can disappear for any number of reasons. But first party cookies, because of above behavior, are the most persistent in the sense that they are rejected the least amount of times, they are preserved the longest, and they are least deleted by "cookie cleaners."
Third-party Cookies:
These cookies are set on your behalf via a different domain. For example, if this website used Facebook's commenting system, then every time you visit this site your behavior would be tracked. But because this cookie (and your anonymous data in the cookie) is not set on this domain (kaushik.net), your behavior across other sites can also be tracked by this cookie. So if after you visit this website you visit Gawker and then CatsThatMakeAnalystsLaugh.com, then that behavior will also be tracked by Facebook.
Data in the cookie is anonymous, but when you go back and visit facebook.com privacy policies on facebook.com dictate how that data, along with your behavior on facebook.com, is used.
Third-party cookies are used by tons of providers. Perhaps the most common users are advertising platforms (Yahoo!, DoubleClick, Microsoft and more). Third-party cookies are critical when it comes to behavior targeting / ad re-targeting tools which rely on the ability to tie one person's behavior across multiple websites.
Because of the behavior described in this section, third-party cookies are the least persistent. They are more often rejected due to default browser settings, user choices, cookie cleaners, etc.
Some web analytics tools providers still user third-party cookies, and often their customers are unaware. This is quite inadvisable. Check what type of cookies your web analytics tool uses. With third-party cookies your data is so sub-optimal that no matter how much the pain, please immediately take steps to shift from third-party to first-party cookies.
Web Analytics Tools:
Tools like Google Analytics, CoreMetrics, Baidu TongJi, StatCounter etc., whose primary purpose is to measure user behavior on one website: Yours!
Advertising Analytics Tools:
Tools that you implement along with, say, display advertising or social features on your site. The primary purpose is to provide you a service (ads, comments plugins, social buttons) and track user behavior *across sites.*
This is a lot of perhaps complex information. But it is important that if we get into the privacy/data collection discussions that we understand these basic terms. Without them it is impossible to understand what regulation should take place and what the implications of these regulations (or our privacy controls) are.
Remember: Cookies are an important part of understanding user behavior on a site or multiple sites. But even if you are not using cookies, or digital analytics tools, behavioral data of your users is still being logged by your website's servers in web log files.
European Privacy Regulations: Implications
I'm not a lawyer, nor do I play one on TV :), so I'm not going to opine on the why's and the how's of the law. Please consult with a lawyer in your local legal jurisdiction.
At the moment there is a central EU e-Privacy directive that is in various stages of interpretation and implementation by individual EU member countries. The directive requires obtaining consent prior to tracking. One specific portion of the directive hence applies to the usage of cookies.
Since the interpretation a bit unique across EU members, a site located in the UK might ask for a different permission, using a different method, and permit tracking of different things than a site in Germany or Holland or Spain.
Broadly speaking, there seem to be four buckets of implementation currently underway in Europe when it comes to the cookie part of the law. Let's look at just one thing: Implication of each implementation (Government requirement) on the data you collect and the web data analysis that you'll be able to do.
1. No change in the law related to cookies.
La vita è bella.
Life continues as normal. Focus on picking the best web metrics, actually do analysis of that data, toil day and night to deliver superior web experiences and improve digital profitability of your company.
Please assign someone in the company to stay in very close touch with government regulations and recommendations. Please ensure that your privacy policy is transparent, up-to-date about the data you collect and the choices your website visitors have for not being tracked.
PS: Oh and if your web analytics tool uses third-party cookies, switch to first-party, and if it can't use first-party then it is time to say sayonara to the tool.
2. No change to first-party cookies. Third-party cookies require opt-in.
A number of EU countries are in this bucket. Implications?
For first party cookies: The data you are collecting with your web analytics tool about user behavior on your site is just fine. Use this data to make life better for everyone. [If your web analytics tools is using third-party cookies (boo!) then everything below applies to you.]
For third party cookies: When users visit your website, you'll present them with a notice to opt-in to being tracked using third-party cookies. They can choose yes or no. You'll ensure your web serving platform remembers that setting and acts accordingly.
If the user accepts the cookie, there is no impact on your data.
If the user says no to being opted-in: In most cases ads, social commenting systems etc, that use third-party cookies, will continue to work. Ads will show up, comments will be accepted. But there will be an impact on your advertising analytics solutions.
The data you'll be able to collect will be worse than it was before (and remember, it was fragile before). The number of impressions, click-through rates, conversions, view-thrus and other metrics reported by your ad platforms will be less precise than they were before.
If you use behavior targeting/ad re-targeting/remarketing solutions then your effectiveness with these solutions will be reduced. That's simply because retargeting/re-marketing relies on leveraging third-party cookies to observe a person's behavior across websites and then deliver optimally targeted content/ads.
If you use social commenting platforms provided by third parties, the user will see that you no longer remember them.
If the users don't accept third-party cookies, then on some platforms you won't be able to track their behavior away from your site. On one social platform they very cleverly track this behavior: Bonita sees your post on the social platform. Bonita then visits your website, which was in your social post. Bonita then comments on your post. Bonita then also clicks on the social share button and shares your post back on the social platform. This tracking primarily works today because of third party cookies. This might not work going forward so you won't be able to analyze this behavior.
The full impact will depend on the digital advertising analytics tool you are using. Please call your account representative at your tool's vendor. Use the information above to ask specific questions. (For a minority of vendors it is very difficult to get straight answers.) Adjust your data analysis strategy accordingly.
3. Both first-party and third-party cookies require asking users for permission/opt-in.
A couple of EU countries are going to be in this bucket. It is not completely clear if no to cookies means that zero tracking can be done (or if only cookies can't be stored). Assuming the opt-in is just for cookies (i.e data stored on a customer's browser)…
If the person opts-in, nothing changes. Both your web analytics and your advertising analytics tools are going to report the data fine.
If the person rejects the request and does not opt into setting cookies in their browser then one of these two scenarios will occur:
1. Some web analytics tools will still collect the data ("hits" really). They will then try to "intelligently stitch" the session together. Since each "hit" is reported to the tool (via the JavaScript tag) there is a bunch of data comes through even if the cookie data does not. So these tools will use the anonymous browser id and the IP Address and other such elements to do something like:
"All these hits look like they are coming from the same browser, their time stamps are really close to each other and while we don't have cookies these 'hits' start at time x and finish at time x+5 and so they must be one visit. We'll stitch the hits together and report that as one visit."
All the data is anonymous (as it would have been in case of a normal web analytics tool as well, almost all of whom don't collect personally identifiable information).
Implication on the Visits metric: You get the best case scenario of a Visit. Not perfect, but close enough.
Implication on Unique Visitors: Since the cookies are not stored, every time the user of that browser comes back to the site he will be identified as a New Visitor. So Unique Visitor counts will be imprecise. By how much will depend on how many people exhibit that behavior.
Implication on New and Returning Visitors: See above, these numbers will be imprecise (with New Visitors being overstated). For this reason you can see why metrics like Recency and Loyalty will also be wrong.
If your web analytics tool is exhibiting the above behavior then there will be little to no impact on data for dimensions like referring websites, keywords etc and metrics like time on page or total page views. There will be some impact on metrics like time on site or total page views (because remember the "visits stitching" is a very informed guess).
Please check with your vendor if they are using this method. To the best of my knowledge, only a couple do.
2. Most web analytics tool will detect an inability to use cookies (after the user says no when presented with a choice in the opt-in) and they will not collect any data for that browser/visit. The motivation is to present you data that is clean (or as clean as it would normally have been) and you can confidently analyze.
Google Analytics falls into this bucket.
So behavior of people who opt-out of cookies won't be measured and not represented in the data. Hence it will represent less than the total.
If first-party cookies are not accepted then the number of people not accepting will remain a unknown unknown number – remember they are opting out from being tracked (including that they don't want to be tracked).
Also remember, as I'd mentioned in the opening, your web server is still likely collecting all the hits (requests). Web servers don't, by default, set cookies and hence don't have that information in the web logs. But information like IP address, browser user agent id, time stamps, page urls and much more are recorded in web logs. These logs can be parsed using freely available web log parsing solutions.
While reports from your web log parser won't give the type of robust reporting you can get from a SiteCatalyst or Yahoo! Web Analytics, you can still report a lot of user behavior using these web logs. Make sure that your privacy policy clearly states this to your users, and please consult with a local law expert for guidance.
4. [Nuance on the cookie issue:] Cookies are fine, IP addresses can't be collected or only partly collected.
One European country, and perhaps more in the future, is in this bucket. The government has said that IP addresses should be considered as personally identifiable information (PII) and not collected by web analytics tools.
If this is you, then the report that will be most impacted is the Geography/Location report. It will show imprecise data. This will be regardless of the type of cookies you are using.
In Google Analytics this report is located in: Audience > Demographics > Location.
If your web analytics tool allows you to report out IP Addresses (Google Analytics does not) and match it back to companies, etc., then you won't be able to do that precisely as well.
All other data should be fine.
Those are the four key scenarios that we are dealing with at the moment. I hope you understand better the implication on your web analytics and digital advertising analytics tools.
Quick Repeat Summary of Cookies.
The current crop of web analytics tools rely on cookies to more accurately identify a unique browser.
They typically don't track a person. If you use three browsers on your computer then you appear as three unique visitors to a web analytics tool. (Solutions like Google Analytics explicitly prohibit you from collecting personally identifiable information.)
The rare exception might occur if, say, you log into all three browsers using a unique login id you'd created with the company. In that case the company has a choice to use that login to match back all three anonymous cookies to one person. But this happens in the company backend (say a CRM system or a Data Warehouse) and not in the web analytics tool.
There was a story recently in the New York Times about Target doing that to create unique people profiles. Again, that would happen in the company's backend systems and not the web analytics tools.
Closing Context: Don't "freak out" about the missing data.
If you think back to how we've measured the effectiveness of marketing in the past (or even today for TV, Radio, Magazines, Newspapers, Billboards, etc.), you'll realize that what we call measurement is essentially a glorified faith-based initiative.
If you think back to how we've measured user experience (using observational studies, lab tests, follow-me-homes, etc.), you'll realize that we used observation of 10 or 100 or 1000 people to extrapolate to what millions of our actual users do.
Now if you consider the data you are collecting with your digital analytics solutions, you'll most definitely marvel at how much data we have. A lot. 80%? 90%? 60%? It is a ton more than from any other channel.
So you can cry about all the data you won't have or don't have. Or you can be happy that you still have 5,000 times more than you have on any other channel on the planet and analyze that data and use the insights. If in the past when you had 100% of the data you might have made a big massive huge data-driven decision, now you might just make a big massive data-driven decision.
Still better than the faith that powers offline advertising, right?
Don't let the quest for perfection stop you from making a decision today based on good enough.
Ok, it's your turn now.
If you live in Europe, how are you adapting to the implementation of the cookie directive in your country? Have you noticed an impact on your advertising analytics or web analytics solution? Is your company paralyzed, or still using the "good enough" data (I know it can be hard for Sr. Managers)? If you live in rest of the world, do you understand cookies better now? If you are a technology expert, what's missing from the article above? Anything you would change?
Please share your feedback, critique, kudos and delightful perspectives via comments.
Thanks.
PS: A couple of bonus items for you…
#1: This article outlines everything, in simple English, you ever wanted to know about how GA uses cookies. The cookie names, exactly what they do, when do they expire and more: Cookies & Google Analytics
#2: My privacy policy states in simple English what I track and what you can track out of every analytics solution used on this website, and more: Occam's Razor Privacy Policy
#3: Two of my favorite articles about cookies: Why The Guardian uses cookies. The New York Times' cookie page.
Enjoy!
Comments