Guest post by Tony Ademi.
Pretty much everyone involved in digital marketing knows that bots account for a lot of Internet traffic. But you not realize just how big the percentage is, or the different techniques you can use to prevent bot traffic from distorting your analytics data.
What type of bots should be avoided, and which ones are best to allow? Here’s a deep dive into how to identify and get control over bot traffic on your website and in your traffic analytics.
What exactly is bot traffic in your Google Analytics data?
Bots, spiders, and other software apps run automated tasks on the internet. There are far more of them than you may imagine. In fact, according to statistics, more than 40% of internet activity includes bots.
That’s a substantial number, and nearly half of the internet is filled with bots. Of course, some bots can be beneficial, such as bots on Reddit and Google, but there are way too many bots that try to steal your sensitive information.
Whether good or bad, bots aren’t human beings, so it’s important to recognize your bot traffic on Google Analytics.
Identify your bot traffic
Determining the sources and extent of bot traffic can be tricky, because the names and origin points of unwanted attention and unusual hits are continually changing. Here’s how to exclude bot traffic in Google Analytics:
- Go to view settings inside your admin section and click the option “Exclude all hits from known bots and spiders.” Many marketers are concerned that this option will change aspects of the data they’ve collected, but it won’t, so there’s nothing to worry about.
- Alternatively, you can try creating a filter to exclude traffic you’ve already identified by creating a new view where you’ll uncheck your bot setting and add a new filter that excludes traffic using HostName, SourceName, etc. Before you start using the filter, test it out to see if it properly works, and after, you can use the Master view.
- Consider using the Referral Exclusion List. It can be found under Tracking info<Property column<Admin section. By doing this, you exclude all domains from your data in Google Analytics. Then, if you see any suspicious domains, you can permanently remove them from your future list.
What are good and bad bots?
Bot traffic is any traffic that doesn’t come from a real person. Good bots are responsible for automating tasks. Some examples include voice assistants like Alexa, Cortana, and Siri. The most common type of good bots are:
- Website health checking bots
- Commercial crawlers
- Bots from search engines (Google, Bing)
- Bots that convert sites to mobile content
All of these types of bots play an essential role in keeping websites running effectively. Blocking good bots from entering your site might adversely affect your traffic, so be cautious.
Then there are bad bots. These are the bots that are responsible for spamming and stealing sensitive information. Most people launch bots for scraping and crawling data from websites to upload the content to their websites. Here are some of the most common types of bad bots:
- Web scrapers (except those used for ethical scraping purposes)
- Spam bots
- Hacker bots
- Bots that try to impersonate someone
Filter by hostnames
You can create view filters in Google Analytics using specific host names. To do this, you need to find the hostname of bot traffic by adding a second dimension for your hostname on the medium source report to ensure it’s bot traffic, then create the include filter for your hostname.
Sometimes, your site might be hit not by actual spam but by random tracking codes, also known as “Ghost spam.” Filtering by hostnames is an excellent way of removing a large amount of spam traffic in one go without manual intervention.
To do this, create a filter for your view and do the following:
Include only > Hostname > That contain > yourdomain.com
How do you find bot traffic?
So, who visited your site—was it a human or a bot? Fortunately, there are several different ways to determine this.
- Sudden spikes: If you see a sudden spike in traffic with no apparent cause (no new content or announcement, no mention in the Wall Street Journal), there’s a solid chance it’s bot traffic. Most experts agree this is a standard method for identifying botsyou’re your site had 500 visits yesterday and 10,000 today, there’s something suspicious. Maybe something wonderful happened…but more likely, it’s bot activity. You can identify bot traffic by scrolling through the sources on Google Analytics. A traffic bot might include “Traffic Bot. life” or something similar.
- Strange reports: Take a second and dive into your campaign report. If you see something you didn’t set up, it may be a bot. You can find these using custom filters to exclude any other traffic form.
- Strange referral traffic: It’s important to monitor your referral traffic sources consistently. Referral traffic is also known as “one-way bot traffic.” If you see any strange bots in your referral traffic, you know they aren’t real traffic, so you can exclude them through a referral exclusion list. Alternatively, referral spam is usually sent by spambots. You can create a referral spam blacklist to identify which ones to filter out.
- Very low time on page: Another common way of identifying bots is seeing visitors stay for one second or less on your site. Set up an unfiltered view in your Google Analytics account to recognize this.
What should your cycle review regarding data quality look like?
When reviewing data quality, you should always be going through a cycle. Here’s what it should look like:
- Analyze your sources of website traffic.
- Identify any “strange” behaviors or unusual patterns in your traffic (e.g., high traffic on a Sunday when visits are usually down).
- Dig deeper to determine the source of the issue.
- Fix the problem by editing your view filters.
- Document the issue and the fix.
- Repeat on a continuous cycle as you analyze traffic on a daily, weekly, or monthly basis.
What kind of limitations are there when you exclude bot traffic?
You can manually exclude bots, but that isn’t the best practice and is not guaranteed to be 100% reliable. In addition, many bot creators are getting more sophisticated and are creating code that acts more like human visitors. It’s increasingly common for bots to spread their requests across different IP addresses, including residential IPs, to appear less as bots.
Finally, there’s the referral exclusion list, which may not be the most reliable option. Excluding referral sources only means that the hits will be removed from referral information. They may still be seen as direct traffic, possibly causing even more damage. In other words, it’s best to use other exclusion methods.
Removing bots might not always be enough
Excluding bots from your Google Analytics platform is a great idea, but it doesn’t entirely remove bot traffic from attacking your app, APIs, and website. Even if bot traffic isn’t showing up in your website performance data, visiting bots can still have a lasting impact on your website performance, slowing it down and hurting the overall user experience.
Google Analytics data helps you make much better marketing decisions. However, with online fraud protection, you can combat account takeover (ATO), content theft, credential stuffing, and the risk of DDoS.
In other words, removing bots from your Google Analytics data does solve issues but does not entirely solve the issue of the bots attacking your platform. In short, your best bet is to ban malicious bots from your website and apps altogether, not only exclude them from your data.
It’s not fun having to identify and remove bots from your Google Analytics platform. But if a significant share of your traffic is coming from a bunch of spam bots, crawlers, and spiders, you’ll get a distorted view of your digital marketing success. So, filter them out and continuously monitor for any suspicious activity.
Bots can be good or bad, but it’s vital to identify the bad bots before they skew your Google Analytics data.
Tony Ademi is a freelance SEO content and copywriter. He has been in the writing industry for three years and has managed to write hundreds of SEO-optimized articles. Moreover, he has written articles that have ranked #1 on Google. Tony’s primary concern when writing an article is to do extensive research and ensure that the reader is engaged until the end.