404 not found: A better approach to track missing pages with Piwik

You don't only want to know about the pages your visitors like. You should also know about the pages they missed due to broken links and bad referers.Here's how you can get that information with Piwik.

Written by: Matthias
Published on: 2014-10-31

Looking for a web analytics solution, we’ve been evaluating Piwik for a while now. We need to provide our customers with a tool that is powerful enough to answer their most important questions, yet is easy to use and meets all regulatory and data protection standards. Piwik really seems to be a good fit here.

Now one question that typically arises is which URLs lead to HTTP 404 errors – that is, which URLs were requested but could not be found. You might want to look at this from two perspectives:

  • Which are the "most missed" pages? ("Did we inadvertently remove popular content?")
  • What are the top referring URLs visitors come from when they end up on the 404 page?

The Piwik FAQ has a simple recipe for this use case. Basically, it re-defines the page title to "404" + the URL of the missing page + the URL of the referring page.

That "kind of" works because the information will show up in the "Actions > Page titles" report. However, it is not very elegant because it abuses the page title and you cannot report on the two dimensions indepentently.

Segments to the rescue

Here’s what seems to be a better suggestion to me: On your 404 error page, track two custom variables Error Code and HTTP Referer for the page scope. You can learn more about Custom Variables in the Piwik docs.

The extra tracking code you need to add basically looks like

_paq.push(["setCustomVariable", 1, "Error Code", 404,"page"]);_paq.push(["setCustomVariable", 2, "HTTP Referer", document.referrer, "page"]);

Make sure you add these custom variables before the trackPageView call, otherwise they won’t be tracked.

Then, create a new segment in Piwik like so:

Piwik: Create a segment using custom variables

This segment acts like a filter and only shows you the requests that ended on the 404 error page. Enable it and then have a look at the following reports:

  • "Actions > Pages" will show you all page URLs that were missing. You can also use the "Transitions" feature here to find out where your visitors came from (i. e. which page showed the broken link).
  • On "Visitors > Custom variables", drill down the "HTTP Referer" variable. This will give you the distinct URLs (internal and external) that sent your visitors to a missing page.

If your error page template uses a common page title ("page not found") for all missing pages, here is another piece of information you might find useful:

  • Go to "Actions > Page titles" and find your error page. You can then use "Transitions" to learn where your visitors moved to when they hit a broken link. This is valuable insight that can help you to improve your error pages and guide your visitors back to the website in case things don’t work as expected.

One thing you should keep in mind is that "Transitions" show you – for a particular page – which other pages your visitors came from and where they moved to afterwards. On the contrary, the "Referrers" menu only deals with the external websites users came from at the beginning of their visit. So "Referrers" won't help you to find the URLs of pages containing bad or broken links.

Oh, and just in case you’re using Symfony 2 (which we love!). Be sure you don’t miss WebfactoryPiwikBundle, which can help you integrate Piwik and track visitor actions, and WebfactoryExceptionsBundle, which helps you with building nice-looking and helpful error pages.

Did we spark your interest?

If so, we'd love to hear from you. Don't hesitate to get in touch with us, if you have any questions or further remarks. And if you want to discuss any type of project, product, problem or idea with us, we would like to even more!