Identifying the mole when Google Analytics data does not match with in-house reporting data.

The Need

The client reported a discrepancy of more than 25% between its in-house transactional reporting system and Google Analytics reporting system, and decided to engage Nabler to find the root cause for this understatement of transactions data in Google Analytics.

About Client

The client is one of the leading ecommerce sites in India selling consumer products across several categories. Nabler helped the client to reduce their transactional discrepancy on Google Analytics from 47% to 11.6%.

Our Approach

At Nabler, we have experienced these kinds of issues with multiple ecommerce clients over the years, and if we look at the industry average, there is approximately 32-36% difference in the transactional numbers reported between in-house transactional reporting system and external web analytics solutions. If we go deeper in to this issue, there are multiple contributing factors and some of the common ones that we have encountered are:

  • Alternative Payment Methods offered on the website such as, PayPal, Google Checkout, Cash on Delivery, etc.
  • Non-tracking of Shipping Charge, Taxes, Promotional Discounts, and Gift Wrap Cost data.
  • Offline orders via Phone and POS.

In case of the client, we validated all the above common factors but unfortunately, none of the factors turned out to be an issue. So, we created a few hypotheses to nail-down the issue. This is how we approached the problem and found the root cause:

Hypothesis 1

Are we seeing differences between the in-house transactional reporting system and Google Analytics on a daily basis, or does it occur inconsistently?

To validate this hypothesis, we trended the differential data at the day level, but we found the difference spread across all days, so, this possibility was ruled-out.

Hypothesis 2

Does the disparity between both solutions happen at a particular hour(s) of the day?

To validate this hypothesis, the differential data was categorized by the time of day, and then trended. Again as expected, the disparity was seen to be occurring at all times.

Hypothesis 3

Are there any products or product types which are not getting captured?

To validate this hypothesis, we compared the Product IDs of the differential data against the Product IDs which were successfully captured by GA. This test revealed that several of the products in the differential data have been tracked by GA at some point.

Next, we created several advanced segments on Google Analytics interface to find the root cause and one of the segment showed some interesting information. Here is the rule behind that segment:

images28

This segment provided us with the visits in which the confirmation page was rendered but transaction data didn’t get tracked. The client confirmed that it was not possible as per the architecture.

Upon looking at the page report under this segment, we recorded all the product pages which were viewed during such conditions. Next, we randomly purchased different products discreetly and noted if they fired the transactional code or not. During one such purchase we noticed that the necessary ecommerce tags did not fire for one particular product.

images29

We closely looked at the source code of the confirmation page, and spotted that an APOSTROPHE (‘) within the product name was breaking the integrity of the ecommerce JavaScript code because, COLON is considered as valid delimiter by Google Analytics. This was causing the 47% of the total discrepancy reported by the client in the above stated business problem.

The Results

Post fix, the transactional discrepancy dipped to 11.6%.

images30

Google Analytics was able to capture several other products, which were not getting tracked earlier.

Want to learn more? Let's talk