Analytics are an integral part of any application and in modern day data is the most valuable thing. We rely on analytics to not only provide better user experience to our user but also better reporting for our business division and to partners.
In June 2017, we rewrote our analytics API a.k.a
Genzo upgrading to Rails 5 refactoring along for better code readability, removing legacy code and cleaning up unwanted code and speeding it up as the amount of data had been steadily growing and impacting the servers. Initially the API was written in a way as Single Controller App, everything was in the
ApplicationControllereverything else was just stubs for all the actions. In the rewrite we divided up the code in concerns modularising it for easier understanding for anyone working on it. In the rewrite we were able to add better error codes for responses and also found some legacy events data that was silently being ignored rather than logged by not swallowing the errors. Below is a comparison of code rewrite before and after by Sandi Metz rules. It is always hard to remove 100% of legacy code, there is always some code you have to live with forever.
While rewriting the
Genzo we also made it multithreaded, V1 was blocking it will hold the user request till the entry is written to the log, something the user is not concerned about and shouldn't be kept waiting. This was something that was also causing loss of data as users would navigate away from the resource terminating the call. V2 however doesn't hold the request till it is logged but rather validates the data and returns response and logs the data in the background. Below is a comparison of the heaviest component in
Genzofor V1 and V2. V1 would suffer from peaks where as V2 delivers consistent reponse time.
The most major enhancement I consider was the updating of the status code for successful response from
204. It might seems like a small change but an impactful one. Firstly
200 isn’t the right code to use for an event that doesn’t indicate a state change also
200 needs to return with a compulsory body.
204 is designed to indicate success without a body as part of response and is also designed to indicate that the application state shouldn’t change due to this event. W3 explains it as follows
“If the client is a user agent, it SHOULD NOT change its document view from that which caused the request to be sent. This response is primarily intended to allow input for actions to take place without causing a change to the user agent’s active document view, although any new or updated metainformation SHOULD be applied to the document currently in the user agent’s active view.”
Due to this change we made a massive savings on our CDN Bandwidth. Below it can be seen that the number of hits have been growing month on month where as bandwidth has been decreasing. (We launched these changes end of June where you see a 6% decrease in data usage, during this we saw that our Android app was replaying the events so we rolled back assuming something wasn’t right in
V2 upon investigating we found that it was looking to match the body rather than the response from the server, have to admit it is not a very elegant way of parsing server response. To avoid the conflict we kept
V1 on 200 and
V2 on 204, and exonerated
V2 from any wrong doings, Hence the rollback contributed to increased data usage in July a discrepancy in the data while investigating.)
Many important lessons learned during this rewrite of our Analytics API, good coding practices allow for better performance and seemingly small changes can have major impact.
Originally published at geeks.wego.com on February 1, 2018.