Week 07/09/2017 - 07/16/2017 Profiling & Performance Improvement

My primary focus during the past week is improve the performance of event handling system on the java server.

Context:
We use one thread for processing room transaction related user actions. For instance, user move from one room to another. I added instrumentation for profiling data a few weeks ago and noticed that user disconnect is taking a long time to process while other transactions on the thread is usually done with 5ms.

I added timer for sections of code that processes such even and ran profiling on production this week, and the result is astonishing. The majority of the time was spent on a synchronized block.

In short, we kick off a call to update database on a different thread and move on with the current thread, and the next thing current thread does is to synchronize on user object. Unfortunately the database updater job locks the user object as well so essentially we're doing the database update on the same thread. And the worst thing is, the lock in database updater is not even necessary.

Removing the unncessary synchronized keyword gives me a big performance improvement from on avg 25ms to 0.5ms. And by rolling this change out to production, I successfully reduce the server pool size by 20%.

Perf metric from datadog:
Perf before and after release
Details are wiped out but one can clearly tell the big perf boost after the change is released.

Comments

Popular posts from this blog

Week 07/02/2017 - 07 /09/2017 Rewrite Memcached Cluster Accessor

Week 10/22/2017 - 10/29/2017 Repository migration & Deployment process improvement