Metrics Driven Development – How did I reduce AWS EC2 costs to 27% and improve 25% in latency

TL;DR Recently, I did some work related to auto scaling and performance tuning. As a result, the costs reduced to 27% and service latency improved 25%.

Overall Instance Count And Service Latency Change


  • React Server Side Render performs not good under Nodejs Cluster, consider using a reverse proxy, e.g. Nginx
  • React V16 Server Side Render performs much faster than V15, 40% in our case
  • Use smaller instances to get better scaling granularity if possible, e.g. change C4.2xLarge to C4.Large
  • AWS t2.large performs 3 times slower than C4.large on React Server Side Render
  • AWS Lambda performs 3 times slower than C4.large on React Server Side Render
  • There’s a race condition in Nginx http upstream keep alive module which generates 502 Bad Gateway errors (104 connection reset by peer)


Here’s the background of the service before optimization:

Aws Lambda retry behaviours on stream-based event sources

From the documentation, AWS Lambda will retry failed function on stream-based events sources.

By using Node.js, we can fail the function by many different ways, e.g. using callback to return error, throw exception directly, throw exception inside Promise, using Promise.reject. Then the questions is, what’s the proper way to let AWS Lambda know it needs a retry?

I did an quick test on following scenarios by setting up DynamoDB Stream and event mappings. It’s fun to have a guess which one will be retried and which one won’t.

Different ways to end the function

  • On Exception
module.exports.throwException = (event, context) => {
  throw new Error('something wrong');

Continue reading “Aws Lambda retry behaviours on stream-based event sources”