I’m working on a project which needs to fetch messages from hundreds of SQS queues. We’re using SQS long polling to reduce the number of empty responses. It was very quick to get response at first when there are only dozen queues. As we added more and more queues, the performance getting worse and worse. It takes 60 seconds to get the response when there’s 300 queues and WaitTimeSeconds set to 10 seconds.
We are using Node.js in single thread mode, and I believe that it could handle 10 thousands connections without any problem because most of the tasks are IO processing. We also created an AWS support case, but nobody clearly answered the question.
Using AWS SDK to reproduce the issue
I start to troubleshoot the problem, the first step is reproduce the issue using a simple code which makes it easier to find out the issue. Continue reading “Troubleshooting of blocked requests when fetching messages from hundreds SQS queues”