This is Embarrassing: More About Today’s Outage
Today, we experienced an unscheduled outage of a little over 2 hours. That is highly unusual for Pure Chat and we thought it’s worth providing a little more explanation… Some Background Over the past 5 months, we have been carefully and methodically learning and creating plans to migrate Pure Chat’s entire infrastructure from dedicated servers, and transition them to virtual machines hosted on Amazon’s AWS network. On December 29th, we finally pulled the trigger and made this huge move. The transition was naturally quite a large undertaking. Pure Chat has oodles of services, sites, and supporting infrastructure to serve up our home page, dashboard, mobile apps, visitor tracking, and APIs. All of this technology works together to deliver an awesome seamless experience. Our transition to AWS opens up completely new possibilities for scalability, reliability, and new features for Pure Chat, so it’s super exciting! What Happened Today Managing infrastructure and software inside of AWS has some unique challenges that we are still working through. Today, we encountered a perfect storm of challenges. We deployed a new version of our software to increase database query performance on a critical piece of code. Even though the new code had been tested in our staging environment, Amazon Elasticbeanstalk was not happy with our change in production, to say the least. It deployed the new version of our code to some of the...
Read More