Ask, approve and act on AWS infrastructure within Slack

Change an EC2 instance type from within Slack after getting approval.

Dan Moore
Nov 25th, 2019

Wrapping up commonly used commands into a Slackbot can be a great way to make complex processes accessible to more users. However, what happens if the process is potentially damaging? In that case, you may want an approval process.

All the better if such an approval process happens within Slack. Doing so will leverage the communications platform everyone is using as well as increase ambient awareness of what’s happening. And both the approver and requestor can use whatever means they want to access Slack (web, desktop, mobile).

Scale an EC2 instance

I just finished a sample application (check it out) which lets you scale an EC2 instance up and down within the comfort of your Slack command line. Because this is a destructive process, there’s an approval step as well.

approvalbot resize-ec2-instance i-xxxxxx approver @ApproverUser

This command, will, if approved, resize the instance id passed to it. It is also smart enough to use the credentials of the user who requested the change, not the approver, and to not allow anyone except the approver user to click the approve button.

A system of record

What I like about this is that it brings an action (“restart the EC2 instance”) entirely into Slack, where it can be tracked and observed by everyone. If everyone can observe it, everyone can learn from it. The fact that this action was taken based on a discussion or other events that also are recorded in Slack is a great way to educate other team members in a low effort manner. The approval process is also fast, simple and clearly defined–the approver clicks a button before the action is taken, and can do that from wherever they access Slack.

An alternative system of record for AWS changes is to use infrastructure as code such as Terraform, an PR process, and deploy with a CI/CD tool like Jenkins. Both serve similar purposes; the CI/CD solution is more bulletproof but less accesible, less searchable and less visible to other team members. (Where do you hang out more? In Slack or on GitHub?) And of course you could write an integration with GitHub to have the instance type change create a commit in Terraform rather than directly access the APIs, getting the best of both worlds.

Next steps

Of course, this example is a bit contrived. Unless I was sure the EC2 instances were behind a load balancer, or were not receiving traffic, I wouldn’t want to resize them as that would impact the end user.

While there is value in being able to resize an EC2 instance from within Slack for the documentation and learning benefits, the real power comes if you are enabling more complicated semi automated workflows and hiding complexity behind a Slack facade. Other tasks that require human decision making in the loop but are complex enough to automate include:

  • standing up a system with a new software version and running a performance test on it. If the test meets performance criteria, promote the code. If it doesn’t, message someone and have them decide on next steps. These could vary depending on the performance report–if it was a small performance impact, maybe run the tests again. If it is a large delta, the code could get sent back for a code review.
  • making a copy of a portion of your production database (asking the initiator how big a copy to make, based on the use case), scrubbing the data and standing it up for issue troubleshooting.
  • gathering information from a portion of an application, the subsystem chosen by a user, including log files from EC2 instances, performance graphs from Graphana and user behavior from Google analytics to assist in troubleshooting of an outage.

Building a Slack mini app which lets you suggest an action, get approval and then take the action, all timestamped within Slack and using the credentials of the suggester to affect cloud infrastructure, is a powerful way to tame a complicated process or set of processes. If you want to explore this further, fork my sample application.