Producer/Consumer queues with Reserved and Spot Instances

Introduction

AWS EC2 is the pay-per-use web service offered by AWS to easily provide flexible cloud computing power to all users, whether you are an individual beginning to unveil the cloud possibilities or you work for multi-billion corporation.

EC2 is offered in a wide range of instances sizes and types, each of them optimized to fit specific situations. There are also different ways to pay for them so you can utilize the most beneficial option for your particular use case.

In this guide, we’ll take a look at those payment options and use a simple, but effective, well-known computational problem to investigate how each different payment options may work together to solve different situations. The problem is the ‘producer-consumer queue problem,’ where a queue is constantly growing with information provided by producer nodes, while consumer nodes take the information from the queue and process them. The objective here is to gain hands-on experience with AWS EC2 by understanding how the different payment offers can fit together, plus work alongside other well-known services (such as SQS, AS and Cloudwatch).

RequirementS

For this guide, you will need:

  • A valid AWS account

    • If you follow the steps within this guide, you should expect to spend less than $2.00 (USD) depending on how many instances you run and how long you run them. Tip: always remember to create a billing alarm in your AWS Account so that you aren’t surprised with unwanted charges.

    • An IAM user with valid credentials to start EC2 OD and SI, check Cloudwatch metrics, create and read from a SQS queue.

    • VPC and subnets setup.

    • You’ll need to know how to launch an Instance and how to create an AMI.

  • A SSH terminal so you may log in to your instances and run commands

    • Linux or Mac OS terminal, or Putty if you’re on Windows.
  • Text editor

    • Atom, Nano, VIM, TextEdit, Notepad will do, but I recommend using PyCharm (they have community edition) for editing the *.py files.
  • Python 2.7+ installed

    • I used 2.7 to run these commands.
  • AWS SDK for Python

    • You can find the step-by-step installation here.
  • AWS CLI

    • You can find the step-by-step installation here. I highly recommend choosing the pip install method.

    • Remember to create a credentials file so that you do not input your Access Key/Secret Access Key in any file. Personally, I like to create a named profile for each AWS account so that it gets easier to use AWS CLI.

Concepts

Let’s quickly look over the payment options offered by AWS. As of Jan 2017, AWS lists three payment ways:

  1. On-Demand Instances (OD): The most simple pay-per-use method. You pay a standard rate by the hour as you run one of the instances. This is fairly straightforward: each instance type/size has a rate associated with it, and you get billed by the end of the month for how many hours of each type/size you used.

  2. Reserved Instances (RI): If you already know that you’ll need a particular instance type/size for at least one year, you make a request to reserve it and get better rates in exchange. This may be a great deal: you can save up to 75% by reserving for 12 or 36 months (sorry, nothing in between) and choose to pay all-upfront, partial-upfront or even with no-upfront (depending on your account history). This can be very efficient, especially when you already know the minimum load that your application requires.

  3. Spot Instances (SI): Much like a stock market, you place bids for what you’re interested in running. Each instance’s price fluctuates based on supply-demand. Depending on the current price of an instance, you can get the best deals: it is common to see savings of up to 80%, not rare to see 90% and, in some cases, the savings can be even greater.

Amazon also lists a 4th payment option, Dedicated Host, which is a way of reserving a physical host and saving money in some cases. Those cases are often based on license issues: for instance, you can plan a migration if your company already has a license (MS Windows Server, MS SQL Server, SUSE Enterprise Server, etc) restricting the use of the software to a particular VM, or to a specific number of sockets or number of cores. When running your software on dedicated hosts, you gain visibility and control over the placement of instances on dedicated hardware, making easier to manage meets regulatory and legal
requirements. However, for practical reasons, I’ll not go further in it in this guide.

The 3 aforementioned payment methods have, of course, some subtleties involved, but I’ll highlight just one for now: RI’s are (stress this)
really just a payment method, while SI’s are instances themselves.

In practical terms, this means that you may be running OD instances and you realize that RI can save money. You just need to place RI requests matching those OD instances attributes (the most important are instance type, size and operational system) and you’re done. There is no downtime, since no technical change is required. SI’s, however, represent instances that need to be spun up and down. That means that you should design your application for making the best use of them, be prepared to automatically launch new instances and, most important, design your application to be resilient when a SI is terminated without your consent. The latter situation occurs when the current instance price goes over your bid request, thus meaning that you do not want to pay more than a specific rate to launch more instances, and AWS shuts them off. It may not be clear at first how this feature can benefit your application, but once you begin to understand, you may be able to save a lot.

For our example, we’ll also need to use other AWS resources. For the sake of simplicity, we will not focus on the details of these. Instead, I’ll just list their overall definitions so that someone with no prior AWS knowledge may understand what is happening.

The first one is Simple Queue Service (SQS), one of the first AWS publicly available services, which provides a scalable and fully managed message queuing service. Imagine a queue cluster that should be shared across a distributed system by different nodes, but you do not want to spend time and effort setting it up.

The second one is Auto Scaling (AS), a utility service that helps you automatically spin your EC2 Instances up and down based on metrics and thresholds that you define.

The last one is CloudWatch, a monitoring service to keep track of several metrics based on AWS Cloud Computing, but that may be also useful for custom metrics.

Hands On

Our problem consists of a simple queue that should be fed by a producer worker (represented here as a single EC2 instance, which could easily grow to several instances) and consumed by a consumer worker (represented here by multiple instances).

user_54533_58c111a2d0c74.png_800.jpg


The diagram above illustrates what we are doing. The producer keeps constantly feeding the queue with newer items. Each queue item is a random integer token, which tells the worker to sleep for the specified time before polling out the next object from the queue. While it sleeps, other workers might pop out the next element in the queue and process it (in this case, sleep for some time). The randomness obviously indicates that the queue might get a variable increase or decrease rate, depending on the amount of time that each worker is kept waiting. This silly example actually shows a real world problem class, which may be faced in different situations, such as:

  • An e-commerce system with a queue handling the orders placed. Each order can take a random time to be processed, depending on number of checks performed by the system;

  • A batch report system that fills several reports in sequence. Each report may be slowed depending on the size of its dataset and its complexity, but many reports can be generated and sent in parallel.

Let’s start by creating our queue. The following python script file (QueueManager.py) will do.


#!/usr/bin/python

import boto3

# Assuming that you have a valid named profile configured in
# your ~/.aws/credentials
session = boto3.Session(profile_name='yourprofilename')
sqs = session.client('sqs')

# Create a new queue named 'myQueue'
queue = sqs.create_queue(QueueName='myQueue')

# We can now access identifiers and attributes
print(queue)

Now we need to create the QueueProducer.py, which will keep posting messages to the queue until manually stopped.


#!/usr/bin/python

import boto3
from random import randint
from time import sleep

# Assuming that you have a valid named profile configured in
# your ~/.aws/credentials
session = boto3.Session(profile_name='yourprofilename')
sqs = session.resource('sqs')

# Gets 'myQueue' queue as an object
queue = sqs.get_queue_by_name(QueueName='myQueue')

# Let' initiate an infinite loop so that it keeps running
# until the process is killed
while True:
# Creates a new message with a random integer as value
integerValue = randint(5, 10)
message = queue.send_message(MessageBody=str(integerValue))

# Checking the message created
print("\tContent: {0}".format(str(integerValue)))
print("\tMessageId created: {0}".format(message.get('MessageId')))
print("\tMD5 created: {0}".format(message.get('MD5OfMessageBody')))

# Sleeps for some time (in seconds) before creating another one
sleep(0.5)

And finally, let’s create the QueueConsumer.py, which will keep polling the queue for new messages. For each message, it reads its content and, if is a valid integer, sleeps the thread for the specified amount of time.


#!/usr/bin/python

import boto3
from time import sleep

# Just a helper function to check if
# a string represents an integer
def check_integer(a_str):
try:
int(a_str)
return True
except ValueError:
return False

# Assuming that you have a valid named profile configured in
# your ~/.aws/credentials
session = boto3.Session(profile_name='yourprofilename')
sqs = session.resource('sqs')

# Gets 'myQueue' queue as an object
queue = sqs.get_queue_by_name(QueueName='myQueue')

while True:
# Process messages
for message in queue.receive_messages():
if check_integer(message.body):
value = int(message.body)
print('Waiting for {0} seconds'.format(message.body))

# Sleeps for the specified time in the message
sleep(value)
else:
print('Invalid value: "{0}"'.format(message.body))

# Deletes message from queue
message.delete()

Great. After running the QueueManager.py once, we’ll have a valid SQS queue and we can run both QueueProducer.py and QueueConsumer.py to, respectively, create and consume new messages. Run it on your local computer to confirm that everything is ok before launching an EC2. When you run them, you’ll probably notice that the producer fills the queue much faster than the consumer reads from it. That would lead to a huge queue after some time. The first step to solve this is to run the QueueConsumer.py concurrently so that messages are consumed faster than they are created. Of course, that would lead us to our local computer limits, and that’s exactly where we see the beauty of cloud scalability.

First, launch an Amazon Linux t2.nano instance. Remember to attach a security group with the port 22 opened. Log in via SSH to your newly created EC2 instance and install boto3 by running:

sudo pip install boto3

Create a QueueConsumer.py file (nano QueueConsumer.py) in your home directory and paste in the content that you previously tested locally.
Remember to also configure your AWS credentials in your EC2 by editing your ~/.aws/credentials file. We could make use of IAM role and policies, but for the sake of simplicity, let’s use your IAM user credentials. For real world, you should not save your credentials and should always make use of IAM roles when possible.

Test your script by running:

/usr/bin/python /home/ec2-user/QueueConsumer.py

If you have some messages in your queue, the prompt should print the “Waiting for .. seconds” message and you’ll know that is ok. Now we must make sure that this script is started automatically when the instance starts. Edit crontab (remember to run as sudo) and insert the command below:

sudo crontab -e
@reboot /usr/bin/python /home/ec2-user/QueueConsumer.py

A good way to verify everything is by shutting down the instance and starting it back again. While doing this, keep an eye on your SQS Console:

image2.png

After restarting the instance, you should refresh the SQS listing to check that the Messages Available in the queue are actually decreasing – meaning your instance is successfully reading them.

Now create an AMI based on your instance. By doing it now, you are baking the QueueConsumer.py and the crontab entry into the AMI , meaning that all future instances you launch based on this AMI will behave the same way. From the EC2 listing (remember to change your region to the correct one), select your instance, stop it (if still running) and right-click it to select the option Image -> Create Image, pictured below:

image3.png

Give it a meaningful name and a nice description (make it a habit!):

image4.png

After hitting the “Create Image” button, you can click on the “Check status” link that is shown. You’ll then monitor the Image creation. When it’s finished, the “Status” will change to “available”.

image5.png

Now we need to create a Launch Configuration, which is a configuration that will hold the information needed to launch more instances based on your newly created AMI with the Auto Scaling Group we’ll create later. On the left side of your EC2 console, click on Launch Configurations and in the “Create Auto Scaling group” button. Click on the “Create launch configuration” button and, in the next screen, select “My AMIs” on the left menu. You should see the AMI you just created.

image6.png

Select your AMI and carry on to the next steps. They are very similar when launching a single EC2 instance. Remember that all parameters selected for the Launch Configuration will be replicated to the EC2 instances created by it.

When you finish creating the Launch Configuration, the “Create Auto Scaling Group” wizard will pop up, based on your Launch Configuration name:

image7.png

Remember to set the “Group size” to “Start with 0 instances”. Tip: Select more than one subnet, based on different AZ’s, for a Highly Available architecture. This is very useful for bigger projects.

On the “Configure Scaling Policies” step, change the scale limits to be between 0 and 10. We’ll need 2 scaling policies here, both of them are simple, so I have skipped these steps. Name them “myScalingPolicy” and “myScalingDownPolicy” (or some name that you find more appropriate), do not select any alarm, and both should wait 60 seconds before taking other action. The first policy should add 1 instance, and the second should set to 0 instances. In real situations, it is important to check some thresholds and the policies limits to find the optimal setting for your auto scaling group. After editing, your screen should look like:

image8.png

In “Configure Notifications,” you may create a topic to get notifications when instances: are launched/terminated or fails to be launched/terminated. On “Configure Tags,” create a Key “Name” with Value “linuxacademy-worker-based-on-asg” and make sure that “Tag New Instances” is selected. By doing this, all new instances launched by your Auto Scaling Group will inherit this Name tag (making easier to manage them later). Tagging instances is good practice and should be utilized.

Now navigate to your CloudWatch Console, select “Metrics” on the left menu, click on the SQS Metrics group in the main panel, click on Queue Metrics, and select “ApproximateNumberOfMessagesVisible” metric.

image9.png

When clicking on the “Graphed metrics” tab, you’ll see options for your monitoring. Make sure to change the Statistic to “Sum” and the Period to “5 minutes”:

image10.png

On the “Actions” column, click on the bell icon (it’s the icon for “Create Alarm”). It will open the “Create Alarm” pop-up. You can create a simple notification when it enters the Alarm state and another for the Ok state. For testing, you can create it using your email (you’ll receive a test email that you must confirm prior to receiving notifications). But the real settings will be like the following, where the the auto scaling group will launch 1 instance every time the

CloudWatch Alarm is triggered, and will have the instances set to 0 when the Alarm gets the OK status again.

image11.png

At this point, we already have a fully managed, scalable queue that gets messages and creates new instances that consume those messages if the load exceeds a certain threshold.

As we are only dealing with a small group of t2.nano instances for such small period, our AWS bill shouldn’t be more than $1 USD (as long as you don’t keep playing around with the queue and the auto scaling group). In bigger applications, an auto scaling group can handle tens of larger instances that process more data for longer periods, which causes a significant cost increase. For such situations, the Spot Instances help alleviate some of the cost.

Let’s create another Launch Configuration, but this time with Spot Requests. Navigate to EC2 Console and click on “Launch Configurations”. Select the first one that you created earlier, right click on it, and select “Copy launch configuration”. You’ll see the review page, but do not confirm it yet. Click on the “Choose Instance Type” step and select a larger instance, like a m4.2xlarge. Click next (confirm that you are actually changing the type). On “Configure details”, change the name of your new Launch Configuration, check the box “Request Spot Instances” and take a look at the current spot prices for this instance. You should
see something like:

image12.png

If you are on us-west-2 region, you’d pay $0.44/hour USD for a m4.2xlarge instance for the 8 cores and 32GB RAM horsepower it brings (plus being IO intensive and EBS-optimized), but with the SI floating prices we can save ~75% of the cost. The t2.nano instance would only cost $0.006/hour, but it only brings a single core and 0.6 RAM. It seems a pretty good deal! So, give it a nice Maximum price. For this example, I set $0.12.

Now, let’s update the auto scaling group: select Auto Scaling Groups on the left panel and click on the existing auto scaling group. Right click on it and select “Edit”. On the bottom panel, select the name of your newly created Launch Configuration on the Launch Configuration combo.

image13.png

And now every time that the Alarm is triggered, our auto scaling group will add a m4.2xlarge instance if the spot price is less than $0.12, getting bigger instances for only 25% of their original price. But what if the price goes beyond this limit? Well, the request would not be fulfilled and no instances would be launched, and your queue would keep growing forever. This is why it’s good to take a hybrid approach for this kind of situation: if you keep an instance constantly running (for example, a t2.nano) you would always have a worker node to consume your queue. When the Alarm triggered (meaning a large number of new requests got in and the single worker could not process in short time), a bunch of Spot Instances would be launched. Furthering this cost-in-mind approach, you could get a RI for that 24/7 running instance, achieving an auto-scalable, cheap, elegant solution to handle a fully managed queue in the cloud.

Wrap up

In this guide we listed the different payment methods that AWS offers for EC2 instances, which are their main use cases. We also explored a classical computational problem by implementing a solution with a mix of Spot Instances and On Demand instances. They are automatically launched and terminated by auto scaling groups, which are triggered by a CloudWatch alarm that monitors a SQS queue.

Additional Resources

Looking for team training?