At Exotel, our platform is hosted on AWS. We have scaled exponentially over the last few years. What was a month worth of transactions a few years back now happens in a single day!
What we use at Exotel
At Exotel, we use a variety of AWS services such as EC2, DynamoDB, ElasticCache and so on. Services such as EC2 instances give us great flexibility in deploying and managing our in-house developed services. Other managed services like ElasticSearch & ElasticMapReduce take care of cluster management and let the developer focus purely on application development.
The Managed Vs. Self Hosted Dilemma
As developers, one decision we are faced with often is which one to choose – a managed AWS services like ElasticSearch, ElasticMapReduce service or our own ElasticSearch (or) Hadoop clusters using EC2 instances. This decision involves multiple factors – cost, capacity planning, infrastructure management, the level of support from AWS, ease of maintenance and some understanding of how it actually works.
In this discussion, we have considered two AWS services, namely Elastic MapReduce (EMR) & ElasticSearch (ES). We will analyse the pros & cons of using these managed services as against launching our own clusters. The goal of this discussion is not to recommend one option over the other but to educate readers about various aspects that need to be considered when making a decision.
Know what is inside
AWS services, no doubt, are fantastic to use. One downside of AWS services, though, is we tend to look at them like opaque black boxes. We assume they would just scale when we want them to. That, however, is a trap. Take EMR for example. EMR is a managed AWS service that lets you launch a Hadoop or Spark cluster in few minutes, run your Hadoop/Spark job and terminate the cluster when the job is completed. It also supports various connector libraries that let you ingest and process data stored in AWS storage services like S3, DynamoDB, etc. which is pretty cool. But, if you closely observe, EMR is hosting an open source product like Apache Hadoop, Spark which was developed and contributed by hundreds or even thousands of developers. So, the code base is not entirely managed by AWS.
One good comparison we can make here is running Hbase in EMR vs. running DynamoDB (both are noSQL services). If something goes wrong with DynamoDB, it is easy for AWS to investigate and fix the issue as they completely own the code base. If something goes wrong with Hbase, well, I guess it is not as easy for AWS to go fix Hbase as compared to fixing issues with DynamoDB. Moreover, EMR supports only single master at the moment and does not support high availability, which is critical for HBase cluster. So, if the master goes down, your Hbase cluster will die.
Cost vs Ease of Use
In addition to the above, EMR service provisioning scheme is different from other AWS services. These are tiny details that you need to be wary of. When you launch a Hadoop or spark cluster, EMR service will provision EC2 instances (that are part of your cluster) in “your” AWS account. This means that if you launch a 100 node cluster, but the EC2 instance quota limit in your account is 20, you will receive an error message saying “Exceeded your EC2 instance quota”. This implies that the user is still responsible for capacity planning and making sure they have sufficient EC2 instance quota in their account before launching an EMR cluster.
The positive side of this is, since the instances are launched in your account, you can leverage reserved EC2 instances that you might have already purchased in your account or spot instances to optimise the cost. From a cost perspective, EMR service charges you over and above what you pay for the EC2 instance. For example, if you are launching a single node EMR cluster with m4.xlarge EC2 instance, you will be paying $0.239 (EC2 charge) + $0.06 (EMR charge) per hour.
Now, coming to AWS ES service. ES service makes it really easy to launch an ElasticSearch cluster. Again, as in the case of EMR, this service is hosting an open source product (ES) on their infrastructure. But, this service does a better job than an EMR in provisioning EC2 instances. It does not use EC2 instances from your account to launch a cluster. Rather, the EC2 instances are owned by the ES service. So, ES service completely takes care of capacity planning.
The downside, however, is that you cannot use reserved instances as you can in the case of EMR. This affects your costs negatively.
What we chose at Exotel
We will share our experience of using ES service at Exotel. We started using the ES service to launch our production ElasticSearch cluster. It was only later that we realised that ElasticSearch does not support basic authentication (username/password) to access Kibana (ElasticSearch User Interface). It supports IP-based AccessPolicy but then the AccessPolicy needs to be updated every time the IP address changes. Although can work around these limitations, it would be great if ES service could support such a feature out of the box.
From a cost perspective, you will be paying roughly 1 – 1.5% of the EC2 instance price for every EC2 instance launched as part of ElasticSearch cluster. The table below explains the cost difference between ES service managed cluster (vs) self-managed ElasticSearch cluster. The table clearly shows that running our own ElasticSearch cluster is more cost effective. At Exotel, gross margins are everyone’s business, even an engineer’s. So optimising cost is always a high priority item. For this reason, we stopped using ES service and launched our own ES cluster.
The downside to this approach is that we need to setup monitoring and alert system to monitor and manage ElasticSearch cluster that ES service provides out of the box. For high availability, Replication factor for the ElasticSearch cluster is set appropriately, and they are multiple EC2 instances in the cluster. So a single node failure will not bring the entire cluster down. Note that ES service offers additional features like Automated snapshots. But, our use case does not demand these features.
|Elastic Search service||Self-managed Elastic Search Cluster|
|Reserved Instance||–||Yes (1-year term, paid all upfront)|
|EC2 Instance pricing||$0.980||$0.353|
|Cluster cost per month (31 days)||$2187.36||$787.89|
Overall, EMR and ES are great services and do a fabulous job of cluster management. But, as engineers ourselves, we believe you should know about these subtle but critical differences before using these services. We hope this information will help you to take a better decision about using these AWS services.