Cloud Security

Serverless 101: When It Makes Sense and When It Doesn’t

By Xiao Sun

[This post is Part One of a three part series. Part Two will cover How to Develop Serverless Functions, and Part Three will cover Serverless Security.]

 

The use of serverless cloud services for enterprise applications continues to increase as new applications are developed that can leverage these new computing platforms. The benefits of serverless include on-demand access to compute resources without the expense or headache of managing the underlying infrastructure.

For example, consider this use case. You want to provide a service for a user upload/download of a document. Your service will do a spell check, translate it to pdf format, then save the documents to a database (upload case) or query the database send back the document (download case).

The Old Way – Do it On-Premise

In a typical deployment on your company network, you will need provision or allocate a webserver (physical or virtual), deploy your application, and connect a database server to save the documents.

Doing it this way is not very scalable because you don’t know if your webserver can serve the user demands adequately. You may end up wasting resources or not being able to process all service requests during the peak season or for a special promotion.

If the infrastructure in your company is deployed using a modern Kubernetes cluster, pods can be dynamically scaled up or down for the service, and with the right configuration you may be able to address the scalability issue.  But there are still drawbacks.

On premise solutions all face a common issue. The machine your application runs on needs to be maintained by you. You need patch/upgrade and defend against attacks to the entire infrastructure and the applications running on it. The operating system is managed by you, the network infrastructure is maintained by you, and any breach in your network from any source can make your service vulnerable to exploits and attacks.  You need to secure the entire infrastructure,  including patching, monitoring,  and firewalling. In addition, there are physical infrastructure issues such as dealing with equipment durability, upgrades, maintaining power supplies, and physical security. It’s a constant struggle to make sure the service is available 24/7, and is efficient and secure.

A Good Solution – Rent a VM/Container or Deploy a Kubernetes Managed Service    

One solution is to use cloud infrastructure services. For example, in AWS you can rent an EC2 service, or if by chance your company already has a Kubernetes production cluster in the cloud you can deploy the pods for your service there too.

However, you will still need to maintain the VM OS and make sure no attacker breaches the os or runs other processes. But compared to an on premise deployment, the attack surface is much smaller, and the maintenance responsibility is largely now shifted to your cloud provider.  It is now the job of AWS to keep your service running 24/7 and be able to scale up/down.

Sounds great, but can we do even better? What are the drawbacks of such a solution?

  1. You still need maintain the operating system, which means knowing which OS to use, install all dependencies, implement virus protection, and patch vulnerabilities. This means you can’t just have application developers running code. You also need system admins to maintain, monitor and secure the environment. And you still need defend against attacks to the system, instead of just hardening your own code.
  2. If your infrastructure is running 24/7, it will cost you money even if it is unused or lightly used. And to make it worse, it’s open for probes, scanning, exploits and attacks as long as it is running.

Ideally, you’d want your service active ONLY when someone or something is using it.

A Better Solution – Serverless

This is where serverless computing comes in handy. Today there are two leading public serverless service providers, Amazon Lambda and Microsoft Azure Function.

We will describe in more detail how serverless works in the next post. For now, just think of serverless as the ability for you to provide only the applications needed to do the required functions – the PDF format translation, saving to a database, and responding to a query.  The serverless cloud provider takes care of the rest. The advantage of this approach is obvious.

1. You only develop application code for your service logic. You don’t need to know anything about operating system, container build, networking, memory consumption, or cpu requirements. In the example below the user document will be part of an event.


# pseudo code function to handle upload

import boto3
from boto3.dynamodb.conditions import Key
 
def upload_handler(event, context):
       pdf_doc = translate_to_pdf(event.doc)   
       file_id = save_file_to_dp(pdf_doc)  
        If file_id != -1 {
             return success
        }
        return error


# pseudo code function to handle download

import boto3
from boto3.dynamodb.conditions import Key
 
def download_handler(event, context):
        file_id = query_file(event.file_name)   
       
        If file_id != -1 {
             pdf_doc = get_file(file_id)
             return pdf_doc
        }
        return error

2. It is low cost, especially if your service is not heavily invoked by user or api requests. You can see with a lightly used service, your cost is very low.

3. There are no system level administration tasks for you to perform. Scaling up/down is performed by the cloud provider.

4. You will only be charged when an invocation or file space consumption in the database goes beyond the free tier.  So it’s zero cost for you if no one is using it.

Serverless functions are stateless and execute specific logic for very short time. The system is provided dynamically by the cloud provider for each running instance, and therefore the attack surface is very small.

Serverless functions are designed to provide a single service with limited runtime and complexity. You can use a large amount of these functions combined with other forms of services like traditional webservers or container-based services to establish a large complicated system.  Here is such an example from  https://labouardy.com/

Serverless Limitations

Most serverless limitations come from the fact that a ‘function as a service’ has been designed to elastically perform a single function. You can combine a large number of these functions to perform a larger complicated service. But for each function limitations include:

  1. Each function has limited code size and runtime. For example, on AWS Lambda the limits are described here. The key point is that a function is required to finish its logic within 15 minutes, and size of code needs to be below 50MB zipped.
  2. There are only a handful of languages supported for developing the function.  AWS Lambda natively supports Java, Go, PowerShell, Node.js, C#, Python, and Ruby code. Of course you can use AWS Lambda runtime to extend support, but that is rarely used for production.
  3. There is a performance penalty if your function starts cold. The latency of a ‘cold start’ depends on your package size, number of dependencies, language chosen and some random factors like region and time of the day, and is roughly several hundred milliseconds. If your function was active recently (within several minutes), it will be considered ‘warm’ (the micro vm is still active), and the performance penalty is small: less than 50 milliseconds for most functions.
  4. It is stateless, as every instance of your function is invoked on different micro-vm, so it may not be a good choice to use S3 storage to save state. So, your application needs to be stateless and run independently of other instances.

If serverless still makes sense for you, we can dive deeper into the implementation details. In the next post we will discuss the mechanisms behind AWS Lambda to better understand how functions are created and run on Lambda and what you can achieve by leveraging serverless computing.

 

About the Author

Xiao Sun

Xiao Sun is a software engineer at NeuVector.

By Xiao Sun |Tags: |No Comments