Riccardo Mocchetti

Riccardo Mocchetti

Using Functional Programming When Building Cloud Native Applications with AWS Lambda

September 17, 2019 by Riccardo Mocchetti

I’d like to share with you the architecture and programming pattern we've been using to build Cloud Native applications. The use case I'm going to present is a RESTful application deployed on AWS, using AWS API Gateway and AWS Lambda. I want to show how by adopting specific AWS components and programming paradigms, we can increase the reliability of the applications we write, while also improving their maintainability.

SymbolfunctionalProgrammingLambda

In particular, I'm going to focus on AWS Lambda and Functional Programming (FP) with Typescript, a typed superset of JavaScript.

If you are already familiar with FP concepts like currying, monads and immutability, then feel free to skip a few sections and go to 'Composing the Service'.

If you want, you can find the examples in the post and the whole application in this repository. The code is written in Typescript and uses the fp-ts library for the functional aspect.

I have the Adidas Platform Team to thank for this post, as our project together inspired it.

Why Functional Programming?

Whenever you talk to an experienced functional programmer, watch a keynote, or read about FP on other blogs, the term 'pure function' often comes up.

A pure function is the basic building block of any FP application, and it can be defined, as Mostly Adequate Guide to FP does, as follows.

A pure function is a function that, given the same input, will always return the same output and does not have any observable side effect.

I'm going to describe side effects better in the following sections, but the idea is that a pure function calculates its output without interacting with or modifying anything external to the function itself (application state, databases, and other I/O streams).

Pure functions, by their definition, give us the following benefits.

  • Readability and self-documentation. Pure functions are self-contained; everything we need to understand the behaviour of the function is either defined as input parameter or in the body.
  • Caching. If the result depends only on the input and nothing else, then for every input we can precalculate the output and cache it. This technique is generally called memoisation.
  • Simplified testing. Testing can be performed by simple input/output assertions. If we need to mock an external service, we can pass the mock as part of our input parameters.
  • Immutability. If our functions don't have side effects, it means we can execute them without worrying about coordinating changes to our application state. This is particularly useful in concurrent programming.

 

However, all this does not come for free. Pure functions fundamentally change the way we write our code. In this blog post, I want to describe what my thought process is like when I write an FP application and show how, by applying a few principles, we can see the benefits listed above.

Why AWS Lambda?

AWS Lambda is a fully managed environment that runs our code (with some limits).

The model is straightforward. We deploy our code in what’s called a lambda function. Whenever a lambda function receives an event from one of the supported AWS services, it triggers our code, passing the event as a parameter.

It sounds perfect! FP is all about defining our program as functions, so AWS Lambda gives us a convenient abstraction to interact with other AWS services. Everything is an event that we receive as a parameter.

One of the most common uses of AWS Lambda is to receive events from AWS API Gateway to build a REST application. The API Gateway deals with receiving, parsing, authorising, and potentially validating requests. The request is passed to a lambda function as an event and triggers its execution.

We can already see the advantage of using this model when we develop our REST application. The API Gateway already implements the logic to process HTTP requests. We don't have to think about it. All we are interested in is our event input and the business logic we need to apply.

This model is also valid to interact with other AWS services. For example, AWS SQS, a managed service to push and pull messages to and from a queue. In this case the logic to retrieve messages from the queue is already implemented by AWS and the lambda function receives the messages in the form of events.

A Real Example

All the examples in the post are going to be based on a real application. This application is extremely simple, the backend for a blog. It offers a REST API that allows two operations:

  • 'POST /blogposts' to create a blog post
  • 'GET /blogposts' to retrieve the full list of posts.

Request Validation

One of the first things I think about when writing a REST application is how I want to validate requests coming from the user.

AWS Lambda helps us here because an HTTP request is just an event. In particular, one that looks like this.


{
	body: '{ "some": "text" }',
	queryStringParameters: { "param": "queryParam" },
	pathParameters: { "param": "pathParam" },
	httpMethod: 'POST',
	path: '/blogposts'
	// plus other attributes
}
    

The pattern to implement validation is pretty much always the same.

A request comes in. It goes through a validation function. If the validation passes, then the request is allowed to proceed. Otherwise, we return an error message to the user.

Everything looks straightforward, but it can become quite complicated depending on the type of validation. The risk is that our validation function becomes big, difficult to maintain and not reusable.

The answer to this problem is function composition.

What’s a ‘Function Composition’?

According to the community Wiki for the Haskell language:

Function composition is the act of pipelining the result of one function, to the input of another, creating an entirely new function.

This definition suggests we can break down the function that validates the whole payload into smaller functions, each validating a part of the payload, which we can then compose together.

The advantage of following this approach is that we can think about our smaller functions in a reusable way. We can then put them all together as we would with building blocks when playing with LEGOs.

Let's say we want to implement validation for the 'POST /blogposts' request, and that the request is valid if:

  • There are no path parameters
  • There are no query parameters
  • We can parse the request body without errors.

If we wanted, we could implement other functions to check the correctness of the body, but this is good enough to explain how function composition works.

Here is a first implementation of our validation rules.


const queryParamsIsNull = (event: APIGatewayEvent) => {
	if (event.queryStringParameters !== null) {
		throw new ApplicationError(
			'Error parsing request query params',
			['Query params should be empty'],
			StatusCodes.BAD_REQUEST
		);
	}
	return event;
};
const pathParamsIsNull = (event: APIGatewayEvent) => {
	if (event.pathParameters !== null) {
		throw new ApplicationError(
			'Error parsing request path params',
			['Path params should be empty'],
			StatusCodes.BAD_REQUEST
		);
	}
	return event;
};
const asUserPostEvent = (event: APIGatewayEvent) => {
	try {
		const parsedBody = event.body
			? JSON.parse(event.body)
			: {};
		return {
			...event,
			body: parsedBody
		};
	} catch (error) {
		throw new ApplicationError(
			'Error parsing request body',
			['Invalid JSON'],
			StatusCodes.BAD_REQUEST
		);
	}
};

These three functions have a few properties that are worth noting.

  • They are generic enough to be applied to every request.
  • They take the whole event as a parameter.
  • They focus on the event attribute they are validating.
  • The input of one function is the output of another function.

 

'queryParamsIsNull' and 'pathParamsIsNull' take the 'event' as input and return the 'event' if the validation is successful. They can be executed in any order.

'asUserPostEvent' parses the 'body' of our request and returns a new object containing the parsed body along with all other request parameters.

We can write the validation function for the 'POST /blogposts' as the composition of our three functions:


const validateCreatePostEvent = compose(asUserPostEvent, queryParamsIsNull, pathParamsIsNull);

Where 'compose' is a utility function already defined in most FP frameworks, we can just use it. We don’t have to write the compose function ourselves. That help us to specify function composition more elegantly.

We can also write our validation function without the 'compose' helper in a less readable, but more explicit fashion:


const validateCreatePostEvent = (event: APIGatewayEvent) =>
	asUserPostEvent(queryParamsIsNull(pathParamsIsNull(event)));

The two definitions are equivalent. If we look closely at our second definition of 'validateCreatePostEvent', we notice that 'compose' applies functions from right to left.

Handling Exceptions

I need to make a confession. The examples I used in the previous section are convenient to explain the concept of composition, but they all have a significant issue: they all break the flow of the program.

In every example we've seen so far, whenever we want to fail, we throw an exception. Exceptions make us lose control of the program flow. The exception needs to be picked up by something else, and that 'something else' has to deal with it.

This introduces the problem that the correctness of the program depends on 'something else'. It makes the program harder to test because we can't just rely on our inputs anymore.

Furthermore, it makes the program less readable because to understand how the program behaves, we can't just look at the function itself, but also consider the context in which the function runs.

So how do we avoid this? How do we return different values depending on the result of the validation and still maintain a usable interface? Moreover, how do we compose our functions so that we have one single flow, independent of the validation’s result?

Either Left or Right

The first step is to rewrite our functions to not throw exceptions. Let's have a look at the following validation rule.


const bodyNotNull = (event: APIGatewayEvent) => {
	if (event.body === null) {
		return new ApplicationError(
		'Error parsing request body',
		['Body cannot be empty'],
		StatusCodes.BAD_REQUEST
	);
}
return event;
};

The 'bodyNotNull' function returns an 'ApplicationError' instead of throwing it like an exception. This function does not break the program flow anymore, it always returns something, and the output depends only on the input. Unfortunately, it is not the best function to deal with since it does not have a consistent interface.

What we need is to return something that can behave either as an 'APIGatewayEvent' or as an 'ApplicationError'.

In FP such a thing exists, and it takes the name of, unsurprisingly, Either.

'Either' assumes a 'Left' value or a 'Right' value. The convention is that the 'Left' value represents an error state, while the 'Right'  value represents a successful computation.

We can now rewrite the validation function to use this new concept:


const bodyNotNull = (event: APIGatewayEvent): Either<ApplicationError, APIGatewayEvent> => {
	if (event.body === null) {
// We failed the validation, so we return a Left value
		return left(
			new ApplicationError(
				'Error parsing request body',
				['Body cannot be empty'],
				StatusCodes.BAD_REQUEST
			)
		);
	}
// Our validation passed, so we return a Right value
	return right(event);
};

If we take a closer look at the return type of the 'bodyNotNull' function, we can see that we return an 'Either' that can assume both an 'ApplicationError' or an 'APIGatewayEvent' value. We return the real value of 'Either' using the 'left' and 'right' functions.

The disadvantage of introducing 'Either' is that now we can no longer compose our validation functions like we were doing before. This is because the input parameters of our functions are not compatible anymore with the value they return.

We need a new way to compose our functions together.

Functions as Chains

If you ask a Haskell programmer, ‘How do I chain a series of functions together, and also capture their errors?’, you would probably receive this answer:Oh, that's easy, just use a monad!’

Unfortunately, when I heard this answer for the first time, it did not make sense to me. I'm going to summarise the way I think about monads hoping that it might help those who, like me, come from an imperative programming style.

My definition is an extreme simplification of the definition of a monad, and I think FP purists will turn their noses up. However, I believe it's good enough if you want to start using monads in your application.

I see a monad as a container of one item. This container has three parts:

  1. a type that describes the behaviour of the container
  2. a constructor to build a container with an item in it
  3. one or more operations to combine (compose) monads with each other. Each operation generates a new monad from the item contained in it.

 

Each one of the three parts also needs to respect a few mathematical laws. I won't go into more details. If you feel adventurous, you can read all about monads in this paper.

The constructor of a monad is generally called 'of':


	MonadA.of(val)
    

It creates a monad of type 'MonadA' that contains the value 'val'.

To access 'val' and operate on it, we can use 'map'. It takes a function as a parameter and returns a new monad containing the output of the function.


	MonadA.of(1).map(one => one + 1)
	// gives us MonadA.of(2)

Another useful way to access 'val' is to use 'chain'. Also known as 'flatMap', it is similar to 'map' but expects the input function to return the new monad:


	MonadA.of(1).chain(one => Monad.of(one + 1))
	// gives us MonadA.of(2)

Good. So, how are monads going to help us?

You will be glad to know that 'Either' is a monad, and it behaves in a particular way.

A 'Left' value ignores any attempt to 'map' or 'chain' over it, by just returning its value without applying the function. A function is only applied to instances of 'Right' values.

Let's see how we can use the 'Either' monad to compose our validation functions together:


const validateCreatePostEvent = (event: APIGatewayEvent) =>
either.of<ApplicationError, APIGatewayEvent>(event)
	.chain(pathParamsIsNull)
	.chain(queryParamsIsNull)
	.chain(bodyNotNull)
	.chain(asUserPostEvent);

'Either.of<...>(event)' creates a new 'Either' monad containing the API Gateway event. We then 'chain' together each validation function. Remember every validation function takes the 'event' and returns a new 'Either' monad, so we cannot use 'map'.

Every function in the chain is applied to the event as long as the value returned is 'Right'. When a validation function returns 'Left', every succeeding function is ignored due to the behaviour of the 'Either' monad, and the error is returned at the end of the chain.

Calling the Database

So far, we have seen how to write pure functions that interact only with in-memory variables. However, when we write real applications, we often need to call external services or a database to store our data.

These interactions differ from what we’ve seen so far in that they change something that is external to our functions. In other terms, they introduce a side effect.

A side effect is a change of system state or observable interaction with the outside world that occurs during the calculation of a result.

FP tells us that the execution of our functions must return the same output if we provide the same input. Although, when we interact with a database, the value we return often depends on the state of the database. So how do we maintain our code pure without giving up on storing and retrieving our precious data?

The answer FP gives us is simple, we delegate the execution. Let's see how:


const storeById(id: string, data: object) =>
	db.store({...data, id})
const storeByIdDelegated(id: string, data: object) =>
	() => db.store({...data, id})

Instead of calling the database directly as we do in the first function, we return a function that calls the database with the parameters we receive.

For every pair 'id, data', we return the function that stores 'id, data' in the database, so technically we are returning the same output given the same input.

I know, this doesn't sound very useful. But let's see how FP makes use of it:


	task.of(2).map(two => two + 4).map(console.log)

We are creating an instance of a 'Task' (we don't know what it is yet), with '2' contained in it. We then map '2' to a function that adds '4' and then we log the returned value. We expect to see '6' on our screen.

If you run this in the console, you'll be surprised. Nothing is returned. It's as if nothing ever happened. However, when we do


	task.of(2).map(two => two + 4).map(console.log).run()

we see that the value '6' is displayed.

You might have guessed by now that 'Task' is another monad. And you are right, but it belongs to a particular kind of monads called IO.

IO is a generic FP interface that let us interact with external components as we would typically do. However, under the hood, it does not immediately execute our functions, but it wraps them with another function as we did with 'storeByIdDelegated'.

'Task' specialises from IO in that it offers us a friendlier interface for when we need to interact with our database asynchronously.

Unfortunately, we cannot just use 'Task'. It is best to assume that our database is going to fail from time to time, so to represent failure scenarios we are going to use 'TaskEither', which combines the behaviour of 'Task' and 'Either', we do this:


const createPostIO = (database: DB) => (event: UserPostEvent) =>
tryCatch<ApplicationError, UserPost>(
	() => database.createPost(event.body),
	reason => new ApplicationError(
	'Error storing item',
	[reason as string],
	StatusCodes.SERVER_ERROR)
)

A few things to note.

  • We pass the 'database' as a parameter. This is particularly useful for testing.
  • 'tryCatch' is a utility method to create 'TaskEither' monads. It takes two functions as parameters. The first function returns our 'Right' value, the second one returns a 'Left' value in case the first function fails. We want to return the post if we created it successfully. Otherwise, we return an error.
  • I'm using a technique called currying to define 'createPostIO'. It allows me to fix the first parameter, like in the following functions:

const createPostWithDynamoDB = createPostIO(dynamoDB)
const createPostWithMockDB = createPostIO(mockDB)
    

This way, I can preserve the behaviour of 'createPostIO' and only change the implementation of my database. It’s useful for testing.

Composing the Service

In the previous sections, we looked at how to use 'Either' to do validation and 'TaskEither'  to send requests to our database. These two monads allow us to define both successful behaviours and error behaviours of our application without introducing any side effect in our code.

We can write the validation function to use the 'Either' monad:

 
const validateCreatePostEvent = (event: APIGatewayEvent) =>
either.of<ApplicationError, APIGatewayEvent>(event)
	.chain(pathParamsIsNull)
	.chain(queryParamsIsNull)
	.chain(bodyNotNull)
	.chain(asUserPostEvent);
    

The interaction with the database is implemented using the 'TaskEither' monad.


const createPostIO = (database: DB) => (event: UserPostEvent) =>
tryCatch<ApplicationError, UserPost>(
	() => database.createPost(event.body),
	reason => new ApplicationError(
	'Error storing item',
	[reason as string],
	StatusCodes.SERVER_ERROR)
)
    

However, if we try to chain these two functions together, our compiler is not happy because 'Either' and 'TaskEither' are two different types.

The easiest and most consistent way to fix this is to exploit the fact that we can also use 'TaskEither' with in-memory variables and update the definition of the validation functions.


const validateCreatePostEvent = (event: APIGatewayEvent) =>
taskEither.of<ApplicationError, APIGatewayEvent>(event)
	.chain(pathParamsIsNull)
	.chain(queryParamsIsNull)
	.chain(bodyNotNull)
	.chain(asUserPostEvent);
    

Now, every function in the chain returns a 'TaskEither'.

The next step is to implement the business logic for a 'POST /blogposts' request. Being now masters of function composition, we can write:


const createPost = (event: APIGatewayEvent, database: DB) =>
taskEither.of<ApplicationError, APIGatewayEvent>(event)
	.chain(validateCreatePostEvent)
	.chain(createPostIO(database))
    

Our new 'createPost' function takes the event and composes together all the functions to do validation and interact with the database, waiting for somebody to call 'run()'.

But what happens if we execute it now? What is the user going to see if the operation succeeds? And what if it fails?

'TaskEither' offers us another interface we can use to specify what happens in case our value is 'Right' or 'Left'. This interface is the 'fold' method:


const createPost = (event: APIGatewayEvent, database: DB) =>
taskEither.of<ApplicationError, APIGatewayEvent>(event)
	.chain(validateCreatePostEvent)
	.chain(createPostIO(database))
	.fold(
		error => errorResponse(error),
		result => successResponse(StatusCodes.CREATED, result)
	);
    

'Fold' merges our two branches 'Left' and 'Right' into one outcome. If the computation fails, 'fold' executes the function passed as the first parameter. Otherwise, it calls the second function.

At this point, we have our first service. We can deal with all requests to store new posts in a way that is reliable, easy to read, easy to test, and easy to extend.

We can use the same techniques and principles to evaluate 'GET /userposts'  requests.

Handling Requests

Now that we have our services, we can see how to configure AWS Lambda to run our code.

When we create a new lambda function, one of the parameters we are allowed to specify is the 'handler'. The handler is the function that the lambda function calls whenever an event is received.


// We specify the database implementation
const db = DB.inMemory();
const createPostHandler = (event: APIGatewayEvent) =>
	createPost(event, db).run();
    

All our handler needs to do is to take the API Gateway event as input and trigger the execution of the 'TaskEither' monad, by calling the 'run()' function.

Conclusion

In this blog post, we have seen how functional programming and AWS Lambda can help us writing Cloud Native applications that are more reliable, easier to read, and easier to test.

We still have to deploy our application—but that is a topic on its own. Let’s leave that for a future blog post.

 

Want to join our talented team? Check out our openings!

C

 

Add a comment