The vision for our system architecture was clear from the beginning: we wanted to build a website containing an application which would allow logged-in users to upload an image file and receive a fast prediction from a deep-learning model as to which objects (distinct, predefined parasite eggs), if any, were present in the image.

The vision was clear, but making it a reality was not so straightforward. In the end, however, we pulled it off and built a robust system architecture which we will now easily be able to build upon moving forward as well. The diagram below shows the sketch of the system architecture. Below, I will detail all major parts of the process. Hopefully this post will help others who are trying to build similar functionalities into their own projects.

Parasite ID system architecture diagram

You’ll notice that the diagram is split into two parts. On the left is the model serving system architecture and on the right is the data ingestion, model building and deployment. This split shows the two independent set-ups in place, and how they are connected. This post will primarily cover the left half of the diagram: the model serving system architecture; diving into the data ingestion, model building and deployment is out of scope for this blog post. The important thing to be aware of is that the image data is cleaned and stored independently from the model serving system architecture, and that the same is true for the building and deployment of our deep-learning CNN model. Our model has been deployed as a web API using Flask on a different server than our website is hosted on. Now, let’s dive into all the pieces of the model serving system architecture!

S3 static web hosting 1. The ParasiteID website was built with Jekyll—a static site generator written in Ruby. Using a free Jekyll template as the base, we were able to customize and build upon it to create this complete site in the form of HTML, CSS, and JavaScript documents, ready to be hosted on a cloud server. For web hosting, we decided to go with Amazon S3 static web hosting for three reasons. First, because of the small size of this website (less than 6MB at the time of writing) and the low levels of traffic it is currently receiving, S3 hosting is very inexpensive (well under $3 a month, so far). As traffic to our site increases, we will continue to evaluate in order to determine if this is still the most cost-effective solution. The second reason we are using S3 for hosting is because nearly all other components of this part of our system architecture are also hosted on AWS, so it makes integrating with these other services much easier. The final reason for using S3 is related to this one; we are using Amazon CloudFront (AWS’s content delivery network) to copy the site to Amazon’s data centers all over the world, meaning our website should be fast no matter where the end-user is located.

Amazon Cognito 2. For security and practicality reasons, we wanted to ensure that users could only upload an image using our tool if they were a registered, logged-in user on our site. To manage users and user sessions, we went with an Amazon Cognito User Pool. Using this along with the Amazon Cognito Identity software development kit (SDK) for JavaScript allows us to use JavaScript in our website code to sign up and authenticate users. The SDK also allows us to get information about a user session, which does two things: 1) pass this information to the front end, so that certain functionalities can be enabled for only logged-in users (e.g. the ability to upload an image file), and 2) retrieve a token associated with the user session which will be needed to access the API which we create in the next step.

Amazon API Gateway logo 3. When a user uploads an image via our tool, client-side JavaScript code first ensures that the file uploaded is in fact an image file, converts it to JPEG format, and resizes it if either dimension is greater than 1800 pixels. Then, the same script makes an AJAX call (an asynchronous HTTP request) to an API. In this call, data is both sent and received. The data which is sent includes the image (converted to a binary string); the user’s Cognito authorization token is sent as an Authorization header. This data is transmitted to our application’s back end via a REST API we created using Amazon API Gateway. This API is publicly accessible but secured using our Amazon Cognito user pool. It quite literally is the “gateway” for accessing our back-end code on AWS Lambda.

AWS Lambda 4. We use the AWS Lambda compute service to run back-end code in a serverless environment. Our system architecture actually uses two separate Lambda functions. The first is a script written in Node.js which receives the image (as a binary string) from the front end—transmitted through the aforementioned API. This Lambda function plays an important role in this system, acting as a control center of sorts, sending and receiving data from multiple sources, writing and even deleting data in some cases, and making sure that data (or any error messages) are returned in the correct way to the front end.

S3 bucket 5. The first action taken by the Lambda function mentioned above is to decode the image from its binary string and send it to an S3 bucket which holds all user-uploaded images. The filename of the image is a timestamp appended by a random string of bytes. Our goal is to add these images submitted by users (who have given us their permission) to our data set in order to improve our model continually.

AWS Lambda 6. Next, the Node.js Lambda invokes a second Lambda: this one having a Python runtime language. The filename of the image in S3 is passed from the first Lambda to the second Lambda. This second Lambda is really just a “middleman” in this current setup; it is in place not for technical reasons but rather for practical reasons, as it has allowed us to build and test the two parts of our architecture separately. We may well eliminate this component later on.

REST API with Python and Flask 7. Our CNN model for classifying images was deployed as a web API using Python and Flask (see earlier post Serving a Keras Model with Flask) on a SoftLayer virtual server. Our second Lambda calls this API and sends the filename of the image in S3 via a GET request. Then, the image is downloaded into a temporary directory on the remote server using boto (the AWS SDK for Python) and previously created IAM access credentials. The image is then fed into our model (held in memory in that server), and an array of predictions (one for each class) is returned. The Python Lambda then returns this array (as a string) to the Node.js Lambda.

Amazon DynamoDB 8. Now we’re back in our “control center” Lambda, and we are ready to take action on the data received from the Flask API. First, our code checks to make sure that no error was returned (an error almost always indicates that the server was down). If there was an error, the error is logged in CloudWatch. Then, we check the value of a variable which was passed from the front end which we haven’t mentioned yet: a boolean variable we call agree which indicates whether or not the user agreed to let us keep their image and data about predictions made on that image. If agree is true, then we save the save the filename of the image in S3 as the primary key in an Amazon DynamoDB table, along with the class predictions from the model (if there were no errors). If, on the other hand, agree was false, rather than storing anything in DynamoDB, an extra command deletes the image altogether from S3. Finally, the predictions (or error messages, if any) are returned to the front end as a callback which is returned via API Gateway.

And that is how a web app—which allows users to receive a prediction from a deep-learning model based on an image—is built!

Vicki Foss
Data scientist and cloud engineer.
Located in Mexico City.