System Design Part 1: Setup a Simple Load Balancer using Python

Load balancing is a fundamental component of system design, crucial for distributing network traffic across multiple servers to ensure optimal resource utilization, reduce latency, and prevent any single server from becoming a point of failure. By providing redundancy and scaling capacity, load balancers enhance both the reliability and performance of applications, making them resilient to high traffic and unexpected spikes in demand.

In This Session, We Will:

Create an initial API
Clone the first API for a second instance
Set up an Nginx server
Run docker compose up

The APIs

For this demonstration, we'll use FastAPI due to its simplicity and Python's robust package ecosystem, which makes it easy to demonstrate these concepts. Start by creating a file named api1.py:

from fastapi import FastAPI
import uvicorn

app = FastAPI()

@app.get("/hc")
def healthcheck():
    return 'API-1 Health - OK'

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8001)

Here, FastAPI is our web framework, and we'll use uvicorn as the http server to run the API. Both are listed in the requirements.txt file. This example features a simple health check endpoint. In real-world applications, the implementation could be some CRUD method that is far more complex.

To avoid configuration issues and permissions on your machine, we'll use Docker for a clean setup. Here's the Dockerfile for api1.py:

FROM python:3.11
COPY ./requirements.txt /requirements.txt
WORKDIR /
RUN pip install -r requirements.txt
COPY . /
ENTRYPOINT ["python"]
CMD ["api1.py"]
EXPOSE 8001

Now, let's create a second API by duplicating everything except the port. This second API will be named api2.py and will run on port 8002:

from fastapi import FastAPI
import uvicorn

app = FastAPI()

@app.get("/hc")
def healthcheck():
    return 'API-2 Health - OK'

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8002)

The Dockerfile for api2.py is identical except for the port number is 8002

FROM python:3.11
COPY ./requirements.txt /requirements.txt
WORKDIR /
RUN pip install -r requirements.txt
COPY . /
ENTRYPOINT ["python"]
CMD ["api2.py"]
EXPOSE 8002

Setting Up the Load Balancer

For this demonstration, we'll use Nginx, a powerful open-source web server that can also handle load balancing. Although there are other options, including AWS's Application Load Balancer, Nginx is sufficient for illustrating the basic concepts.

The goal is to have two identical APIs taking requests in a round-robin fashion. While this may seem trivial at a small scale, it becomes crucial as the number of users increases. Technologies like AWS Fargate allow you to scale dynamically based on demand, making load balancing essential for modern applications.

This setup demonstrates fundamental system design principles:

Redundancy: Multiple API instances ensure service availability
Scalability: Easy to add more instances as needed
Reliability: If one instance fails, others continue serving
Performance: Distributed load improves response times

Understanding these concepts is crucial for system design interviews and real-world architecture decisions. Load balancing is just the beginning—it opens the door to more advanced concepts like auto-scaling, health checks, and distributed systems.

System Design Part 1: Setup a Simple Load Balancer using Python