System Design Part 1: Setup a Simple Load Balancer using Python
Learn the fundamentals of system design by building a load balancer with Python, FastAPI, and Nginx. Understand redundancy, scaling, and system architecture.
System Design Part 1: Setup a Simple Load Balancer using Python
Load balancing is a fundamental component of system design, crucial for distributing network traffic across multiple servers to ensure optimal resource utilization, reduce latency, and prevent any single server from becoming a point of failure. By providing redundancy and scaling capacity, load balancers enhance both the reliability and performance of applications, making them resilient to high traffic and unexpected spikes in demand.
In This Session, We Will:
- Create an initial API
- Clone the first API for a second instance
- Set up an Nginx server
- Run docker compose up
The APIs
For this demonstration, we'll use FastAPI due to its simplicity and Python's robust package ecosystem, which makes it easy to demonstrate these concepts. Start by creating a file named api1.py
:
from fastapi import FastAPI
import uvicorn
app = FastAPI()
@app.get("/hc")
def healthcheck():
return 'API-1 Health - OK'
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8001)
Here, FastAPI is our web framework, and we'll use uvicorn as the http server to run the API. Both are listed in the requirements.txt
file. This example features a simple health check endpoint. In real-world applications, the implementation could be some CRUD method that is far more complex.
To avoid configuration issues and permissions on your machine, we'll use Docker for a clean setup. Here's the Dockerfile
for api1.py
:
FROM python:3.11
COPY ./requirements.txt /requirements.txt
WORKDIR /
RUN pip install -r requirements.txt
COPY . /
ENTRYPOINT ["python"]
CMD ["api1.py"]
EXPOSE 8001
Now, let's create a second API by duplicating everything except the port. This second API will be named api2.py
and will run on port 8002:
from fastapi import FastAPI
import uvicorn
app = FastAPI()
@app.get("/hc")
def healthcheck():
return 'API-2 Health - OK'
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8002)
The Dockerfile
for api2.py
is identical except for the port number is 8002
FROM python:3.11
COPY ./requirements.txt /requirements.txt
WORKDIR /
RUN pip install -r requirements.txt
COPY . /
ENTRYPOINT ["python"]
CMD ["api2.py"]
EXPOSE 8002
Setting Up the Load Balancer
For this demonstration, we'll use Nginx, a powerful open-source web server that can also handle load balancing. Although there are other options, including AWS's Application Load Balancer, Nginx is sufficient for illustrating the basic concepts.
The goal is to have two identical APIs taking requests in a round-robin fashion. While this may seem trivial at a small scale, it becomes crucial as the number of users increases. Technologies like AWS Fargate allow you to scale dynamically based on demand, making load balancing essential for modern applications.
This setup demonstrates fundamental system design principles:
- Redundancy: Multiple API instances ensure service availability
- Scalability: Easy to add more instances as needed
- Reliability: If one instance fails, others continue serving
- Performance: Distributed load improves response times
Understanding these concepts is crucial for system design interviews and real-world architecture decisions. Load balancing is just the beginning—it opens the door to more advanced concepts like auto-scaling, health checks, and distributed systems.