Docker Swarm
After you've decentralized your Database and Storage, you can go into Docker Swarm. If you haven't done so make sure to that first.
Even though you can use Docker Swarm to deploy the application containers as the MariaDB, for example, in this tutorial, you'll go with a decentralized approach for the volumes and the applications.
Good to know
All the docker-compose.yml
and .env
files are available in the book repo.
What Docker Swarm is?
The best way to understand Docker Swarm is to think of it as Docker way to treat a bunch of servers as one giant computer. Meaning you're not creating containers rather services
. Each service
have tasks
which are the actual deployed container.
Docker Swarm has many pieces, here is the must-know terminology:
Terminology | Meaning |
---|---|
Node | Physical server |
Manager node | Server that ranked as a manager. Can change the deployed services and remove other nodes. |
Worker node | Server that ranked as a worker. This server can't change anything rather just get a new tasks for a manager. |
Stack | List of services inside docker-compose.yml file |
Service | Declaration of a desired state for a given container. The state includes the container image version and various deploy options.For example setting the replicas deploy-option declares that this service desired state is to have 4 replicas of that service. |
Task | An action that being send from the managers to a worker contain container details that need to run by the worker. |
Replica | Number of replicas a given server will have in the Swarm. All the replicas can be in a single node, or they can be scattered all around. |
Scaling | Changing the number of replicas for a given service in realtime. |
Rolling update | Changing the image version (upgrade or downgrade) of a service in realtime. |
Quorum | In case of more than one manager, you'll need to have the majority of the managers available before rolling any update. For example, in case you have 3 managers and 2 are down, you won't be able to roll any update. The reason is that the manager need to "consult" each other to come to a verdict before changing some of the settings, To make sure the decision made by you Docker Swarm require you to have the majority of the manager quorum available |
For example, if a worker node is not responding is in charge of 3 tasks for some service. The managers will deploy that 3 tasks to other nodes.
Manage nodes
To get a list of all available nodes and their status in the Swarm, run:
docker node ls
Drain manager
By default, manager nodes get assigned to tasks like any other nodes. In most cases this would be the best option. In case of big Swarm, you may want to restrict tasks to a given manager node by running:
docker node update --availability drain NODE_ID
Running this command with the manager node-id will move all the current node tasks to other nodes, and block a future task assigned for this node.
Quorum
One of the hard-to-swallow-pills when it comes to Docker Swarm is the all quorum
concept.
When you're running Docker in Swarm mode, you're no longer deploying containers
but services
. When you've more the one manager a vote will happen between the managers, the first to response will become the leader
.
The leader-manager node will go over the list of request services and their desired state, and will update/delete/create what needed using tasks.
For example, you've deployed service that uses appwrite:1.3.4
, now, you run in the manager the command to change the image for that service to appwrite:1.3.8
. Docker Swarm will use the leader node to manage the upgrade process, so far so good.
In case the leader node becomes unavailable, then a new vote should take place. If the remaining manager node is less than the majority of the total manager nodes, Docker won't be able to update the service. This is due to the fact that no manager node can get the majority of the votes to be declared as the leader, no leader, no update.
In this case, you must do whatever you can to restore the amount of managers nodes to get back the quorum majority. If it is not possible, recover the Swarm
Here is a good interactive tutorial to understand the Raft algorithm.
Deploying
To deploy Appwrite to a Swarm you'll need to follow this steps.
- Init the Swarm.
- Join nodes to the Swarm as managers or workers.
- Set the stack
.env
anddocker-compose.yml
files. - Deploy the Stack
- Set load balancer. Docker does the rest for you.
For this example will use a small cluster composed of 5 servers:
- Not inside the Swarm Decentralized server contains the Databases and Storage volumes.
- Manager node
- Worker 1 node
- Worker 2 node
- Worker 3 node
Make sure all the servers have Docker installed.
Init the Swarm
Login to your manager node and run this to initialize the Swarm.
docker swarm init --advertise-addr 10.0.0.1
The --advertise-addr
contained the IP address in which other nodes use to connect to the manager. You can replace the 10.0.0.1
IP with either:
- The manager internal IP which is accessible to other nodes - recommended for most use-cases
- The manager external IP - useful for use cases when you want to join nodes that don't share the same network as the manager.
The above command will output something like this:
Swarm initialized: current node (j3wvahq4bf3zy05892grcq4fu) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join --token SWMTKN-1-token-token 10.0.0.1:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
The line that starts with docker swarm join
is the command needed to run to join nodes to your Swarm as workers.
You can get this command again any time by running:
docker swarm join-token worker
In case you want to add another manager, you can run:
docker swarm join-token manager
Hostname
Run this command to set the server host name to manager-1
hostname manager-1
This is necessary as some of the services you'll deploy will be attached to a node with manager-1
as the hostname.
Join nodes
Inside each one of your nodes run the join command from the previous step.
docker swarm join --token SWMTKN-1-token-token 10.0.0.1:2377
That's all you need to do when you want to join nodes.
Setting the stack
In your root folder create folder named appwrite
. Inside that folder create these two files using the following links.
Swarm uses stacks to deploy a list of services across all the Swarm nodes. Stack uses the same yaml syntax as Docker compose. Docker will know which fields to ignore when you're deploying the file to a Swarm and vice versa.
This docker-compose.yml
file has been adapted to be used as Swarm stack, while most of the file is just like the regular Appwrite docker-compose.yml
file here are the differences.
Removing fields
When using docker-compose
syntax for Swarm there are some attributes that you'll need to remove.
The attribute container_name
is not relevant when using Swarm as the access is service-based and not by container.
The attribute restart
is part of the deploy
one, which is unique to Swarm.
container_name: appwrite
restart: unless-stopped
Setting deploy
Inside each service you can see the deploy
attribute contained all the information about how should Swarm deploy that service across the nodes mesh.
There are two types of deployment inside Swarm:
- replica (default) - Service being replicated across the network
- global — Service being deployed on each node.
For example, the appwrite
service:
deploy:
mode: global
restart_policy:
condition: on-failure
You can see the service is being set as global
meaning each server will have one container of appwrite
.
On the other hand, other services will have something like this:
deploy:
mode: replicated
replicas: 1
restart_policy:
condition: on-failure
Both services use the restart_policy
to declare that the service should restart on-failure
.
Adjusting Traefik
Traefik
is the reverse proxy service use by Appwrite to route the traffic between the appwrite
and appwrite-realtime
containers.
To use Traefik with Swarm you'll need use it like so:
services:
traefik:
image: traefik:2.7
<<: *x-logging
command:
- ...
deploy:
mode: global
restart_policy:
condition: on-failure
ports:
- target: 80
published: 80
mode: host
protocol: tcp
- target: 443
published: 443
mode: host
protocol: tcp
The deploy
is set to global
and the ports are changed to host
making each Traefik container parsing request on its own. Notice that the protcol
is set to tcp
as this necessary for routing the realtime websocket.
Using the hostname
In the appwrite-executor
and appwrite-worker-functions
services there's another field:
deploy:
replicas: 1
placement:
constraints:
- "node.hostname==manager-1"
The use of constraints
inside the placement
attribute make sure that these two services will be deployed only to node with manager-1
as is hostname.
The reason is that these two containers can't be replicated without adjustments, and the best way is to have them on the same node.
Deploying the stack
Inside /root/appwrite
run this command.
export $(grep -v '^#' .env | xargs) && docker stack config -c docker-compose.yml
This command will go over all the .env
variables and will add them to the local environment.
Then, run
docker stack deploy -c docker-compose.yml appwrite
This command will use the docker-compose.yml
file to deploy stack named appwrite
In case you're changing any of the value inside these two files, you can run these two commands again, Docker Swarm will know you're just updating the settings.
To explore the deployed services, you can use these commands.
# Show all services
docker services ls
# Get service logs
docker services logs SERVICE_NAME
# Get service containers details and node placement
docker services ps SERVICE_NAME
Load balancer
If you try to access any IP of those services, Docker Swarm will use Traefik to navigate you to Appwrite.
To take advantage of that, create a Load Balancer and add all the swarm nodes as a target.
Upgrade
Upgrading is separated to two steps.
1. Update the image version
Edit the image version inside your docker-compose.yml
file, for example:
image: appwrite/appwrite:1.3.7
image: appwrite/appwrite:1.3.8
Then run the deployment commands
2. Migrate
In case there's a need to migrate the database, run this command in any of your nodes
docker ps
This command will return the local container names, search for the main appwrite
container, it should be named something like this appwrite_appwrite.1.a3faf3e
.
Then run this command replacing appwrite_appwrite.1.a3faf3e
with the local appwrite
container.
docker compose exec appwrite_appwrite.1.a3faf3e migrate
Backup & Restore
These instructions are for backup and restoring the Swarm
data.
Backup
In any manager node run
# Stop Docker to prevent discrepancy
systemctl stop docker
# Backup the swarm folder (using zip)
zip -r swarm.zip /var/lib/docker/swarm
# Start Docker back again
systemctl start docker
Restore
Create a new manager node, then run:
# Stop docker
systemctl stop docker
# Delete the newly created Swarm data
rm -rf /var/lib/docker/swarm
# Restore from previous backup
unzip swarm.zip -d /var/lib/docker/
# Start docker
systemctl start docker
# Force using the restored cluster.
docker swarm init --force-new-cluster
Recover from losing the quorum
In case you can't bring up the majority of the managers, you can run this command form a working manager node.
docker swarm init --force-new-cluster --advertise-addr 10.0.0.1:2377
Replace 10.0.0.1
with node IP
This special command will remove all other managers and make the current node as the leader
. All services and workers will get attached to that node. As for managers, you'll need to reconnect them to the new leader.
Benchmarks
Go to Benchmarks to see how Appwrite is handling request when scaling horizontally using Swarm.
Ansible
In the book repo in the swarm folder, you'll find ansible file for automating all the Swarm installation process.
You'll need to set just the servers IP and run
ansible-playbook appwrite.yml --ask-vault-pass
I love it!
Pocket size instructions
# Step-by-step summary, checklist style.
1. Create a decentralized **server** contains databases & storage drivers.
2. Create **server**
1. Create swap file
2. Install docker
3. Mount the decentralized server `share` folder
4. Create a *snapshot* and name it `swap_plus_docker`
5. Init Swarm
6. Add the `swarm` and `swarm-manager` tags to the server.
3. Create another 2 **servers** using the `swap_plus_docker` snapshot.
1. Connect to the Swarm as manager
2. Add the `swarm` and `swarm-manager` tags to the server.
4. Create 5 more **servers** using the `swap_plus_docker`.
1. Connect as worker
2. Add the `swarm` and `swarm-worker` tags to the server.
5. Create `docker-compose.yml` using the Swarm `docker-compose.yml` file
6. Create `.env` file using the Swarm `.env` file
7. Update, set and backup the `.env` environment variables.
8. Run `export $(grep -v '^#' .env | xargs) && docker stack config -c docker-compose.yml`
9. Run `docker stack deploy -c docker-compose.yml appwrite`
10. Create a **Load-balancer** and make balanced through all the `swarm` tagged-servers.
# 🚀 Your Swarm has been deployed