docker swarm


create swarm (this host is now a manager)

docker swarm init

the command gives us the command to launch on the other hosts
we can also get this from this command :

docker swarm join-token worker
docker swarm join-token manager

now run the command to join the swarm on another host (this other host is now a worker)

docker swarm join --token xxxxxxxxxxxxx X.X.X.X:2377

remove tasks from a manager or reactivate it :

docker node update --availability drain
docker node update --availability active

show all hosts from manager

docker node ls

run this on every server to stop swarm :

docker swarm leave

remove a node from swarm from the manager

docker node rm fkvwjkdvyg4ymeeq0xak67j4w

see only running services

docker stack ps -f "desired-state=running" swarm

configure swarm to automatically delete dead containers

docker swarm update --task-history-limit=1

deploy a stack

create a global network named public

docker network create --driver=overlay public

start or update a stack

docker stack deploy --with-registry-auth -c traefik.yml traefik
docker stack deploy --with-registry-auth -c swarmpit.yml swarmpit


docker stack rm traefik
docker stack rm swarmpit

see swarm service (a stack creates services so you call it with stack_service).

# for a whole stack
docker stack ps errorpage
# for a specific service
docker service ps errorpage_errorpage
docker service inspect errorpage_errorpage
docker service logs -ft errorpage_errorpage

get complete error

    docker inspect $(docker service ps serviceName | tail -n 1 | cut -d ' ' -f 1) | jq .[].Status

docker-compose to swarm

Swarm doesn't do variable expension (.env)
So you must generate a standalone version of the compose file.

    docker-compose -f swarm.yml config > generated.yml
    docker stack deploy --with-registry-auth -c generated.yml swarm

add a deploy section in every service and tune the grace period to allow graceful shutdowns to work.

    stop_grace_period: 130s
        mode: replicated
        replicas: 1
            failure_action: rollback
            parallelism: 1
            delay: 10s
            order: start-first
            parallelism: 1
            delay: 10s
            order: stop-first

you need to remove container_name & restart

compose version
use at least version 3.7 to handle all deploy options

since we created an overlay network to be used by the swarm we need to connect every service to it. so we add it on every service spec.

        - public

define network at the end of compose file :

        # Use the previously created public network "public", shared with other
        # services that need to be publicly available via this Traefik
            external: true

add an explicit service port to help traefik routing (necessary for traefik's api service)


Specify a host mode port on traefik

        - target: 80
          published: 80
          mode: host
        - target: 443
          published: 443
          mode: host

not this

        - "80:80"
        - "443:443"

some services need to be run only on the master (ex: dockercron, dozzle, traefik)

                - node.role == manager