docker swarm

14/07/2020

create swarm (this host is now a manager)

docker swarm init

the command gives us the command to launch on the other hosts
we can also get this from this command :

docker swarm join-token worker
docker swarm join-token manager

now run the command to join the swarm on another host (this other host is now a worker)

docker swarm join --token xxxxxxxxxxxxx X.X.X.X:2377

remove tasks from a manager or reactivate it :

docker node update --availability drain
docker node update --availability active

show all hosts from manager

docker node ls

run this on every server to stop swarm :

docker swarm leave

remove a node from swarm from the manager

docker node rm fkvwjkdvyg4ymeeq0xak67j4w

see only running services

docker stack ps -f "desired-state=running" swarm

configure swarm to automatically delete dead containers

docker swarm update --task-history-limit=1

deploy a stack

create a global network named public

docker network create --driver=overlay public

start or update a stack

docker stack deploy --with-registry-auth -c traefik.yml traefik
docker stack deploy --with-registry-auth -c swarmpit.yml swarmpit

stop

docker stack rm traefik
docker stack rm swarmpit

see swarm service (a stack creates services so you call it with stack_service).

# for a whole stack
docker stack ps errorpage
# for a specific service
docker service ps errorpage_errorpage
docker service inspect errorpage_errorpage
docker service logs -ft errorpage_errorpage

get complete error

    docker inspect $(docker service ps serviceName | tail -n 1 | cut -d ' ' -f 1) | jq .[].Status
    

docker-compose to swarm

run
Swarm doesn't do variable expension (.env)
So you must generate a standalone version of the compose file.

    docker-compose -f swarm.yml config > generated.yml
    docker stack deploy --with-registry-auth -c generated.yml swarm

add
add a deploy section in every service and tune the grace period to allow graceful shutdowns to work.

    stop_grace_period: 130s
    deploy:
        mode: replicated
        replicas: 1
        update_config:
            failure_action: rollback
            parallelism: 1
            delay: 10s
            order: start-first
        rollback_config:
            parallelism: 1
            delay: 10s
            order: stop-first

remove
you need to remove container_name & restart

compose version
use at least version 3.7 to handle all deploy options

network
since we created an overlay network to be used by the swarm we need to connect every service to it. so we add it on every service spec.

    networks:
        - public

define network at the end of compose file :

    networks:
        # Use the previously created public network "public", shared with other
        # services that need to be publicly available via this Traefik
        public:
            external: true

ports
add an explicit service port to help traefik routing (necessary for traefik's api service)

    labels:
          - traefik.http.services.public.loadbalancer.server.port=8080

Specify a host mode port on traefik

    ports:
        - target: 80
          published: 80
          mode: host
        - target: 443
          published: 443
          mode: host

not this

    ports:
        - "80:80"
        - "443:443"

constraints
some services need to be run only on the master (ex: dockercron, dozzle, traefik)

        placement:
            constraints:
                - node.role == manager