name: docker class: middle, center template:inverse # Docker .center[.imgsm[!["Docker Logo"](/teaching/slides/docker/basics/docker-logo.png)]] [a container runtime for software deployments] ## Concepts and Introduction by [João Rocha da Silva](https://silvae86.github.io), based on ['Using Docker: Developing and Deploying Software with Containers'](https://www.oreilly.com/library/view/using-docker/9781491915752/) by *Adrian Mouat* and [other sources](#references). --- name: agenda class: middle, center ## Agenda .index[ .indexpill[[Virtual Machines](#virtual-machines)] .indexpill[[Virtual Machines vs. 'bare metal'](#virtual-machines-vs-bare-metal)] .indexpill[[VM Hypervisor](#hypervisor)] .indexpill[[Why Docker?](#why-docker)] .indexpill[[VMs vs Containers](#vms-vs-containers)] .indexpill[[Images](#images)] .indexpill[[Containers](#containers)] .indexpill[[Volumes](#volumes)] .indexpill[[Networking](#networking)] .indexpill[[Dockerfiles](#dockerfiles)] .indexpill[[Installation](#installation)] .indexpill[[Example container (Web Server)](#example)] .indexpill[[References](#references)] ] --- name: virtual-machines ## Virtual Machines ### What are they? - A virtual machine is a software that *simulates* a physical computer's hardware and software components ### Host vs. Guest - Several **virtual** machines can run in a single **physical**, or *bare metal* machine - Virtual machines are typically called "guests" - The physical machines where they run are called "hosts" --- name: virtual-machines-vs-bare-metal ## Virtual machines vs. 'bare-metal' - Advantages - .good[Portability and hardware-agnosticism] - The same VM can run in computers with very different hardware and software configurations - The [*hypervisor*](#hypervisor) provides an abstraction layer between the virtual and physical hardware configurations - .good[Faster disaster recovery] - Virtual machines can be backed up and restored simply by copying and pasting their *virtual* hard drives, which are simple files on the host's file system. - .good[Isolation] - Improved security for shared hardware machines-several users can have full administration privileges inside their own VMs, but without any access to the host - VMs make it easier to set up a *multi-tenant* environment, where resources are shared among the various VMs, which can even belong to different people. --- name: virtual-machines-vs-bare-metal-2 ## Virtual machines vs. 'bare-metal' - Disadvantages - .bad[High resource consumption] - The physical machine needs to virtualise everything, including the operating system - RAM usage is the same as that of a 'bare-metal' machine. The host needs a lot of RAM to run several VMs at the same time. - Running many VMs on the same host can slow down even powerful servers, because the access to hard drive needs to be split among multiple concurrent and random accesses--- this is especially hard on mechanical hard drives, not so much on SSDs. - .bad[Lack of access to some low-level functions] - If your application needs direct access to some low-level / hardware capabilities (such as 3D acceleration), those may be unavailable. - .bad[Backups need to include the entire virtual machine] - Very large files, as the virtual machine "virtual hard drive" takes as much space as an entire hard drive of a 'bare-metal' machine (~hundreds of GB **each**!). --- name: hypervisor ## The hypervisor - Software that powers the virtual machines - It provides virtual networking and storage layers (virtual network cards and virtual hard drives) - Controls how much % of CPU and RAM of the host is given to each of the guests <br> <br> .cols[ .fifty[ .center[ .imgmd[!["VMWare Logo"](/teaching/slides/docker/basics/vmware-logo.png)] .tiny[[VMWare Workstation by VMWare](https://www.vmware.com/)] ] ] .fifty[ .center[ .imgmd[!["VirtualBox Logo"](/teaching/slides/docker/basics/virtualbox-logo.png)] .tiny[[VirtualBox by Oracle](https://www.vmware.com/) - (Open source hypervisor)] ] ] ] --- name: why-docker ## Why Docker? (1/2) - .good[**Flexibility**] - Practically any application can be containerized - .good[**Lightweight**] - Containers share the host Kernel, without virtualizing an OS for every application, saving a LOT of resources when compared to using Virtual Machines - .good[**Portability**] - Ensures portability of execution environment on any machine - Application, pre-requisites and dependencies are packaged together - .good[**Scalability**] - From a single container on your laptop or thousands in the cloud, it is all the same technology - .good[**Loose coupling**] - Containers are highly self-sufficient and encapsulated, and can be updated individually without upsetting others - .good[**Security**] - Containers apply aggressive constraints and isolations to processes without any configuration required on the part of the user. .tiny[.footnote[Source: "Orientation and setup", by [Docker](https://docs.docker.com/get-started/)]] --- name: why-docker-2 ## Why Docker? (2/2) - .good[**Elasticity for the cloud**] - In the cloud, computational resources should be purchased as needed - Too many users for too little computational power → poor system performance - Too many resources for too little users → waste of money - Docker makes it easier to scale applications up to meet peak loads, and then scale down during downtime - Spin up more or less containers (*replicas* of the application) across a datacenter, to respond to application loading - .good[**Separation of code from state**] - Separates code+infrastructure (*application logic*) configuration from Data (*application state*) - Backups only need to worry about the data, as code and infrastructure can be built on-the-fly - .good[**Native CPU scheduling**] - On Linux, containers are seen by the Kernel as independent processes, so they can be efficiently managed by the CPU scheduler --- name: vms-vs-containers ## VMs vs Containers (Cont'd) .center[ .cols[ .fifty[ .center[ .imgmd2[![Container Engine Architectural Diagram](/teaching/slides/docker/basics/container-engine.png)] ] ] .fifty[ .center[ .imgmd2[![Virtual Machines Architectural Diagram](/teaching/slides/docker/basics/vms.png)] ] ] ] ] - Containers share the host's OS and kernel, and do not need to virtualize the Operating System → saving memory .footnote[.tiny[Images from [*Using Docker: Developing and Deploying Software with Containers*](#references)]] --- name: architecture ## Architecture .center[ .imgfull[!["Docker Architecture"](/teaching/slides/docker/basics/architecture.png)] .tiny[An overview of the Docker architecture (Image by Docker - [Source](https://docs.docker.com/get-started/overview/))] ] --- name: images ## Images .cols[ .fifty[ .center[ .imgmd[!["Docker Images"](/teaching/slides/docker/basics/architecture-images.png)] .tiny[Docker Images (Based on an image by Docker - [Source](https://docs.docker.com/storage/volumes/))] ] ] .fifty[ - Read-only **templates** with instructions for creating a Docker container - Images are built using [Dockerfiles](#dockerfile), written as a sequence of steps to go from the *base image* to your final image. Every step in a Dockerfile creates a new *layer*. - You can `pull` them from an image registry, i.e. [Docker Hub](https://hub.docker.com/search?q=&type=image&image_filter=official), or build and `push` your own to Docker hub to publish your work. ] ] - Often based on other images. e.g. : you can start with from a `ubuntu` image (*base image*) and install additional libraries, resulting in a new image. - Steps to go from one image to another are called *layers*, because an image is like an onion: made up of several successive sets of changes. - When images are rebuilt, only the modified layers are remade, and the base image recovered from *cache*. This makes image building much more efficient than building a VM using, say, [Vagrant](https://www.vagrantup.com/). --- name: containers ## Containers .cols[ .fifty[ .center[ .imgmd[!["Docker Containers"](/teaching/slides/docker/basics/architecture-containers.png)] .tiny[Docker containers (Based on an image by Docker - [Source](https://docs.docker.com/storage/volumes/))] ] ] .fifty[ - Containers are a runnable *instance* of an image. - Instance because you can start __multiple containers__ from the same image---like baking cookies from the same mold! - You can `start`, `stop`, `move`, or `delete` containers using the `docker` command. ] ] - Connectors can be connected to one or more [*networks*](#networking) → separate them for security via isolation, or connect them so they work together - Attach storage to the container via [**volumes**](#volumes) → like plugging in an external hard drive to keep changed files after the container is shut down. - You can `save` a new image from the current state of a container. .dangerbox[If you `start` a container from an image and anything is modified inside the former, all changes be lost when you `stop` and `rm` (remove) it.] --- name: volumes ## Volumes (1/4) - What are they? .center[ .imgmd[!["Docker Volumes"](/teaching/slides/docker/basics/types-of-mounts-volume.png)] .tiny[Docker volumes (Image by Docker - [Source](https://docs.docker.com/storage/volumes/))] ] - Without volumes, containers contain both application code and state (no separation) - When the container is deleted, so are any changes made since its instantiation from its base image - A volume acts a like a *mount point* for a container - It "injects" a link to a folder from the host machine into the container's file structure - That becomes a shared folder between host and container - You can also use `tmpfs` in Linux to create a memory-based volume for using RAM as a virtual file structure - Fast!! But volatile too, good for caching files, for example --- ## Volumes (2/4) - Advantages and Disadvantages .center[ .imgmd[!["Docker Volumes"](/teaching/slides/docker/basics/volumes-shared-storage.png)] .tiny[Docker volumes (Image by Docker - [Source](https://docs.docker.com/storage/volumes/))] ] - .good[Sharing data across different containers and machines] - Good for fault-tolerant applications---if one containerized "clone" of your application crashes, another can over transparently, because they share the same data, or *state* - .good[Access control] - Volumes are **bidirectional** be default: changes made by the host to files inside the volume folder are also propagated inside the corresponding folder in the container (and vice-versa) - `readonly` volumes will allow the container to read files in the volume, but not change them --- ## Volumes (3/4) - Advantages and Disadvantages Part 2 - .good[Very useful for backups] - You link only the folders within the container that have your application state (say, the folders where you have database files, uploaded images, and any other) - Ignore the rest of the operating system in each container, because all dependencies are handled by the image---great space savings - To backup, instead of making an image of the entire container, you just copy the volume folders in the host - To go back in time, just replace the volume folders' content with your backup and start a new container with those volumes mounted. - .bad[Slow I/O on non-Linux Operating Systems] - Read and write to/from volume folders can be quote slow on non-Linux operating systems. Watch out if you need intensive I/O in your app. --- ## Volumes (4/4) - Initialization of containerized applications .tiny[from "Stuff I learned the hard way, page 3519"] .dangerbox[Volumes are mounted when a container is booted, **replacing the contents** of the folder in the container with the contents of the volume folder from the host] .small[ 1. If you need to initialize your app automatically on first startup (e.g. create default admin users), you cannot do it during image creation, but instead use a startup script inside the container. - This is because our application state (e.g. some database files) is saved in a folder somewhere in the container. If you initialize those files during the image building process, apparently everything works well **without volumes**. 2. When that folder is later mounted as a volume, its contents **will be replaced** with the contents of the host's folder that is mounted - .red[Down the 🚽 goes your initialization], as the mounted volume's folder is most likely empty when it is mounted by the host in the container on first startup! 3. Possible generic solution: Create a dummy file in your data folder after a successful initialization. Let your initialization code check if the file is present. If it is not, then you need to re-initialize. Changes will be propagated to the volume's folder on the host, so this should only happen on first bootup. - Alternatively, some web frameworks also provide support for database *migration* and *seeding*, that you can run on application startup. ] --- name: networking ## Networking - Docker containers can live on the same network as other VMs, bare-metal machines, or containers - All types of machines can communicate transparently - 5 networking modes: `bridge`, `host`, `overlay`, `macvlan` and `none` - `none` will disable networking completely-no communication - We will cover the two most common ones, `bridge` and `host`. .footnote[.tiny[Source: [https://docs.docker.com/network/](https://docs.docker.com/network)]] --- name: networking-bridge ## Networking (`bridge` mode) .cols[ .fifty[ .center[ .imgmd[!["Docker Networking - Bridge Mode"](/teaching/slides/docker/basics/docker-networks-bridge.png)] .tiny[Docker Networking ([Bridge mode)](https://docs.docker.com/network/bridge/)] ] ] .fifty[ - Containers running on Bob host can find other containers also running on Bob by name (running `ping solr` on `my-app` container will return the IP of `my-app`, as given by the Docker DHCP server). - Alice cannot find a container by name (`ping solr` on Alice's machine will return Host Not Found). - Containers can find Alice by name (`ping alice` will return alice's IP, as given by Internet Gateway) and access the Internet. - Containers running in the Bob host cannot find any containers running on Alice - Multiple `bridge` networks can be created, and containers can will communicate within the same `bridge` network (for separation). ] ] --- name: networking-host ## Networking (`host` mode) .cols[ .fifty[ .center[ .imgmd[!["Docker Networking - Host Mode"](/teaching/slides/docker/basics/docker-networks-host.png)] .tiny[Docker Networking ([Host mode)](https://docs.docker.com/network/host/)] ] ] .fifty[ - Only available on Linux - Containers on Bob host can access the host network directly. - No name resolution among containers - `my-app` running on Bob will not get any response if it runs `ping mysql` - DNS entries (such as `bob.lan <ip>`) must be added at the physical network host resolution level - Containers can bind their open ports to ports on the host. - Alice can access a container on Bob via Bob's IP + port of the container they want - Only one program (and thus, only one container) can be listening on a port of each host. Beware of conflicts! ] ] --- name: dockerfiles ## Dockerfiles - Dockerfiles are files containing the sequence of steps required to build an image. - They are typically named `Dockerfile` without any extension. .small[ ````dockerfile # Start with a base image of Ubuntu 18.04, then: FROM ubuntu:18.04 # Copy current folder on the host to /app on the container COPY . /app # run `make` (compilation, etc) on /app to build the app on the container RUN make /app # Set default command when container boots up (runs installed app) CMD python /app/app.py ```` ] To build an image from a `Dockerfile` in the current directory, you run: .small[ ````shell docker build . ```` ] - More complicated `Dockerfile` examples can be found [here](https://github.com/silvae86/feup-bdad-corrector/blob/master/Dockerfile) and [here](https://github.com/feup-infolab/dendro/blob/master/Dockerfile) if you are curious. - [Best practices for writing Dockerfiles](https://docs.docker.com/develop/develop-images/dockerfile_best-practices/) - [`docker build` command reference](https://docs.docker.com/engine/reference/commandline/build/) --- name: installation ## Installation - Installation guide for Docker and how to use a container for local PHP development available [here](/teaching/howto/local_dev_with_docker). .warningbox[Remember to turn on Virtualization Support (or VT-x) on your BIOS/ UEFI (press Delete/F2 before Windows Starts) in order to run virtualization apps like Docker or a VM Hypervisor. See more [here](https://docs.fedoraproject.org/en-US/Fedora/13/html/Virtualization_Guide/sect-Virtualization-Troubleshooting-Enabling_Intel_VT_and_AMD_V_virtualization_hardware_extensions_in_BIOS.html).] .center[ .imglg[!["Virtualization Off - Docker Error"](/teaching/slides/docker/basics/docker-no-virtualization.png)] .tiny[Oops, I forgot to [turn on Virtualization](https://docs.fedoraproject.org/en-US/Fedora/13/html/Virtualization_Guide/sect-Virtualization-Troubleshooting-Enabling_Intel_VT_and_AMD_V_virtualization_hardware_extensions_in_BIOS.html)!] ] --- name: example ## Example Container (Apache Web Server + PHP) - The command that you use to start your server, now explained: .small[ ```sh docker run \ # \ # run is the command for running a container -d \ # \ # run in detached mode (without this, \ # the container will stop when you close the \ # command line, instead of running in \ # the background and on system startup) -p 8080:8080 \ # \ # bind port 8080 of the container, \ # which is running the Apache+PHP server, \ # to the port 8080 of the host. This is \ # what allows you to type localhost:8080 \ # on the browser and have the container respond -it \ # \ # allocate a tty for the container process --name=php \ # \ # name of the container to create -v $(pwd)/html:/var/www/html \ # \ # create a volume to map \ # [current folder]/html on the \ # host to /var/www/html (default Apache \ # htdocs location) on the container quay.io/vesica/php73:dev # # name of the image to base # the container on (has Apache and PHP # pre-installed) ``` ] .footnote[.tiny[ - [`docker run` command reference](https://docs.docker.com/engine/reference/run/) - [What is a TTY?](https://www.howtogeek.com/428174/what-is-a-tty-on-linux-and-how-to-use-the-tty-command/) ]] --- name: references ## References - *Using Docker: Developing and Deploying Software with Containers* Mouat, A. (2016). O'Reilly Media, Inc. - *Docker Overview* Docker Docs [https://docs.docker.com/get-started/overview/](https://docs.docker.com/get-started/overview/) - *Docker Networking* Docker Docs [https://docs.docker.com/network/](https://docs.docker.com/network/) - *Most common Docker commands* [Docker Cheat Sheet](https://github.com/wsargent/docker-cheat-sheet)