2020-04-15 Docker Engine with Proxy 3/3

Docker Engine with Proxy — 3 of 3

Goal

Speed up container builds by coniguring a locally hosted squid proxy to intercept internet requests made by the container.

Background

You might say this is a long round trip to return to the point of this excercise. I realise I started with little understanding of all the ways that proxies could be configured - it feels like I hit every pot hole along the way. When we diagnose a system failure we need to test various stages of the process with a known or expected proof of success. Without understanding how the process worked and what the log files meant and what they needed to say (e.g. decoding Squid 4.8 log keywords). I decided on working “Outside In” from initial install of Squid and always aiming for “Cache Hit’s” in the log, never moving on until I got a result that indicated I could move on.

Important note Containers cannot see the Loop Back IP of the Host. Significantly until now all 127.0.0.1 was the IP used by my scripts to point to the Proxy. Containers have their own 127.0.0.1, 127.0.0.1 is a designated address only available local to an network instance. It always referers the system’s “self”. Furthermore, my containers can only communicate to machines on the docker-engine’s virtual network.

Detailed Learning

Listing all the devices by ip address show or abbreviated to ip a, I can see lo (loopback), enp0s25 (my ethernet), wlp3s0 (my wireless) and finally down the list docker0. I can filter to it specifically with ip address show dev docker0.

$ ip address show dev docker0
6: docker0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default 
    link/ether 02:42:96:53:1d:81 brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
    inet6 fe80::42:96ff:fe53:1d81/64 scope link 
       valid_lft forever preferred_lft forever

Now significantly the proxy I’m testing with permits broadcast on specific network devices or all. So in my /etc/squid/squid.conf I have specified simply the default port of 3128. This is functionally equivalent to 0.0.0.0:3128 is means Squid will respond at port 3128 on every interface.

$ cat /etc/squid/squid.conf
...
http_port 3128
...

This is important because our containers on docker-engine can only see the host on 172.17.0.1, as docker0 acts as a Network Access Translation (NAT) to the internet but a subnet internally, for which the host is resident.

Hence on the host (not the container) I can use a http_proxy value of any of the device IP’s listed by the ip address show command, but within a container, I’m can see the docker0 device corresponding to an ip of 172.17.0.1. Actually I can see all the devices from the container but 172.17.0.1 is what I’d use as I implement more security within my containers.

Regardless lets use this new IP for our cache value:

$ http_proxy=http://172.17.0.1:3128/ sudo -E \
     apt install --yes neovim

$ sudo tail -n1 /var/log/squid/access.log
1586677965.754      4 192.168.44.1 TCP_HIT/200 1263826 GET http://au.archive.ubuntu.com/ubuntu/pool/universe/n/neovim/neovim_0.3.8-1_amd64.deb - HIER_NONE/- application/x-troff-man

What happened here? Well I’m calling from the host so it seems to have translated by virtual address into a network equivalent address. Regardless lets jump into a Debian container and have a look at what we can see from within the container.

$ docker run --interactive --tty --rm --name=test2 api:0.1 bash
root@6b1cef9db9ae:/datallama# apt update
...
root@6b1cef9db9ae:/datallama# apt install nmap
...
root@6b1cef9db9ae:/datallama# nmap 172.17.0.1/32
Starting Nmap 7.70 ( https://nmap.org ) at 2020-04-12 07:45 UTC
Nmap scan report for 172.17.0.1
Host is up (0.000020s latency).
Not shown: 998 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
3128/tcp open  squid-http
MAC Address: 02:42:96:53:1D:81 (Unknown)Nmap done: 1 IP address (1 host up) scanned in 1.63 seconds
root@6b1cef9db9ae:/datallama# nmap 192.168.1.102/32
Starting Nmap 7.70 ( https://nmap.org ) at 2020-04-12 07:45 UTC
Nmap scan report for 192-168-1-102.tpgi.com.au (192.168.1.102)
Host is up (0.000032s latency).
Not shown: 998 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
3128/tcp open  squid-httpNmap done: 1 IP address (1 host up) scanned in 1.62 seconds
root@6b1cef9db9ae:/datallama# nmap 192.168.44.1/32
Starting Nmap 7.70 ( https://nmap.org ) at 2020-04-12 07:46 UTC
Nmap scan report for 192.168.44.1
Host is up (0.000014s latency).
Not shown: 998 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
3128/tcp open  squid-httpNmap done: 1 IP address (1 host up) scanned in 1.63 seconds

In this case both the docker0 virtual NIC, wireless and ethernet devices are visible, and displaying an active 3128 port. I suspect that the way dockerd sets up virtual NIC’s makes the translation to the hardware card before it gets to Squid, I’m not sure if docker will then “smartly” route the packets transparently to Squid back to the container. I’ll need to investigate Software Defined Networking a bit more and answer this question in another blog. BUT, the end result is that we can call this successfully:

docker run --rm --name=test2 api:0.1 bash -c "
  apt -o 'Acquire::HTTP::proxy=http://172.17.0.1:3128/' update && 
  apt -o 'Acquire::HTTP::proxy=http://172.17.0.1:3128/' \
    install neovim --yes"

Rather than calling the proxy within a script you can persist proxy preferences per user within ${HOME}/.docker/config.json

cat <<EOF | tee ${HOME}/.docker/config.json
{
 "proxies":
 {
   "default":
   {
     "httpProxy": "http://172.17.0.1:3128",
     "httpsProxy": "http://172.17.0.1:3128",
     "noProxy": "localhost,172.17.0.1,127.0.0.1"
   }
 }
}
EOF

$ docker run --rm --name=test2 api:0.1 bash -c "
  apt update && 
  apt install --yes neovim
  apt remove --purge --yes neovim
  apt install --yes neovim"

$ sudo tail -n4 'neovim' /var/log/squid/access.log
1586680013.289     15 172.17.0.3 TCP_HIT/200 3411576 GET http://deb.debian.org/debian/pool/main/n/neovim/neovim-runtime_0.3.4-3_all.deb - HIER_NONE/- application/x-debian-package
1586680013.316     26 172.17.0.3 TCP_REFRESH_UNMODIFIED/200 1287122 GET http://deb.debian.org/debian/pool/main/n/neovim/neovim_0.3.4-3_amd64.deb - HIER_DIRECT/151.101.106.133 application/x-debian-package
1586680013.372      1 172.17.0.3 TCP_MEM_HIT/200 30446 GET http://deb.debian.org/debian/pool/main/p/python-neovim/python-neovim_0.3.0-1_all.deb - HIER_NONE/- application/x-debian-package
1586680013.422     14 172.17.0.3 TCP_REFRESH_UNMODIFIED/200 30482 GET http://deb.debian.org/debian/pool/main/p/python-neovim/python3-neovim_0.3.0-1_all.deb - HIER_DIRECT/151.101.106.133 application/x-debian-package

Outcome

We’re either using memory, or file, or Squid had checked that it’s files are fresh enough and have pushed them to the container! So, no matter what container you build, debian/fedora/ubuntu, they can all take advantage of proxy caches! No more smashing my phone data :)