Packages store
All package repositories are technically similar and unfancy. Technical boredom is a guarantee of stability.
On a website, packages are indexed per release, per version, per architecture (and sometimes per group, like “security” or “backport”).
Lots of technical details are explained in the previous blog post (sorry for my French).
Faster, closer, and more efficient
If you manage a large float of servers or a busy continuous integration service, you have to cache package repositories (to avoid banishment, throttling, or bad karma).
Package integrity is handled behind the distribution channel with signatures. At least, it should be so.
Distribution channel (HTTP most of the time) is not a security problem.
Old-school providers sign (with a trusted public key) the index of the packages (including their hashes).
The (almost) universal signing tool for open-source code is Sigstore.
With signatures, package integrity is guaranteed, so caching or mirroring is safe.
Repositories can be mirrored, but beware of the synchronization periodicity.
Some package providers handle their own mirrors, with explicit and implicit GeoDNS and cache proxy. They can broadcast updates to minimize mirror lags.
Bring your own cache
Cache can be fancy, with a lot of features and complexity, or crude, just a plain old HTTP cache proxy.
Fancy caches
Here is a short selection of specialized proxies:
Some specialized proxies handle different kinds of packages, like Pulp.
apt-proxy (which also handles yum and apk) is designed to handle flappy Internet connections and find the best mirror (behind it).
Old school caches
Squid is boring; lets use Nginx cache.
A proxy cache for your repositories
Dinosaurs remember the time when a simple HTTP_PROXY environment variable plugged a proxy in front of the target site.
Proxy only works with HTTP servers. HTTPS encrypts the connection, which can’t be cached.
Now, all repositories use HTTPS by default now (kudos to SSL Everywhere). The proxy must be a “man in the middle”, and intercept the queries.
Each kind of package must be configured specificallyy.
Setting a proxy for all package families can be done with a config file. It’s explicit, but not universal.
Developers and CI can use different private caches.
Some package manager can be configured with few ENV; for the others, a mix of file configuration and ENV can do the trick.
Demo time
One cache to rule them all.
Setting a private DNS and TLS is boring; life is short. The easiest and laziest way is to mix local and containerized development with bare IP and containerized cache server.
Please don’t do that ia real production environment.
Caching for few package families are explained later in this blog post.
This examples use Docker to build images.
The adaptation for local development is trivial.
Docker has the ONBUILD
command in its Dockerfilen very useful for setting variable in the build process.
The easiest way to build image which can use cache is to build an image atop the main language image.
The layers of the cake are:
- base image
- cachable image
- project image
Nginx cache
The official Nginx image doesn’t have the subs-filter
module.
Lets build a new image with it.
docker build \
--build-arg ENABLED_MODULES="subs-filter" \
-t nginx-subs \
https://raw.githubusercontent.com/nginx/docker-nginx/refs/heads/master/mainline/alpine/Dockerfile
Each cache packages families use path route, and when it’s impossible, it uses hostname route (the client is configured to use the cache as a proxy).
Nginx configuration is minimalistic, tune it if you wish.
Pick your favorite resolver, I use 193.110.81.0 (dns0.eu), 8.8.8.8 (google) is “déjà vue”.
Cache size needs your attention, set a correct value, nor to small, nor to huge.
One server, one port, no hostname.
worker_rlimit_nofile 8192;
events {
worker_connections 4096;
}
http {
log_format proxy '$remote_addr '
'"$request" $status $bytes_sent bytes'
' -> "$upstream_addr" "$http_location" '
' "$http_user_agent" ';
error_log /dev/stdout info;
access_log /dev/stdout proxy;
include /etc/nginx/mime.types;
index index.html;
resolver 193.110.81.0; # dns0.eu
default_type application/octet-stream;
tcp_nopush on;
server_names_hash_bucket_size 128;
proxy_cache_path /data/cache keys_zone=fat_cache:10m max_size=1g inactive=60m use_temp_path=off;
server {
listen 80;
}
}
The server section is empty, some location
will be added later.
Pick a private interface face, and its IP for publishing the cache service. The cache server needs to be available from containers and your workstation.
The ip is stored as SERVER_IP :
ip -f inet addr show docker0 | grep "inet " | sed -E "s#.*inet (.*)/.*#\1#g"
Or something more dirty with MacOS, docker0
is inside the VM, so, the IP of the laptop is used (check your firewall settings, first).
ifconfig -v en0 | grep "inet " | cut -w -f 3
You can now run the cache server (with its SERVER_IP).
mkdir -p data/cache
docker run --rm \
-v `pwd`/data/cache:/data/cache \
-v `pwd`/nginx.conf:/etc/nginx/nginx.conf:ro \
-p $(SERVER_IP):8082:80 \
nginx-subs
On Linux, you should use an user with your id, cache owned by root is unorthodox.
Debian (and Ubuntu)
Debian uses URLs starting with /debian
.
server {
location ~^/debian(.*)$ {
proxy_cache fat_cache;
proxy_cache_background_update on;
proxy_cache_lock on;
proxy_pass http://deb.debian.org/debian$1;
}
}
Cache image.
FROM debian:bookworm-slim
ONBUILD ARG APT_MIRROR=""
ONBUILD RUN if [ -z '$APT_MIRROR' ] ; \
then \
echo 'No mirror'; \
else echo "Acquire::http::Proxy \"$APT_MIRROR\";" \
> /etc/apt/apt.conf.d/cache.conf; \
fi
docker build -f Dockerfile.debian-mirror -t deb-mirror .
Debian demo imagee :
FROM deb-mirror
RUN apt-get update \
&& apt-get install -y --no-install-suggests --no-install-recommends \
cowsay \
&& rm -rf /var/lib/apt/lists/*
CMD ["cowsay", "Through the Looking-Glass"]
Build it , with a specific cache server.
docker build \
-f Dockerfile.debian \
-t debian-demo \
--build-arg APT_MIRROR=http://192.168.1.35:8082/ \
.
You should see a flow of package URLs in the terminal with Nginix Cache.
Run it :
docker run --rm debian-demo
The Ubuntu variant needs only few modifications.
Ubuntu doesn’t use prefixed URLs;the hostnameme must be used.
server {
server_name ~^(.*)\.ubuntu.com$;
listen 80;
location / {
proxy_cache fat_cache;
proxy_cache_background_update on;
proxy_cache_lock on;
proxy_pass https://$1.ubuntu.com;
}
}
Alpine
Nginx conf:
location /alpine/ {
proxy_cache fat_cache;
proxy_cache_background_update on;
proxy_cache_lock on;
proxy_pass https://dl-cdn.alpinelinux.org/alpine/;
}
Cache image:
FROM alpine:latest
ONBUILD ARG HTTP_PROXY=""
ONBUILD RUN if [ -z '$HTTP_PROXY' ] ; \
then \
echo 'No mirror'; \
else \
sed -i 's/https/http/g' \
/etc/apk/repositories; \
fi
There is trick, $HTTP_PROXY
will be used by apk
.
Build cache image:
docker build \
-f Dockerfile.alpine-with-cache \
-t alpine-with-cache \
.
Demo image:
FROM alpine-with-cache
RUN apk add --no-cache figlet
CMD ["figlet", "Carpe diem"]
Build demo image:
docker build \
-f Dockerfile.alpine \
-t alpine-demo \
--build-arg HTTP_PROXY=$(SERVER_IP):8082 \
.
Run demo:
docker run --rm alpine-demo
Pypi
Nginx conf for pypi:
location ~ ^/pypi/(.*)$ {
proxy_cache fat_cache;
proxy_pass https://pypi.org/simple/$1;
proxy_cache_background_update on;
proxy_cache_lock on;
proxy_ssl_protocols TLSv1.2;
proxy_ssl_session_reuse off;
proxy_ssl_server_name on;
proxy_ssl_name pypi.org;
}
pip
can use an index without https, but the host needs to be trusted.
Cache image:
FROM python:3.13-slim
ONBUILD ARG PYPI_CACHE=""
ONBUILD RUN if [ -z '$PYPI_CACHE' ] ; \
then \
echo 'No mirror'; \
else \
echo "[global]\n\
index-url = http://${PYPI_CACHE}/pypi\n\
trusted-host = $(echo ${PYPI_CACHE} | cut -d : -f 1)" \
> /etc/pip.conf ;\
fi
Build cache image:
docker build \
-f Dockerfile.python-with-cache \
-t python-with-cache \
.
Demo image:
FROM python-with-cache
RUN python3 -m venv /demo \
&& /demo/bin/pip install cowsay
CMD ["/demo/bin/cowsay", \
"-c", "tux", \
"-t", "\"Welcome to the thunder dome\"" ]
Build example:
docker build \
-f Dockerfile.python \
-t python-demo \
--build-arg PYPI_CACHE=$(SERVER_IP):8082 \
.
Run example:
docker run --rm python-demo
npm
Nginx conf:
server {
server_name registry.npmjs.org;
listen 80;
location / {
proxy_cache fat_cache;
proxy_cache_background_update on;
proxy_cache_lock on;
proxy_pass https://registry.npmjs.org/;
proxy_ssl_protocols TLSv1.2;
proxy_ssl_session_reuse off;
proxy_ssl_server_name on;
proxy_ssl_name registry.npmjs.org;
}
}
Cache image:
FROM node:24-alpine
ONBUILD ARG NPM_CACHE=""
ONBUILD RUN if [ -z '$NPM_CACHE' ] ; \
then \
echo 'No mirror'; \
else \
npm set proxy "http://${NPM_CACHE}/" --location global && \
npm set https-proxy "http://${NPM_CACHE}/" --location global &&\
npm set registry http://registry.npmjs.org/; \
fi
Build cache image:
docker build \
-f Dockerfile.npm-with-cache \
-t node-with-cache \
.
Demo image:
FROM node-with-cache
# npm should be somewhere, not /
RUN mkdir -p /opt/demo \
&& cd /opt/demo \
&& npm --verbose install cowsay
CMD ["/opt/demo/node_modules/.bin/cowsay", \
"-e", "xx", \
"\"With a little help from my friends\"" ]
Build demo image:
docker build \
-f Dockerfile.node \
-t node-demo \
--build-arg NPM_CACHE=$(SERVER_IP):8082 \
.
Run demo:
docker run --rm node-demo
Docker
Docker daemon can use a mirror.
Docker Hub, now, has quota; if you don’t want to be banned, use a mirror.
A private registry is mandatory to deploy your images.
All registries can be used as a proxy cache for another public or private registry.
Harbor, graduated by CNCF, like all registries, can be a proxy cache
Using nginx for caching Docker Hub should not be done in production, but it’s fun to cache everyting with one server.
Docker daemon configuration.
Add this line in the daemon.json
config file:
"registry-mirrors": ["http://_server_ip_:8082/docker/"]
On Linux the path is /etc/docker/daemon.json
, but, before breaking something, RTFM the Docker configuration file.
With Docker Desktop (OSX or Windows), the configuration is in the tab “Docker Engine.”
Docker can use a containerized mirror, but pull and build the image BEFORE using it.
Nginx conf:
location /docker/ {
rewrite ^/docker/(.*)$ /$1 break;
proxy_cache fat_cache;
proxy_cache_background_update on;
proxy_cache_lock on;
proxy_pass https://registry.hub.docker.com/;
}
Github Demo Project
Copy pasting is boring;, the package-cache contains all the files cited in this post, with an useful Makefile
.
Split your terminal (with tmux maybe), and build and run the cache server.
make cache
Run all demos:
make demo
For the Docker cache demo, you have to tweak your Docker daemon configuration and pull a fresh image, not one already cached in your local registry.
Cache all the things
CI can share a private local cache between steps.
Useful, but this cache is not shared (cache poisoning is very dangerous).
The cache proxy is safe, the user can’t write data (and poison it), and the cache is shared between all project builds.
Have fun with Nginx demo, but sooner or later, you will use specifics caches.