self-host/drone-plugins/sanitize-links/test/drone-and-hugo.md

9.4 KiB

+++ draft = false title = "Copying HTML files by hand is for suckers" date = "2023-02-02" author = "Nick Dumas" authorTwitter = "" cover = "" tags = ["drone", "gitea", "obsidian", "devops"] keywords = ["drone", "gitea", "obsidian", "devops"] description = "How I built a drone instance and pipeline to publish my blog" showFullContent = false +++

Attribution

Credit to Jim Sheldon in the Harness slack server who pointed me here which provided much of the starting skeleton of the project.

The Old way

I use hugo to build my blog, and I love it. Static sites are the way to go for most content, and keeping them in git provides strong confidence that I'll never lose my work. I really like working in Markdown, and hosting is cheap and easy. Unfortunately, my current setup is extremely manual; I run hugo myself and copy the files into /var/www.

For a long time, this has been a really uncomfortable process and is part of why I find myself so disinterested in writing with any frequency. When the new year rolled around, I decided it was time to do better.

I want every push to my blog repository to generate a new hugo build and publish my content somewhere. The tools I've chosen are gitea for managed git services, drone for continuous integration/deployment, and hugo to build the site.

Hello Drone

Standing up a working Drone instance involves a few moving pieces:

  1. configure an ouath2 application in your hosted git service with which to authenticate your Drone instance
  2. You need the drone server itself, which hosts the web UI, database, responds to webhooks
  3. The drone-runner is a separate entity that communicates with drone and actually executes pipelines. There's a few flavors of drone-runner and I've selected the docker runner.

Step 1 is accomplished manually, or with the gitea admin API. Using docker-compose, I was able to assemble the following configuration files to satisfy points 2 and 3.

docker-compose

version: '3.6'
services:
  drone:
    container_name: drone
    image: drone/drone:${DRONE_VERSION:-1.6.4}
    restart: unless-stopped
    environment:
      # https://docs.drone.io/server/provider/gitea/
      - DRONE_DATABASE_DRIVER=sqlite3
      - DRONE_DATABASE_DATASOURCE=/data/database.sqlite
      - DRONE_GITEA_SERVER=https://code.ndumas.com
      - DRONE_GIT_ALWAYS_AUTH=false
      - DRONE_RPC_SECRET=${DRONE_RPC_SECRET}
      - DRONE_SERVER_PROTO=https
      - DRONE_SERVER_HOST=drone.ndumas.com
      - DRONE_TLS_AUTOCERT=false
      - DRONE_USER_CREATE=${DRONE_USER_CREATE}
      - DRONE_GITEA_CLIENT_ID=${DRONE_GITEA_CLIENT_ID}
      - DRONE_GITEA_CLIENT_SECRET=${DRONE_GITEA_CLIENT_SECRET}
    ports:
      - "3001:80"
      - "3002:443"
    networks:
      - cicd_net
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - ./drone:/data:z

  drone-runner:
    container_name: drone-runner
    image: drone/drone-runner-docker:${DRONE_RUNNER_VERSION:-1}
    restart: unless-stopped
    depends_on:
      - drone
    environment:
      # https://docs.drone.io/runner/docker/installation/linux/
      # https://docs.drone.io/server/metrics/
      - DRONE_RPC_PROTO=https
      - DRONE_RPC_HOST=drone.ndumas.com
      - DRONE_RPC_SECRET=${DRONE_RPC_SECRET}
      - DRONE_RUNNER_NAME="${HOSTNAME}-runner"
      - DRONE_RUNNER_CAPACITY=2
      - DRONE_RUNNER_NETWORKS=cicd_net
      - DRONE_DEBUG=false
      - DRONE_TRACE=false
    ports:
      - "3000:3000"
    networks:
      - cicd_net
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

networks:
  cicd_net:
    name: cicd_net

All of the docker-compose files were ripped straight from documentation so there's very little surprising going on. The most common pitfall seems to be setting DRONE_PROTO_HOST to a URL instead of a hostname.

For me, the biggest hurdle I had to vault was SELinux. Because this is a fresh Fedora install, SELinux hasn't been relaxed in any way.

When dealing with SELinux, your friends are ausearch and audit2{why,allow}. In my case, I needed to grant system_u:system_r:container_t on /var/run/docker.sock so drone and drone-runner can access the host Docker service.

That wasn't the end of my SELinux woes, though. Initially, my Drone instance was crashing with "cannot open database file" errors. To that end, observe :z on this following line. This tells docker to automatically apply SELinux labels necessary to make the directory mountable.

      - ./drone:/data:z

Why didn't this work for docker.sock? I really couldn't say, I did try it. With all the SELinux policies configured, I had a Drone instance that was able to see my Gitea repositories.

caddy config

drone.ndumas.com {
	encode gzip
	reverse_proxy localhost:3001
}

The caddy configuration is a very simple reverse-proxy. Caddy has builtin LetsEncrypt support, so it's pretty nice to act as a last-hop for internet traffic. sudo caddy start will run caddy and detach, and with that Drone has been exposed to the internet under a friendly subdomain.

startup script

#!/usr/bin/env bash

export HOSTNAME=$(hostname)
export DRONE_VERSION=2.16.0
export DRONE_RUNNER_VERSION=1.8.3
export DRONE_ADMIN_USER="admin"
export DRONE_RPC_SECRET="$(echo ${HOSTNAME} | openssl dgst -md5 -hex|cut -d' ' -f2)"
export DRONE_USER_CREATE="username:${DRONE_ADMIN_USER},machine:false,admin:true,token:${DRONE_RPC_SECRET}"

# These are set in ~/.bash_profile
# export DRONE_GITEA_CLIENT_ID=""
# export DRONE_GITEA_CLIENT_SECRET=""
docker-compose -f docker-compose/drone.yml up  -d
caddy start --config caddy/drone --adapter caddyfile

The startup script, drone.sh injects some environment variables. Most of these are boring but DRONE_RPC_SECRET and DRONE_USER_CREATE are the two most important. This script is set up to make these deterministic; this will create an admin user whose access token is the md5 of your host machine's hostname.

This really saved my bacon when I realized I didn't know how to access the admin user for my drone instance when I needed it. Diving into your Drone instance's database is technically on the table, but I wouldn't advise it.

It's pipeline time

Once I had drone up and running, getting my blog publishing pipeline going was a relatively straightforward process: write a pipeline step, commit, push, check Drone for a green build. After a couple days of iterating, the complete result looks like this:

kind: pipeline
name: default

steps:
- name: submodules
  image: alpine/git
  commands:
  - git submodule update --init --recursive
- name: build
  image: alpine:3
  commands:
  - apk add hugo
  - hugo
- name: publish
  image: drillster/drone-rsync
  settings:
    key:
      from_secret: blog_sync_key
    user: blog
    delete: true
    recursive: true
    hosts: ["blog.ndumas.com"]
    source: ./public/
    target: /var/www/blog.ndumas.com
    include: ["*"]

The steps are pretty simple

  1. Clone the repository ( this is actually handled by Drone itself ) and populate submodules, a vehcile for my Hugo theme
  2. Building the site with Hugo is as simple as running hugo. Over time, I'm going to add more flags to the invocation, things like --build{Drafts,Future,Expired}=false, --minify, and so on.
  3. Deployment of the static files to the destination server. This did require pulling in a pre-made Drone plugin, but I did vet the source code to make sure it wasn't trying anything funny. This could be relatively easily reproduced on a raw Alpine image if desired.

Green checkmarks

At this point, I've got a fully automated publishing pipeline. As soon as a commit gets pushed to my blog repository, Drone jumps into action and runs a fresh Hugo build. The process is far from perfect, though.

You might've noticed a lack of screenshots or other media in my posts. At the moment, I'm authoring my blog posts in Obsidian, my preferred note-taking application, because it gives me quick access to...well, my notes. The catch is that Obsidian and Hugo use different conventions for linking between documents and referencing attachments/images.

In the long term, what I want to do is probably write a script and pipeline which can

  1. convert Obsidian-style links and frontmatter blocks to their Hugo equivalents, so I can more easily cross-link between posts while drafting
  2. Find embedded media ( images, etc ) and pull them into the blog repository, commit and push to trigger the blog publish pipeline.

Unsolved Mysteries

For some reason, audit2allow was emitting invalid output as the result of something in my audit log. I never traced it down. Whatever was causing this wasn't related to my drone setup since I got everything running without fixing it.

[root@drone x]# cat /var/log/audit/audit.log|audit2allow -a -M volumefix
compilation failed:
volumefix.te:24:ERROR 'syntax error' at token 'mlsconstrain' on line 24:
mlsconstrain sock_file { write setattr } ((h1 dom h2 -Fail-)  or (t1 != mcs_constrained_type -Fail-) ); Constraint DENIED
#       mlsconstrain sock_file { ioctl read getattr } ((h1 dom h2 -Fail-)  or (t1 != mcs_constrained_type -Fail-) ); Constraint DENIED
/usr/bin/checkmodule:  error(s) encountered while parsing configuration