You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

271 lines
12 KiB
Markdown

---
draft : false
title : "Gardening with Quartz"
aliases : ["blogging-with-quartz"]
date : "2023-03-04"
series: ["blogging-with-quartz"]
series_order: 2
author : "Nick Dumas"
authorTwitter : ""
cover : ""
tags : ["drone", "hugo", "devops", "obsidian", "quartz"]
keywords : ["drone", "hugo", "devops", "obsidian", "quartz"]
description : "When you want a container built right, you have to do it yourself."
showFullContent : false
---
## Authoring blog posts in Obsidian
I'm using Gitea, Drone, and Hugo to watch for commits to my Obsidian vault, extract blog posts, and publish them to one of my servers. I run my stuff on Digital Ocean droplets, and I use Caddy for serving static sites.
## Why does it work?
it's cheap, fast, and simple. Self-hosting means I have more control over what gets published. This could all be accomplished with Github Actions, but I'd have to have separate vaults/repositories for public vs private content or I'd have to just make all my notes public.
## Why doesn't it work?
My original selection of pipeline images and commands was inefficient, incurring unnecessary network traffic and relying on third party package mirrors that suddenly started performing very badly.
Another important detail is media: the directory structure for my Obsidian vault and my site are very different.
I want to write blog posts with screenshots, media files, and more. Obsidian lets you drag and drop attachments, or link them manually with links in the form `![[path/to/attachment.png]]`
Finally, Hugo is a great static site generator, but there are better options when you're looking to publish content authored in Obsidian. In particular, the graph view is something that I'd love to bring into my blog. Luckily, [Quartz](https://github.com/jackyzha0/quartz) is built directly on top of Hugo and comes with a theme and some helper utilities
## What are the options?
### The Requirements
- [ ] attachment links must be transformed from `![[attachments/whatever.png]]` to `![[notes/post-name/whatever.]]`
- [ ] the site must be built with Quartz instead of Hugo
### Transforming links
The first choice is to whether I "fix" this during authoring, or during the publishing step. For the former, my options look something like this:
1) manually typing the final URL into the note
2) creating a complicated template system for generating Hugo shortcodes. in my head, this would use a prompter to let me select what attachment i want to insert, ask for resizing parameters, etc, and then generate a Hugo shortcode or an `<img>` tag.
None of these are satisfactory to me. I'd love to just drag and drop a piece of media into my note inside Obsidian and simply not have to think about it any further.
This leaves implementing something during the publishing pipeline. Now that I've got my [drone pipeline](notes/drone-and-hugo/) working, it's the perfect place to do transformations. This path presents a variety of possibilities falling on a spectrum somewhere between a bash script invoking sed and a custom ( Golang ) program that parses frontmatter, markdown, and applies pre-configured transformations.
### Quartz
The Quartz repo has a few built-in options for turning your notes into a website: a Dockerfile, a Makefile, and instructions on how to build everything from scratch. All of these are great, and I played with them all at different times to figure out which was a good fit.
## Pipelines: More than meets the eye
Unsurprisingly, I opted to extend my existing Drone pipeline with a transformer. This part of the pipeline has been in the back of my mind since the beginning, more or less, but it was much more important to get things stable first.
The pipeline I'm finally satisfied with looks like this, with checked boxes indicating what I had implemented at the start of this phase of the project.
- [x] Create a temporary shared directory, `/tmp/blog`
- [x] Clone the vault repository
- [x] do a `submodule` update and use `git-lfs` to pull down attachments
- [ ] clone my forked Quartz repository into `/tmp/blog`
- [x] Copy posts from `$VAULT/Resources/blog/post-name.md` to `/tmp/blog/content/notes/post-name/index.md`
- [ ] Scan all `index.md` files in `/tmp/blog/content/` for links that look like `![[attachments/whatever.png]]`, find `whatever.png` and copy it into the `/tmp/blog/content/notes/post-name/` directory for that `index.md`.
- [ ] Scan all `index.md` files in `/tmp/blog/content/` for links that look like `![[attachments/whatever.png]]` and edit them to `![[notes/post-name/whatever.png]]`
- [ ] Run the Quartz build command
- [x] Copy the static site to destination web server
## Hours and hours of debugging pipelines later
### Drone Volumes
The linchpin of this whole operation is having a temporary workspace that all these tools can operate on in sequence. To that end, I used Drone's [Temporary Volumes](https://docs.drone.io/pipeline/docker/syntax/volumes/temporary/) to mount `/tmp/blog` in all the relevant pipeline steps.
Creating a temporary volume looks like this. I really couldn't tell you what `temp:{}` is about, it certainly looks strange but I never had the spare cycles to investigate.
```yaml {title=".drone.yml"}
volumes:
- name: blog
temp: {}
```
Once you've created the volume, a pipeline step can mount it to a desired path. See below for an example of using your created volume.
### Quartz
Forking Quartz was easy, I'd done so late last year during another attempt to get this blog off the ground.
After a merge to get my fork up to date with upstream, I was able to slot this into the pipeline with the following.
```yaml {title=".drone.yml"}
- name: clone-quartz
image: alpine/git
volumes:
- name: blog
path: /tmp/blog
commands:
- git clone -b hugo https://github.com/therealfakemoot/quartz.git /tmp/blog
```
This sets the stage for building the site; this sets the stage for a step I implemented previously:
![[Resources/attachments/copy-posts-checkbox-screenshot.png]]
I opted to stop committing content to a blog repository and cloning the static site skeleton into the pipeline for a few reasons:
1) I already have reproducibility by virtue of building things with docker and having sources of truth in git.
2) It was an unnecessary layer of complexity
3) It was an unnecessary inversion of control flow
Configuring Quartz had its rocky moments. I've had to wrestle with frontmatter a lot, confusing TOML and YAML syntaxes can break your build or break certain features like the local graph.
### Gathering Media
This step ended up being pretty fun to work on. I took the opportunity to write this in Go because I knew I could make it fast and correct.
The process is simple:
1) Walk a target directory and find an `index.md` file
2) When you find an `index.md` file, scan it for links of the form `[[attachments/whatever.png]]`
3) Find `whatever.png` in the vault's attachments directory and copy it adjacent to its respective `index.md` file.
`walkFunc` is what handles step 1. You call `err := filepath.Walk(target, walkFunc(attachments))` and it will call your `walkFunc` for every filesystem object the OS returns.
This piece of code checks if we've found a blog post and then chucks it to `scanReader`.
```go {title="main.go"}
func walkFunc(matchChan matches) filepath.WalkFunc {
return func(path string, info fs.FileInfo, err error) error {
if err != nil {
return nil
}
if info.IsDir() {
return nil
}
f, err := os.Open(path)
if err != nil {
return err
}
if strings.HasSuffix(path, "index.md") {
scanReader(f, path, matchChan)
}
return nil
}
}
```
`scanReader` iterates line-by-line and uses a regular expression to grab the necessary details from matching links.
```go {title="main.go"}
type Attachment struct {
Filename string
Note string
}
type matches chan Attachment
func scanReader(r io.Reader, path string, matchChan matches) {
log.Printf("scanning markdown file: %s", path)
pat := regexp.MustCompile(`\[\[(Resources\/attachments\/.*?)\]\]`)
s := bufio.NewScanner(r)
for s.Scan() {
tok := s.Text()
matches := pat.FindAllStringSubmatch(tok, -1)
if len(matches) > 0 {
log.Printf("media found in %s: %#+v\n", path, matches)
for _, match := range matches {
dirs := strings.Split(path, "/")
noteFilename := dirs[len(dirs)-2]
log.Println("noteFilename:", noteFilename)
matchChan <- Attachment{Filename: match[1], Note: noteFilename}
}
}
}
}
```
Finally, `moveAttachment` receives a struct containing context ( the location of the `index.md` file and the name of the attachment to copy ) and performs a copy.
```go {title="main.go"}
func moveAttachment(att Attachment, dest string) error {
destPath := filepath.Jon(dest, strings.Split(att.Note, ".")[0])
log.Println("moving files into:", destPath)
_, err := copy(att.Filename, filepath.Join(destPath, filepath.Base(att.Filename)))
return err
}
func copy(src, dst string) (int64, error) {
sourceFileStat, err := os.Stat(src)
if err != nil {
return 0, err
}
if !sourceFileStat.Mode().IsRegular() {
return 0, fmt.Errorf("%s is not a regular file", src)
}
source, err := os.Open(src)
if err != nil {
return 0, err
}
defer source.Close()
destination, err := os.Create(dst)
if err != nil {
return 0, err
}
defer destination.Close()
nBytes, err := io.Copy(destination, source)
return nBytes, err
}
```
This ended up being the most straightforward part of the process by far. I packed this in a `Dockerfile` , using build stages to improve caching.
```docker {title="Dockerfile"}
FROM golang:latest as BUILD
WORKDIR /gather-media
COPY go.mod ./
# COPY go.sum ./
RUN go mod download
COPY *.go ./
RUN go build -o /bin/gather-media
```
Integration into the pipeline is here:
```yaml {title=".drone.yml"}
- name: gather-media
image: code.ndumas.com/ndumas/gather-media:latest
volumes:
- name: blog
path: /tmp/blog
commands:
- gather-media -target /tmp/blog/content/notes
```
Full code can be found [here](https://code.ndumas.com/ndumas/gather-media/src/branch/main/main.go).
### Transforming Links
Link transformation ended up being pretty trivial, but it took way way longer than any of the other steps because of an embarrassing typo in a `find` invocation. Another Docker image, another appearance of the blog volume.
The typo in my `find` was using `contents/` instead of `content/`. My code worked perfectly, but the pipeline wasn't finding any files to run it against.
```yaml {title=".drone.yml"}
- name: sanitize-links
image: code.ndumas.com/ndumas/sanitize-links:latest
volumes:
- name: blog
path: /tmp/blog
commands:
- find /tmp/blog/content/ -type f -name 'index.md' -exec sanitize-links {} \;
```
`sanitize-links` is a bog-standard `sed` invocation. My original implementation tried to loop inside the bash script, but I realized I could refactor this into effectively a `map()` call and simplify things a whole bunch.
The pipeline calls `find`, which produces a list of filenames. Each filename is individually fed as an argument to `sanitize-links`. Clean and simple.
```bash {title="sanitize-links"}
#! /bin/sh
echo "scanning $1 for attachments"
noteName=$(echo $1|awk -F'/' '{print $(NF-1)}')
sed -i "s#Resources/attachments#notes/$noteName#w /tmp/changes.txt" $1
cat /tmp/changes.txt
```
## Lots of Moving Pieces
If you're reading this post and seeing images embedded, everything is working. I'm pretty happy with how it all came out. Each piece is small and maintainable. Part of me worries that there's too many pieces, though. `gather-media` is written in Go, I could extend it to handle some or all of the other steps.
{{< figure
src="drone-builds-screenshot.png"
alt="Screenshot showing a series of green and red bars indicating a a set of mostly successful builds"
caption="This is mostly just a flex"
>}}
## For the future
Things I'd like to keep working on
- [ ] include shortcodes for images, code snippets, and the like
- [ ] customize the CSS a little bit
- [ ] customize the layout slightly
## Unsolved Mysteries
- What does `temp: {}` do? Why is it necessary?