You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
blog.ndumas.com/content/posts/gitea-lfs-and-syncing-obsid...

7.5 KiB

draft = false title = "Gitea, git-lfs, and syncing Obsidian Vaults" date = "2023-01-31" author = "Nick Dumas" authorTwitter = "" cover = "" tags = ["obsidian", "git", "gitea"] keywords = ["obsidian", "git", "gitea"] description = "A brief overview of how I stood up a gitea instance for the purpose of backing up and syncing my Obsidian vault." showFullContent = falsedraft = false title = "Gitea, git-lfs, and syncing Obsidian Vaults" date = "2023-01-31" author = "Nick Dumas" authorTwitter = "" cover = "" tags = ["obsidian", "git", "gitea"] keywords = ["obsidian", "git", "gitea"] description = "A brief overview of how I stood up a gitea instance for the purpose of backing up and syncing my Obsidian vault." showFullContent = false

What am I Doing?

I take notes on a broad spectrum of topics ranging from tabletop roleplaying games to recipes to the last wishes of my loved ones. Because of how valuable these notes are, I need to accomplish two things:

  1. Back up my notes so that no single catastrophe can wipe them out
  2. Make my notes accessible on multiple devices like my phone and various work laptops

For writing and organizing my notes, I use an application called Obsidian, an Electron Markdown reader and editor with an emphasis on plaintext, local-only files to represent your notes. This has a lot of interesting implications which are well beyond the scope of this post, but this is the one that's germane: your notes are a textbook use-case for version control.

Markdown files are plain-text, human-readable content that every modern Version Control System is supremely optimized for handling. In this arena, there's a lot of options ( mercurial, bzr, git, svn, fossil, and more ) but I'm partial to git.

Life with git

nick@DESKTOP-D6H8V4O MINGW64 ~/Desktop/general-notes (main)
$ git log $(!!)
git log $(git rev-list --max-parents=0 HEAD)
commit 18de1f967d7d9c667ec42f0cb41ede868d6bdd31
Author: unknown <>
Date:   Tue May 31 09:44:49 2022 -0400

    adding gitignore

I've kept my vault under git for all but the first 2 months of my vault's lifetime and I cannot count the number of times it's saved me from a mistake or a bug.

A few times a day, I'll commit changes to my notes, plugins, or snippets and push them up. This is a manual process, but by reviewing all my changes as they're committed I kill a few birds with one stone:

  1. I get a crude form of spaced repetition by forcing myself to review notes as they change
  2. I verify that templates and other code/plugins are working correctly and if they aren't, I can revert to a known-good copy trivially
  3. reorganizations become much easier ( see point 2, reverting to known-good copies )

For convenience, I chose to start off with Github as my provider. I set up a private repository because my notes contain sensitive information of various flavors and had no problems with it, except for attachments. This works great, Github is a fast reliable provider and meets all the requirements I laid out above.

The catch

There is no free lunch. On Github, free repositories have restrictions:

  1. github will warn you if you commit files larger than 50mb and ask you to consider removing them or using git-lfs
  2. github will not permit any files larger than 100mb to be committed
  3. You're allowed a limited number of private repositories, depending on the type and tier of your account.

My vault does not exclusively consist of plaintext files, though; there's PDFs, PNGs, PSDs, and more hanging out, taking up space and refusing to diff efficiently. I've got a lot of PDFs of TTRPG content, screenshots of important parts of software I care about for work or my personal life, and a lot of backup copies of configuration files.

In theory, this is sustainable. None of my attachments currently exceed 100mb, the median size is well under 1mb.

$ pwd
~/evac/obsidian-vaults/bonk/Resources/attachments
$ ls -lah|awk '{print $5}'|sort -hr|head -n5
62M
36M
8.4M
3.1M
2.9M

I'm not satisfied with theoretical sustainability, though. For something this important and sensitive, I'd like to have total confidence that my system will work as expected for the foreseeable future.

What are the options?

  1. Github has its own lfs service with the free tier capped at 2gb of storage.
  2. Pay for a higher tier of Github's LFS
  3. Managed Gitlab (or similar) instance
  4. Host my own

Options 1 and 2 are the lowest effort solution and rely the most on third parties. I've opted not to go with this because Github may change its private repository or git-lfs policies at any time.

Option 3 is better; a managed git hosting service splits the difference nicely. Using Gitlab would give me built-in CI/CD.

I've opted out of this mostly for price and partly because I know for a fact that I can implement option 4.

Option 4

I chose to use what I'm already familiar with: Gitea. Gitea is a fork of Gogs, a hosted git service written in Go. It's lightweight and its simplest implementation runs off an sqlite database so I don't even need a PostgreSQL service running.

I've been using gogs and gitea for years and they've been extremely reliable and performant. It also integrates tightly with Drone, a CI/CD system which will help me automate my blog, publish my notes, and more I haven't had the energy to plan.

docker-compose and gitea

For my first implementation, I'm going to host gitea using docker-compose. This will give me a simple, reproducible setup that I can move between providers if necessary.

Hosting will be done on my DigitalOcean droplet running a comically old version of Fedora for now. This droplet is really old and up until now I've had very poor reproducibility on my setups. I'm working on fixing that with caddy, and using gitea for code management is next.

Below you'll see the docker-compose.yaml for my gitea instance. This is ripped directly from the gitea documentation so there's very little to comment on. The ports field is arbitrary and needs to be adjusted based on your hosting situation.

version: "3"

networks:
  gitea:
    external: false

services:
  server:
    image: gitea/gitea:1.18.0
    container_name: gitea
    environment:
      - USER_UID=1000
      - USER_GID=1000
    restart: always
    networks:
      - gitea
    volumes:
      - ./gitea:/data
      - /etc/timezone:/etc/timezone:ro
      - /etc/localtime:/etc/localtime:ro
    ports:
      - "3069:3000"
      - "222:22"

Starting it up is similarly uninteresting; using detached mode for "production" work because I'm not super interested in watching all the logs. If something breaks, I can start it back up again without detaching and see whatever error output is getting kicked up.

$ docker-compose up -d
Starting gitea ... done
$

Once this is done, you've got a gitea instance waiting to be configured with an admin user and a few other bootstrap settings. Navigate to the URL you chose for your gitea instance while following the docs and you're ready to create a repository for your vault.

The web UI will guide you from there.

Success Story???

This solution is only a week or two old so it has not be put under a lot of load yet, but gitea has a good reputation and supports a lot of very high profile projects, and DigitalOcean has been an extremely reliable provider for years.

Migrating my attachments into git-lfs was trivial, but it did rewrite every commit which is something to be mindful of if you're collaborating between people or devices.

I don't intend to get more aggressive with adding large media attachments to my vault, I prefer plaintext when it's an option. Backing up my notes was only one item on a list of reasons I stood gitea up, in the coming weeks I'm going to work on using Drone to automate blog posts and use that as a springboard into more automation.