summaryrefslogtreecommitdiffstatshomepage
path: root/posts
diff options
context:
space:
mode:
authorWynn Wolf Arbor2020-05-30 14:03:17 +0200
committerWynn Wolf Arbor2020-05-31 20:54:17 +0200
commitc6d0582dad4b47bb01605b10518c5d38d260636b (patch)
treec81b3904e877c675d6de1f85f5f8db64f49bcfa3 /posts
parenta93e625ad207720c68b9e45e04c368ca8dfc1c36 (diff)
downloadsite-c6d0582dad4b47bb01605b10518c5d38d260636b.tar.gz
posts: Add a new post: "The Long Journey to cgit"
Diffstat (limited to 'posts')
-rw-r--r--posts/img/cgit-log.pngbin0 -> 129086 bytes
-rw-r--r--posts/img/gitea-unnecessary.pngbin0 -> 32955 bytes
-rw-r--r--posts/img/oh-no-github-search.pngbin0 -> 21434 bytes
-rw-r--r--posts/the-long-journey-to-cgit.md463
4 files changed, 463 insertions, 0 deletions
diff --git a/posts/img/cgit-log.png b/posts/img/cgit-log.png
new file mode 100644
index 0000000..3457a43
--- /dev/null
+++ b/posts/img/cgit-log.png
Binary files differ
diff --git a/posts/img/gitea-unnecessary.png b/posts/img/gitea-unnecessary.png
new file mode 100644
index 0000000..fd26cf9
--- /dev/null
+++ b/posts/img/gitea-unnecessary.png
Binary files differ
diff --git a/posts/img/oh-no-github-search.png b/posts/img/oh-no-github-search.png
new file mode 100644
index 0000000..9e244b0
--- /dev/null
+++ b/posts/img/oh-no-github-search.png
Binary files differ
diff --git a/posts/the-long-journey-to-cgit.md b/posts/the-long-journey-to-cgit.md
new file mode 100644
index 0000000..a567f16
--- /dev/null
+++ b/posts/the-long-journey-to-cgit.md
@@ -0,0 +1,463 @@
+title: The Long Journey to cgit
+date: 2020-05-31
+author: Wynn Wolf Arbor
+
+As more and more FOSS projects leave the confines of dusty arcane
+mailing lists or thrice-cursed Bugzilla instances for the seemingly
+green pastures of the likes of GitHub or GitLab, there has been an ever
+greater need as a proactive user to engage and deal with these
+platforms. And, to be perfectly frank, the general experience sucks!
+
+The search interfaces especially leave a lot of things to be desired.
+Usually I end up capitulating after a minute, clone the whole repository
+instead, and fire up [`rg(1)`](https://github.com/BurntSushi/ripgrep).
+Why struggle to perform a task in your browser if you have tools that
+have been perfected for it right in your terminal?
+
+<figure>
+ <img class="round" src="img/oh-no-github-search.png" alt="GitHub's search bar"/>
+ <figcaption>The most dreaded place in all of GitHub.</figcaption>
+</figure>
+
+Another aspect GitHub has been working on is its review interface. From
+my personal experience, patch reviews have been perfected on mailing
+lists like `git@vger.kernel.org`, where people discuss proposed patches
+by replying to them with in-line comments. After this review phase, an
+improved version of the patch is sent in, and another review phase
+begins, until there are no more points to discuss, and the patch is
+either accepted or deferred.
+
+It took GitHub a very long time to achieve relative feature-parity with
+patch reviews by mail. Now, the review interface exists and it works,
+but it is very crowded and needlessly convoluted. Finding old versions
+of a specific pull-request could be a lot nicer, as could be comparing
+subsequent versions to the original. Just about the best feature of
+having all this readily available in your browser is the ability to
+easily reference other bug reports, leading to improved inter-project
+communication. Of course a good mail archive frontend would solve this
+problem for mailing lists too.
+
+Regarding repository landing pages, I feel that a Git web interface
+should show the state of the Git repository, not introduce me to a
+project. Chances are I've discovered a project's repository through its
+website or project page, which hopefully already achieved the
+introductory part. Once I've found the repository, I'm ready to look at
+the code or browse a few commits; I don't want to waste time reading a
+(possibly slightly different) project outline.
+
+I want a system that augments my already existing CLI workflow in a
+practical way - something that enhances the experience in certain ways,
+and doesn't completely recontextualize it. Which brings me to the
+present state... and a question.
+
+## The status quo
+
+I have been hosting my Git repositories on my dedicated server for more
+than half a decade now, using a very simple (but effective) system:
+
+1) Initialize a bare Git repository in a well-known directory
+2) Tweak directory permissions (`0700` if it's private)
+3) Enable the default `post-update` hook that runs
+`git-update-server-info(1)`
+4) Push my work to the server via `ssh(1)`
+
+Together with any old web server that can serve static files and a few
+symbolic links in the right places, this setup enables most (if not all)
+of what I need from a Git hosting platform. I can push and pull my work
+using `ssh(1)`, and the few public repositories I have are accessible
+via pull-only HTTPS using Git's "dumb protocol".
+
+I can even collaborate directly with other users on my server by
+allowing specific people write access to the bare repositories. No `git`
+user or group needed - just good old POSIX Access Control Lists. And for
+people without a shell account on my server (the vast majority as it
+turns out) there still was the possibility of using
+[git-send-email(1)](https://git-scm.com/docs/git-send-email) to
+contribute, of course[^1].
+
+## No web interface?
+
+Over the last decade or so one thing has become very apparent:
+fully-featured and well-integrated collaborative platforms for
+development are there to stay, and users' expectations have risen with
+them. Given these expectations, I've been asked more than a couple of
+times recently why I do not have a Git web interface for my projects.
+The answer's always been the same, **"why browse a repository with
+anything other than CLI tooling?"**, but the question stuck with me...
+
+It was trivial to find out why I didn't set up a web interface
+initially, back when I set up the system I described above: I simply did
+not need one. This was a deliberate decision as I had recently migrated
+away from GitHub, which, back then, already contained most of the
+interactive web components it has now. My projects did not see a great
+deal of external activity, there were next to no bug reports, and most
+of the projects were too small to warrant integration into GitHub's
+whole feature set. They were frequently starred[^2], but it did not seem
+that activity would pick up any time soon.
+
+*A web interface therefore was not even part of the equation.*
+
+## Yes web interface!
+
+These days, the question is harder to answer as I admittedly very much
+recognize the ease of use and comfort of using a web interface to give a
+project only a cursory glance, link a patch on IRC, or do a quick and
+dirty blame on some broken code. As repository size decreases,
+explorability on decent web interfaces increases, and, for the smallest
+projects, a good web frontend can arguably fully replace `git-clone(1)`.
+
+So, do I need a web interface? *No, not really.* Do I want one? *I think
+so.* Given a decent enough candidate, I am confident that it will provide
+features other people find useful. Perhaps it will also boost visibility
+of what I am working on, and give me a nice stage on which to present
+projects still lacking their own post on this site. Plus, it is
+something to keep me occupied for a few days as I go about setting it
+up.
+
+About I month back I set out to compile a full list of requirements:
+
+- Self-hosted, without depending on popular containerized deployments
+- A minimal and well-designed user interface
+- No misguided social networking features (GitHub stars, I am looking at you)
+- Active development, good documentation
+- Easy integration with my preferred web server, [Caddy](https://caddyserver.com/)
+- Light on dependencies: no Ruby, no Perl, no PHP
+- Not strictly necessary, but a bonus: no JavaScript
+
+## Searching for candidates
+
+The search was on. I quickly found a very helpful
+[list](https://git.wiki.kernel.org/index.php/Interfaces,_frontends,_and_tools#Web_Interfaces)
+and started digging through it...
+
+### klaus
+
+First up, [klaus](https://github.com/jonashaag/klaus). The
+[demo](http://klausdemo.lophus.org/klaus/) looks promising enough,
+except for the fact that one of the badges gets blocked by the
+`X-Frame-Options` policy, but that is not relevant to the project
+itself. It is built with Python, making hosting relatively easy with my
+server setup, and also does not contain any features beyond simple
+repository browsing.
+
+My issues come with the overall design. I just... don't like the way it
+looks. Lots of the core elements are very blocky, positioned too close
+together, and not contrasted enough. The diffs themselves look good,
+but the commit messages suffer from the same blocky and grey fate,
+drowning in the big wall of text below.
+
+The design of the summary page feels strange to me as well, with most of
+the page taken over by the rendered `README` file and no quick way to
+see the last few commits other than scrolling all the way down. All of
+this would need a great deal of work to fix, more than I am willing to
+commit to right now.
+
+### sorcia
+
+[sorcia](https://git.mysticmode.org/r/sorcia) is a relatively new
+frontend written in Go. It's still in alpha, and the "federation" part
+seems to be missing entirely, but is very pleasing to look at and use
+already. The [homepage](https://sorcia.org/) has the following to say:
+
+> Initially I've started building this project for myself as I've grown
+> kinda frustrated with the noisy interface that I've been noticing on
+> Github or other similar softwares. I started my career as a UI
+> designer and then moved on to web programming, so I thought I could
+> build and design something that would only have the necessary features
+> which I need in order to host and develop my projects.
+
+Sounds like what I am looking for. Sadly, a cursory look at the
+installation guide reveals a bit of a snag. For now, sorcia seems to use
+its own SSH server, and cannot be integrated to use an already running
+one. Bummer.
+
+Additionally, without JavaScript,
+[some](https://git.mysticmode.org/r/sorcia/tree/master/config/app.ini.sample)
+[things](https://git.mysticmode.org/r/sorcia/commit/master/96d462d33ee80853bb29bfddf6c910675708fa52)
+are not as nice yet as they could be. Picking this as a long-time
+solution right now is definitely a no-go. Maybe very well worth a look
+in year or so - if the project is still around by then. That said, I
+definitely appreciate the effort to create a nice user experience.
+
+### Gogs & Gitea
+
+Now on to the big ones, the behemoths of self-hosted Git services: Gogs,
+and its more or less recent fork Gitea. I'll focus on the latter only,
+seeing as the user experience is largely the same.
+
+Gitea fulfills most of the requirements, with one huge caveat: The
+interface and features set is meant to match GitHub's or GitLab's. I
+will not only get a Git browser, but also an issue tracker, support for
+pull requests, a full-fledged wiki system, an activity tab, and the
+ability to watch repositories, star, and fork them.
+
+<figure>
+ <img src="img/gitea-unnecessary.png" alt="The Gitea
+ navigation."/>
+ <figcaption>The Gitea navigation with unnecessary functionality
+ highlighted in red.</figcaption>
+</figure>
+
+Gitea's not meant to have one instance per user, it's meant to have one
+instance per company or community. And that community better be big!
+This puts me in a weird position where I am forced to go all in, but get
+nothing in return: If I were to consider enabling the issue tracker
+properly, I would also have to open up registrations; that means
+enabling authentication, allowing outbound mail, and fighting spam. All
+for the end result of people not wanting to register with yet another
+random instance on the internet.
+
+On the other hand, if I disable[^3] all the features I do not need, the
+10% that remains of the whole of Gitea is not even that great. Barely
+worth putting in the effort of going through its one thousand line
+[sample configuration](https://github.com/go-gitea/gitea/blob/master/custom/conf/app.ini.sample)
+file.
+
+### cgit
+
+[cgit](https://git.zx2c4.com/cgit/about/) is probably one of the oldest
+web frontends for Git, next to [gitweb](https://git-scm.com/docs/gitweb)
+(which comes bundled with Git itself.) As such, it is one of the most
+mature interfaces out there. Written entirely in C, the list of
+dependencies is expectedly short: cgit pulls in a tagged release of Git
+in the build stage, but other than that only depends[^4] on zlib.
+
+cgit is a Git web browser **only**, but has a bit of a different
+approach compared to klaus and sorcia. While the latter two put more
+emphasis on project presentation and present a fully-rendered `README`
+file first and foremost, cgit focuses more on giving the user a
+summarized view of their repository - showing a selection of the latest
+active branches, tags, and commits.
+
+<figure>
+ <img src="img/cgit-log.png" alt="The commit log in
+ cgit."/>
+ <figcaption>The commit log on cgit's summary page.</figcaption>
+</figure>
+
+In that regard, cgit might scare off users who are unfamiliar with git
+internals, but for me this is a welcome change. I can quickly scan
+through the summary view and find out what has been worked on recently,
+all without having to click through to another page. Of course cgit also
+supports markdown or manpage rendering, but it's only a secondary
+concern.
+
+All in all, I liked cgit the most and set out to deploy it. But things
+are never this easy...
+
+## Not-so Common Gateway Interface
+
+There is an item on my list of requirements that calls for easy
+integration with Caddy, a web server I've been using for a couple of
+years now. Nowadays, most web applications actually run an HTTP server
+themselves and can easily be set up with Caddy acting as a reverse
+proxy. cgit, however, as the subtle name might reveal, runs via
+[CGI](https://tools.ietf.org/html/rfc3875), an ancient interface that
+relies on the web server executing the application for every incoming
+request.
+
+This is a problem because Caddy does not natively support CGI. While
+there exists an abandoned
+[plugin](https://github.com/jung-kurt/caddy-cgi) for Caddy 1, the
+recently released Caddy 2.0 does not share the same plugin interface and
+lacks CGI support completely. I deem it unlikely that it will ever get
+it.
+
+Thankfully, both versions support FastCGI, an attempt to overcome the
+overhead of launching a process for every single request (most CGI
+applications are actually scripts run by interpreters, so this overhead
+can become quite noticable.) But how does one plug a CGI application
+into a FastCGI interface?
+
+The answer is to use a FastCGI wrapper, a long-running process that acts
+as a bridge between the web server and the CGI application. Problem is
+that there are not many good ones around that are still maintained. All
+I could find after a few hours of search was a cursed Perl script, and
+an implementation in C that was more than 8 years old. **Yikes.**
+
+## slowcgi(8) to the rescue
+
+A look on the OpenBSD side of things, though, revealed something very
+promising: [`slowcgi(8)`](https://man.openbsd.org/slowcgi.8). The
+OpenBSD project is never really vocal about any of its programs, so this
+slipped through the cracks easily - even though it's been a part of the
+operating system since version 5.4, released seven years ago.
+
+slowcgi clocks in at around 1300 lines of C, is actively
+maintained, simple, secure by default[^5], and seemed very easy to port
+to Linux. In fact, it's been ported
+[already](https://github.com/adaugherity/slowcgi-portable) by someone
+else, but that version hasn't seen any updates in about a year now - so
+I decided to roll my own thing (and will keep maintaining it for the
+time being). Instead of pulling in
+[libbsd](https://libbsd.freedesktop.org/wiki/), I decided to just copy
+in the required parts of the OpenBSD source. In the future I might
+migrate over to using Kristap's excellent
+[oconfigure](https://github.com/kristapsdz/oconfigure).
+
+Once ported, slowcgi works right out of the box. By default it uses
+a chroot to keep CGI applications jailed under a specific root
+directory, and runs as a separate `www` user. A chroot for cgit is
+technically not needed, but comes highly recommended. Even if it means
+having to set up the entire chroot tree with cgit's runtime
+dependencies, defense in depth is important and makes the setup more
+secure and safe in the long run.
+
+## Putting everything together
+
+With slowcgi up and running, there's only the matter of putting
+together all the pieces: Caddy needs to be set up to talk to the
+slowcgi instance controlling cgit, and the chroot needs to be set
+up properly. At this stage, considerations for backwards compatibility
+come into play also. Particularly, I did not want to have to move the
+already existing repositories to a new location. Everything that worked
+before should continue to work like before.
+
+Since cgit will exist in a chroot, I cannot use symbolic links to any
+paths outside of it. Putting cgit into the same directory tree as the
+repositories seemed suboptimal also, as I want to keep the web interface
+and the actual Git data fully independent of each other for more
+flexibility in the future.
+
+### The general framework
+
+I found my solution in a feature called "bind mounts". People who have
+needed to `chroot(1)` into a Linux installation from a Live CD might be
+familiar with this - to give the installation access to devices mapped
+by the host running from the CD, the whole `/dev` directory is bound
+onto something like `/mnt/dev`. Thus, the same files are accessible
+through multiple distinct mount points, with one of those contained
+within the chroot.
+
+This is the same concept I used in
+[`skein(7)`](https://git.oriole.systems/skein/about/skein.7), a small
+framework facilitating a flexible and modular approach to multi-user
+cgit hosting. Given a simple directory structure, a helper script sets
+up all necessary devices[^6] and bind mounts for every user wanting to
+give the cgit CGI application access to their repositories. If there is
+a need to have multiple cgit hosts set up, this framework also allows
+fine-grained control over which repositories show up on which host. This
+way, users on my server who would like to opt in to having a cgit
+frontend can freely determine how it is set up.
+
+The following is part of the `skein(7)` framework, showing my home
+directory in the cgit chroot. Symbolic links in each instance's `repos/`
+directory point to the actual Git repositories under the `repos.avail`
+bind mount.
+
+```
+wolf
+├── instances
+│   └── git.oriole.systems
+│   ├── config
+│   ├── repos
+│   │   ├── slowcgi.git@ -> ../../../repos.avail/slowcgi.git
+│   │   └── [...]
+│   └── site
+│   ├── cgit.css
+│   ├── custom.css
+│   ├── favicon.svg
+│   ├── logo.svg
+│   └── robots.txt
+└── repos.avail
+```
+
+As for cgit's runtime dependencies... I've decided to keep these to an
+absolute minimum and use only statically compiled binaries to reduce the
+amount of work needed to maintain the chroot setup. Whilst this means
+that cgit will **not** have access to a shell (or, for that matter, a
+Python interpreter), a very decent chunk of its filter capabilities can
+still be used in conjunction with the wonderful
+[lowdown](https://kristaps.bsd.lv/lowdown/) and
+[mandoc](https://mandoc.bsd.lv/) projects and a tiny custom C
+[program](https://git.oriole.systems/skein/tree/cgit-about-filter.c)
+invoking them.
+
+Finally, the directives needed for Caddy are as simple as:
+
+```
+git.oriole.systems {
+ import shared
+ root /srv/cgit/home/wolf/instances/git.oriole.systems/site/
+
+ fastcgi / /run/slowcgi.cgit.sock {
+ env SCRIPT_FILENAME /bin/cgit
+ env CGIT_CONFIG /home/wolf/instances/git.oriole.systems/config
+
+ except /cgit.css /custom.css /logo.svg /favicon.svg /robots.txt
+ }
+}
+```
+
+### Configuring cgit
+
+All that remains now is configuring cgit itself to work with this
+framework. There's not much to be done in that regard, it simply needs
+to be pointed to my custom
+[`cgit-about-filter`](https://git.oriole.systems/skein/tree/cgit-about-filter.c)
+program, and the repository location within the chroot:
+
+```
+about-filter=/bin/cgit-about-filter
+scan-path=/home/wolf/instances/git.oriole.systems/repos
+```
+
+For projects with a `README` file formatted in markdown, `lowdown(1)`
+will take care of HTML conversion. Manuals are formatted by `mandoc(1)`.
+Given the lack of a Python interpreter there is no syntax highlighting,
+but I find excessive syntax highlighting unappealing anyway. A couple
+more changes to cgit's default `cgit.css`... and we're done!
+
+## git.oriole.systems
+
+Finally, after about a week's worth of research, experimentation, and
+setup work, my new Git web interface is finally online under
+[git.oriole.systems](https://git.oriole.systems)! A great deal of
+care has gone into setting it up *just right*, and I dearly hope this
+will be useful for people, enjoyable to use, and interesting to just
+browse around in.
+
+### Future work
+
+A few things remain to be done in the next few weeks and months. For
+one, I'll have to look at changing my `Caddyfile` to be compatible with
+the recent release of Caddy 2, before I fully switch over to it. That
+means having to relearn most of what Caddy does, and may be a bit
+time-consuming. Of course I'll also have to remain backwards compatible
+with all the things I have already set up.
+
+There's a few things to be done on the Git web interface front, too:
+Right now I still serve Git repositories using the ["Dumb
+HTTP"](https://git-scm.com/book/en/v2/Git-on-the-Server-The-Protocols#_dumb_http)
+protocol. Integration with Git's own
+[`git-http-backend(1)`](https://git-scm.com/docs/git-http-backend) would
+be a good addition. Furthermore, I have been planning for a very long
+time to set up a [public inbox](https://public-inbox.org/) for issue
+tracking and bug reports. Once that is set up, I'll have to link to the
+proper addresses from within each Git repository.
+
+All this will be part of a future post. For now, enjoy.
+
+[^1]: The chances of that happening are tremendously low, but
+ [hope](https://git-send-email.io/) dies last, as they say.
+
+[^2]: To my eternal chagrin, the project with the highest amount of
+ stars was a hastily thrown-together shell script that fired up a
+ now-playing notification for mpd...
+
+[^3]: Not even supported yet, but is
+ [introduced](https://github.com/go-gitea/gitea/pull/8788) with the
+ upcoming `1.12.0`.
+
+[^4]: The README [mentions](https://git.zx2c4.com/cgit/tree/README#n52)
+ libcrypto and libssl, but as far as I can tell these are not needed.
+ Git switched to [its own](https://github.com/git/git/commit/e6b07da2780f349c29809bd75d3eca6ad3c35d19)
+ SHA1 implementation a few years back, and libssl is only needed for
+ `git-imap-send(1)`.
+
+[^5]: Sadly there's no good way to do
+ [`pledge(2)`](https://man.openbsd.org/pledge.2) nicely on Linux, so
+ those parts are ignored. And no, I'm not going to pivot to
+ `seccomp(2)`.
+
+[^6]: For now that is just /dev/null.