diff options
author | Wynn Wolf Arbor | 2020-05-30 14:03:17 +0200 |
---|---|---|
committer | Wynn Wolf Arbor | 2020-05-31 20:54:17 +0200 |
commit | c6d0582dad4b47bb01605b10518c5d38d260636b (patch) | |
tree | c81b3904e877c675d6de1f85f5f8db64f49bcfa3 | |
parent | a93e625ad207720c68b9e45e04c368ca8dfc1c36 (diff) | |
download | site-c6d0582dad4b47bb01605b10518c5d38d260636b.tar.gz |
posts: Add a new post: "The Long Journey to cgit"
-rw-r--r-- | posts/img/cgit-log.png | bin | 0 -> 129086 bytes | |||
-rw-r--r-- | posts/img/gitea-unnecessary.png | bin | 0 -> 32955 bytes | |||
-rw-r--r-- | posts/img/oh-no-github-search.png | bin | 0 -> 21434 bytes | |||
-rw-r--r-- | posts/the-long-journey-to-cgit.md | 463 |
4 files changed, 463 insertions, 0 deletions
diff --git a/posts/img/cgit-log.png b/posts/img/cgit-log.png Binary files differnew file mode 100644 index 0000000..3457a43 --- /dev/null +++ b/posts/img/cgit-log.png diff --git a/posts/img/gitea-unnecessary.png b/posts/img/gitea-unnecessary.png Binary files differnew file mode 100644 index 0000000..fd26cf9 --- /dev/null +++ b/posts/img/gitea-unnecessary.png diff --git a/posts/img/oh-no-github-search.png b/posts/img/oh-no-github-search.png Binary files differnew file mode 100644 index 0000000..9e244b0 --- /dev/null +++ b/posts/img/oh-no-github-search.png diff --git a/posts/the-long-journey-to-cgit.md b/posts/the-long-journey-to-cgit.md new file mode 100644 index 0000000..a567f16 --- /dev/null +++ b/posts/the-long-journey-to-cgit.md @@ -0,0 +1,463 @@ +title: The Long Journey to cgit +date: 2020-05-31 +author: Wynn Wolf Arbor + +As more and more FOSS projects leave the confines of dusty arcane +mailing lists or thrice-cursed Bugzilla instances for the seemingly +green pastures of the likes of GitHub or GitLab, there has been an ever +greater need as a proactive user to engage and deal with these +platforms. And, to be perfectly frank, the general experience sucks! + +The search interfaces especially leave a lot of things to be desired. +Usually I end up capitulating after a minute, clone the whole repository +instead, and fire up [`rg(1)`](https://github.com/BurntSushi/ripgrep). +Why struggle to perform a task in your browser if you have tools that +have been perfected for it right in your terminal? + +<figure> + <img class="round" src="img/oh-no-github-search.png" alt="GitHub's search bar"/> + <figcaption>The most dreaded place in all of GitHub.</figcaption> +</figure> + +Another aspect GitHub has been working on is its review interface. From +my personal experience, patch reviews have been perfected on mailing +lists like `git@vger.kernel.org`, where people discuss proposed patches +by replying to them with in-line comments. After this review phase, an +improved version of the patch is sent in, and another review phase +begins, until there are no more points to discuss, and the patch is +either accepted or deferred. + +It took GitHub a very long time to achieve relative feature-parity with +patch reviews by mail. Now, the review interface exists and it works, +but it is very crowded and needlessly convoluted. Finding old versions +of a specific pull-request could be a lot nicer, as could be comparing +subsequent versions to the original. Just about the best feature of +having all this readily available in your browser is the ability to +easily reference other bug reports, leading to improved inter-project +communication. Of course a good mail archive frontend would solve this +problem for mailing lists too. + +Regarding repository landing pages, I feel that a Git web interface +should show the state of the Git repository, not introduce me to a +project. Chances are I've discovered a project's repository through its +website or project page, which hopefully already achieved the +introductory part. Once I've found the repository, I'm ready to look at +the code or browse a few commits; I don't want to waste time reading a +(possibly slightly different) project outline. + +I want a system that augments my already existing CLI workflow in a +practical way - something that enhances the experience in certain ways, +and doesn't completely recontextualize it. Which brings me to the +present state... and a question. + +## The status quo + +I have been hosting my Git repositories on my dedicated server for more +than half a decade now, using a very simple (but effective) system: + +1) Initialize a bare Git repository in a well-known directory +2) Tweak directory permissions (`0700` if it's private) +3) Enable the default `post-update` hook that runs +`git-update-server-info(1)` +4) Push my work to the server via `ssh(1)` + +Together with any old web server that can serve static files and a few +symbolic links in the right places, this setup enables most (if not all) +of what I need from a Git hosting platform. I can push and pull my work +using `ssh(1)`, and the few public repositories I have are accessible +via pull-only HTTPS using Git's "dumb protocol". + +I can even collaborate directly with other users on my server by +allowing specific people write access to the bare repositories. No `git` +user or group needed - just good old POSIX Access Control Lists. And for +people without a shell account on my server (the vast majority as it +turns out) there still was the possibility of using +[git-send-email(1)](https://git-scm.com/docs/git-send-email) to +contribute, of course[^1]. + +## No web interface? + +Over the last decade or so one thing has become very apparent: +fully-featured and well-integrated collaborative platforms for +development are there to stay, and users' expectations have risen with +them. Given these expectations, I've been asked more than a couple of +times recently why I do not have a Git web interface for my projects. +The answer's always been the same, **"why browse a repository with +anything other than CLI tooling?"**, but the question stuck with me... + +It was trivial to find out why I didn't set up a web interface +initially, back when I set up the system I described above: I simply did +not need one. This was a deliberate decision as I had recently migrated +away from GitHub, which, back then, already contained most of the +interactive web components it has now. My projects did not see a great +deal of external activity, there were next to no bug reports, and most +of the projects were too small to warrant integration into GitHub's +whole feature set. They were frequently starred[^2], but it did not seem +that activity would pick up any time soon. + +*A web interface therefore was not even part of the equation.* + +## Yes web interface! + +These days, the question is harder to answer as I admittedly very much +recognize the ease of use and comfort of using a web interface to give a +project only a cursory glance, link a patch on IRC, or do a quick and +dirty blame on some broken code. As repository size decreases, +explorability on decent web interfaces increases, and, for the smallest +projects, a good web frontend can arguably fully replace `git-clone(1)`. + +So, do I need a web interface? *No, not really.* Do I want one? *I think +so.* Given a decent enough candidate, I am confident that it will provide +features other people find useful. Perhaps it will also boost visibility +of what I am working on, and give me a nice stage on which to present +projects still lacking their own post on this site. Plus, it is +something to keep me occupied for a few days as I go about setting it +up. + +About I month back I set out to compile a full list of requirements: + +- Self-hosted, without depending on popular containerized deployments +- A minimal and well-designed user interface +- No misguided social networking features (GitHub stars, I am looking at you) +- Active development, good documentation +- Easy integration with my preferred web server, [Caddy](https://caddyserver.com/) +- Light on dependencies: no Ruby, no Perl, no PHP +- Not strictly necessary, but a bonus: no JavaScript + +## Searching for candidates + +The search was on. I quickly found a very helpful +[list](https://git.wiki.kernel.org/index.php/Interfaces,_frontends,_and_tools#Web_Interfaces) +and started digging through it... + +### klaus + +First up, [klaus](https://github.com/jonashaag/klaus). The +[demo](http://klausdemo.lophus.org/klaus/) looks promising enough, +except for the fact that one of the badges gets blocked by the +`X-Frame-Options` policy, but that is not relevant to the project +itself. It is built with Python, making hosting relatively easy with my +server setup, and also does not contain any features beyond simple +repository browsing. + +My issues come with the overall design. I just... don't like the way it +looks. Lots of the core elements are very blocky, positioned too close +together, and not contrasted enough. The diffs themselves look good, +but the commit messages suffer from the same blocky and grey fate, +drowning in the big wall of text below. + +The design of the summary page feels strange to me as well, with most of +the page taken over by the rendered `README` file and no quick way to +see the last few commits other than scrolling all the way down. All of +this would need a great deal of work to fix, more than I am willing to +commit to right now. + +### sorcia + +[sorcia](https://git.mysticmode.org/r/sorcia) is a relatively new +frontend written in Go. It's still in alpha, and the "federation" part +seems to be missing entirely, but is very pleasing to look at and use +already. The [homepage](https://sorcia.org/) has the following to say: + +> Initially I've started building this project for myself as I've grown +> kinda frustrated with the noisy interface that I've been noticing on +> Github or other similar softwares. I started my career as a UI +> designer and then moved on to web programming, so I thought I could +> build and design something that would only have the necessary features +> which I need in order to host and develop my projects. + +Sounds like what I am looking for. Sadly, a cursory look at the +installation guide reveals a bit of a snag. For now, sorcia seems to use +its own SSH server, and cannot be integrated to use an already running +one. Bummer. + +Additionally, without JavaScript, +[some](https://git.mysticmode.org/r/sorcia/tree/master/config/app.ini.sample) +[things](https://git.mysticmode.org/r/sorcia/commit/master/96d462d33ee80853bb29bfddf6c910675708fa52) +are not as nice yet as they could be. Picking this as a long-time +solution right now is definitely a no-go. Maybe very well worth a look +in year or so - if the project is still around by then. That said, I +definitely appreciate the effort to create a nice user experience. + +### Gogs & Gitea + +Now on to the big ones, the behemoths of self-hosted Git services: Gogs, +and its more or less recent fork Gitea. I'll focus on the latter only, +seeing as the user experience is largely the same. + +Gitea fulfills most of the requirements, with one huge caveat: The +interface and features set is meant to match GitHub's or GitLab's. I +will not only get a Git browser, but also an issue tracker, support for +pull requests, a full-fledged wiki system, an activity tab, and the +ability to watch repositories, star, and fork them. + +<figure> + <img src="img/gitea-unnecessary.png" alt="The Gitea + navigation."/> + <figcaption>The Gitea navigation with unnecessary functionality + highlighted in red.</figcaption> +</figure> + +Gitea's not meant to have one instance per user, it's meant to have one +instance per company or community. And that community better be big! +This puts me in a weird position where I am forced to go all in, but get +nothing in return: If I were to consider enabling the issue tracker +properly, I would also have to open up registrations; that means +enabling authentication, allowing outbound mail, and fighting spam. All +for the end result of people not wanting to register with yet another +random instance on the internet. + +On the other hand, if I disable[^3] all the features I do not need, the +10% that remains of the whole of Gitea is not even that great. Barely +worth putting in the effort of going through its one thousand line +[sample configuration](https://github.com/go-gitea/gitea/blob/master/custom/conf/app.ini.sample) +file. + +### cgit + +[cgit](https://git.zx2c4.com/cgit/about/) is probably one of the oldest +web frontends for Git, next to [gitweb](https://git-scm.com/docs/gitweb) +(which comes bundled with Git itself.) As such, it is one of the most +mature interfaces out there. Written entirely in C, the list of +dependencies is expectedly short: cgit pulls in a tagged release of Git +in the build stage, but other than that only depends[^4] on zlib. + +cgit is a Git web browser **only**, but has a bit of a different +approach compared to klaus and sorcia. While the latter two put more +emphasis on project presentation and present a fully-rendered `README` +file first and foremost, cgit focuses more on giving the user a +summarized view of their repository - showing a selection of the latest +active branches, tags, and commits. + +<figure> + <img src="img/cgit-log.png" alt="The commit log in + cgit."/> + <figcaption>The commit log on cgit's summary page.</figcaption> +</figure> + +In that regard, cgit might scare off users who are unfamiliar with git +internals, but for me this is a welcome change. I can quickly scan +through the summary view and find out what has been worked on recently, +all without having to click through to another page. Of course cgit also +supports markdown or manpage rendering, but it's only a secondary +concern. + +All in all, I liked cgit the most and set out to deploy it. But things +are never this easy... + +## Not-so Common Gateway Interface + +There is an item on my list of requirements that calls for easy +integration with Caddy, a web server I've been using for a couple of +years now. Nowadays, most web applications actually run an HTTP server +themselves and can easily be set up with Caddy acting as a reverse +proxy. cgit, however, as the subtle name might reveal, runs via +[CGI](https://tools.ietf.org/html/rfc3875), an ancient interface that +relies on the web server executing the application for every incoming +request. + +This is a problem because Caddy does not natively support CGI. While +there exists an abandoned +[plugin](https://github.com/jung-kurt/caddy-cgi) for Caddy 1, the +recently released Caddy 2.0 does not share the same plugin interface and +lacks CGI support completely. I deem it unlikely that it will ever get +it. + +Thankfully, both versions support FastCGI, an attempt to overcome the +overhead of launching a process for every single request (most CGI +applications are actually scripts run by interpreters, so this overhead +can become quite noticable.) But how does one plug a CGI application +into a FastCGI interface? + +The answer is to use a FastCGI wrapper, a long-running process that acts +as a bridge between the web server and the CGI application. Problem is +that there are not many good ones around that are still maintained. All +I could find after a few hours of search was a cursed Perl script, and +an implementation in C that was more than 8 years old. **Yikes.** + +## slowcgi(8) to the rescue + +A look on the OpenBSD side of things, though, revealed something very +promising: [`slowcgi(8)`](https://man.openbsd.org/slowcgi.8). The +OpenBSD project is never really vocal about any of its programs, so this +slipped through the cracks easily - even though it's been a part of the +operating system since version 5.4, released seven years ago. + +slowcgi clocks in at around 1300 lines of C, is actively +maintained, simple, secure by default[^5], and seemed very easy to port +to Linux. In fact, it's been ported +[already](https://github.com/adaugherity/slowcgi-portable) by someone +else, but that version hasn't seen any updates in about a year now - so +I decided to roll my own thing (and will keep maintaining it for the +time being). Instead of pulling in +[libbsd](https://libbsd.freedesktop.org/wiki/), I decided to just copy +in the required parts of the OpenBSD source. In the future I might +migrate over to using Kristap's excellent +[oconfigure](https://github.com/kristapsdz/oconfigure). + +Once ported, slowcgi works right out of the box. By default it uses +a chroot to keep CGI applications jailed under a specific root +directory, and runs as a separate `www` user. A chroot for cgit is +technically not needed, but comes highly recommended. Even if it means +having to set up the entire chroot tree with cgit's runtime +dependencies, defense in depth is important and makes the setup more +secure and safe in the long run. + +## Putting everything together + +With slowcgi up and running, there's only the matter of putting +together all the pieces: Caddy needs to be set up to talk to the +slowcgi instance controlling cgit, and the chroot needs to be set +up properly. At this stage, considerations for backwards compatibility +come into play also. Particularly, I did not want to have to move the +already existing repositories to a new location. Everything that worked +before should continue to work like before. + +Since cgit will exist in a chroot, I cannot use symbolic links to any +paths outside of it. Putting cgit into the same directory tree as the +repositories seemed suboptimal also, as I want to keep the web interface +and the actual Git data fully independent of each other for more +flexibility in the future. + +### The general framework + +I found my solution in a feature called "bind mounts". People who have +needed to `chroot(1)` into a Linux installation from a Live CD might be +familiar with this - to give the installation access to devices mapped +by the host running from the CD, the whole `/dev` directory is bound +onto something like `/mnt/dev`. Thus, the same files are accessible +through multiple distinct mount points, with one of those contained +within the chroot. + +This is the same concept I used in +[`skein(7)`](https://git.oriole.systems/skein/about/skein.7), a small +framework facilitating a flexible and modular approach to multi-user +cgit hosting. Given a simple directory structure, a helper script sets +up all necessary devices[^6] and bind mounts for every user wanting to +give the cgit CGI application access to their repositories. If there is +a need to have multiple cgit hosts set up, this framework also allows +fine-grained control over which repositories show up on which host. This +way, users on my server who would like to opt in to having a cgit +frontend can freely determine how it is set up. + +The following is part of the `skein(7)` framework, showing my home +directory in the cgit chroot. Symbolic links in each instance's `repos/` +directory point to the actual Git repositories under the `repos.avail` +bind mount. + +``` +wolf +├── instances +│ └── git.oriole.systems +│ ├── config +│ ├── repos +│ │ ├── slowcgi.git@ -> ../../../repos.avail/slowcgi.git +│ │ └── [...] +│ └── site +│ ├── cgit.css +│ ├── custom.css +│ ├── favicon.svg +│ ├── logo.svg +│ └── robots.txt +└── repos.avail +``` + +As for cgit's runtime dependencies... I've decided to keep these to an +absolute minimum and use only statically compiled binaries to reduce the +amount of work needed to maintain the chroot setup. Whilst this means +that cgit will **not** have access to a shell (or, for that matter, a +Python interpreter), a very decent chunk of its filter capabilities can +still be used in conjunction with the wonderful +[lowdown](https://kristaps.bsd.lv/lowdown/) and +[mandoc](https://mandoc.bsd.lv/) projects and a tiny custom C +[program](https://git.oriole.systems/skein/tree/cgit-about-filter.c) +invoking them. + +Finally, the directives needed for Caddy are as simple as: + +``` +git.oriole.systems { + import shared + root /srv/cgit/home/wolf/instances/git.oriole.systems/site/ + + fastcgi / /run/slowcgi.cgit.sock { + env SCRIPT_FILENAME /bin/cgit + env CGIT_CONFIG /home/wolf/instances/git.oriole.systems/config + + except /cgit.css /custom.css /logo.svg /favicon.svg /robots.txt + } +} +``` + +### Configuring cgit + +All that remains now is configuring cgit itself to work with this +framework. There's not much to be done in that regard, it simply needs +to be pointed to my custom +[`cgit-about-filter`](https://git.oriole.systems/skein/tree/cgit-about-filter.c) +program, and the repository location within the chroot: + +``` +about-filter=/bin/cgit-about-filter +scan-path=/home/wolf/instances/git.oriole.systems/repos +``` + +For projects with a `README` file formatted in markdown, `lowdown(1)` +will take care of HTML conversion. Manuals are formatted by `mandoc(1)`. +Given the lack of a Python interpreter there is no syntax highlighting, +but I find excessive syntax highlighting unappealing anyway. A couple +more changes to cgit's default `cgit.css`... and we're done! + +## git.oriole.systems + +Finally, after about a week's worth of research, experimentation, and +setup work, my new Git web interface is finally online under +[git.oriole.systems](https://git.oriole.systems)! A great deal of +care has gone into setting it up *just right*, and I dearly hope this +will be useful for people, enjoyable to use, and interesting to just +browse around in. + +### Future work + +A few things remain to be done in the next few weeks and months. For +one, I'll have to look at changing my `Caddyfile` to be compatible with +the recent release of Caddy 2, before I fully switch over to it. That +means having to relearn most of what Caddy does, and may be a bit +time-consuming. Of course I'll also have to remain backwards compatible +with all the things I have already set up. + +There's a few things to be done on the Git web interface front, too: +Right now I still serve Git repositories using the ["Dumb +HTTP"](https://git-scm.com/book/en/v2/Git-on-the-Server-The-Protocols#_dumb_http) +protocol. Integration with Git's own +[`git-http-backend(1)`](https://git-scm.com/docs/git-http-backend) would +be a good addition. Furthermore, I have been planning for a very long +time to set up a [public inbox](https://public-inbox.org/) for issue +tracking and bug reports. Once that is set up, I'll have to link to the +proper addresses from within each Git repository. + +All this will be part of a future post. For now, enjoy. + +[^1]: The chances of that happening are tremendously low, but + [hope](https://git-send-email.io/) dies last, as they say. + +[^2]: To my eternal chagrin, the project with the highest amount of + stars was a hastily thrown-together shell script that fired up a + now-playing notification for mpd... + +[^3]: Not even supported yet, but is + [introduced](https://github.com/go-gitea/gitea/pull/8788) with the + upcoming `1.12.0`. + +[^4]: The README [mentions](https://git.zx2c4.com/cgit/tree/README#n52) + libcrypto and libssl, but as far as I can tell these are not needed. + Git switched to [its own](https://github.com/git/git/commit/e6b07da2780f349c29809bd75d3eca6ad3c35d19) + SHA1 implementation a few years back, and libssl is only needed for + `git-imap-send(1)`. + +[^5]: Sadly there's no good way to do + [`pledge(2)`](https://man.openbsd.org/pledge.2) nicely on Linux, so + those parts are ignored. And no, I'm not going to pivot to + `seccomp(2)`. + +[^6]: For now that is just /dev/null. |