I'm not saying this project isn't cool, but whenever you have ANY software that's designed to be hosted A-style, and you host it B-style, the obvious question is "Why not host it the A way?"
I've always wanted to write something like this. The problem with Gitlab/Gitea etc. is their reliance on disk storage; which means self hosting them requires that I get the backup story just right. Whereas with this, I could just handle it as part of the database backup process.
Having no web UI, at least even a rudimentary one is kinda a bummer though.
I've struggled with this decision myself but I came to the opposite conclusion as you:
- Gitea's (I use Forgejo) reliance on disk storage is perfect for me because files are well understood as a concept by most people.
Every battle hardened linux tool knows how to backup files. Plain old `rsync` can backup and restore files. I have heard people put their `.git` on something like Dropbox (I've never tried it myself).
You can run checksums on files and ensure they are exactly how you expect them to be.
There are multiple, well tested, well understood options to reliably backup, snapshot and restore files.
Also, remote/cloud storage for files is really cheap. In most cases, if it's less than 10GB, you likely don't have to pay anything at all, as in $0 every month for having a backup on servers that won't go up in flames even if your laptop or house did.
- OTOH, PostgreSQL backup and restore feels like they are less popular or accessible to the general population vs files' backup and restore.
Infact, for non DBA folks who don't necessarily understand PostgreSQL WAL, backup snapshotting, what asynchronous and synchronous WAL replication means and how they affect RTO and RPO, it's definitely multiple and nonobvious ways to get things more wrong than right, and lose your data. Something you wouldn't have to worry about when using files backup and restore.
> Whereas with this, I could just handle it as part of the database backup process
What's the database backup and restore process you follow right now and what are the tools you use?
So, it's a git server with an interesting storage layer? Don't get me wrong, that part sounds like it might have been a ton of work to implement, but I think the web UI (pull requests, etc) is a lot of what Github has won on historically.
Basically I don't feel qualified to judge the product itself, but I think positioning it against Github, while popular given the recent hard times, isn't quite correct.
This is really cool. PG has zlib compression on TOAST objects so this should still be okay even if you are not storing pack files. I am curious with your choice of hand-rolling pktline, upload-pack and receive pack implementations including rev-walking. Any particular reason you did not want to use libgit2 or something like the gitoxide implementation of pkt-line. Was it performance or is it because you wanted it to be in pure rust? Did you try running this on slightly heavier repository with a lot of commits, refs and objects?
Interesting approach using Postgres as the storage layer. Curious how you're handling the object model since Git's content-addressable storage maps pretty differently to relational tables. Are you storing blobs as bytea or going with something like a JSONB tree structure for the commit graph?
While git internally uses a pretty loose system for connecting different model concepts that has always seemed more like a concession to the storage medium than a desired step. If git existed on an already ACID compliant system instead of trying to build one out of the filesystem itself I don't see a reason to keep all the references as loose as they are. If you can cascade changes with confidence you can likely just switch to using standard surrogate keys for linkages and allow the data to normalize more fully.
The core model objects in git are all pretty straightforward and their interactions well defined.
I'm not saying this project isn't cool, but whenever you have ANY software that's designed to be hosted A-style, and you host it B-style, the obvious question is "Why not host it the A way?"
Having no web UI, at least even a rudimentary one is kinda a bummer though.
- Gitea's (I use Forgejo) reliance on disk storage is perfect for me because files are well understood as a concept by most people.
Every battle hardened linux tool knows how to backup files. Plain old `rsync` can backup and restore files. I have heard people put their `.git` on something like Dropbox (I've never tried it myself).
You can run checksums on files and ensure they are exactly how you expect them to be.
There are multiple, well tested, well understood options to reliably backup, snapshot and restore files.
Also, remote/cloud storage for files is really cheap. In most cases, if it's less than 10GB, you likely don't have to pay anything at all, as in $0 every month for having a backup on servers that won't go up in flames even if your laptop or house did.
- OTOH, PostgreSQL backup and restore feels like they are less popular or accessible to the general population vs files' backup and restore.
Infact, for non DBA folks who don't necessarily understand PostgreSQL WAL, backup snapshotting, what asynchronous and synchronous WAL replication means and how they affect RTO and RPO, it's definitely multiple and nonobvious ways to get things more wrong than right, and lose your data. Something you wouldn't have to worry about when using files backup and restore.
> Whereas with this, I could just handle it as part of the database backup process
What's the database backup and restore process you follow right now and what are the tools you use?
So, it's a git server with an interesting storage layer? Don't get me wrong, that part sounds like it might have been a ton of work to implement, but I think the web UI (pull requests, etc) is a lot of what Github has won on historically.
Basically I don't feel qualified to judge the product itself, but I think positioning it against Github, while popular given the recent hard times, isn't quite correct.
"Doesn't" doesn't mean "can't". Someone just needs to do the work (with no thanks or pay expected).
The core model objects in git are all pretty straightforward and their interactions well defined.