(This is a stripped-down version of the news letters sent out to members of Codeberg e.V., in particular the newsletter for Codeberg e.V. members contains more details about e.V.-internal affairs and the annual assembly. If you are a member, please check this part in your email. Not member of the non-profit association yet? Consider joining it!)

Dear Codeberg e.V. members and supporters!

It's time to share some more updates about what has happened in the recent months in, on and around the Codeberg mountain. We are especially proud about the work we have done on the infrastructure side, and will share some of it with you.

The most important and actionable announcement first: If you have set up a custom domain for Codeberg Pages before April 19 this year, please check that your A/AAAA records (if you use such records) are up to date. The relevant details were all laid out further down this article.

Highlights

  • Codeberg Pages has received new IP addresses.
  • More news about our plans for storage quota.
  • More steps towards a clustered infrastructure.

Codeberg Pages: New IP addresses and more

In short, what you need to know:

  • Websites using a *.codeberg.page domain are not affected. Their IP addresses have been updated automatically.
  • Websites that use a CNAME record pointing to codeberg.page are also not affected.
  • Websites that were set up using A/AAAA records are affected. If you have not updated them since April 19, please ensure that they are up to date.

On Unix-like hosts, you can probably issue host yoursite.example.com (or dig, or nslookup on Microsoft Windows, dig etc.) to discover your domain's current records.

Your domain should point to the IP addresses listed below:

A    (IPv4): 217.197.84.141
AAAA (IPv6): 2a0a:4580:103f:c0de::2

The longer story: In December 2021, we launched our new Pages Service featuring custom domains. Many users have made use of that feature since then. We only had a single physical server and used the IP addresses assigned to it. The same IPv4 and IPv6 address were shared by Codeberg.org and all other services, so we did some tricks in the reverse proxy to decide where to send the encrypted traffic based on SNI headers. Unfortunately, the setup had three problems.

First, the shared IP address was causing some issues from time to time. We discovered that when the SNI header is delayed for whatever reason (e.g. connection issues for the user), inbound traffic got routed to the wrong service. Instead of a connection failure, users would see a wrong certificate, triggering confusing security warnings in their browser.

The second issue is that the IP address was tied to our physical server. Whenever that goes down, Codeberg Pages is offline, too. Going forward, we want to allow for maintenance on one machine without taking down your Pages (actually migrating Codeberg Pages across servers will require a little more effort, but we now have good preparations for this on the network level).

Last but not least: The IPv6 address of the machine was assigned to us in error, and we have no choice but to return it sooner or later. There is no definitive date and we will keep it working for a while, but better move as soon as possible.

We have recently deployed virtual (floating) IP addresses (read more in the infrastructure update below) to address these issues. One of them is dedicated to Codeberg Pages, so no SNI routing hacks required any more.

Storage quota

We have recently announced and enabled storage quota for users on Codeberg. However, the limit is still set to "unlimited". What's going on?

The urgency of limiting storage on Codeberg is growing, as we see more storage abuse recently. Because we don't want to be a surprise for our users, we have decided to announce the changes as soon as possible.

However, after early feedback, we have learned that many accounts and organizations use much more storage than they (and we) expected, especially due to container and package registry using a lot of storage.

Our goal is to reduce friction for legitimate (including small!) free/libre software projects, so we have decided to proceed carefully and postpone some changes we had in mind. We apologize for the confusion this might have caused.

We have settled for some "funny rule names": For code, we are using snow-related names (from small to large: snowflake, snowball, snowwoman, snowpile and snowscape), and for all packed binary types (packages, attachments, releases, LFS etc) we are using ice-related names (from small to large: griesel, graupel, hail, icicle, icefloe, iceberg). You can find all the definitions in the script creating them. Unfortunately, the purpose of the names is not yet reflected in the UI very well, and people started to wonder what they are supposed to mean.

For the next steps, we will need to improve our review and detection workflows, to efficiently process your requests and to detect and investigate if users and organizations are reaching their limits, so we have a better idea about the impact of our changes.

Status of our infrastructure

When it comes to our infrastructure, we are somewhat opinionated. Given our desire to remain independent from Big Tech, we pour a lot of effort in deploying services which are fully under our control, even if that requires more administration and maintenance effort than any big "cloud" solutions.

We mean "under our control" quite literally, as Codeberg's services are hosted on our own servers and (encrypted) disks. Which means, we can actually hold the data in our hands. Nowadays, this independence is not a given, as large parts of the Internet rely on the "cloud" offers of Big Tech or use resellers.

While using one of those hosting strategies might have been easier maintenance-wise, that would directly contradict our goals of offering alternatives to the dominant tech models. Even worse, your donations and membership fees to Codeberg would directly fund some of the very Big Tech players that we want to offer an alternative to.

Of course, our independence comes with responsibilities, and we can't do everything alone. Thankfully, there are many others that help us along the way!

Individual Network Berlin e.V. provides us with necessary rackspace and network infrastructure. The setup of our infrastructure is also supported by some of our members, who are helping us with internal network routing, firewall and our Ceph setup. And of course, our stack is based in free & open source tools created by innumerous developers from around the world: Our servers run on Debian GNU/Linux, and use more specialized software like Ceph for our storage cluster are similarly maintained by large developer teams.

Recently, we had to deal with repeated hardware issues in our primary server (Kampenwand). Some months ago, we had an issue where one of our root SSDs disappeared after a reboot, but it re-appeared shortly after and we shrugged it off as a transient issue. However, the problem occured again, and it stayed, so we had to degrade the RAID setup to bring Codeberg back online.

We replaced the disk on the next opportunity, but the new drive (from a different vendor) also disappeared from the running system a few days after. Since some not-yet-clustered services were only running on that machine, we migrated some of them in a rush (finally also doing the internal gitea to forgejo renaming), in order to give us peace of mind while doing extended maintenance on that host. The search for the root cause of the issue is still ongoing.

Recently: Floating IP addresses, tool changes, clustering

In the past months, we have started to migrate to virtual (floating) IP addresses. We have now four separate addresses, with each of them being used for the following:

  1. Codeberg.org (Forgejo)
  2. Codeberg Pages
  3. All other services (e.g. Woodpecker CI, Weblate (Codeberg Translate), Blog, Docs etc)
  4. (Reserved for future use.)

We also swapped some tools in our infrastructure: For Firewall, we moved from ufw to nftables, giving us much more control about the internal network that was not possible with ufw. For example, our LXC containers have used NAT for internal connections, which made it hard to figure out from which service traffic was coming from.

Our first host (Kampenwand) used the BTRFS filesystem (RAID 1 mode) for the NVMe root drives. Because we were unhappy with the performance for our use case, we chose "traditional" ext4 + MD software RAID for the other servers. Since we had to deal with hardware failure in Kampenwand, we have learned that BTRFS is not the most convenient tool for disaster recovery either, so we decided that it is simply not the right tool for root filesystems (for us). We have now unified the root setup of all servers to ext4, and are now only using BTRFS for some non-critical workloads and offsite backup machines.

Last but not least, we moved our reverse proxies from HAProxy to Caddy (excluding raw TCP traffic which remains behind HAProxy), as we hoped that our team would have an easier time configuring and managing Caddy's configuration. Overall, this was indeed the case, but there are several things where we have been surprised by unintuitive behaviour, that we had to apply some (perceivedly ugly) workarounds to. For example, for the interface binding and HTTP redirect support, we are so far using some workarounds, where we are not sure if this is the best solution. If you have some knowledge of Caddy and would like to optimize the config with us, please let us know.

There has been progress in clustering our services. We now have a working Galera cluster for our database, distributed across the three servers, allowing writes to happen anywhere. We also have a Redict Sentinel setup with one main and two replication nodes.

Next steps need to happen with our Forgejo setup to distribute load and adding failover procedures. While the basic conditions exist now, we now need to finalize some details and go through a lot of careful testing to see how multiple instances of Forgejo can safely operate together. If you want to follow up on details, you can read the discussion issue within the Forgejo community.

Other things

What else has happened in the past months? A quick summary of good and bad.

FOSDEM 2025: We have been with our stand to FOSDEM and it was a pleasure meeting with so many of you. Also for our Codeberg and Forgejo teams with contributors scattered around the world, it is a nice opportunity to meet with many people from Europe and discuss ideas and motivation face-to-face.

Spam attacks: In addition to SEO and advertisements happening on a daily basis on Codeberg, we have been hit with targetted spam. In February, we have seen racial slurs being sent to projects fighting for human rights on our platform, to which we have published a clear response on our blog. Unfortunately, we have since been under attack as well, with spam striking back harder than before. Some things (like our mailboxes getting mass-subscribed to hundreds of newsletters) are easy to deal with, others are a little more annoying.

Spam has returned from time to time, including AI-generated fake issues in repositories on the platform (but since GitHub declared AI-generated junk to be an official feature that you cannot get rid of, we suppose it might just be someone trying to bring this to Codeberg. And no, we don't want it).

Most of the spam campaigns were successful, because all our past protection was focusing on advertisement and only triggered by hyperlinks, for example. We have since extended our protection to "even more useless" spam, and are waiting for the abuse reporting feature coming into the next version of Forgejo - thanks to generous sponsoring by NLnet.

New members: We are very happy about the huge increase in new members we have seen early this year. The main causes for the offtake were people joining after FOSDEM, as well as people supporting our mission in response to the spam attacks we have faced. Thank you so much, we really appreciate your trust and support, and we are looking forward to your involvement within Codeberg e.V.

Forgejo development: There is so much going on, and we encourage everyone to read the monthly updates on the website. One thing we want to highlight: Codeberg is funding review and maintenance of federation code. This ensures that contributions don't get stuck due to lack of simultaneous contributors, which has been a problem in the past. Progress is still slower than we all hope - but at least it's steady!

AI crawling protection: The threat of gold-rush-style AI crawling of the web is not a new problem, but the impact on our infrastructure kept increasing. In the past, blocking of certain IP ranges would be enough and keep us protected for days. However, more and more AI companies seem to hide behind residential proxies, abusing consumer IP addresses to distribute traffic. Rate-limiting is no longer effective, because given IP addresses might only do about two requests per day.

As a consequence, we have decided to protect certain expensive routes of Codeberg.org and most of Weblate (Codeberg Translate) behind Anubis, which requires the browser to do some computation via JavaScript to prove they have capacity. While this causes a slight increase of energy demand, legitimate users will only solve the challenge rarely, while large amounts of energy usage due to the heavy crawling. Most current crawlers do not seem to execute the code, and their requests don't reach our backends.

Community Spotlight

Liberating communities: At Codeberg, we believe in the power of free/libre software and healthy and transparent governance. More than two years ago, this was the reason why we have forked Forgejo from Gitea in order to fully liberate the software development. And we are happy about this decision in retrospect.

We are happy to stand behind those that choose to do the same within their communities, and send a warm welcome to the maintainers of CoMaps on Codeberg, which have decided to fork the popular offline and privacy-friendly map and routing application. Their first binary can already be downloaded from Codeberg, and the website looks pretty to us. All the best to CoMaps, please give it a try!

Libre operating systems: We are happy that two projects have chosen to rely on our platform for their development recently. The GNU Guix project has recently moved to Codeberg. It is a functional package manager based on Nix, which can either be used to manage packages on an existing system, or be used as a standalone GNU operating system. By the way, it also allows choosing the GNU Hurd kernel.

Last year, the Fedora operating system has decided to choose Forgejo as their new development platform, entering a close collaboration with the Forgejo developers and the Codeberg community. Although they will be hosting Forgejo on their own infrastructure, some of their infrastructure projects can also be found on Codeberg.

Development tooling: Running tests, and running them often, is one of the major cost factors in software development: it consumes time and energy. Why not only focus on the tests that actually matter to your change? While there are surely many approaches to it, most of which require manually defining run conditions. Testtrim takes a different approach: It flips data from your coverage analysis to decide which tests to run to cover a change. It has early support for Rust, Go and .NET, but has seen many improvements over the past months.

Another common problem in software development is dealing with merge conflicts. Did you know that Git supports custom merge drivers? Take a look at Mergiraf, which knows about the syntax of many common programming language and can deal with conflicts differently than the built-in Git merge drivers. Please give it a try and share your experiences.


Thank you for your trust and support, and for reading our updates.

We are looking forward to seeing you soon during our annual assembly!

Thank you for reading! Your Codeberg Public Relations team

--

Codeberg.org
Codeberg e.V. – Arminiusstraße 2 - 4 – 10551 Berlin – Germany
Registered at registration court Amtsgericht Charlottenburg VR36929.