I'm in a state that's affected by the OS age verification laws going into place in 2027. I don't give that information to any website except for my bank and for tax purposes. My phone doesn't have my actual date of birth, nor does google or facebook. I don't trust any of them with my data. But with the law taking affect I think I might lose all access to the internet.
I have decided to make backups of key websites containing either knowledge or entertainment that I might miss in a world without internet.
1. {Next Day Edit: In my panic I forgot about copyrights. Let's keep it to websites that allow copying/mirroring or otherwise encourage page backups.}
2. Wikipedia all text and images (early 2025 version compressed it's about 120 GB)
3. Have backed up several of my favorite Linux iso install files.
4. currently fetching all of gutenberg.org
5. Github sources of GCC, Linux Kernel, Busybox, SDL. Probably I should do the same for any open source program that I use daily.
What websites do you recommend backing up?
Extra points if you think it might be "at risk" some time in the near future.
Also points if it's less than 15GB in final size, this project is very tedious with sub 1Mb connection speeds. Some quick "wins" would be very welcome.
Write to or contact your local representatives. For those in states that haven't been affected by this, look into what bills are being introduced into your local legislatures. If we just let this level of surveillance and control happen without a fight, they're only going to take more away from us. Hopefully people will also sue once the law goes into effect if it causes a real, quantifiable harm or loss (not sure if you're allowed to prematurely sue, not a lawyer).
Same thing with trying to get rid of AI surveillance devices like Flock. If representatives know that people won't stand for this crap, maybe they'll reverse it.
> 1. cplusplus.com's reference pages (using wget)
Surely you mean cppreference? cplusplus.com is a bit out of date these days.
The Kiwix project is looking like it's got a good chunk of what I'm after.
They have a collection of downloadable websites compressed with a zim extension. -> https://download.kiwix.org/zim/
I'm seeing a lot of stackexchange/stackoverflow and all the little sub-categories for those, as well as many of wikipedia's offshoots like wiktionary and wikiversity. There are also other education websites in the list. The cppreference can be found in the "devdocs" folder and is a very reasonable 7Mb download.
The "other" category has some strange and silly extras. Bulbagarden, really?
The Kiwix browser can cruise through the zim files just like a web-browser. Links generally go back to content in the same zim file and show when they go external at which point it opens in firefox or your default browser.
Sadly I already came across some where the content is downloaded but the links all still send you to external sources. As a quick fix on Linux there are the zim-tools that let you search, decompress, and create content in a zim file.
More about kiwix: I just made my first web backup to zim file.
I used zimwritefs, found in ubuntu's zim-tools package.
While you can use wget to first mirror a website, it is nice when a site gives you a good alternative.
I decided to backup SDL's documentation wiki, and was happy to find a little link in the top right corner of their page (called "offline html") that lets you download all documentation in a single zip. (It is currently 21MB zipped)
zimwritefs is a bit finicky in that there are a lot of required commands you need to fill in before it will run.
After unzipping the docs, this is how I got zimwritefs to build a zim file from console: $ zimwriterfs --welcome FrontPage.html -I sdl3Logo.png --language eng --title "SDL3 Dev Wiki" -L "A snapshot of the SDL3 documentation reference wiki" -d "The SDL3 documentation wiki" -c "Maintainers of the SDL libraries" -n SDL3_Wiki -p Newbieg SDL3/ ../SDL3_wiki.zim
sdl3Logo.png is a simple png that I drew in Krita and placed in the SDL3 folder. The initial download does not appear to have a png, and it's a required field of zimwriterfs, so you'll have to make it yourself and pop it into the site's root folder.
Pop your own name or whatever you want after the -p (--publisher) command, it's meant to be you, the person publishing the zim file.
Also, while $ man zimwritefs says that the -n or --name command is optional, I got errors saying that it must be at least one character long if I skipped it.