this post was submitted on 04 Jul 2023
25 points (100.0% liked)

datahoarder

6613 readers
2 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 4 years ago
MODERATORS
 

Right now, I have around 20TB of data in redundant ZFS mirrors, so I am somewhat protected against any single drive failing. Critical data is backed up at various cloud providers, but that's only a few gigs of all my data.

Looking at S3 pricing, It seems rather unfeasible to back up my data there or on the other "big" cloud providers, as it would cost me around $180 with AWS or half of that with backblaze.

How and where do you guys back up your data?

top 26 comments
sorted by: hot top controversial new old
[–] [email protected] 5 points 1 year ago* (last edited 1 year ago) (1 children)

The real question is how much of that data is irreplaceable. While I hoard like most of you I only off-site backup the hand full of TBs I can't live without if there was a full system failure. It's not the perfect solution but most of my hoarded data isn't mission critical

EDIT: to answer your question though I use AWS glacier storage

[–] [email protected] 4 points 1 year ago (3 children)

Apart from the few gigs of really private and self-made data, most of it would probably be replacable, it's just a matter of how much work that would be. On the other hand, I wonder how much of my media collection I would actually miss were it to get lost.

I will look into AWS glacier, thank you.

[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

Sounds like we have similar setups, and knowing that I'd highly recommend investing in automating as much of your current setup as possible so you can quickly get things back up and running with little to no interaction. Backing up configurations and library metadata might prove to be pretty useful in resurrecting a dead server.

Also rclone is going to be your best friend if you don't already have that setup

[–] [email protected] 1 points 1 year ago

The egress fees from glacier are astronomical. So if you ever need them you might just decide it’s worth re-downloading. Last I checked Wasabi seemed a better option, but higher priced per month of course.

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

My line of thinking is that radarr and sonarr are my backups. If the drives went boom then just have those two sync my library. It may take a couple weeks but I can live with that.

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago) (1 children)

The amount of data I backup offsite is significantly less than 20 TB. Therefore, my answer to your question will probably not help you.

I store my offsite backups at rsync.net and in one of Hetzners Storage Boxes. For backups in general, I use Borg.

[–] [email protected] 2 points 1 year ago (1 children)

Hetzner Storage Boxes seem much more affordable than AWS, thanks for the suggestion!

[–] [email protected] 3 points 1 year ago

However, I would not use the storage boxes as the only backup. The offer has two disadvantages.

  • The boxes are regularly unavailable for some time due to maintenance work. But these maintenance times are announced in advance.

  • Hetzner does not specify what kind of RAID is used.

I therefore only use my box as an additional offsite backup and to swap out less important files.

[–] [email protected] 3 points 1 year ago (1 children)

You could look into AWS Glacier or S3 Deep Storage tier. If you have 20TB stored that’s about $20/month(YMMV) which isn’t wonderful but that’s a lot of data so it’s understandable.

Being a cheapskate, if I can get something back or it’s not crucial it’s on a RAID array with snapshots, everything else is either encrypted Duplicati backups to Google Drive (Windows) or encrypted borg backups to Borgbase(Linux)

Borgbase is very reasonably priced and if you have a large storage space in GDrive due to having one of their other services it’s a good use of it.

[–] [email protected] 2 points 1 year ago

Hadn't heard of borgbase before, I'll check it out. Thanks.

[–] [email protected] 3 points 1 year ago

I use Scaleway Glacier since I could actually afford to pull the data out, unlike S3.

[–] [email protected] 3 points 1 year ago

I just have a Synology NAS running a Hyper Backup task to an external USB HDD that I physically drive to my parents house whenever I go there, at which point the new data gets copied to another external drive I keep there using rsync.

Not the most ideal solution but it works for now. Eventually it would be nice to get another NAS to keep at my parents house and have nightly backups going over the internet.

[–] [email protected] 2 points 1 year ago
  • Backblaze B2 for my most important stuff. Encrypted of course
  • Backup HDDs and M-Discs stored at friend’s house
[–] [email protected] 2 points 1 year ago* (last edited 1 year ago)

It's not really enabled right now, but my offsite backup is a combination of a Raspberry Pi 4B, a QNAP TL-D800C and a Tasmota WiFi power plug at a family member's place.

I SSH in to the always-on Pi over a VPN connection, send a command to the Tasmota to turn on the QNAP disk shelf, do a zfs send and once it finishes it shuts down the disk shelf again.

[–] [email protected] 1 points 1 year ago

How about good old offsite HDDs, tapes, etc. I guess it depends on the target. If it's family photos, you probably want something like this -- after all, if you get hit by a bus and stop paying the hosting bill for a couple of months, all that stuff could be gone.

Variations on the scheme include rotating media into safe deposit boxes at a bank, etc.

[–] [email protected] 0 points 1 year ago (1 children)

I used to store a bunch of hard drives with ZFS snapshots of my stuff in the garage. Not ideal, but better than nothing, and it’s technically a separate building lol

I only have roughly 5TB of data though.

Definitely looking to improve the situation, cause at the moment I have no offsite backups at all :/

[–] [email protected] 2 points 1 year ago (1 children)

It would be marginally risky, but considering how many people have large storage arrays having a “mutual backup compact” between two folks where each can run backups to the others array would help get you an affordable offsite backup for catastrophes.

I see a bunch of people with 10TB of data and 30TB arrays and if two of them got together they would both be reasonably safe from a total array failure.

[–] [email protected] 0 points 1 year ago (6 children)

This does sound interesting! Would need some tooling to lay my paranoia to rest though, and some trust towards the other person.

[–] [email protected] 2 points 1 year ago (1 children)

I can imagine a containerized service that only runs, say, ssh which only runs a forcedcommand, like https://borgbackup.readthedocs.io/en/stable/usage/serve.html

And set up the container with the storage-opt option to limit space usage. It would make it harder to misuse the space or cpu, or break out into the hosting server.

You could go one step further and set up something like a tailscale/headscale network and only allow access over that, and limit the acls on the tailnet to only the ssh port. That should shield it from the Internet at large and also apple am absolute minimum of access to the other side.

I wonder if you could run the tailscale client within the container? Having it all together would make it actually usable.

I’m also looking at some of the distributed file systems out there, if one supports “m of n” connections to get the data, you could possibly use that to have the encrypted backups stored on multiple machines at once with more resilience.

[–] [email protected] 1 points 1 year ago

Tbh the idea does sound interesting, especially if there’s a way to do Shamir’s secret sharing on top of the encrypted snapshot or something. Cause I’m not too worried with exposing my stuff to the internet, as I at least partially do that for a living, but rather make sure I do not existentially send all my family’s documents in plaintext to some stranger on the internet.

[–] [email protected] 1 points 1 year ago

I can imagine a containerized service that only runs, say, ssh which only runs a forcedcommand, like Borgbackup

And set up the container with the storage-opt option to limit space usage. It would make it harder to misuse the space or cpu, or break out into the hosting server.

You could go one step further and set up something like a tailscale/headscale network and only allow access over that, and limit the acls on the tailnet to only the ssh port. That should shield it from the Internet at large and also apple am absolute minimum of access to the other side.

I wonder if you could run the tailscale client within the container? Having it all together would make it actually usable.

I’m also looking at some of the distributed file systems out there, if one supports “m of n” connections to get the data, you could possibly use that to have the encrypted backups stored on multiple machines at once with more resilience.

[–] [email protected] 1 points 1 year ago

I can imagine a containerized service that only runs, say, ssh which only runs a forcedcommand, like Borgbackup

And set up the container with the storage-opt option to limit space usage. It would make it harder to misuse the space or cpu, or break out into the hosting server.

You could go one step further and set up something like a tailscale/headscale network and only allow access over that, and limit the acls on the tailnet to only the ssh port. That should shield it from the Internet at large and also apple am absolute minimum of access to the other side.

I wonder if you could run the tailscale client within the container? Having it all together would make it actually usable.

I’m also looking at some of the distributed file systems out there, if one supports “m of n” connections to get the data, you could possibly use that to have the encrypted backups stored on multiple machines at once with more resilience.

[–] [email protected] 1 points 1 year ago

I can imagine a containerized service that only runs, say, ssh which only runs a forcedcommand, like Borgbackup

And set up the container with the storage-opt option to limit space usage. It would make it harder to misuse the space or cpu, or break out into the hosting server.

You could go one step further and set up something like a tailscale/headscale network and only allow access over that, and limit the acls on the tailnet to only the ssh port. That should shield it from the Internet at large and also apple am absolute minimum of access to the other side.

I wonder if you could run the tailscale client within the container? Having it all together would make it actually usable.

I’m also looking at some of the distributed file systems out there, if one supports “m of n” connections to get the data, you could possibly use that to have the encrypted backups stored on multiple machines at once with more resilience.

load more comments
view more: next ›