this post was submitted on 20 Nov 2023
27 points (93.5% liked)

Selfhosted

40152 readers
448 users here now

A place to share alternatives to popular online services that can be self-hosted without giving up privacy or locking you into a service you don't control.

Rules:

  1. Be civil: we're here to support and learn from one another. Insults won't be tolerated. Flame wars are frowned upon.

  2. No spam posting.

  3. Posts have to be centered around self-hosting. There are other communities for discussing hardware or home computing. If it's not obvious why your post topic revolves around selfhosting, please include details to make it clear.

  4. Don't duplicate the full text of your blog or github here. Just post the link for folks to click.

  5. Submission headline should match the article title (don’t cherry-pick information from the title to fit your agenda).

  6. No trolling.

Resources:

Any issues on the community? Report it using the report flag.

Questions? DM the mods!

founded 1 year ago
MODERATORS
 

Hi everyone. I was considering backup options to Glacier Deep Archive, and wanted to know:

  1. Which software do you use to encrypt client-side, obfuscate, compress and deduplicate the data before you send it to S3?
  2. What is the difference between Restore Requests (bulk) and Outbound data transfer and which one will I be using when I want to pull my data from AWS?

I'll be storing approximately 8TB or so of data, which is why I was looking at inexpensive ways to back it up other than buying an HDD outright.

Thanks!

top 48 comments
sorted by: hot top controversial new old
[–] [email protected] 9 points 11 months ago* (last edited 11 months ago) (1 children)

Recently I looked into the same thing, since AWS caught my eye with their apparently ridiculously low prices. Then I found this (presumably indepdenant) review, that changed my view on things: https://b3n.org/b2-vs-s3-nas-backup/

After reading that, I won't go with AWS. I'm currently considering to abuse the OneDrive Office Family plan, which costs 99 $ a year for 6 TB of storage (split across 6 accounts), which comes down to 1,40 $ per month per TB. A price that I have not seen beaten by other storage / backup providers.

[–] [email protected] 3 points 11 months ago* (last edited 11 months ago) (3 children)

The problem with AWS is that one need to bring all storage from glacier to regular S3 (which is quite expensive), and then their egress costs are massive. I read more about it and completely agree that AWS is not worth it at all when pulling data in.

However, BackBlaze is quite expensive. I'd be paying $50 for my storage, which is simply not how much I'd like to pay. At this rate, I'd get a 10TB Ironwolf drive in an enclosure from Amazon for my cold storage.

TBH, I was looking at other providers, and Dropbox looks much less expensive at $20 for 9TB. It's a lot more than the 8TB I thought I'd be able to get away with for Glacier storage, but at least it's not $50. I will take a look at scaleaway, but I simply trust the bigger players to be around and keep my files safe than the smaller companies.

Edit: Scaleway seems less expensive than Dropbox with their glacier option. I really hope I can trust this company because this is an excellent price and I might even be willing to pay as much if it can be kept secure and safe. Now, I just need to read more about the value proposition of Dropbox vs Scaleway Glacier.

[–] [email protected] 1 points 11 months ago (1 children)

How are you getting a 10TB HDD for $50?

[–] [email protected] 1 points 11 months ago (1 children)

If I were to pay $50 a month for 8TB, I'd have paid about $200 in 4 months. I can get a 10TB-12TB external drive (with Ironwolf/WD Red Pro CMR inside) for about $200.

I should have worded that properly, apologies.

[–] [email protected] 2 points 11 months ago

No worries, I should have realized it's a recurring payment. Yes, at this rate you'll make up the cost of the HDD in 2-3 months.

[–] [email protected] 1 points 11 months ago (1 children)

Regarding Dropbox: Where are you seeing 9 TB for 20 $? I'm in the EU so my pricing may vary, but all I can see is the Business plan for 16 € per month per user, with a minimum of 3 users in the plan, making it cost 48 € instead. Do you have access to something else?

Regarding Backblaze: Agreed that 50$ a month is a rough bill to pay, that sums up very fast if youre counting across the years. But their storage is also a lot more reliable than one single hard drive stored in a bank locker, with them always checking their arrays and replacing aged drives.

Regarding Scaleway: If im reading their pricing chart right, it would cost roughly 2 € / month / TB for glacier storage, and 9 € / TB when restoring from glacier to standard storage? A big questionmark for me is how ingress works. If I'm using this for backups in case of total system failure, i'll want to upload differential backups (borg/duplicati) every couple days. How is that going to work with pricing, is that all running through standard storage driving up monthly cost, do I have to manually manage file history and deletion of older stuff or does my backup software handle that? Plus, you loose out on the instant file access that you get with Backblaze, or with something hacky like Dropbox / Onedrive. I'm still undecided which I value more, money or fast access.

[–] [email protected] 1 points 11 months ago

My apologies, I seemed to have missed the per user part. I'm looking at $20 a user a year, which obviously makes it insanely expensive.

[–] [email protected] 1 points 11 months ago (1 children)

Hello ! Just adding my two cents for Scaleway. I’ve used them personally for some services (and probably will add s3 storage in the near future)

It’s seems pretty reliable in my opinion.

[–] [email protected] 2 points 11 months ago

Does seem reliable from what I read about that. Thanks, I'm considering them as an option.

[–] [email protected] 7 points 11 months ago (2 children)

That class of storage is very expensive to get your data back. Buying a drive will be cheaper.

[–] [email protected] 2 points 11 months ago (1 children)

Wait, I'm looking at the data retrieval cost (bulk request) and it says it's priced at $0.0025 per GB? That comes out to about $21 for a retrieval! Am i missing something important?

[–] [email protected] 4 points 11 months ago (1 children)

Take a look at the calculations here https://www.arqbackup.com/aws-glacier-pricing.html

It explains it a bit better. You have to factor in how many requests you need too. So both file sizes and amount of files.

[–] [email protected] 1 points 11 months ago

Thanks, that makes it clear

[–] [email protected] 2 points 11 months ago (1 children)

I think that people would be using the service as a last resort, like when all other local or physical offsite backups fail.

In that sense, the cost to recover shouldn't be the main factor when considering it.

[–] [email protected] 2 points 11 months ago (2 children)

Is there a less expensive alternative for Cloud storage with a decent SLA? I don't want to go for the smaller companies, and BackBlaze is quite expensive too!

[–] [email protected] 1 points 11 months ago (2 children)

With my Synology NAS, I use icloud e2 for cloud storage. Reasonably priced, and it integrates with Synology's Hyperbackup software.

But my needs are relatively small, sending < 5TB to my cloud backup. A few more TB and I may start looking at other options.

[–] [email protected] 2 points 11 months ago (1 children)

I plugged in my numbers into AWS, and I'm looking at $9 a month for storage with $21 for a bulk retrieval. That's quite inexpensive, which is why I'm starting to think that I'm missing something important

[–] [email protected] 1 points 11 months ago

Scaleway also offers glacier storage class. ~€0.002/GB/month. €0.009/GB retrieval. €0.01/GB transfer.

[–] [email protected] 1 points 11 months ago (1 children)

What other options? I was looking at hezner storage box and it seems pretty reasonable for storage, about $13 for 5 tb

[–] [email protected] 1 points 11 months ago

I'm not at the "other options yet" as my idrive will review for another year in a week or so.

At some point, it may be cheaper if I set up a small NAS as a family member's house and stick an 8TB or 12TB drive in there.

Really, the cloud backup for me is the last resort, and I have other redundancies available well before I'd need to use a cloud backup.

[–] [email protected] 1 points 11 months ago

idrive was good when I used them.

[–] [email protected] 5 points 11 months ago (1 children)

Lots of answers in the comment about this particular storage type/vendor. Regardless, to answer your original question, rclone. Hands down. If you spend 30-60 minutes actually reading their documentation, you are set and understand so much more of what’s going on under the hood.

[–] [email protected] 1 points 11 months ago (1 children)

Thanks, I do know of rclone and intend to study it. I was just wondering if the likes of borg, duplicati etc could be used.

[–] [email protected] 1 points 11 months ago (1 children)

Can’t speak for those but I tried Kopia and it did the job okay. Ultimately tho I landed on rclone.

[–] [email protected] 1 points 11 months ago (1 children)

Which cloud provider do you use?

[–] [email protected] 1 points 11 months ago (1 children)

I’m currently using Backblaze. I also researched Wasabi and AWS.

[–] [email protected] 1 points 11 months ago (1 children)

How much of storage do you use on B2? Does it not feel quite expensive to you? Even Wasabi is quite expensive, although it's not as bad as AWS.

I was recommended iDrive e2 by another commentor, and now that I look at it, it is likely the best product I have come across other than factoring in reliability. I have never heard of this company before, and considering that this is very important data to me, I'd like to have a reliable company behind it.

[–] [email protected] 2 points 11 months ago (1 children)

Well here’s my very abbreviated conclusion (provided I remember the details appropriately) when I did the research about 3 months ago.

Wasabi - okay pricing, reliable, s3 compatible, no charges to retrieve my data, pay for 1tb blocks (wasn’t a fan of this one), penalty for data retrieval prior to a “vesting” period (if I remember correctly, you had to leave the data there for 90 days before you could retrieve it at no cost. Also not a big fan of this one)

AWS - I’m very familiar with it due to my job, pricing is largely influenced by access requirements (how often and how fast do I want to retrieve my data), very reliable, s3, charges for everything (list, read, retrieve, etc). This is the real killer and largely unaccounted cost of AWS.

Backblaze - okay pricing, reliable, s3 compliant, free retrieval of data up to the same amount that you store with them (read below), pay by the gig (much more flexible than Wasabi). My heartburn with Backblaze was that retrieval stipulation. However, they have recently increased it to free up to 3x of what you store with them which is super awesome and made my heartburn go away really quickly.

I actually chose Backblaze before the retrieval policy change and it has been rock solid from the start. Works seamlessly with the vast majority of utilities that can leverage s3 compliant storage. Pricing wise, I honestly don’t think it’s that bad

Hope this helps

[–] [email protected] 1 points 11 months ago* (last edited 11 months ago) (1 children)

Here's my situation; I anticipate about 8TB that I will need to store reliably.

That's $50 with BackBlaze B2 a month.

I can get 2 12TB drives for $500 total, and keep one/both of them in remote locations (may or may not be connected to the internet, so I suppose the convenience just isn't there like the Cloud).

The supposed value of the cloud is becoming a bit difficult for me to justify TBH. No wonder B2 is reliable, but if I have 2 drives acting as cold storage in different locations (I will be encrypting the contents), is that a better idea than Cloud storage/BackBlaze specifically? I have been assured that the remote locations should be fine for the most part, other than for natural calamities.

[–] [email protected] 2 points 11 months ago (1 children)

Honestly what really matters (imo) is that you do offsite storage. Cloud, a friends house, your parents, your buddy’s NAS, whatever. Just get your data away from your “production/main” site.

For me, I chose cloud for two main reason. First, convenience. I could use a tool to automate the process of moving data offsite in a reliable manner thus keeping my offsite backups almost identical to my main array and easy retrieval should I need it. Second, I don’t really have family or friends nearby and/or with the hardware to support my need for offsite storage.

There are lots of pros and cons of each, let alone add your specific needs and circumstances on top of it.

If you can use the additional drives later on in your main array, some other server or a different purpose then it may be worth while exploring the drives (my concern would be ease of keeping offsite data up to par with main data). If you don’t like it for one reason or the other, you can always repurpose the drives and give cloud storage a try. Again, the important thing is to do it in the first place (and encrypt it client side).

[–] [email protected] 1 points 11 months ago

There are 3 main reasons (in my particular scenario) that might prompt me to go for the cloud:

  1. Reliability of infrastructure.
  2. Convenience.
  3. (Supposed) Bitrot protection (I won't have the protection, just the detection, since I'll be using standalone drives with ZFS).

I need to think a bit more. Thanks!

[–] [email protected] 3 points 11 months ago (1 children)
  1. I don't encrypt before I push to S3. Probably bad practice on my part. I just rely on AWS encryption to secure my data. My backups are low-risk (imo). That said, I lock down the bucket so that only my account can access the objects. Compression I use tar cjf (bzip). Protip: Once the tar file is made, run tar ljf $archiveFile > archiveFile-ls.txt and store the resulting file along with the tar file in standard storage. That way you know what is in the archive.

  2. Both. Restore Requests is to copy the data out from Glacier into Standard storage. Note that I said copy. When you perform a restore, your original object stays in glacier and AWS creates a copy to somewhere in S3 that you specify. Once the restore is complete, you can then download the copied object like any S3 object, triggering the Outbound data transfer fee.

[–] [email protected] 1 points 11 months ago

Thanks, I'll keep that in mind. I'd encrypt everything client-side since I don't want anyone to know what I'm storing; including the Cloud provider.

[–] [email protected] 3 points 11 months ago (1 children)
[–] [email protected] 2 points 11 months ago (1 children)

Thanks! Which cloud provider do you use?

[–] [email protected] 2 points 11 months ago (1 children)

I've been using S3 but I'm considering Cloudflare R2 as it might be a bit cheaper

[–] [email protected] 1 points 11 months ago (1 children)

Really? Their pricing is even more expensive than AWS' S3 Glacier Archive! I'd much rather use BackBlaze B2 than pay that much!

[–] [email protected] 2 points 11 months ago* (last edited 11 months ago) (1 children)

Tbh I don't really bother with Glacier. It is a lot more expensive than it seems especially when you want to restore anything.

I generally just use intelligent tiering and it kind of balances out.

You might think "oh well I'm probably never going to restore from here anyway"

I am here to tell you that's a very foolish attitude.

If you aren't testing your backups you might as well not have them.

My honest advice if you must insist on using Glacier is to start off in a normal tier, and keep it there long enough to have tested the backups before transferring it as-is into Glacier.

It's not perfect as there's really no guarantee that data remains safe but at least it mitigates the possibility and reduces the cost to initially use standard tiers before retiring it to Glacier.

[–] [email protected] 1 points 11 months ago (1 children)

You're right about testing backups. I will have 2 different backups, one for my config and the second for the irreplaceable media. Indeed, restoration from Glacier is too expensive for the data that I plan to back-up.

I was looking at Scaleway's Glacier offering, B2 and iDrive. How do you propose I test my backups? I could certainly pull in my config and test it on a VM, but how do I check that I have backed up my media? I plan to encrypt, compress, deduplicate and then ship it off.

[–] [email protected] 1 points 11 months ago (1 children)

I guess it depends on how you do it.

I use Kopia so I can easily mount a snapshot like a removable disk or restore a snapshot so I typically test my backups by simply restoring them

[–] [email protected] 1 points 11 months ago
[–] [email protected] 2 points 11 months ago* (last edited 10 months ago) (2 children)
[–] [email protected] 1 points 11 months ago* (last edited 10 months ago) (1 children)
[–] [email protected] 1 points 11 months ago

I'm using it too and even the current prices are reasonable (especially if you consider there's no other fees, no transfer, no ingress, no egress, ...). If you put it in S3 glacier and you ever have to restore a relevant chunk of your data (or god forbid, want to do periodic testing of the backed up data) then you'll be paying quite a bit of fees.

[–] [email protected] 1 points 11 months ago (1 children)

I have been recommended iDrive a lot under this post. How reliable are they?

[–] [email protected] 2 points 11 months ago* (last edited 10 months ago) (2 children)
[–] [email protected] 1 points 11 months ago* (last edited 10 months ago)
[–] [email protected] 1 points 11 months ago

Thanks. I was wondering about the reliability of data storage/infrastructure of iDrive specifically. For example, I'm fairly sure that I can keep my data in AWS Glacier/B2 for 10 years or so and nothing much would happen (Assuming Backblaze doesn't just die). Can I assume that for iDrive? Is this an old company with many years in the business? For their offerings seem amazing, it's just the perceived risk from lack of information that is holding me back.