this post was submitted on 29 Sep 2023
65 points (94.5% liked)
Asklemmy
43874 readers
1248 users here now
A loosely moderated place to ask open-ended questions
Search asklemmy ๐
If your post meets the following criteria, it's welcome here!
- Open-ended question
- Not offensive: at this point, we do not have the bandwidth to moderate overtly political discussions. Assume best intent and be excellent to each other.
- Not regarding using or support for Lemmy: context, see the list of support communities and tools for finding communities below
- Not ad nauseam inducing: please make sure it is a question that would be new to most members
- An actual topic of discussion
Looking for support?
Looking for a community?
- Lemmyverse: community search
- sub.rehab: maps old subreddits to fediverse options, marks official as such
- [email protected]: a community for finding communities
~Icon~ ~by~ ~@Double_[email protected]~
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
I'm referring to BIG storage, private clouds, data lakes, etc. For example, my primary customer, In three years we've grown the object storage footprint by 100 petabytes. The rest of the global footprint across 110 sites is another 95PB. Commodity services do not scale, and global data transmission is typically custom tailored to the user requirements. Thinks like a 1st pass at the edge in 15 remote test sites, each crunching 100TB of raw data down to 10TB for transmission back to core, and that process happens on a clock. Other binary distribution uses cases, transmitting 50GB jobs from other continents back to core for analysis. It's all still custom. Then there's all the API back end work, to build out all the customer accessible storage APIs, numerous challenges there.
I'm trying to wrap my head around this - I've been stuck in the mickey mouse line of business world where a company may have like a few TB of transactional data in a decade - and I kind of want out into the real world. A few questions if you don't mind, what kind of customer needs this amount of storage, what kind of data is it, and are you mostly building on top of S3?