this post was submitted on 15 Jun 2023
169 points (95.2% liked)
Programming
17503 readers
33 users here now
Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!
Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.
Hope you enjoy the instance!
Rules
Rules
- Follow the programming.dev instance rules
- Keep content related to programming in some way
- If you're posting long videos try to add in some form of tldr for those who don't want to watch videos
Wormhole
Follow the wormhole through a path of communities [email protected]
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
You could, but if you want to do it very efficiently and at scale, you would probably need to specialize your data access layer:
It's paid for in the logical organization that is enforced at write-time (or during a maintenance task like rebuilding indices or recomputing statistics), where millisecond responsiveness is not as important.
Lots of duplication across different layers to support different access patterns and reuse work between data retrieval tasks. You need to be able to efficiently access frequently requested data, ingest new data, synchronize data between the different layers, and provide a reasonable minimum efficiency for arbitrary requests.
Semi-related, here's a story about how Discord does it.
Great link, thanks!
Looks like Discord was using 177 nodes each with 4TB disk space running Cassandra (Java), and then in 2022 migrated to 72 nodes of 9TB disk space running ScyllaDB (C++). Switching to a C++ database and writing their services in Rust allowed them to finally end latency spikes from Java garbage collection. The messages are stored in buckets assigned by channel and time window. Buckets are replicated across 3 nodes, and are accessed using "quorum consistency". They were still having difficulties with "hot partitions" where many users at once all want to access the same bucket, leading to increased latencies. They solved it by putting a data service in front of the database that would detect multiple identical incoming queries and pool them together into a single database request. The nodes are still spending a lot of time periodically "compacting" their tables for better disk read performance.