this post was submitted on 05 Jul 2023
7 points (88.9% liked)

Lemmy Support

4650 readers
1 users here now

Support / questions about Lemmy.

Matrix Space: #lemmy-space

founded 5 years ago
MODERATORS
 

I'm just trying to gauge if the performance gain will be worth the additional effort and have some questions. Was directed here from asklemmy.

I've read that back end communication is relatively cheap compared to end user content presentation in Lemmy. So, that leads me to believe that if I host my own instance, even without any communities, it would present content from other instances to me faster and more reliably. Are these assumptions correct?

Does an instance do any content caching for other instances? Ie, if I browse [email protected] and someone else does the same, will my instance need to make new requests to lemmy.ml?

Are images caches from other instances?

Obviously if my instance goes down, there's no service. Is there some sort of high availability or clustering supported?

Are updates relatively straightforward on Docker? I assume just pull the new image and you're good to go, or are there usually database migrations to complete outside of that?

Thanks for reading!

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 8 points 1 year ago* (last edited 1 year ago) (1 children)

... if I host my own instance, even without any communities, it would present content from other instances to me faster and more reliably.

Yes provided federated replication works reliably.

Does an instance do any content caching for other instances?

Yes, federated replication occurs at post/comment/vote time. The on-demand reads for a user-account browsing lemmy occur out of the db for the instance that hosts their account, not the instance that hosts the community.

Are images caches from other instances?

Thumbnails yes, fullsize, no. Fullsize images are the main exception to replication/caching.

Obviously if my instance goes down, there’s no service. Is there some sort of high availability or clustering supported?

This is an advanced topic. Lemmy is multiple processes working in concert, some of them are easy to cluster, others are trickier, others are impossible. Federated replication will handle standard 10m server restarts through retries. If your instance is down for many hours or days, some messages will never get delivered to it and you'll have a gap in your timeline.

A single user instance will be present data that has been replicated to it faster than a big overloaded instance, it's getting less clear that lots of single-user instances are good for the health of the overall network though. If a big instance federates with 1000 other instances, it has to copy every post, comment, and vote 1k times. If each of those copies is serving hundreds or thousands of users... that's a pretty good deal. If each is server 1 user that only reads 10% of the stuff... it looks a lot less efficient. And we're starting to see more common lag and breakage in replication, whether that's v0.18 compatibility issues, or federated replication performance issues I don't know. But federated networks do struggle as the size of the network becomes very large, and this is the biggest the lemmyverse has ever been. Large instances may start to struggle to deliver federated mesages for a while, and more single-user instances would exascerbate that if it's happening.

[–] [email protected] 2 points 1 year ago (1 children)

Thank you for this write-up, extremely helpful. I'd probably have like 5 people on the instance, but network health is a very real concern. Probably want 100 users minimum or something per instance to be efficient.

[–] [email protected] 1 points 1 year ago (1 children)

I dunno where the break even point is. 5 active users who read 90% of the posts in their subscribed feed could easily be a win. But on the flip side, the network may just struggle to deliver federation messages beyond a certain instance count irrespective of how many users are on each instance. We're kind of all learning as we go.

If you want to stand up an instance, go for it. I'm just highlighting possible tradeoffs, but they're changing quickly and I don't think anyone knows what the right course is beyond "a medium number of medium sized instances is probably optimal".

[–] [email protected] 2 points 1 year ago (1 children)

That makes sense. I there a place to monitor the general health of the entire network, or would it end up being greater replication time, and misses?

[–] [email protected] 0 points 1 year ago* (last edited 1 year ago)

I[s] there a place to monitor the general health of the entire network...

Not really. I just watch a lot of support communities, admin communities, and big instance announcement channels to keep up with the technical vibe across the lemmyverse... but even spending all that time reading posts/comments I have no idea what is an individual admin not knowing how to tune their instance and what is fundamental limits that matter.

You kind of just have to make a call with incomplete info and live with it. If you want to run an instance, do it... there's no clear consensus that we're at the network size limit. Just be aware that the performance tradeoffs are complicated and I wouldn't recommend telling everyone that the "solution" is single-user instances, because that way lies a different set of performance problems that are at least equally serious.