this post was submitted on 22 Aug 2023
47 points (98.0% liked)

Lemmy Support

4629 readers
3 users here now

Support / questions about Lemmy.

Matrix Space: #lemmy-space

founded 5 years ago
MODERATORS
 

I was getting close to hitting the end of my free object storage so there was time pressure involved haha.

Seems to work but I haven't tested it too much. Currently running on my instance.

top 21 comments
sorted by: hot top controversial new old
[–] [email protected] 21 points 1 year ago (1 children)

An option to set a cache limit would be nice. Caching is okay, as long as you can clear the cache. Something like Mastodon caching. You can prune all remote media via tootctl

Turning completly off will increase the traffic of all other instances -> longer loading times πŸ€”

[–] [email protected] 8 points 1 year ago

100% agreed. I don't have the time to make a change that complex right now, so I did a fairly blunt approach with the hope that larger instances will keep caching on to reduce load.

[–] [email protected] 7 points 1 year ago (2 children)

I think the local caching was intentional to reduce load on remote instances, should we disable it?

[–] [email protected] 16 points 1 year ago (1 children)

What I'd rather see is cleanup of cached data and more granular control of that cleanup, rather than completely disabling it.

[–] [email protected] 3 points 1 year ago (1 children)

Agreed, I sadly don't have the time to implement that.

[–] [email protected] 1 points 1 year ago

Fair enough.

[–] [email protected] 6 points 1 year ago* (last edited 1 year ago) (1 children)

That's why I made it a config option that defaults to true (defaults to caching on).

I think big instances should cache, but for smaller instances with less funding and resources it makes sense to skip the caching.

[–] [email protected] 2 points 1 year ago

This also helps mitigate the risk of people posting CSAM to attack other communities which your instance is subscribed to right? If you instance never cached the image, there's no clean-up you have to do on your end provided the original instance removes the image from their server.

As you've mentioned, it makes sense for larger instances to have a cache, but smaller instance (especially single-user instances) may actually be better off not caching at all and just hosting their own images. As a more long-term solution which can add to this patch, it would be good if Lemmy did 2 things:

  1. Separated the image cache for images from other instances so it can be cleared automatically on a schedule. E.g. Images which are a local cache are deleted after X days. Yes there are proper caching algorithms used in filesystems which would be better long-term, but a quick solution for this is probably better than no solution.
  2. Periodically check for images which were uploaded by your own users to see if they are being referenced by any posts or comments. If not, delete them. I would imagine this could be a fairly intense operation so limiting this more fine-grained approach to images uploaded by your own users and taking the more liberal approach with cached images may help performance.
[–] [email protected] 6 points 1 year ago* (last edited 1 year ago) (1 children)

Great work!

This got me thinking, does Lemmy clear orphan pictrs files? Say a user uploads an image but never submits the comment/post? That file is still on your pictrs and publicly linkable. And what if the post or comment is removed by moderator or deleted by the author? Is Lemmy cleaning these up?

[–] [email protected] 4 points 1 year ago

I don't think anything in lemmy is currently clearing that. There are community scripts around that do some clearing but I have not tried them.

[–] [email protected] 6 points 1 year ago (1 children)

Interesting, I know personally one of my concerns with self-hosting an individual Lemmy instance (after losing my first account to the Vlemmy.net shutdown) was the threat of being held legally responsible for things cached on the server. Say someone uploads something illegal and my server caches it. Seeing it as an option to turn that off is nice, and for an instance only meant to be used by one person I'd imagine caching won't have a huge impact on load times

[–] [email protected] 2 points 1 year ago (1 children)

Designate your DMCA contact, pay your $6, set up a clear infringement policy, and rest easy. Full details in this EFF Fediverse Legal Primer.

[–] [email protected] 2 points 1 year ago

Is there a general infringement policy that can be copy pasted so people don’t have to figure out the legalese to write out?

[–] [email protected] 3 points 1 year ago

Nice! I would like to use it in stable version.

[–] [email protected] 3 points 1 year ago (2 children)

What's the use case for this? Reducing storage usage?

[–] [email protected] 9 points 1 year ago (1 children)

Yep! There's a pretty rapid growth of pictrs data that's never going to go away from all the images being cached for thumbnails on my instance.

It's starting to get to ~1GB per week at this point.

[–] [email protected] 3 points 1 year ago (1 children)

Holy shit, alright, I think I might be interested in this. I also have a growing storage problem that would need to solve. How does a patched instance look? Do you still see thumbnails? Can you share a screenshot?

[–] [email protected] 3 points 1 year ago (1 children)

go to https://campfyre.nickwebster.dev, sort by "new"

Thumbnails still seem to work.

[–] [email protected] 2 points 1 year ago

Thanks for sharing! I see it takes a few ms to load the thumbnail but it's a worthy tradeoff if it saves that much in storage. I'll save this post and come back to your PR once it gets merged in, too lazy to apply it myself. :)

[–] [email protected] 6 points 1 year ago

Exactly. Currently Lemmy copies thumbnail and images from remote instances locally.

[–] [email protected] 2 points 1 year ago

I set this up on my instance about a week ago and it works perfectly, thank you!