this post was submitted on 31 Aug 2023
345 points (92.2% liked)

Fediverse

17788 readers
2 users here now

A community dedicated to fediverse news and discussion.

Fediverse is a portmanteau of "federation" and "universe".

Getting started on Fediverse;

founded 5 years ago
MODERATORS
 

This shouldn't come as a huge surprise. Meta is moving forward with their plans for Theads and the Fediverse, and their adjusted terms reflect a new impending reality for Fediverse users.

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 55 points 1 year ago (4 children)

Looks like there's a lot of FUD around this, so I decided to jump into the ActivityPub spec and see exactly what they can and can't get with the spec as is.

First off, they cannot get a users individual IP unless the instance owner publishes it in the profile data as part of a "public" activity stream. I don't know of any instance that does this currently (feel free to correct me if I'm wrong).

It looks like what Meta is looking to do is scrape the information in the "public" tagged activity streams:

In addition to [ActivityStreams] collections and objects, Activities may additionally be addressed to the special "public" collection, with the identifier https://www.w3.org/ns/activitystreams#Public.

Activities addressed to this special URI shall be accessible to all users, without authentication.

This is similar to what most instances do to show the posts of a user or community - they send a request to get "public" tagged data to publish to their end users. Within this data is all the activity information on that post - who upvoted what and who, and who commented. Again, this is the same way federation works now - your server has an activity stream of all your followed and followers that it can make available to view by tagging their activity as "public". Many instances have this information tagged as "public" as a default.

Now, this system works fine if you're dealing with small actors that don't have nefarious designs on the network, or the resources to dominate it.

When you have a digital behemoth with grand AI designs that's already embroiled in lawsuits where it was grabbing your medical data and regularly allows law enforcement to stroll through its records, it's an entirely different situation. Meta has the power and capacity to not only engage in an "embrance, extend, extinguish" campaign against the Fediverse, but also to seriously threaten the privacy and well-being of Fediverse users in a way no single instance owner can.

I think the solution here will be for individual instance owners to harden their security and if not outright de=federate from Threads, ensure that posts are private by default and that their users are made well aware in the TOS that following a Threads user will result in sharing data about their profile that could (and most likely will) be matched back to their Facebook account.

Instances that don't allow visibility control on posts, like Kbin and Lemmy, should look at adding an option to post only to the local server, or have the capacity to block threads.net outgoing publication based on user profile settings.

Instances that don't allow follow request filtering probably should look at adding it (Mastodon has it implemented - Kbin and I think Lemmy would need to catch up) - otherwise users could be unaware that they're sending their data to threads.net when someone from that service follows them.

I think it goes without saying that any data Meta gets will get the AI treatment - both to identify users and to sell your activity to marketers. That activity is the real goldmine for them - that's a stream of revenue for marketing that rivals what Meta tracks on its own platform.

As such, it may be worthwhile for instance owners to look at removing voting and boosting counts from the "public" activity feed. This would mean more fragmentation for communities whose populations span instances (vote counts would be more off than they are now), but it would prevent bad actors from easily scraping that data for behavioral analysis.

All in all, though, I don't believe it's going to be a positive event when Threads does start federating. One of the nice things about the Fediverse is that the learning curve is high enough to keep the idiot count down, and I don't really see our content or commentary here improving once Meta's audience enters the space.

[–] [email protected] 8 points 1 year ago (1 children)

We don't know what they'll do yet as there's nothing in the article about what they do with the data or how the protect it.

Setting everything to private by breaks the fediverse pretty much. Imagine if everyone on Twitter was only private. It severely limits everything.

A "public" instance is just one that publishes to other instances if I understand correctly. So they would get the IP of the server instance. Which most instances actually do.

[–] [email protected] 3 points 1 year ago

The instance owner determines what's on their "public" tagged activity feeds. If they remove the "public" tag from a post or user account, it's restricted from non-authenticated requests from outside servers. You're correct that this shouldn't grab user IP addresses, but they could if an instance owner is including that information in what they mark as "public" profile feed data. I should reiterate that I know of no instance that does this, but the capability is there in theory (and I do know that certain forum software packages outside the Fediverse collect and publish this level of information, although it's a dying practice).

I'm not advocating instance owners turn everything private, but it's clear they're going to have to examine what they're providing through their feeds to Threads if they're serious about their users' security and privacy. The safest bet is to defederate from Threads until it's clear what Meta's intentions are (aside from their rhetoric, which is always deceitful when it comes to user privacy).

As to what Meta will do, they absolutely will scrape that activity data for marketing use, if they aren't already. It's what their entire business model on Facebook is built around - targeted ads based on user activity. Anything they say about protecting that data is lip service at best given their past performances and lawsuits. It also very likely that they'll merge it with their existing data hoards, and do their best to de-anonymize accounts so that they can increase their data accuracy and thus their profit margin.

[–] [email protected] 5 points 1 year ago (1 children)

Pretty much wanted to say similar. Ip address isn't known beyond your local instance (and any retention time and purposes should be stated in their privacy policy).

The rest is standard data any federation app will collect upon seeing content from a user.

It's also worth noting that in general the user URL (which provides this user data) is generally also public. So if you know the user url you can get this too.

Having said that, I do wonder how much they can monetize third party data about people that have not agreed to their privacy policy that grants such uses. It'll be interesting to see.

[–] [email protected] 4 points 1 year ago

Can’t speak for kbin but Lemmy doesn’t collect or store IP addresses at all.