this post was submitted on 01 Aug 2024
87 points (100.0% liked)

hexbear

10235 readers
56 users here now

Now that the old Hexbear fork has been officially abandoned, this community will be used as a space for meta-discussion on the site itself.

founded 3 years ago
MODERATORS
87
Link trackers (hexbear.net)
submitted 1 month ago* (last edited 1 month ago) by [email protected] to c/[email protected]
 

Hi folx

Not much has changed since we last brought this up half a year ago, which is probably a mistake as link trackers have become more ubiquitous, and the corporations that know our names and addresses have built up shadow profiles on us, but better late than never.

Anyway, cutting to the chase. This bot will warn you in DMs when you share a tracking link. That's it. Post over.

Read on if you want to see my unhinged tracking link rants.

What are link trackers?When you share a youtube link you may notice an ?si=(random gibberish) at the end. You may notice the same with Instagram, except here it's ?igshid. On Twitter, it's ?t. On TikTok and Reddit you have urls that end in gibberish like vm.tiktok․com/blahblah or reddit․com/r/blahblah/s/blahblah.

These URLs are artisanal. They are made only for you.

Other site's URLs can also be called "high entropy" URLs, for example, they may contain the time down to the millisecond, in one case.

When you share these URLs to the world wide web, you broadcast to this service (to YouTube, to Google, to TikTok, to Reddit etc.) that "Hey! This previously-anonymous account is actually me!". When you share this link to your friend halfway across the world who only talks to you on Discord, and they click it, you broadcast to this service that actually you two are buddies. Same here on Hexbear. This sharing helps these sites build a social graph on us.

The threat is two-fold. Google has a powerful search crawler, and also runs a massive ad network. They could sift through the pages they indexed on Hexbear and link the exact Hexbear account to your real name. People who have clicked on your shared link will also be exposed as having been on that exact page to which you shared the link. This kind of metadata leak can be dangerous, as law enforcement has previously asked Google to reveal people who watched so-and-so YouTube video at so-and-so time.

This bot also handles TikTok, Yandex, Snapchat, Meta/Facebook trackers that all have this same ad-related threat.

What can mods on Hexbear do?If you're a mod and you think this is important, you can @ mention this bot on a community you moderate. The bot should reply to you with some cringe, and then you can appoint it as a mod. When given mod powers, it will remove any comment/post that contains tracking links if the user has not fixed it after a day.

I will probably add functionality to sift through old comments that have dangerous trackers (like TikTok, which exposes your name and picture to anyone who clicks it) and remove/report them soon.

How to protect yourself on other sites and on your phoneInstall the ClearUrls extension on desktop (if you're on Chrome... please switch, that is another privacy issue entirely). ClearUrls will cut down on most of your worries.

Be on the lookout for the high-entropy parameters when you share things on your phone as well. Parameters in the url that look like ?si=blahblah, ?igshid, which look like they'd stand for "share ID" or "Instagram share ID", as well as obfuscated TikTok links like vm.tiktok․com/blahblah will all track you and your social circle.

How to protect your identity from leakage if you accidentally click on a tracking URLIf you're browsing a sensitive website, like Hexbear, and you happen to click a tracking URL that goes to YouTube, Google/YouTube can correlate your click with the appearance of this URL on Hexbear, associating your identity with this site.

To avoid this, you may use Firefox Multi-Account Containers, and make Hexbear use its own container that you keep separate from everything else. Although this solution is not perfect, it will prevent one facet of your identity leaking and make it harder for other sites to correlate your digital footprint.

What other threats exist hidden in URLsThe biggest threat is TikTok, which basically doxxes you when you share a link with someone.

When someone clicks your TikTok link, a big banner on top of their screen shows your profile picture and your name. If you used your real name and picture... well. Uh-oh.

Other "light doxxing hazards" exist on other sites. After looking through Hexbear comments using the search function, you can find comments that link to *****, comments that link to ****, etc. that may include the user's general location down to the city, their preferred language, their screen width and height (in the URL!!! for some reason???), and some very high-entropy parameters that look like a long string of gibberish.

If I sat down today and looking to dox someone by looking at their profile and they shared links willy-nilly, I'd have some pretty good leads.

What can the maintainer of HexReplyBot do?HexReplyBot does not handle YouTube tracking parameters properly. The maintainer can check this RegExr post I made with the modified regex. I bodged it real quick, but it should remove the ?si at least. It will still keep the ?pp parameter, but I got lazy and it's not as common. Please consider changing the regex out, thank you.

Some linkshttps://archive.ph/8c80m - law enforcement using metadata provided by YouTube to find the real name of a suspect
https://hexbear.net/comment/4439859 - someone mentioning that they keep getting a Hexbear user recommended to them on TikTok because they clicked that user's TikTok link months ago
https://archive.is/WD7ke - "We kill people based on metadata" Can't be bothered to find it but ross ulbricht got busted on some metadata links between his email and stackoverflow. Now imagine if they had tracking links back then to triangulate his stackoverflow identity (which now has tracking links) with some other offsite identity.

Share any feedback or thoughts, I'll take it into consideration.

all 35 comments
sorted by: hot top controversial new old
[–] [email protected] 23 points 1 month ago* (last edited 1 month ago) (2 children)
[–] [email protected] 7 points 1 month ago (1 children)

Regex is scary. I view it like the dark side of the force. I did a bunch of work taking "the length of the phrase after the comma until the next ' ' in the string" instead of trying to decipher regex.

[–] [email protected] 8 points 1 month ago (1 children)

regex is awesome ngl.

regexr.com is a great resource

[–] [email protected] 4 points 1 month ago

I've never seen it spelled out like that. It might change everything for me on a rainy day. Thank you!

[–] [email protected] 4 points 1 month ago

Still to this day I have not bothered to learn regex

[–] [email protected] 19 points 1 month ago (1 children)

And what if i like posting tracker links, as a hobby? fedposting

[–] [email protected] 16 points 1 month ago

Thank you for your service rat-salute

[–] [email protected] 15 points 1 month ago (1 children)

Good bot, even if it's a little annoying. It'd be cooler if Lemmy had an integrated auto-replace tool for this.

[–] [email protected] 13 points 1 month ago (1 children)

There is an issue open for this exact thing: https://github.com/LemmyNet/lemmy/issues/4905

CleanURLs provides repo of rules: https://github.com/ClearURLs/Rules that can be used for this task.

I'm working my way through the Rust book to learn more about how Rust works so I can add this into the server backend of Lemmy once I feel confident.

[–] [email protected] 10 points 1 month ago

Will be following that issue closely! Thanks for sharing

[–] [email protected] 15 points 1 month ago (1 children)

Tracking TikTok links should be completely banned, huge security risk. Admin(s?) can you please consider it?

[–] [email protected] 8 points 1 month ago

if they can be detected using a simple regex (no lookaheads/behinds iirc) I think the slur filter can remove them.

[–] [email protected] 13 points 1 month ago* (last edited 1 month ago) (1 children)

Damn, this was on my list of things to do, lol. I have to ask, is there a Hexbear Coding Collective, maybe a publicly hosted Gitea server or even just a Github Organization? I would love to contribute to these projects, and help give back to the Hexbear tech infrastructure in some practical way.

Also, I just want to point out that there is a repo of tracking rules independent of CleanURLs that can be used for this task.

https://github.com/ClearURLs/Rules

That way, you do not have to come up with these regexes yourself.

[–] [email protected] 11 points 1 month ago

The robots will save us

[–] [email protected] 11 points 1 month ago

Curious to hear input from people who have mod powers on this. What makes the most sense? To mod this bot, or to have the bot point out posts and comments, maybe have the bot reply to tracking links publicly... or something else entirely?

[–] [email protected] 8 points 1 month ago (1 children)

Superb post, ty. Does InsidiousTrackers not provide a comment removal reason though? Just checked the modlog and I think it should show "reason: tracking" or smth.

[–] [email protected] 5 points 1 month ago (1 children)
[–] [email protected] 6 points 1 month ago

Just thought it was weird seeing a buncha removed comment without a "reason" field =)

[–] [email protected] 7 points 1 month ago

Can hexbear be made to automatically strip tracking from links?

[–] [email protected] 7 points 1 month ago (2 children)

So if I give someone a tiktok link even if I delete the ? and everything past it, do I still get doxxed? I just have this feeling like I do.

[–] [email protected] 10 points 1 month ago (1 children)

You just want to avoid sharing the URLs that are formatted www.ticktok.com/t/abcd12rfd, as well as the ones that have a ? in them. See my post: On Sharing TikTok Videos.

[–] [email protected] 5 points 1 month ago (1 children)

I will check this out. That makes sense and is crystal clear. Thank you!

[–] [email protected] 5 points 1 month ago

Anytime comrade!

[–] [email protected] 10 points 1 month ago

if the link is just www.tiktok.com/@username/video/numbers without a ? the url doesn’t have tracking. if it’s the vm.tiktok ones can either open in a browser to get www. version or use a site like urlex.org to get the www. link then remove the ? and everything after. (also works on the tracking reddit links)

[–] [email protected] 4 points 1 month ago

I encourage everyone to check this android app that helps cleaning links

https://f-droid.org/packages/com.trianguloy.urlchecker/

Some people here are terrible with link hygiene, but people irl are so much worse

[–] [email protected] 4 points 1 month ago* (last edited 1 month ago) (1 children)
[–] [email protected] 2 points 1 month ago

I found a YouTube link in your comment. Here are links to the same video on alternative frontends that protect your privacy:

[–] [email protected] 2 points 1 month ago* (last edited 1 month ago) (3 children)

Please don't have a bot DM me. That's very annoying.

I get that there's some technical challenges that would need to be solved for those tracking links to be stripped out by lemmy itself but I'm just annoyed by a nerd saying uhhhh excuse me you used a youtube link instead of some dodgy YouTube frontend that will disappear in a year, or how dare you use a twitter link instead of some twitter proxy that will shut down in 3 months.

Like why not spend the coding time actually fixing the issue instead of just annoying people

[–] [email protected] 16 points 1 month ago (1 children)

It's not about a frontend, we have that already in HexReplyBot. It's about removing a parameter tacked on at the end in your YouTube/Instagram/TikTok links.

[–] [email protected] 3 points 1 month ago* (last edited 1 month ago) (1 children)

It's the same technique. A bot that just replies with 'oh sweaty you should have done something different'

One complains about YouTube or X links and other complains about tracking info in URLs.

It's annoying and instead there should be code in Lemmy that does the URL sanitizing

[–] [email protected] 1 points 1 month ago* (last edited 1 month ago)

I agree with you. Seems like there's an active issue in Lemmy for this, so it'll get implemented eventually. Then it'll take a while for Hexbear to update.

[–] [email protected] 13 points 1 month ago* (last edited 1 month ago)

Sharing the wrong TikTok link can and has doxed users before, and some users have unsafe living situations if they got doxed.

uhhhh excuse me you used a youtube link instead of some dodgy YouTube frontend that will disappear in a year

You fundamentally do not understand what this post is talking about.

[–] [email protected] 12 points 1 month ago

Comrade, operational security is what this post is about, not anticapitalist frontends for media services. TikTok will dox you if you are not careful.