this post was submitted on 05 Jul 2023
193 points (99.5% liked)

Technology

59174 readers
2116 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

OpenAI has disabled the Browse with Bing feature in ChatGPT to prevent users from bypassing paywalls and accessing website information without making a subscription first.

top 13 comments
sorted by: hot top controversial new old
[–] [email protected] 77 points 1 year ago

"We have noticed that by accident we provided a user-friendly functionality without trying to extract money out of you. We apologize for the convenience and promise that we will make sure it never happens again"

[–] [email protected] 34 points 1 year ago (4 children)

I'm wondering why websites keep using fake paywalls when they can use a real one where the content isn't available until user verification.

[–] [email protected] 38 points 1 year ago (1 children)

They do that to let search engines index their articles. Then they switch on the paywall an hour later or so but still get a lot of traffic (which is good for advertising) when people click on the link on Google etc.

[–] [email protected] 12 points 1 year ago (1 children)

The crawler identifies itself as a "robot" which can get past the paywall. When you browse using Chrome the site behaves differently. That's why it's so easy to get past by pasting the link into archive.ph

[–] [email protected] 5 points 1 year ago* (last edited 1 year ago)

Or by using a browser addon that changes the way the browser identifies itself to pretend it's a search engine crawler.

[–] [email protected] 15 points 1 year ago* (last edited 1 year ago)

They'd like to allow search engines and block (non-paying) visitors, but they took a lazy approach to it.

The correct approach would indeed be to identify paying visitors (user+password) and search engines correctly (secret key), then they can reliably shut down everyone else.

But that would require Google to cooperate and I suspect they don't want to set a precedent where they let a website dictate how they get content. They like to deal from an all or nothing position.

Of course there are other methods, such as making public just enough about the article to be relevant in searches, but I don't know why they don't do that. Probably lowers their SEO effectiveness if I were to guess.

[–] [email protected] 14 points 1 year ago
[–] [email protected] 3 points 1 year ago

America's Test Kitchen used to substitute their article text with a bunch of Lorem Ipsum, but I can't tell whether they are still doing it without a laptop in front of me.

[–] [email protected] 10 points 1 year ago (1 children)
[–] [email protected] 15 points 1 year ago
[–] [email protected] 9 points 1 year ago

I guess that only applies to paywalls that can also be bypassed by 12ft.io, right? I.e. paywalls that open themselves up for indexing by search engines like Bing or Google.

[–] [email protected] 4 points 1 year ago

The Bing integration was unusably slow anyway.

[–] [email protected] 4 points 1 year ago

Lol. Just use Bypass Paywalls Clean.

load more comments
view more: next ›