this post was submitted on 23 Aug 2023
110 points (94.4% liked)

Technology

57455 readers
4617 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

NYT looks like it's updated it's robots.txt file to disallow the Open AI bot from scraping it's data. Pretty interested to see if they just update their user agent string or if they'll respect it

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 36 points 1 year ago (1 children)

Updating user agent doesn't natter unless NYT is actively blocking that, too. Updating robots.txt is purely a "gentleman's agreement" that OpenAI will respect it. OpenAI would be dumb to ignore it, hat all said, because it'd trigger the lawyer shenanigans to ensue.

[–] [email protected] 12 points 1 year ago (1 children)

NYT is already considering a lawsuit against OpenAI. So, not just dumb but arrogantly stupid when the lawyers are already in the room.

[–] [email protected] 9 points 1 year ago

The burden of proof will fall upon the NYT and it will be extremely difficult to prove OpenAI is culpable for any infringement that it's end users perform.

It's new territory and will be expensive, but NYT is old money and has the liquidity to burn cash all day.