this post was submitted on 22 Nov 2024
9 points (84.6% liked)

datahoarder

6792 readers
1 users here now

Who are we?

We are digital librarians. Among us are represented the various reasons to keep data -- legal requirements, competitive requirements, uncertainty of permanence of cloud services, distaste for transmitting your data externally (e.g. government or corporate espionage), cultural and familial archivists, internet collapse preppers, and people who do it themselves so they're sure it's done right. Everyone has their reasons for curating the data they have decided to keep (either forever or For A Damn Long Time). Along the way we have sought out like-minded individuals to exchange strategies, war stories, and cautionary tales of failures.

We are one. We are legion. And we're trying really hard not to forget.

-- 5-4-3-2-1-bang from this thread

founded 4 years ago
MODERATORS
 

The Aniwave(9anime) comments currently have a few problems that I will fix later:

currently missing some glitched/merged comment threads

Imgur images didn't download properly

Some images were downloaded twice(as the scraper was downloading I made changes to how images were named and ran it again)

Most commented pages on each site sorted from most(Aniwave) to least(Anitaku) amount of comments:

Aniwave(9anime): Attack on Titan The Final Season Part 3 Episode 1

Gogoanime Old comments: Yuri on Ice Category page

Anitaku(Gogoanime): Kimetsu no Yaiba Yuukaku Hen Episode 10

Folders were compressed into tarballs with zstd level 9 compression:

Aniwave(9anime): TOTAL GB UNCOMPRESSED: 69.2 GiB TOTAL GB COMPRESSED:17.4 GiB

Gogoanime: TOTAL GB UNCOMPRESSED: 84.8 GiB TOTAL GB COMPRESSED: 48.2 GiB

Anitaku(Gogoanime): TOTAL GB UNCOMPRESSED: 16.6 GiB TOTAL GB COMPRESSED: 1 GiB

Inside each of the anime folders, you will find 3 types of files that end with 'part X.json,' 'full.json,' and 'simple.json':

Part files - downloaded from disqus and unmodified and contain a maximum of 100 comments

Full - concatonated all part files

Simple - Full file with info stripped out to make more readable by human eyes

DOWNLOADS:

Aniwave(9anime) Comments: https://mega.nz/file/RfgliKJR#kV9MXkEYC-5tqS9A4ZenOMoQKKxpj_ujNadzKeu--qs

Anitaku(Gogoanime) March 2024: https://mega.nz/file/FDBngTQB#p3GMrhPpBY893GLBUJfBePwDOYsKFWmpRyarFlGWCZs

Gogoanime Comments Before 2021: Unfortunatly the compressed file size for Gogoanime is 48.2 GiB and I dont know how to share it since I ran out of free storage space. I will make another post when I figure out how to set up a torrent and also add the link here

top 2 comments
sorted by: hot top controversial new old
[โ€“] [email protected] 1 points 4 days ago (1 children)

I'm not 100% following, but if this is content that's gone from being publicly viewable, uploading it to archive.org seems appropriate (but don't take my word for it). Though if you do upload, it sounds like just the "part files" or just the "full files" would be enough. You could then also seed the torrent there.

The imgur links you might try uploading to ArchiveTeam's imgur-grab project. Looking at the tracker, the archival seems currently half dead, but I'm sure they'll get around to saving what's left. (Unless it's dead because imgur made it nigh impossible, which I can definitely imagine)

[โ€“] [email protected] 1 points 3 days ago* (last edited 3 days ago)

I will upload them to archive.org later on. Also regarding the imgur images sorry for the misunderstanding, they are still available but the disqus downloader I built didn't download them because I didn't know about their existence and didn't account for them I will fix it later. Edit: I checked the uploading troubleshooting page and I think I might not need a torrent since they have a max size limit of around 500GB