Topic: How do I archive my favorites offline in the event of censorship?

Posted under General

I'm sure by now everyone is aware of the current attempts at censorship of NSFW content on the internet. There's now a non-zero chance that a good portion of furry art is at risk of being permanently lost if not archived offline. Even SFW content is propably not safe since the people responsible will see anything furry as fetish content.
Some of you may argue that there's no way they'll censor furry content to the point that e621 is taken down, and during normal circumstances, I'd agree with you, but we don't seem to be living in normal circumstances anymore. Censorship is getting worse with the recent UK law (with similar proposals in the US and EU) and Steam/Itch.io situation. The only thing I can do now is archive some of it before it's lost.

Is there a proper way to archive my 12k+ favorites (without being tagged as a bot / IP banned) or do I just recursively curl posts?page={page}&tags=fav:{user} with a custom script?

Donovan DMC

Former Staff

You can fetch up to 320 posts in a single page, that's just under 40 requests for ~12k favorites
You should be able to fetch from the static server as fast as reasonably possible

in other words don't worry about it, your usage is a drop in the bucket

nyaaaaa said:
I'm sure by now everyone is aware of the current attempts at censorship of NSFW content on the internet. There's now a non-zero chance that a good portion of furry art is at risk of being permanently lost if not archived offline. Even SFW content is propably not safe since the people responsible will see anything furry as fetish content.
Some of you may argue that there's no way they'll censor furry content to the point that e621 is taken down, and during normal circumstances, I'd agree with you, but we don't seem to be living in normal circumstances anymore. Censorship is getting worse with the recent UK law (with similar proposals in the US and EU) and Steam/Itch.io situation. The only thing I can do now is archive some of it before it's lost.

Is there a proper way to archive my 12k+ favorites (without being tagged as a bot / IP banned) or do I just recursively curl posts?page={page}&tags=fav:{user} with a custom script?

Use Hydrus network

Aacafah

Moderator

nyaaaaa said:
Is there a proper way to archive my 12k+ favorites (without being tagged as a bot / IP banned) or do I just recursively curl posts?page={page}&tags=fav:{user} with a custom script?

As Donovan said, I wouldn't be too worried about it if you're using the max posts per page (I'm not even sure you'd even get rate limited until you exceed 60 requests). You can override the default/user-defined posts per page by adding the limit query parameter. In this case, you could use posts?limit=320&page={page}&tags=fav:{user}. Since you have your favs hidden, you'll need to add your login data to your requests (see the API docs for details); I believe you'd also need to do this if there are any posts in your favorites that would match the global blacklist (I think the post data would still be there, but the static file's URL would be removed).

Also, totally inconsequential nitpick, but I'd imagine this would be iterative, not recursive.

I personally recommend gallery-dl. Get an API key for it and set it in your config file (optional, but you might be limited by the global blacklist), then run gallery-dl <url>. It follows the e6's rules for scraping, so you won't get any warnings or the like. If you want to download the post's metadata json as well, add --write-metadata.

You can also just download the JSON versions of results, like page 1, page 2, etc.

There are plenty of tools to process JSON files. The most useful data is the tags, and hashes(contained in URLs or by themselves).

What software would you guys recommend for downloading a post's metadata (sources, description, relationships, etc) in addition to the post itself or downloading that information for an existing file and storing it somewhere? Hydrus Network seems to be not well suited to downloading information for existing images. I've been using Imgbrd-Grabber but it only handles tags.

Aacafah

Moderator

eightoflakes said:
What software would you guys recommend for downloading a post's metadata (sources, description, relationships, etc) in addition to the post itself or downloading that information for an existing file and storing it somewhere? Hydrus Network seems to be not well suited to downloading information for existing images. I've been using Imgbrd-Grabber but it only handles tags.

  • I've heard good things about gallery-dl
  • For mass metadata, the db export is preferred
  • For smaller sets of data, using the API should be fine (though you need to do some work to use it raw, obviously)

Original page: https://e621.net/forum_topics/58706