⚒️Thor, the Norseman⚒️ er en bruker på snabeltann.no. Du kan følge dem eller kommunisere med dem hvis du har en konto hvor som helst i fediverset. Hvis du ikke har en konto så kan du registrere deg her.
⚒️Thor, the Norseman⚒️

It would be handy if there was a way of slurping content off Instagram, Facebook or Tumblr feeds without actually needing to have accounts on there. Some very good artists can only be followed on such sites. Maybe we should infiltrate them with a decentralised swarm of bots that scrape the content for us... It would do much to undermine the network effect, and would also be rather hard for them to counteract.

@thor Tumblr has RSS feeds for external following (though I'm not sure whether that applies when the blog isn't set to public visibility). A glance at suggests it may have truncated posts though.

@thor yeah I would totally contribute accounts and, perhaps, code.

I think the biggest issue is, that these pages are mostly Javascript. Javascript is harder to scrape than static sides (you need to not only execute it but also there might be a lot of XHR, ajax or other stuff)

@saxnot The path of least resistance is probably to use their APIs and various open source API libraries where possible. One would only resort to literal screen scraping if they start shutting down their APIs. Getting API keys isn't exactly hard. If they start fighting it, the battle will never end. New bots and API keys cropping up faster than they can shut them down, and no single point of origin. Like trying to kill a swarm of mosquitoes in a crowd of people.

@saxnot If this swarm of bots isn't run by anyone in particular, and works a bit like Tor exit nodes...

@saxnot Think Tor exit nodes but they log into and crawl social networks on behalf of anonymous people.

@thor twitter harshly limits their API but not the website. Thus scraping is inefficent, but still faster.

@thor idk how others deal with it (instagram and stuff)

@thor

automated slurping might be harder, but for small amount of content/certain pages its possible to use Ublock Origin to remove login nag screen and then cut and paste:

>FOSTER HOME NEEDED FOR MAMA & PUPPIES

Lovely momma dog + her 9 cute pups need a temporary home asap! Do you have a safe, clean space where they can live while puppies grow up and are old enough to be rehomed?

Location: 70, Jalan Seri Utara 1, Taman Wahyu, KL

Please contact Ravin:

Please SHARE this post!

@thor I swiped that straight from the page of SPCA Selangor, a suburb of Kuala Lumpur Malaysia, and I've never had a FB account in my life..

(the doggo and puppers might already have been rehomed, an Indian family has offered to foster them!)