Web scraping
Turns out we can get information off the internet
2016-06-16 — 2023-01-24
Wherein web pages are parsed for structured data, parsed outputs are converted into RSS feeds by configured parsers, and deployments are orchestrated across cloud services to run the extraction at scale.
browser
                        computers are awful together
                        confidentiality
                        diy
                        doing internet
                        faster pussycat
                    Services to extract information from web pages.
Some of these use browser automation although that is kind of its own thing.
1 Scrapy
Scrapy is a Python library to do that. Companion project scrapy-rss converts my parsings into RSS feeds.
Also, there is a custom cloud service (scrapinghub) that will deploy it for you on a massive scale if you want.
Scrapoxy automates deployment of distributed cloud for this purpose.
