Services to extract information from web pages.
Some of these use browser automation although that is kind of its own thing.
1 Scrapy
Scrapy
is a Python library to do that. Companion project scrapy-rss converts my parsings into RSS feeds.
Also, there is a custom cloud service (scrapinghub) that will deploy it for you on a massive scale if you want.
Scrapoxy automates deployment of distributed cloud for this purpose.