What's all this, then?
It's an on-going concern and/or meme over on HackerNews that the larger/corporate sites are getting more and more front-page posts. This site lists the latest news items, but filters out the larger sites.
What sites are being filtered out?
Right now, the list contains the domains of the following organizations:
Amazon, Apple, Blogspot, Bloomberg, CNBC, CNN, Foxnews, Github, Google, Googleblog, Hawaiigentech, HBR, Lever, Medium, Microsoft, Nytimes, Politico, Reuters, SEC, SKY, Substack, Theverge, Twitter, VOX, Wired, Wordpress, WSJ, Youtube.
That list will be periodically updated. It's not a perfect list, and YMMV.
Are you scraping the Hackernews front page?
No. The news articles are taken from HN's RSS feed, then parsed according to the list of ignored domains listed above.