Show HN: Feed Creator 2.0 – Generate RSS feeds from web page elements

k1m | 24 points

Cool project!

I've recently created something similar for personal use. I have many websites (mainly webshops) I want to be notified about changes on, but they don't have RSS feeds, subscriptions or APIs than you can use.

I set up a cron job that runs daily, scrapes websites according to some XPaths, and saves the results to a DB. If any new elements have appeared, an email will be sent out. The biggest challenge is handling false positives: being able to distinguish between a new element and e.g., a previously seen element with an updated title, description etc. For websites that directly expose what seems to be unique, server-side, identifiers in their HTML, using that as a primary key seem to work well. If that's not available, the href of the HTML element seem to be fairly static.

Do you have any thoughts on the issue of false positives and unique identifiers?

stekern | 4 years ago

Happy to get feedback and answer any questions about this here.

Here are two feeds we made earlier to give you an idea of what Feed Creator is supposed to do. The links below will pre-fill the form with the parameters you'd enter and produce a preview of the RSS:

* Chomsky.info articles: https://createfeed.fivefilters.org/index.php?url=https%3A%2F...

* Latest articles by John Pilger: https://createfeed.fivefilters.org/index.php?url=http%3A%2F%...

k1m | 4 years ago

I’d be worried about copyright issue.

In many cases, the client doesn’t care/know about content copyright and just crawling.

(P/s: I used to develop a blogging platform and find RSS links to seed content)

docuru | 4 years ago

This seems a halfway before create a webscraping solution. Adding support for integrate automation with zapier or ifttt could help to close the circle. Nice project

jslakro | 4 years ago