Unable to setup feed for Reddit search #4

Open
opened 2018-12-15 00:08:53 +01:00 by kmlucy · 5 comments

I'm trying to use HRSS for a Reddit search like this:

https://www.reddit.com/search?q=(%2Bgiveaway%2BOR%2Bgiveaway%2B)%2B(%2Bsubreddit%253ADataHoarder%2BOR%2Bsubreddit%253Ahomelab%2B)

I can select the element, title, etc., but the feed only results in the one post I used when selecting them. Am I doing something wrong, or will HRSS just not work for this application?

I'm trying to use HRSS for a Reddit search like this: ``` https://www.reddit.com/search?q=(%2Bgiveaway%2BOR%2Bgiveaway%2B)%2B(%2Bsubreddit%253ADataHoarder%2BOR%2Bsubreddit%253Ahomelab%2B) ``` I can select the element, title, etc., but the feed only results in the one post I used when selecting them. Am I doing something wrong, or will HRSS just not work for this application?
Owner

The provided URL does not give any result on my end.

The provided URL does not give any result on my end.
Author

I'm not sure why the link isn't working for you. It is just this search:

( giveaway OR giveaway ) ( subreddit:DataHoarder OR subreddit:homelab )
I'm not sure why the link isn't working for you. It is just this search: ``` ( giveaway OR giveaway ) ( subreddit:DataHoarder OR subreddit:homelab ) ```
Owner

HRSS uses HTML tags, id and classes to find content in HTML pages and to build the resulting RSS feed. Some websites like Reddit generate random classes for identical elements, making HRSS differentiate them and thus failing to correctly build the RSS feed.

This can be confirmed by creating a superuser in HRSS (python3 manage.py createsuperuser) and logging on admin panel then looking at the classes used by HRSS to generate the feed: YOUR-BASEURL/admin/web/feed/. Here you will find all your feeds and when you click on one, you can see and modify the HTML classes used by HRSS. Screenshot of the feed created for Reddit (with only one post in the result) attached.

Manual workaround is to remove the random-generated classes and/or ajust classes until your feed looks good.
Real fix would be for HRSS to autodetect random-generated classes and ignore them, but I don't know how I can do that.

HRSS uses HTML tags, id and classes to find content in HTML pages and to build the resulting RSS feed. Some websites like Reddit generate random classes for identical elements, making HRSS differentiate them and thus failing to correctly build the RSS feed. This can be confirmed by creating a superuser in HRSS (`python3 manage.py createsuperuser`) and logging on admin panel then looking at the classes used by HRSS to generate the feed: `YOUR-BASEURL/admin/web/feed/`. Here you will find all your feeds and when you click on one, you can see and modify the HTML classes used by HRSS. Screenshot of the feed created for Reddit (with only one post in the result) attached. Manual workaround is to remove the random-generated classes and/or ajust classes until your feed looks good. Real fix would be for HRSS to autodetect random-generated classes and ignore them, but I don't know how I can do that.
Author

Got it. One suggestion might be to allow manual editing of the classes during creating. As in, you pick an element, it gets the class, then you can edit the class before you create the feed.

I don't know if you use an adblocker, but that is how uBlock Origin (and I assume others) do it to allow you to block elements.

Got it. One suggestion might be to allow manual editing of the classes during creating. As in, you pick an element, it gets the class, then you can edit the class before you create the feed. I don't know if you use an adblocker, but that is how uBlock Origin (and I assume others) do it to allow you to block elements.
Owner

That would be a solution. I wanted to make HRSS autodetect random classes but that could make it too. I suck at web design so I need to find a way to integrate it easily in the setup page...

That would be a solution. I wanted to make HRSS autodetect random classes but that could make it too. I suck at web design so I need to find a way to integrate it easily in the setup page...
Sign in to join this conversation.
No Label
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: hipstercat/hrss#4
No description provided.