I'm getting RSS items from different RSS channels. And I'd like to sort them correctly by time and take into account the time zone, from the latests to the oldests. So far, I have the following code:
import feedparserimport dateutil.parserrss_channels = ["https://www.novinky.cz/rss","https://news.ycombinator.com/rss","https://unix.stackexchange.com/feeds","https://www.lupa.cz/rss/clanky/","https://www.lupa.cz/rss/n/digizone/","https://www.zive.cz/rss/sc-47/","https://bitcoin.stackexchange.com/feeds","https://vi.stackexchange.com/feeds","https://askubuntu.com/feeds",]latest_items = []for url in rss_channels: feed = feedparser.parse(url) for entry in feed.entries: pub_date_str = entry.published try: pub_date = dateutil.parser.parse(pub_date_str, ignoretz=True, fuzzy=True) if pub_date.tzinfo is None: pub_date = pub_date.replace(tzinfo=dateutil.tz.tzutc()) latest_items.append((entry.title, pub_date, entry.link)) except Exception as e: print(str(e))latest_items.sort(key=lambda x: x[1], reverse=True)for title, pub_date, url in latest_items: print(f"{pub_date.strftime('%Y-%m-%d %H:%M:%S %z')} - {title} - {url}")
I'm not sure if the code is correct. Could you assure me or refute and show me what's wrong? The code is very slow as well, so if it's possible to make faster, it would be great.