Quantcast
Viewing all articles
Browse latest Browse all 105

Retrieve info between paragraph tags with feedparser

I've been reading through the documentation for feedparser and haven't been able to find a solution to this: I would like to retrieve only the string between <p></p>. An example of an excerpt from a feed I'd like to retrieve this from is:

<img alt="Dawsons" height="259" src="http://i.cbc.ca/1.2703554.1405073659!/fileImage/httpImage/image.jpg_gen/derivatives/16x9_460/dawsons.jpg" title="Kathy Dawson and her daughter Emily Dawson, 18, now have a complaint before the Alberta Human Rights Commission over a sexual education course Emily had to take last year. " width="460" /> <p>The Edmonton Public School Board has said it will tell teachers not to use an anti-abortion centre to teach part of its sexual education curriculum, after a McNally high school student filed a human rights complaint over what she was taught.</p>

Note: this is from the RSS feed at http://www.cbc.ca/cmlink/rss-topstories

which I retrieved with

for item in cbc.entries:    print item.summary

I know I could easily write something to manually parse through and return only what I want but is there a way feedparser can do it for me?


Viewing all articles
Browse latest Browse all 105

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>