I've been reading through the documentation for feedparser and haven't been able to find a solution to this: I would like to retrieve only the string between <p></p>
. An example of an excerpt from a feed I'd like to retrieve this from is:
<img alt="Dawsons" height="259" src="http://i.cbc.ca/1.2703554.1405073659!/fileImage/httpImage/image.jpg_gen/derivatives/16x9_460/dawsons.jpg" title="Kathy Dawson and her daughter Emily Dawson, 18, now have a complaint before the Alberta Human Rights Commission over a sexual education course Emily had to take last year. " width="460" /> <p>The Edmonton Public School Board has said it will tell teachers not to use an anti-abortion centre to teach part of its sexual education curriculum, after a McNally high school student filed a human rights complaint over what she was taught.</p>
Note: this is from the RSS feed at http://www.cbc.ca/cmlink/rss-topstories
which I retrieved with
for item in cbc.entries: print item.summary
I know I could easily write something to manually parse through and return only what I want but is there a way feedparser can do it for me?