The problem is that there's nothing in RSS to say if the various blocks of text are allowed to contain markup, and if so which. Apparently (see here):
"Userland's RSS reader—generally considered as the reference implementation—did not originally filter out HTML markup from feeds. As a result, publishers began placing HTML markup into the titles and descriptions of items in their RSS feeds. This behavior has become expected of readers, to the point of becoming a de facto standard"This isn't just difficult, it's unresolvable. If you find
in feed data you simply can't know if the author intended it as an example of HTML markup, in which case you should escape the brackets before including them in your page, or as 'Boo!', in which case you probably expected to include the data as it stands.
However, given how things are and unless you know from agreements or documentation that a feed will only ever contain text then you are going to have to assume that the content includes HTML. Stripping out all the tags would be fairly easy, but probably isn't going to be useful because it will turn the text into nonsense - think of a post that includes a list.
What should you let through? Well, that's hard to say. Most of the in-line elements, like <b>, <strong>, <a> (carefully), etc. will probably be needed. Also at least some block level stuff - <p>, <div>, <ul>, <ol>, etc. And note that you will have to think carefully about the character encoding both of the RSS feed and the page you are substituting it into, otherwise you might not realise that +ADw-script+AD4- could be dangerous (hint: take a look at UTF7)
If at all possible I'd try to avoid doing this yourself and use a reputable library for the purpose. Selecting such a library is left as an exercise for the reader.
See also Doing RSS right (3) - character encoding.