I have yanked the html tags out of a web page.ÃÂ I need to yank out the character entities as well.ÃÂ For example, " is replaced with a space.ÃÂ I've tried using word boundaries with no luck.ÃÂ It seems like it would be pattern matching but I can't get it to work for anything.ÃÂ
I've tried variations of
with +*-. in between but can't get anywhere.ÃÂ Given the number of character entities in html, I don't want to write a string.replace statement for each char entity.ÃÂ Help!
View Complete Post