I can't seem to get the regex I need for this match. I have an html document that has multiple
<DIV id="Emag_Header"> tags in it. It's an html render of a magazine article from a web page. What I need to do it get all html including the first occurance of the div. I then need all html up to the next occurance of the div. I then
want to omit the div but get any html after that. All the time omitting the div (after the first one which needs to be in there). I tried a capture/replace method like this:
regex = @"(<DIV[\s?]+Id=""Emag_Header"">.*?</DIV>.*)<DIV[\s?]+Id=""Emag_Header"">.*?</DIV>(.*)";
content = reg.Replace(content, "$1$2");
The idea was to grab the first occurance and all html up to the second occurance, then grab all the html after that. Of course it works fine for the first two occurances, but the .* at the end matches everything (as its supposed to), and a 3rd occurance
of the div will be in there....