.NET Tutorials, Forums, Interview Questions And Answers
Welcome :Guest
Sign In
Win Surprise Gifts!!!

Top 5 Contributors of the Month
david stephan
Gaurav Pal
Post New Web Links

Does Regular Expression (RegEx) able to extract Sentence from paragraphs

Posted By:      Posted Date: October 09, 2010    Points: 0   Category :.NET Framework


I am new to Regular languages; I have been trying to find out whether Regex is able to extract sentences from any paragraph (text) on PDF format based on particular words (pattern).



Can you please outlines this with example in C#


Thank you in advance  

View Complete Post

More Related Resource Links

How Regular Expression extract Sentence from paragraphs



Hello there

I hope that you can provide me with code exmaple of extracting sentence from paragraph,I have insert paragraph below as exmaple,However, you can outline this for me in differnet example if wish too,Thank you in advance 

paragraph example

The goals of the information extraction system is to understand and transfer different structure of natual lanaguge  text and structuring the extracted units of the information into understandable formation in such data structure. The information extraction is enable is to produce semantic representation of the text, the extracted semantic units of information is could be utilized for information analysis.The information extraction system is unable to understating the actual text, but its attempts finds and extract pieces of information that is relevant to predefined pattern .Machine learning is tools to analysis these and produces from them facts.


Security Briefs: Regular Expression Denial of Service Attacks and Defenses


Microsoft security expert Bryan Sullivan believes denial-of-service blackmail attacks will become more common as privilege escalation attacks become more difficult to execute. He demonstrates how to protect your apps against regular expression DoS threats.

Bryan Sullivan

MSDN Magazine May 2010

Help with regular expression


I am using this regular expression: /.*-lyrics-.*$

and I need the expression find urls like this:

and it do really does that!

The problem is that it finds also this URL:

Whar regular expression should I use to exclude urls that end with /lyrics ?

Thanks :]



Regular Expression to Match


Here is the kind of text I want to match via a regular Expression

The id="dgSchedule" is always present in the TAG but its location may differ

The table is span over multiple line/contains white spaces/tabs...

I have the regular expression to match the start and end tag respectively over a single line

\<table .*\>


The problem is to match the whole table span over multiple lines

<table cellspacing="0" rules="all" border="1" id="dgSchedule" style="border-style:None;height:100%;width:100%;border-collapse:collapse;">
	<tr class="blackbar" align="center" style="background-color:#7FB4DE;font-family:Verdana;font-weight:bold;height:20px;">
		<td>From Place</td><td>To Place</td><td>Time</td><td>Bus Type</td>

	</tr><tr style="background-color:#DEECF5;">
	</tr><tr style="background-color:#EFF5FA;">
	</tr><tr style="background-color:#DEECF5;">


Need a regular expression


I have a required field validator for a texbox, but I just found out that the texbox can have a space in it (for a delimiter, let's say).

I checked the .net and regexlib.com, but couldn't find what I was looking for.

I simply need a regular expression that basically excepts any character string including spaces in it or even a space by itself

Can someone help me out?

regular expression

hi...can anobody help me that how to write regular expression for textbox..such as the textbox should accept only numbers otherwise alert msg should be displayed ..displaying that u cannot enter the charcter in the textbox..i want this in javascritp.. 

Matching set of characters via regular expression

I need to be able to match sets of quote and space characters in a string and replace these accordingly. I currently have this done in 2 lines of code, but would like to use regular expression so that I can use only 1 line of code. See below: var Qt = unescape("%22"); //quote char var Sp = unescape("%20"); //space var Cr = unescape("%0d"); //carriage return var Lf = unescape("%0a"); //line feed var CrLf = Cr + Lf;  //carriage return & line feed stringOut = stringIn.replace(Qt+Qt+Sp+Qt,Qt+Qt+CrLf+Qt); stringOut = stringOut.replace(Qt+Qt+Sp+Qt+Qt,Qt+Qt+CrLf+Qt+Qt);   I want to replace space in string like in pattern below with carriage return line feed: (in hex) 22 22 20 22 or 22 22 20 22 22. This needs to return 22 22 0D 0A 22 or 22 22 0D 0A 22 22   Line in green above will only match and replace 22 22 20 22 , but I want have regular expression into 1 line of code which  will match and replace either 22 22 20 22 or 22 22 20 22 22  

Regular Expression for Digits, comma & space combination

Hi Can you suggest a Regular Expression for the following category. A text box which should contain: Max size : 53 0-9 digits + a comma + a space The 0-9 numbers should be repeated 5 times and separated by a comma and a space Eg: 123456789, 123456789, 123456789, 123456789, 123456789 Thanks, David.      

I need a regex for the following sentence

I have a line of text which is as follows  : This should occur before variable1 = 1 variable2 =2 variable3=6 variable4=9   I want a regex which checks if this text (This should occur before ) comes before  the string of the variables (variable1, variable2, variable3, variable4 ) and also  should pass if the placement of the variables can occur anywhere in the sentence.Also the regex should pass only if the variable4 has a value greater than 9.  

Need Regular Expression for d/mm/yyyy hh:mm:ss

Hi everybody,i am stuck in generating a regular expresion for this typem/dd/yyyy hh:mm:ssbecause i am dealing with some date values, for example i have  date values like bellow, 3/29/2007 13:28:343/27/2007 17:12:36so i need to find out a regular expression type in order to validate this, i already got some regular expression string like this ,^(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)[0-9]{2}$but this is working only for 03/29/2007but if i use 3/29/2007 it wont work , or if use 03/27/2007 17:12:36 still it wont work so i really need you guys help on this to generate my regular expression string thanksregrdssukavi

need regular expression to get message body

I have text like this: <tr class="NormalRow_Small"> <td colspan=2>&nbsp;</td> <td colspan=3> </td> </tr> <tr class="NormalRow_Small"> <!--description--> <td colspan="5"> <font size="2">Hi All,<div><!br /></div><div>I'm glad to introduce ...</div><div><!br /></div><div>Actually we already have ... <font size=1>(see link for full text)</font></font><br> <font size="1"> </td> </tr> <tr> <td colspan=5><hr /></td> </tr>   I want to get bold text, please help me.   Thanks, Alex.

Need a Specific Regular Expression

I am clueless when it comes to building regular expressions. I know what they are, but haven't been able to master them. I found this regular expression which will validate a Sprint/Nextel Direct Connect number - all numbers, only asterisks allowed. ^\d+\*\d+\*\d+$ http://www.regexlib.com/REDetails.aspx?regexp_id=1730 Can someone modify it for me so that it checks for a minimum and maximum number of numbers in the string and checks that there are (2) asterisks? If you can use a minimum of (6) and a maximum of (12), I should be able to figure out how to modify it myself after I call Sprint on Monday.  David H

regular expression: Read a multi line paragraph

I am trying to build a regular expression that will capture a paragraph of any length that starts with "Ordinance Summary:" and ends with "<=>" but doesn't actually include them in the selection.  And I need it to stop at the first instance of <=>. Here is an example of what what I might encounter: "Ordinance Summary: Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Maecenas porttitor congue massa. Fusce posuere, magna sed pulvinar ultricies, purus lectus malesuada libero, sit amet commodo magna eros quis urna. Nunc viverra imperdiet enim. Fusce est. Vivamus a tellus. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Proin pharetra nonummy pede. Mauris et orci. Aenean nec lorem.<=> Ordinance Sponsor:Rhuarch<=>"   And here is what I would need it to actually capture: Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Maecenas porttitor congue massa. Fusce posuere, magna sed pulvinar ultricies, purus lectus malesuada libero, sit amet commodo magna eros quis urna. Nunc viverra imperdiet enim. Fusce est. Vivamus a tellus. Pellentesque habitant morbi tristique senectus et netus et malesuada fames ac turpis egestas. Proin pharetra nonummy pede. Mauris et orci. Aenean nec lorem.   And here is the expression I tried which didn't work: (?<=Ordinance Titl

How to extract subjects and objects from a sentence

Hi all I am supposed to do these few things in my program but I got stuck.  Step 1: To split and count the number of words in the sentence input by users. Step 2: To extract subjects and objects e.g. nsubj, pobj, etc from the sentence input by users. Or to remove the bad words (stop words) from the sentence with the database of the stop words. Step 3: To check words that match the clues and patterns. E.g. Jones was born in 1975; the matching words are Jones, born, 1975. The clues and patterns are all in sql database. Step 4: To check proper noun - capital word for each starting word) Here are my half-way done codes. I hope there are someone can help on advising me on how to continue it. Imports System Imports System.IO Imports System.Text Imports System.Text.RegularExpressions Module Module1 'To count the words Public Function getWordCount(ByVal InputString As String) As Integer Return Split(System.Text.RegularExpressions.Regex.Replace(InputString, "\s+", Space(1))).Length End Function 'To eliminate stop words and replace with a space Public Function StripStopWords(ByVal s As String) As String Dim StopWords As String = ReadFile("C:\Users\jaimiechin\Desktop\stopwords.txt").Trim Dim StopWordsRegex As String = Regex.Replace(StopWords, "\s+", "|") ' about|after|all|also etc.

RegEx Request - Extract an tag from string...

Hello, Can anyone give me a regular expression that will extract the <img> tag from a string.  I am receiving descriptions from an rss feed and the description contains an <img> tag in the text.  I want to extract the <img> so that I can display it in a different location on my page. I have fooled with some regular expressions, but my knowledge of them is not very good. Thanks in advance for any help!!

regular expression for file upload box

I have a regular expression that check for filenames in upload file dialog box . The main aim is to disallow any special characters in filenames.  ValidationExpression="^(([a-zA-Z]:)|(\\{2}\w+)\$?)(\\(\w[\w\-\. ]*))+\.([a-zA-Z]*)$" But this one will not allow filenames with back slashes   I want to modify the reg ex to except both of the following type filenames c:\temp\abcd.txt  and also the way filenames are displayed in firefox browser (with backslashes) file:///c:/BIDS2/abcd.txt   I need help with modifying the regex, so that it excepts both   Thank you

How to extract City State Zip using Regex match

Hi there - I am parsing a file which contains customer address in the following 2 formats:   Format #1 12345 Melrose Place New York NY USA 12987     Format # 2: 12345 Melrose Place New York NY 12987   I need to put the data into Address, City, State and Zip fields. I am able to parse and put the data (specifically line 2) in the fields for format #1 but am having issues doing the same for format # 2 because format # 2 doesn't have USA as a reference point. Below is my code if any expert can help that will be appreciated Dim AddressChunk As String = tokenizer.NextToken() If AddressChunk.Contains("USA") Then _State = AddressChunk.Substring(AddressChunk.IndexOf("USA") - 4, 2).Trim _City = AddressChunk.Substring(0, AddressChunk.IndexOf("USA") - 4).Trim _Zip = Regex.Match(AddressChunk, "\d{5}").Value Else _Zip = Regex.Match(AddressChunk, "\d{5}").Value _State = AddressChunk.Substring(Regex.Match(AddressChunk, "\s[a-zA-Z]{2}\s\d{5}").Value - 5).Trim _City = End If
ASP.NetWindows Application  .NET Framework  C#  VB.Net  ADO.Net  
Sql Server  SharePoint  Silverlight  Others  All   

Hall of Fame    Twitter   Terms of Service    Privacy Policy    Contact Us    Archives   Tell A Friend