.NET Tutorials, Forums, Interview Questions And Answers
Welcome :Guest
 
Sign In
Register
 
Win Surprise Gifts!!!
Congratulations!!!


Top 5 Contributors of the Month
david stephan

Home >> Articles >> ASP.NET >> Post New Resource Bookmark and Share   

 Subscribe to Articles

Remove Duplicate Words in C#

Posted By:Dhiraj Ranka       Posted Date: October 06, 2010    Points: 50    Category: ASP.NET    URL: http://www.dotnetspark.com  

This article will help you in removing duplicate words in c#
 

While developing application its very often that we take input from users, other applications or from some where else. It is very much possible that it may contain some junk or unnecessary value. Even if we put in all kind of validation, still in cannot prevent the duplicate words that can be entered mistakenly. In order to remove duplicate words from any string or data which can simplify your algorithm or improve performance. Using a Dictionary instance we can remove duplicate words from a string in C#. Following is a sample input and result
Input:  This is a test string for this blog
Output: This is a test string for blog
Note:   [The second 'this' was removed.]

Using word Dictionary

We have to select data structure which provides constant-time lookup times for keys such as Dictionary. The logic would be very straight forward, we will loop through all words, and will check each word against all words already encountered. If we will use two lists that will result in complexity and which will eventually make our program useless.
/*
=== Example program that removes duplicate words (C#) ===
*/
using System;
using System.Collections.Generic;
using System.Text;

class Program
{
    static void Main()
    {
        string s = "This is a test string for this blog";
        Console.WriteLine(s);
        Console.WriteLine(RemoveDuplicateWords(s));

        s = "We use C# for development and share what we learn";
        Console.WriteLine(s);
        Console.WriteLine(RemoveDuplicateWords(s));
    }

    static public string RemoveDuplicateWords(string v)
    {
        // 1
        // Keep track of words found in this Dictionary.
        var d = new Dictionary();

        // 2
        // Build up string into this StringBuilder.
        StringBuilder b = new StringBuilder();

        // 3
        // Split the input and handle spaces and punctuation.
        string[] a = v.Split(new char[] { ' ', ',', ';', '.' },
            StringSplitOptions.RemoveEmptyEntries);

        // 4
        // Loop over each word
        foreach (string current in a)
        {
            // 5
            // Lowercase each word
            string lower = current.ToLower();

            // 6
            // If we haven't already encountered the word,
            // append it to the result.
            if (!d.ContainsKey(lower))
            {
                b.Append(current).Append(' ');
                d.Add(lower, true);
            }
        }
        // 7
        // Return the duplicate words removed
        return b.ToString().Trim();
    }
}

/*
=== Output of the example program ===
This is a test string for this blog
    This is a test string for blog
We use C# for development and share what we learn.
    We use C# for development and share what learn
*/
 Subscribe to Articles

     

Further Readings:

Responses

No response found. Be the first to respond this post

Post Comment

You must Sign In To post reply
Find More Articles on C#, ASP.Net, Vb.Net, SQL Server and more Here

Hall of Fame    Twitter   Terms of Service    Privacy Policy    Contact Us    Archives   Tell A Friend