.NET Tutorials, Forums, Interview Questions And Answers
Welcome :Guest
 
Sign In
Register
 
Win Surprise Gifts!!!
Congratulations!!!


Top 5 Contributors of the Month
david stephan

Home >> Articles >> LINQ >> Post New Resource Bookmark and Share   

 Subscribe to Articles

Remove duplicate items from Generic List

Posted By:Dhiraj Ranka       Posted Date: October 06, 2010    Points: 50    Category: LINQ    URL: http://www.dotnetspark.com  

This article will help you to remove duplicate items from custom class generic lists
 

Most of times we use extensions like IList, IEnumerable, etc. which comes with .NET Framework 3.5. Other day when I was using these generic list with my custom class, I came across one of its shortcoming. Like it offers methods like Select(), Where() and GroupBy() which takes something called lambda as parameters. A few query operators, such as Distinct(), do not take lambdas as parameters. As a result, they are easy to call.

The parameter less Distinct() removes duplicates from a list, based on their hash. Now this is all works fine when you use predefined data types for generic list. Trouble starts when you use your own class with generic list. At that time Distinct() most probably will not work. So we need some thing which actually removes duplicate entries from list.

To solve this we can use IEqualityComparer which is also an parameter in Distinct() for one of the overloaded method. The Distinct() method with IEqualityComparer parameter returns an unordered sequence that contains no duplicate values. If comparer is null, the default equality comparer, Default, is used to compare values. Using this method one can avoid using lambda expressions. To understand the power of LINQ you do not need to understand lambdas.

Lets see an example that will make this very clear.

First we will create our class (we will create class for car)

public class Car
{
    public string ModelName { get; set; }
    public int Price { get; set; }
    public string Type { get; set; }
}
Then we have to create a class which will inherit IEqulityComparer and we have override two of their methods, namely Equals() and GetHashCode()
// Custom comparer for the Car class.
class CarComparer : IEqualityComparer
{
    // Cars are equal if their model names and car price are equal.
    public bool Equals(Car x, Car y)
    {
        // Check whether the compared objects reference the same data.
        if (Object.ReferenceEquals(x, y)) return true;

        // Check whether any of the compared objects is null.
        if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
            return false;

        // Check whether the cars' properties are equal.
        return x.ModelName == y.ModelName && x.Price == y.Price;
    }

    // If Equals() returns true for a pair of objects,
    // GetHashCode must return the same value for these objects.

    public int GetHashCode(Car car)
    {
        // Check whether the object is null.
        if (Object.ReferenceEquals(car, null)) return 0;

        // Get the hash code for the ModelName field if it is not null.
        int hashModelName = car.ModelName == null ? 0 : car.ModelName.GetHashCode();

        // Get the hash code for the price field.
        int hashCarPrice = car.Price.GetHashCode();

        // Calculate the hash code for the product.
        return hashModelName ^ hashCarPrice;
    }
}
Once we have defined the comparer, we can use IEnumerable sequence of Car objects in Distinct() method, as shown in the following example
List cars= {
        new Car { ModelName = "Merd", Price = 15000, Type = "Basic" },
        new Car { ModelName = "Ferrari", Price = 25000, Type = "Pro" },
        new Car { ModelName = "Porsche", Price = 50000, Type = "Adv" },
        new Car { ModelName = "Merd", Price = 15000, Type = "Modern" },
        new Car { ModelName = "Range Rover", Price = 77000, Type = "Luxurious" } };

//Exclude duplicates.

IEnumerable noduplicates = cars.Distinct(new CarComparer());

foreach (var car in noduplicates)
    Console.WriteLine(car.ModelName + " " + car.Price);

/*
    This code produces the following output:
    Merd 15000
    Ferrari 25000
    Porsche 50000
    Range Rover 77000
*/

Here we can notice that even Type property being different for ModelName "Merd" it only showed the first one, because we have defined the comparer only for two of its properties. If we will include the third property as well in comparer then it will show one more "Merd" in output of different type.

My personal experience with this IEqualityComparer is that I had a parser which gave me records of around 2300+ for which I have used generic list with my custom class. To find duplicates was hell for me, because parameter less Distinct() didn't actually solved my problem. That's why I had to create custom comparer which solved my problem and amazing reduce the looping and no of duplicate records which reduced to around 300. Ohhhh....awesome! isn't it?

Find more at http://msdn.microsoft.com/en-us/library/bb338049%28v=VS.90%29.aspx


 Subscribe to Articles

     

Further Readings:

Responses

No response found. Be the first to respond this post

Post Comment

You must Sign In To post reply
Find More Articles on C#, ASP.Net, Vb.Net, SQL Server and more Here

Hall of Fame    Twitter   Terms of Service    Privacy Policy    Contact Us    Archives   Tell A Friend