IEnumerable interface in .NET and LSP

I often see developers saying that in most cases, use of IEnumerable breaks LSP. Does it? Let’s find out.

This is the continuation of my article Read-Only Collections and LSP. It this post, I’d like to discuss IEnumerable interface from a Liskov Substitution Principle (LSP) perspective.

Liskov Substitution Principle and IEnumerable interface

To answer the question whether or not use of IEnumerable breaks LSP, we should step back and see what it means to break LSP.

We can say that LSP is violated if one of the following conditions are met:

  • Subclass of a class (or, in our case, of an interface) doesn’t preserve its parent’s invariants

  • Subclass weakened the parent’s postconditions

  • Subclass strengthened the parent’s preconditions

The problem with IEnumerable is that its preconditions and postconditions are not defined explicitly and often being interpreted wrong. Official contracts for IEnumerable and IEnumerator also don’t tell us much. Even more, different implementations of IEnumerable often contradict each other.

Implementations of IEnumerable interface

Before we dive into implementations, let’s look at the interface itself. Here’s the code for IEnumerable<T>, IEnumerator<T> and IEnumerator. IEnumerable interface is essentially the same as IEnumerable<T>.

public interface IEnumerable<out T> : IEnumerable
{
    IEnumerator<T> GetEnumerator();
}
 
public interface IEnumerator<out T> : IDisposable, IEnumerator
{
    T Current { get; }
}
 
public interface IEnumerator
{
    object Current { get; }
    bool MoveNext();
    void Reset();
}

They are quite simple and doesn’t do much. Nevertheless, different BCL classes implement them differently. Perhaps, the most significant example of contradiction in IEnumerable implementation is List<T> class:

public class List<T>
{
    public struct Enumerator : IEnumerator<T>
    {
        private List<T> list;
        private int index;
        private T current;
 
        public T Current
        {
            get { return this.current; }
        }
 
        object IEnumerator.Current
        {
            get
            {
                if (this.index == 0 || this.index == this.list._size + 1)
                    throw new InvalidOperationException();
                return (object)this.Current;
            }
        }
    }
}

'Current' property of type T does not require you to call MoveNext(), whereas 'Current' property of type object does:

public void Test()
{
    List<int>.Enumerator enumerator = new List<int>().GetEnumerator();
    int current = enumerator.Current; // Returns 0
    object current2 = ((IEnumerator)enumerator).Current; // Throws exception
}

Reset() method is also implemented differently. While List<T>.Enumerator.Reset() conscientiously moves to the beginning of the list, iterators don’t implement it at all, so the following code fails:

public void Test()
{
    Test2().Reset(); // Throws NotSupportedException
}
 
private IEnumerator<int> Test2()
{
    yield return 1;
}

It turns out that the only thing we can be sure in is that IEnumerable<T>.GetEnumerator() method returns a non-null enumerator. Other methods guarantee nothing to us. A class implementing IEnumerable interface can be an empty set:

private IEnumerable<int> Test2()
{
    yield break;
}

As well as an infinite sequence of elements:

private IEnumerable<int> Test2()
{
    Random random = new Random();
    while (true)
    {
        yield return random.Next();
    }
}

And that is not a made-up example. BlockingCollection implements IEnumerator in such a way that calling thread is blocked on MoveNext() method until some other thread places an element into the collection:

public void Test()
{
    BlockingCollection<int> collection = new BlockingCollection<int>();
    IEnumerator<int> enumerator = collection.GetConsumingEnumerable().GetEnumerator();
    bool moveNext = enumerator.MoveNext(); // The calling thread is blocked
}

In other words, IEnumerable interface gives no promises about the underlying element set; it doesn’t even guarantee that the set is finite. All it tells us is that it can somehow step through the elements (enumerate them).

Use of IEnumerable and LSP

So, does use of IEnumerable interface break LSP? No. It is actually an invalid question because you can’t break LSP by using an interface, whatever this interface is.

Every interface has some essential preconditions, postconditions and invariants (although, they may not be specified explicitly). Client code using an interface can violate one of the preconditions, but it can’t override them and thus cannot violate LSP. Only classes that implement the interface can break this principle.

That brings us to the next question: do implementations of IEnumerable break LSP? Consider the following code:

public void Process(IEnumerable<Order> orders)
{
    foreach (Order order in orders)
    {
        // Do something
    }
}

In the case when orders' underlying type is, say, List<Orders>, everything is fine: they can be easily iterated. But what if orders is actually an endless generator that pulls out a new object on every MoveNext()?

internal class OrderCollection : IEnumerable<Order>
{
    public IEnumerator<Order> GetEnumerator()
    {
        while (true)
        {
            yield return new Order();
        }
    }
}

The Process method will fail apparently. But is that because the OrderCollection class breaks LSP? No. OrderCollection class religiously follows the IEnumerable contracts: it provides an Order instance every time it is asked to.

The problem is that the Process method expects more than the IEnumerable interface promises. There’s no guarantee that the underlying orders class is a finite collection. As I mentioned earlier, 'orders' can be an instance of BlockingCollection class, which makes trying to get all the elements out of it completely useless.

To avoid the problem, you can simply change the incoming parameter’s type to ICollection<T>. Unlike IEnumerable interface, ICollection provides Count property which guarantees that the underlying collection is finite.

IEnumerable and read-only collections

Use of ICollection has some other drawbacks, through. ICollection allows changing its elements, which is often undesirable when you want to introduce a read-only collection. Before version 4.5 of .Net, IEnumerable was often used for that purpose.

While it might seem a good decision, it puts too much of restriction on the interface’s consumers.

public int GetTheTenthElement(IEnumerable<int> collection)
{
    return collection.Skip(9).Take(1).SingleOrDefault();
}

The code above shows one of the most common approaches developers use: using LINQ to avoid IEnumerable limitations. Although this code is simple, it has an obvious drawback: it iterates through the collection ten times whereas the same result can be achieved by just accessing an element by index.

The solution is obvious - use IReadOnlyList instead:

public int GetTheTenthElement(IReadOnlyList<int> collection)
{
    if (collection.Count < 10)
        return 0;
    return collection[9];
}

There’s no reason to continue using IEnumerable interface in places where you expect the element set to be countable (and you do expect it in most cases). IReadOnlyCollection<T> and IReadOnlyList<T> interfaces introduced in .Net 4.5 make it a lot easier.

Implementations of IEnumerable and LSP

What about the implementations of IEnumerable that do break LSP? Let’s look at example where IEnumerable’s underlying type is DbQuery<T>. We could get it in the following way:

private IEnumerable<Order> FindByName(string name)
{
    using (MyContext db = new MyContext())
    {
        return db.Orders.Where(x => x.Name == name);
    }
}

There’s an obvious problem with this code: the database call is being postponed until you start iterating through the results. But, as the database context is already closed, the call yields an exception:

public void Process(IEnumerable<Order> orders)
{
    foreach (Order order in orders) // Exception: DB connection is closed
    {
        // Do something
    }
}

This implementation violates LSP because IEnumerable itself doesn’t have any preconditions that require you to keep a database connection open. You should be able to iterate through IEnumerable regardless of whether or not there’s is one. As we can see, DbQuery class strengthens IEnumerable’s preconditions and thus breaks LSP.

I must say that it isn’t necessarily a sign of a bad design. Lazy evaluation is a common approach while dealing with database calls. It allows you to execute several calls in a single roundtrip and thus increase your overall system’s performance. Of course, that comes at a price which in this case is breaking one of the design principles.

What we see here is a trade-off made by architects. They consciously decided to sacrifice some readability for performance benefits. And, of course, the problem can be easily avoided by forcing DbQuery to evaluate the database call:

private IEnumerable<Order> FindByName(string name)
{
    using (MyContext db = new MyContext())
    {
        return db.Orders
            .Where(x => x.Name == name)
            .ToList(); // Forces EF to evaluate the query and put the results into memory
    }
}

Summary

Use of IEnumerable interface was and still is a common way to deal with collections. But be conscious: in most cases, IReadOnlyCollection and IReadOnlyList will fit you much better.

Subscribe


I don't post everything on my blog. Don't miss smaller tips and updates. Sign up to my mailing list below.

Comments


comments powered by Disqus