Friday, April 20, 2007

Under the Hood of Iterators and yield statement

Took a brief look at the mechanics of C# iterator and yield statement --- more in terms of the MSIL level implementation than just how to use it.

  • The compiler generates a state machine implementation of the IEnumerator interface internally.
  • Each yield return statement produces a separate state, while each yield break transits the machine to the terminate state
For example, for the function here:
        private string[] _names = { "Foo", "Bar", "Poo"};

        public IEnumerator GetEnumerator() {
            for (int i = 0; i < _names.Length; ++i) {
                yield return _names[i];
            }
        }
The compiler produces a state machine with one working state and a memeber variable that keeps track of the current index into _names. But for the following code:
        public IEnumerator GetEnumerator() {
            yield return "Foo";
            yield return "Bar";
            yield return "Poo";
        }
The compiler would generate a state machine with 3 working states. It seems Reflector chokes on compiler produced iterator code. I don't expect it to reverse engineer the state machine into yield statements, but the C# code it produces is totally out of whack. I end up had to read the MSIL to understand what's going on.