Archive

Archive for April, 2010

.NET Generics Implementation

April 21, 2010 1 comment

A really interesting question on stackoverflow caused me to do some research on how exactly the Common Language Runtime (CLR) implements parametric polymorphism – generics.  How does the CLR actually implement generics? My initial thought was, that specialized code must be generated somewhere to support a concrete generic instance. This code would need to be different for each generic type. Let’s look at two common concepts to implement Generics:

The Java Way : Type Erasure

Java generics are compile time only generics. In essence they are just syntactic sugar on top of a run-time environment that only has no notion for a generic type. Therefore, no runtime operations are possible that depend on the type of a generic argument. For every generic type, the Java compiler will generate a raw type. Inside a generic class or method, variables of the generic type will be substituted by the compiler with the closest matching type: In case an extends constraint is specified, the type generic arguments are constrained to is used, otherwise object. This process is called Type Erasure, no information about the generic type argument is preserved in the raw type. This means that List<String> and List<Object> can and will both map to the same raw type since the process of Type Erasure will yield identical representation for instances of both generic arguments. The compiler will insert runtime casts at call sites to ensure the runtime types are correct.

The advantage of this approach is binary compatibility to previous version of the JVM. As always, the price for backwards compatibility is high. It’s a serious disadvantage introspecting the type of a generic argument at runtime is impossible. It’s easy to break the type system with a few casts. Performance for value types is bad because boxing operations are needed everywhere. Since Java generics don’t avoid the burden of runtime casting, there are no performance gains over traditional object collections.

The .NET Way : Reified Generic Arguments

.NET is said to reify parametrized types, which means it has a notion of generic types at the IL level. When a generic type is declared, the compiler generates IL that contains placeholders for the generic type and metadata about the constraints. When the generic class is used, the compiler  substitutes the placeholders on each invocation for the concrete generic type argument and uses the supplied metadata to enforce the type system. It is important to notice that the compiler does only request an instantiation of the generic class or method, that is the actual to be code executed. It does not generate that code. The compiler is only responsible for passing the right generic type arguments at invocation. Let me explain what the somewhat ambiguous term “instantiation” means here.

A given generic type might require different instantiations for each different generic type argument it is used with. You can think of it as “code instantiation”, not “object instantiation”. “Instantiation” refers to the native image, the executable code here.

Whenever a parametrized class or method is invoked, the corresponding instantiation is generated by the JIT compiler at runtime. Looking at the generic type argument, the loader checks if an instantiation of a compatible type has already been generated and returns the compatible instantiation or generates a new one. Compatible means, the code executed for a given parametrized class or method can be shared among different generic arguments. This is the case for all reference types because the pointers used for their storage are fixed size regardless of the concrete reference type. For value types, the JIT compiler will generate a specialized instantiation for each generic type argument used. Let’s look at a short example:

    public class SomeType<T> where T : new()
    {
        T field;

        public SomeType()
        {
            field = new T();
        }
    }
    class Program
    {
        static void Main(string[] args)
        {
            new SomeType<int>();
            new SomeType<float>();
            new SomeType<object>();
            new SomeType<Exception>();
        }
    }

Due to the first and second line in Main(), two different, specialized instantiations of SomeType will be generated by the JIT compiler. This is because they are value types. For the third and fourth line, only a single instantiation will be generated because the representation for reference types is binary compatible. The instantiation will be shared. The details that make this and preservation of the runtime type possible are very low level. Described shortly: While the exectuable code (instantiation, the native image) is shared among all reference types, the vtable associated with an instance of the instantiation is unique to the concrete parameter type. Information on the details can be found in a paper by Andrew Kennedy and Don Syme from Microsoft Research titled “Design and Implementation of Generics for the.NET Common Language Runtime”, which also is where most information for this blog post is from.

What are the advantages of this approach? Well, instantiation sharing is clever in terms of reduced code size and less just-in-time compilation, which is very expensive usually. Not sharing instantiations on value types means that boxing operations are unnecessary. The performance of a true generic collection of value types will therefore be superior to a collection implementation that uses the “object-idiom”.

Compared to Java’s generics implementation, the .NET implementation has many advantages in terms of performance and enforcing consistent type system because the complete type information is available at runtime. I won’t repeat the same arguments over and over again why this is advantageous, instead I’ll leave you with the recommendation to read Jonathan Pryor’s excellent blog post on this topic.

Here’s another good link to an interview with Anders Hejlsberg, talking about the C#, Java and C++ implementation of generics (or templates) respectively.

Advertisements
Categories: .NET, CLR

Version Control Tooling List

April 20, 2010 Leave a comment

A short list of the VCS tools I’m currently using. Maybe it’s useful for anybode else than me searching for a hint.

Mercurial:

  • TortoiseHg (Windows), Murky (Mac OS),  Command Line only on Linux. For more complex operations (e.g. using Mercurial Queues) I do use the command line.
  • Hosting: GoogleCode, bitbucket. Bitbucket looks nicer and has a great community, google code is more reliable.

Git:

  • CommandLine only in most cases, gitk for repository browsing. GitX is great for Mac OS. Will check QGit soon.
  • Hosting: Github. (There’s no more to say).

Subversion:

  • SmartSVN, IDE Integration (XCode, VisualStudio(AnkhSvn), Eclipse, MonoDevelop).
  • Hosting: Supported on almost every hosting platform, Google Code wins for me with its clear UI and reliable service.

For Windows, the Tortoise(X) explorer shell integrations are a good choice, however right now I’m only using TortoiseHg as there may be some clashes when I’m using e.g. hg (for private commits) and svn on one project. As a diff/merge tool I’m using the excellent DiffMerge from SourceGear.

Categories: Source Control, Tools

Subversion Client Evaluation

April 19, 2010 2 comments

Even though I am a huge believer in DVCS, for some projects I am still bound to use subversion.  I have development environments set up under Windows (primarily .NET), Mac OS (Mono and iPhone) and Linux (Mono, Haskell), so I’d like my tools to be nearly identical on all three platforms to reduce any friction not directly related to writing code.

I use version control on any project I do, some of them being cross-platform projects (e.g. mono ports), so having frictionless access to my VCS on any platform is of primary concern to me. All version control systems I happen to use at the moment run on all platforms and have good (svn, git) or excellent (hg) command line utilities, but I like the comfort of a GUI. Especially for browsing history and diffing or merging, graphical tools are irreplaceable.

Finding a good multi-platform VCS Client GUIs for subversion is not very easy, I have evaluated quite a few of them but haven’t reached any final conlusion. Here are my notes on each:

eSVN:
Qt based, only very basic repository operations, no stable version available, development has stalled in 2007. No recent releases (SVN 1.6.x) available.

QSvn:
Qt based, lightweight and solid. Actively developed, recent releases available. Easy project set-up and configuration. The GUI is very streamlined and sufficient for day to day use. Very intuitive. No merge support.

RapidSVN:
WxWidgets based, development has stalled in late 2009. There might be future maintenance but so far it doesn’t look very promising. Ugly GUI. Haven’t tried any further.

Subcommander:
Qt based, rich feature set and actively developed. The user interface is not very intuitive but you’ll get used to it. I found refreshing the repository status to be very slow. So far the most complete OSS choice (including merge support). I’d prefer QSvn for the standard tasks.

SmartSVN:
Java based, sadly enough it’s commercial. You’ll get a 30-days trial for the pro version, after that only features of the “Foundation” version are enabled. I have found it to be very intuitive, stable and fast. SmartSVN has some features that will make life with SVN easier such as prepared commits (think of a private patch queue, unfortunately not in the free edition) and a graphical revision graph. Very good merge support.

In principle I am a fan of stand-alone utilities as I think of coding and version control as sequential tasks. Having to collect my changes manually forces me to review them once more before finally comitting, which is a good thing. Subversion is a very mature VCS and has been around for very long, so I expected to find some decent OSS Clients for it. To be honest, the existing projects seem to be far behind here. The commercial SmartSVN is the best stand-alone client I have seen.

Given this situation, I am heavily leaning towards using IDE integrated tools for subversion. In contrast to a DVCS, using a centralized system takes away more than enough freedom anyway, so using IDE integrated tools won’t hurt any further. Most of the IDEs I use provide native Subversion support (XCode, MonoDevelop, Eclipse) or via a plugin (AnkhSVN for Visual Studio). The supported features are equivalent to what a simple tool like QSvn has to offer, for everything else I’ll probably use SmartSVN.

%d bloggers like this: