The online Analytics service for Rowing in Motion has some intense data processing requirements. A typical logfile that users may want to work with for a 90minutes training session is about 5 megabytes in compressed size. The in-memory models we need to work with for data analysis need to encompass millions of data points and can easily exceed 30mb of memory when fully unfold.
It’s pretty clear we could not offer a good user experience when processing all this data locally on a device, so we decided to build the data analysis software as an online service. There are some other benefits to this model too, especially in the space of data archival and historical comparisons. F# excels at expressing our calculation models in a short and concise manner and makes parallellizing these calculations easy, which is crucial to achieve acceptable response times in our scenario.
Deciding to use F# was easy, but it turns out I faced some problems integrating with our cloud hosting platform of choice AppHarbor. This post will explain what needs to be done to get F# code to compile on AppHarbor and also how to run unit tests there.
Compiling F# 3.0 code on AppHarbor
Visual Studio 2012 installs the F# “SDK” (there is no official one for F#3.0) into C:\Program Files\Microsoft SDKs\F#, and that’s where the default F# project templates point to.
<Import Project="$(MSBuildExtensionsPath32)\..\Microsoft SDKs\F#\3.0\Framework\v4.0\Microsoft.FSharp.Targets" Condition=" Exists('$(MSBuildExtensionsPath32)\..\Microsoft SDKs\F#\3.0\Framework\v4.0\Microsoft.FSharp.Targets')" />
We will fix this (and another issue) by copying the whole “SDK” folder into our source repository at tools/F# (yes, everything). Next up, we will create a Custom.FSharp.Targets file, that we will reference instead. Replace the project line above with:
<Import Project="$(SolutionDir)\build\RowingInMotion.FSharp.Targets" />
We will also have to delete the FSharp.Core reference from the fsproj file. Since the AppHarbor build machines don’t have FSharp.Core 4.3.0 in the GAC (or in a ReferenceAssemblies location), we have to include this into the project too. I copied mine from C:\Program Files (x86)\Reference Assemblies\Microsoft\FSharp to lib\FSharp
The Custom.FSharp.Targets we created earlier will take care of including the correct reference, as well as pointing the Microsoft.FSharp.Targets to the correct F# compiler directory in our source tree.
<?xml version="1.0" encoding="utf-8"?> <Project xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> <!--Include a default reference to the correct FSharp.Core assembly--> <ItemGroup> <Reference Include="FSharp.Core, Version=22.214.171.124, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a"> <HintPath>$(SolutionDir)\lib\FSharp\3.0\Runtime\v4.0\FSharp.Core.dll</HintPath> </Reference> </ItemGroup> <!--Override the Path to the FSharp Compiler to point to our tool dir--> <PropertyGroup> <FscToolPath>$(SolutionDir)\tools\F#\3.0\Framework\v4.0</FscToolPath> </PropertyGroup> <Import Project="$(SolutionDir)\tools\F#\3.0\Framework\v4.0\Microsoft.FSharp.Targets" /> </Project>
One last thing that needs to be fixed is that the F# compiler (itself written in F#) also needs a copy of FSharp.Core, so I simply dropped one right next to it. That’s it, now you should be able to compile F# 3.0 projects on AppHarbor. It’s nice that F# is “standalone” enough from the rest of the .NET Framework that it can be pulled apart this easily, but it would be even better if Microsoft offered a F# SDK that the guys at AppHarbor could install on their build servers.
Running F# xUnit tests on AppHarbor
AppHarbor uses Gallio to run unit tests. Unfortunately, Gallio is not able to detect static test methods. That means you cannot write tests as modules. Instead you have to resort to declaring normal types with members, which is a bit heavier on the syntax and feels considerably less idiomatic (and its more typing…). I have filed a bug with the Gallio Team, which can be tracked here: http://code.google.com/p/mb-unit/issues/detail?id=902. It should be noted that the xUnit Visual Studio runner can run F# Xunit tests just fine. We’ll see if I see the need to switch to a more specific F# testing framework in the future.
SubSpec is finally available as a NuGet package. See http://nuget.org/ on how to get started with NuGet. Once you have NuGet installed, it’s a simple matter of running
Install-Package SubSpec or
Install-Package SubSpec.Silverlight from the Package Manager console to get SubSpec integrated into your project.
Integrated into your project you said? You mean “get the dll and reference it”? No, in fact, deployment as a separate dll is a thing of the past for SubSpec. SubSpec is an extremely streamlined extension of xUnit and as such it fits into less than 500 lines of C# (excluding xmlDocs). This approach has several advantages:
- Faster builds, 500 lines of C# are faster to compile than resolving and linking against a library
- It fosters the creation of extensions (which is extremely common, at least in my usage of it)
- No need to get the source separately, you already have it!
- Experimental extensions can be easily shared as single files too, such as Thesis, AutoFixture integration…
I hope you like the new packages, please feel free to upvote SubSpec and SubSpec.Silverlight on the NuGet gallery and feel encouraged to write a review.
SubSpec is built for .NET as well as for Silverlight. For the .NET test suite, we use the xUnit MSBuild task to execute the tests, for Silverlight we use a combination of Statlight and xunitContrib. Whenever you run a suite of tests, it’s usually desirable to have a failing test break the build, however under all circumstances the complete suite of tests should be run to give you an accurate feedback.
Our build script looks something like this:
<Target Name="Test" DependsOnTargets="Build"> <MSBuild Projects="SubSpec.tests.msbuild" Properties="Configuration=$(Configuration)" /> </Target>
SilverlightTests"/> <Target Name="xUnitTests"> <xunit Assemblies="@(TestAssemblies)"/> </Target> <Target Name="SilverlightTests"> <Exec Command=""tools\StatLight\StatLight.exe" @(SilverlightTestXaps -> '-x="%(Identity)"', ' ') --teamcity" /> </Target>
When using each of the build runners (xUnit MSBuildTask, Statlight) in isolation with multiple assemblies, they do the right thing: Run all tests, fail if at least one test failed, succeed otherwise. Now imagine we have a test succeeding under .NET but failing under Silverlight. When we run xUnit first, we get the desired result. But if Statlight was to run before xUnit, we would never know if the .NET suite would actually succeed, because Statlight stops the build.
The first (and most intuitive) idea was to move the test targets into a separate MSBuild project and call MSBuild on that project with ContinueOnError=”false”:
<Project DefaultTargets="Build" xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> <Target Name="Build"> <MSBuild Projects="test.msbuild" Targets="Test" ContinueOnError="true"/> </Target> <Target Name="Test" DependsOnTargets="Foo;Bar"> </Target> <Target Name="Foo"> <Error Text="Foo"/> </Target> <Target Name="Bar"> <Error Text="Bar"/> </Target> </Project>
But this yields only Foo as the error (I wanted to see error: Foo and error: Bar).
MSDN says about ContinueOnError:
Optional attribute. A Boolean attribute that defaults to false if not specified. If ContinueOnError is false and a task fails, the remaining tasks in the Target element are not executed and the entire Target element is considered to have failed.
This is probably why it doesn’t make sense on the MSBuild task, it would only allow another task after the MSBuild task in “Build” to execute. We confirm this by:
<Target Name="Build"> <MSBuild Projects="test.msbuild" Targets="Test" ContinueOnError="true"/> <Message Text="Some Message"/> </Target>
And we see Foo as well as Some Message. At this point, it was clear me to me that I want a target that fails if any of its tasks failed, but executes all of them.
In MSDN, we discover StopOnFirstFailure:
true if the task should stop building the remaining projects as soon as any one of them may not work; otherwise, false.
If we specified separate projects, it would work, but we’re in the same project, so unfortunately this won’t help
The next idea was to use CallTarget with ContinueOnError=”true”, like:
<Target Name="Build"> <MSBuild Projects="test.msbuild" Targets="Test" ContinueOnError="false"/> <Message Text="I should not be executed"/> </Target> <ItemGroup> <TestTargets Include="Foo;Bar" /> </ItemGroup> <Target Name="Test"> <CallTarget Targets="%(TestTargets.Identity)" ContinueOnError="true"/> </Target> <Target Name="Foo"> <Error Text="Foo"/> </Target> <Target Name="Bar"> <Error Text="Bar"/> </Target>
However, “I should not be executed” appears in the output log, what happened? Build called MSBuild with ContinueOnError=false (the default). Because all tasks in Test were ContinueOnError=true, no error bubbled up to MSBuild and it executed without error. This is dangerous, because it makes our build appear succeeded when it’s not.
The next option I tried was using RunEachTargetSeparately:
Gets or sets a Boolean value that specifies whether the MSBuild task invokes each target in the list passed to MSBuild one at a time, instead of at the same time. Setting this property to true guarantees that subsequent targets are invoked even if previously invoked targets failed. Otherwise, a build error would stop invocation of all subsequent targets. The default value is false.
<Target Name="Build"> <MSBuild Projects="test.msbuild" Targets="Foo;Bar" RunEachTargetSeparately="true"/> </Target> <Target Name="Test" DependsOnTargets="Foo;Bar"> <Error Text="Foo"/> </Target> <Target Name="Foo"> <Error Text="Foo"/> </Target> <Target Name="Bar"> <Error Text="Bar"/> </Target>
This gives us exactly what we want, but it doesn’t allow test runs to be parallelized. To achieve that, we need to put each test target in a separate project file. It turns out, that using this strategy, we don’t need to worry about controlling our failure strategy: Both projects get build and the MSBuild task reports an error when any of the projects have failed:
<Target Name="Build"> </Target> <Target Name="Test"> <MSBuild Projects="SubSpec.test.msbuild;SubSpec.Silverlight.test.msbuild"/> </Target>
Whats the alternative? The Alternative is capturing the ExitCodes of the runners, as described in http://stackoverflow.com/questions/1059230/trapping-error-status-in-msbuild/1059672#1059672, however I don’t like that approach since it’s a bit messy. The only thing we give up by using multiple projects is that it’s harder to get an overview of what happens where, but I think in this case the separation might also aid a proper separation of concerns.
Disclaimer: I would have loved to migrate to a different framework (and I would strongly advice you do so if you’re not a full stack TeamSystem shop), however I have a couple of consultants on that project who are not very test experienced and having built-in MSTest has compelling advantages. Having said that, I know that Gallio has nice VS integration that you can use to run any frameworks’ tests inside Visual Studios Test windows, however that would require each developer to install gallio on their machine (which is bad too).
Without reiterating the tirades of hate Microsoft has earned for making it impossible to run MSTest on a build server without installing Visual Studio, I want to present what I have compiled from several sources to get it working for me:
- See this post on Stackoverflow for an overview of the issue and possible solutions
- Mark Kharitonov has compiled a basic set of instructions that allow installing MSTest on a Build Server
My setup consists of a Teamcity Build Agent running on Windows Server 2008R2 x64, so I needed to change all registry keys in the reg file to point at
HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\ instead of
Next, I am using Gallio to run the tests instead of executing them directly using MSTest. Even though Gallio is considerably slower than native MSTest, which you can also use with a built-in Teamcity buildstep, there are a couple of advantages:
- Pretty Reports
- No need to deal with test run configurations and test metadata (I’ve got no idea what they are and why I would need them)
- Teamcity picks up the test resulty properly
- I can use a MSBuild script to pick up my Test dlls via wildcards, no need to have extra MSTest build tasks.
As a reference, here’s my MSBuild script for running the tests using Gallio:
<Project DefaultTargets="Test" xmlns="http://schemas.microsoft.com/developer/msbuild/2003"> <!-- This is needed by MSBuild to locate the Gallio task --> <UsingTask AssemblyFile="tools\Gallio\Gallio.MSBuildTasks.dll" TaskName="Gallio" /> <!-- Specify the tests assemblies --> <ItemGroup> <TestAssemblies Include="src\test\**\bin\$(Configuration)\*Tests.dll" /> </ItemGroup> <Target Name="Test"> <Gallio Files="@(TestAssemblies)" IgnoreFailures="true" ReportDirectory="build\" ReportTypes="html"> <!-- This tells MSBuild to store the output value of the task's ExitCode property into the project's ExitCode property --> <Output TaskParameter="ExitCode" PropertyName="ExitCode"/> </Gallio> <Error Text="Tests execution failed" Condition="'$(ExitCode)' != 0" /> </Target> </Project>
One of the interesting differences between .NET and Java I ran into while reading Joshua Bloch’s excellent “Effective Java” book is the handling of finalizers.
In both, .NET and Java, the Garbage Collector will implicitly place objects implementing a finalizer and are eligible for collection on a finalizer queue. This queue is processed by a separate finalizer thread, and no guarantees are made as to when an object may be finalized or if it will be finalized at all. The reason finalization cannot be guaranteed is that the process may crash, finalization throws an exception or similar mishappenings.
Due to the nature of the finalization, it comes with significant overhead. To avoid this overhead, .NET provides the IDisposable pattern. The
Dispose method takes care of releasing all allocated resources and then instructs the Garbage Collector with a call to
GC.SupressFinalizer(this) that it need not be finalized anymore.
Clients of objects implementing
IDisposable are responsible (though not forced to) call
IDisposable.Dispose whenever they are done with an instance. To support this scenarios, many .NET languages feature a
using keyword, which is syntactic sugar for providing a scope for disposable objects. Whenever this scope is left, the appropriate
Dispose method is called.
Java on the other does not provide a general purpose equivalent to
IDisposable, for IO related the
Closeable interface exists. As of Java 6 there is no equivalent to the using keyword, although “Automatic Resource Block Management” using the
try keyword has been announced for Java 7. Until then, you’re left with manually implementing scoping with a
try/finally block. Java also doesn’t provide an equivalent of
GC.SupressFinalize which means that finalizable objects will always have significant performance impacts.
A further difference is in the way base class finalization is handled. Although the .NET CLI Spec does not enforce base class finalizers are called from a derived class’ finalizer, C# and C++/CLI enforce this with their destructor syntax. In Java, it is up to the implementor not to forget calling the base class finalizer.
All these aspects, combined with the significantly better support for code issue warnings around undisposed Disposables in the .NET stack, make .NET’s handling look superior to Java’s.
A common issue when modifying .NET assemblies by using IL Round-Trip compiling or a library like Mono.Cecil is preserving Debug information across modifications. You will need to take big care not to lose your PDBs along the way.
Building for .NET 3.5
If you want to use Cecil in a 3.5 project, you need to define the NET_3_5 symbol and change the target framework to 3.5 in Mono.Cecil.csproj.
Running the test suite:
Cecil will build fine after checkout from [http://github.com/jbevain/cecil|jbevain’s github repo], the test suite will not however.
The following steps are necessary to sucessfully build and run the cecil test suite:
- Add the Framework SDK (PEVerify, ILDasm) and the Framework install directory (ILAsm)to your PATH variable (we need the 4.0 tool set because the tests run over a few 4.0 assemblies): C:\Program Files (x86)\Microsoft SDKs\Windows\v7.0A\Bin\NETFX 4.0 Tools;C:\Windows\Microsoft.NET\Framework\v4.0.30319
- Install NUnit 2.4.8 (its a little outdated but Mono compatible): http://www.nunit.org/index.php?p=download
- The tests can’t be run using Ad-hoc TD.Net but must be run using the NUnit runner. There’s a NUnit GUI project in the projects root.