Microservices and Docker containers: Architecture, Patterns and Development guidance

As part of the series of posts announced at this initial blog post (.NET Application Architecture Guidance) that explores each of the architecture areas currently covered by our team, this current blog post focuses on “Microservices and Docker containers: Architecture, Patterns and Development guidance”.

Just as a reminder, the four introductory blog posts of this series will be the following:

The microservices architecture is emerging as an important approach for distributed mission-critical applications. In a microservice-based architecture, the application is built on a collection of services that can be developed, tested, deployed, and versioned independently. In addition, enterprises are increasingly realizing cost savings, solving deployment problems, and improving DevOps and production operations by using containers (Docker engine based as de facto standard).

Microsoft has been releasing container innovations for Windows and Linux by creating products like Azure Container Service and Azure Service Fabric, and by partnering with industry leaders like Docker, Mesosphere, and Kubernetes. These products deliver container solutions that help companies build and deploy applications at cloud speed and scale, whatever their choice of platform or tools…


Java Vs .Net


Untitled1576350It is very difficult to rationalize which is better Java or dotNet. Both have their points to back up. Java’s tag line “Write once; Run anywhere.” says just go the way as  provided by me and I assurance, you can run it anywhere and will get the same result, while on opposite dotNet’s tag line “Write in any language; Run on Windows.” says hey!! come and use any language you are comfortable with and I assure you, you will get the same result.

Java facilitates a variety of web servers to run its web application while dot net supports IIS to host Asp.net applications. Not many options are available in case of dotnet.

Java is a programming language while dotnet is a framework which supports multiple languages to run on windows platform.

View original post 57 more words

Anything Over Anything – Tunneling Sofware

http://AoA.codeplex.com – tunneling software written using the pre-release version of the Rx framework. Currently the implementation only supports http protocol, the exchange of data is done by posting the data in http request and then returning the data accumulated on the server. This is done several times a second and requires an normal one directional http access.

The abstraction of the tunnel makes it easy to tunnel any protocols traffic over another protocol, thus the name – Anything Over Anthing.

Would be very glad if someone could add to this project


The JIT compiler logically determines which methods to inline. But sometimes we know better than it does. With AggressiveInlining, we give the compiler a hint. We tell it that the method should be inlined. Actually the only hint we give the compiler is to ignore the size restriction on the method or the property you want to inline. Using this attribute does not guarantee that the method will be inlined. There are 1000 and 1 reasons why it cannot be (being virtual for one thing)


This example benchmarks a method with no attribute, and with AggressiveInlining. The method body contains several lines of useless code. This makes the method large in bytes, so the JIT compiler may decide not to inline it.

And: We apply the MethodImplOptions.AggressiveInlining option to Method2. This is an enum.

using System;
using System.Diagnostics;
using System.Runtime.CompilerServices;

class Program
    const int _max = 10000000;
    static void Main()
	// ... Compile the methods
	int sum = 0;

	var s1 = Stopwatch.StartNew();
	for (int i = 0; i < _max; i++)
	    sum += Method1();
	var s2 = Stopwatch.StartNew();
	for (int i = 0; i < _max; i++)
	    sum += Method2();
	Console.WriteLine(((double)(s1.Elapsed.TotalMilliseconds * 1000000) /
	    _max).ToString("0.00 ns"));
	Console.WriteLine(((double)(s2.Elapsed.TotalMilliseconds * 1000000) /
	    _max).ToString("0.00 ns"));

    static int Method1()
	// ... No inlining suggestion
	return "one".Length + "two".Length + "three".Length +
	    "four".Length + "five".Length + "six".Length +
	    "seven".Length + "eight".Length + "nine".Length +

    static int Method2()
	// ... Aggressive inlining
	return "one".Length + "two".Length + "three".Length +
	    "four".Length + "five".Length + "six".Length +
	    "seven".Length + "eight".Length + "nine".Length +

7.34 ns    No options
0.32 ns    MethodImplOptions.AggressiveInlining

We see that with no options, the method calls required seven nanoseconds each. But with inlining specified (with AggressiveInlining), the calls required less than one nanosecond each.

Tip:Consider for a moment all the things you could do with those seven nanoseconds.

Tip 2:If you are scheduling your life based on nanoseconds, please consider reducing your coffee intake.

Cache Consideration in Multi-Threaded Code

In parallel programs is very important to regard cache size and hit rates on a single CPU, but it’s even more important to to consider how the caches of multiple processors/cores interact. Let’s consider a single representative example, which demonstrates the important cache optimisation and emphasizes the value of good tools when it comes to performance optimisation in general.

Let’s first examine the first sequential method, it performs the rudimentary task of summing all the elements in a two-dimensional array of integers and returns the result:

public static int MatrixSumSequential(int [,] matrix)
    int sum = 0;
    int rows = matrix.GetUpperBound(0);
    int cols = matrix.GetUpperBound(1);
    for(int i = 0; i < rows; i++)
        for(int j = 0; j < cols; j++)
            sum += matrix[i, j];
    return sum;  

We could have used TPL but let’s ignore the huge arsenal of tools TPL provides in our simple example. The following attempt at parallelisation may appear sufficiently reasonable to harvest the fruits of multi-core execution, and even implements a crude aggregation to avoid synchronisation on the shared sum variable:

public static int MatrixSumParallel(int [,] matrix)
    int sum = 0;
    int rows = matrix.GetUpperBound(0);
    int cols = matrix.GetUpperBound(1);
    const int THREADS = 4;
    int chunk = row / THREADS;
    int [] localSums = new int[THREADS];
    Threads [] threads = new Threads[THREADS];
    for(int i = = 0; i < THREADS; i++)
        int start = chunk * i;
        int end - chunk * (1 + i);
        int threadNum = i;
        threads[i] = new Thread(() => {
            for(int row = start; row < end; r++)
                for(int col = 0; col < cols; col++)
                    localSums[threadNum] += matrix[row, col];
        foreach(var thread in threads)
    return localSums.Sum();

Executing each of the two methods several time on an i7 machine with 6 cores produced the following results for a 2,000 x 2,000 matrix of integers:

* 325ms average for sequential method
* 935ms for the parallel method. Three times as slow as the sequential method!

The obvious question is why?
This is not an example of too fine grained parallelism, because the number of threads is only 4. However if you accept the premise that the problem is somehow the cache related, it would make sense to measure the number of cache misses introduced by the 2 methods above.

The Visual Studio profiler when sampling the execution of each methods with a 2,000 x 2,000 matrix reported 963 exclusive samples in the parallel version and only 659 exclusive samples in the sequential version, the vast majority of samples being on the inner loop line that reads from the matrix.

Why would a line of code writing to localSums introduce so many cache misses in comparison to writing to sum local variable? The answer is that the writes to the shared array invalidate cache lines at other processors/cores, causing every += operating to be a cache miss.
When processor writes to a memory location that is in the cache of another processor/core cache, the hardware causes a cache invalidation, that marks the cache line as invalid. Accessing that line results in a cache miss.

The moral of the story do not blindly introduce parallelisation in a hope that that would also result in performance increase. Always test both versions, you might be surprised at the results!

How IEnumeberable might get you in trouble

Quite often I see a bug where a certain Factory method creates an IEnumeberable of objects.

While initially looking at the code you might not notice it, but that sort of code might lead to excessive creation of instances because of multiple enumeration of objects causing the factory method to create more objects that you expect.

For example let examine this bit of innocent looking code:
Continue reading “How IEnumeberable might get you in trouble”

Scheduling With Quartz.Net

Sacha's Blog

The other day I have a requirement to schedule something in my app to run at certain times, and at fixed intervals there after. Typically I would just solve this using either a simple Timer, or turn to my friend Reactive Extensions by way of Observable.Timer(..).

Thing is I decided to have a quick look at something I have always known about but never really used, for scheduling, which is Quartz.net, which actually does have some pretty good documentation up already:


For me I just wanted to get something very basic up and running, so I gave it a blast.

Step 1 : Install Quartz.net

This is as easy as installing the following NuGet package “Quartz

Step 2 Create A Job Class

This again is fairly easy thanks to Quartz nice API. Here is my job class

That is all you need for a job really. The…

View original post 72 more words

Run Java 8 Code on .NET with IKVM

IKVM is a JVM built on top of the CLR that is working towards full compatibility. It runs on both .NET and Mono and, as of this release candidate, supports Java through version 8. For class libraries, it uses OpenJDK 8.

IKVM offers two modes. In dynamic mode, it runs Java applications directly just like any other virtual machine. In static mode, Java byte code is recompiled into .NET libraries and executables.

When working with Java code that is intended for running on IKVM, you can import .NET classes by prefixing the namespace with “cli.”. In order to satisfy the Java compiler, this requires generating the appropriate Java stubs using the ikvmstub utility.

Automatic Non-Deterministic Finalisation

The automatic mechanism cannot be deterministic, because it must rely on on the GC to discover whether the object is referenced or not. At times this behaviour is a show stopper. because temporary “resource leaks” or holding a shared resource locked for slightly longer than necessary might be unacceptable in an application. At others, it’s perfectly acceptable. I will focus on the scenarios where it is.

Any type can override protected Finilize method defined by System.Object to indicate that it required automatic finalisation. However the C# syntax for requesting automatic finalisation on a class A is to implement method ~A(). This method is called finiliser a must be invoked when the object is destroyed.

Incidentally any type can have a finalizer, even the value types. However the finalizer on the value type object will never be invoked.

Continue reading “Automatic Non-Deterministic Finalisation”

Blog at WordPress.com.

Up ↑