Previous month:
October 2007
Next month:
December 2007

November 2007

Spider's Web: Spawning threads in .NET (Part 1 of 5), The Basics

Now with our multi core CPUs and high expectations of application responsiveness, executing the application in multiple paths have become a requirement rather than an option. Also running the code in application servers requires that we know a thing or two about threads. So I have decided to write a few posts about threads and their use in .NET environment. This is the first part of a five part series post.

History

A process in traditional programming languages pursued one execution path. Unix platform provided the fork command which would enable one to split the execution into different paths. A spawned process could have independent execution path but that required quite an amount of resources. A thread is a simpler and cheaper way of having different execution path at real time with the benefit shared memory space. Each process has a minimum of one thread. With the invention of threads, the code blocks of a same process could run virtually simultaneously. With today's multi core CPUs the 'virtually' part of running simultaneously can be taken out and a thread can in physical world run simultaneously. The implementations of threads are different on different operating systems. For more historical information on threads please refer to the Wikipedia Article.

The Basics

System.Threading namespace contains the classes required for threading. This is the basic syntax of spawning a  thread.

Thread thread1 = new Thread(new ThreadStart(function));

A thread takes a ThreadStart delegate as the constructor parameter. A Delegate is a safe function pointer. (For more information on delegates please visit these next links MSDN delegate reference and MSDN Article: An Introduction to Delegates from Jeffrey Richter.) 

After creating the thread instance, call the start method to start the thread.

thread1.Start();

The code sample below creates a sample thread and runs it.

class Program
{
    static void Main(string[] args)
    {
        WorkerObject wo = new WorkerObject();
        Thread thread1 = new Thread(new         thread1.Start(); ThreadStart(wo.Count));       
        for (int i = 0; i < 100; i++)
        {   
           Console.WriteLine("Main says, I am counting ...{0}", i);
        }
    }      
}

class WorkerObject
{
    public void Count()
    {
        for (int i = 0; i < 50; i++)
        {
            Console.WriteLine("Worker object says, I am counting ...{0}",i);                        }     }
}

This is a very simple example of threading where the main application thread prints out a number to console every second and the another thread is printing out a number to console 1/10 of each second.

The application output looks like the this

Main says, I am counting ...0
Main says, I am counting ...1
Worker object says, I am counting ...0
Worker object says, I am counting ...1
Worker object says, I am counting ...2
Worker object says, I am counting ...3
Main says, I am counting ...2
Main says, I am counting ...3
Main says, I am counting ...4
Worker object says, I am counting ...4
Worker object says, I am counting ...5
Worker object says, I am counting ...6
Worker object says, I am counting ...7
Main says, I am counting ...5

If we observe the output then we can see that both execution path are running at the same time. Several points to note here.

  • There is no time deterministic way of scheduling code executing in a thread. When there are multiple threads the OS will decide which time slice to give which thread and when. Also the time slices are not equal.
  • After calling a thread start from another thread the call will return control immediately that other thread will start at its own time which can be immediate or later

Waiting using Thread.Sleep()

Thread.Sleep() is a static function which causes the executing thread to pause for specified number of milliseconds. It it quite useful when you need to pause processing for a while. It must be noted that Thread.Sleep() will only pause the executing thread which called the Sleep function.

Wait for a Thread to finish by using thread.Join()

Join() is an instance function available to a thread. This blocks the calling thread until the target thread has finished.

 class Program
{      static void Main(string[] args)
     {
         WorkerObject wo = new WorkerObject();
         Thread thread1 = new Thread(new ThreadStart(wo.CountSlowly));
         Console.WriteLine("Before starting the thread.");
         thread1.Start();
         thread1.Join()
         Console.WriteLine("After starting the thread.");
     }
}


class WorkerObject
{
     public void CountSlowly()
     {
         for (int i = 0; i < 5; i++)
         {
             Console.WriteLine("Worker object counts...{0}",i);              Thread.Sleep(1000);          }      } }

This code sample demonstrates the use of static function Thread.Sleep() and instance function Join(). The application started a thread waits for thread to finish by calling thread1.Join(). The function CountSlowly waits 1000 ms before counting each number by putting the thread1 to sleep. The application output looks like this ...

Before starting the thread.
Worker object says, I am counting ...0
Worker object says, I am counting ...1
Worker object says, I am counting ...2
Worker object says, I am counting ...3
Worker object says, I am counting ...4
After finishing the thread.

Passing parameters to the thread function

Before .NET 2.0 the function passed to the thread had to be a void function without any parameter. The basic idea was to pass the parameters to object via private variable. However that creates a problem for static functions or instances of objects stored in static variables, which I will discuss in a later part. So we can modify our WorkerObject class to store a parameter. The code should look like this

class WorkerObject
{
    string _name = string.Empty;
    public WorkerObject(string name)
    {
        _name = name;
    }
    public void CountSlowly()
    {
        for (int i = 0; i < 5; i++)
        {
            Console.WriteLine("{0} says, I am counting ...{1}", i, _name);
            Thread.Sleep(1000);
        }
    }
}

Now we can use write code that creates the Worker class with the constuction parameter which may look like this

  WorkerObject wo = new WorkerObject("Bumble Bee");
  Thread thread1 = new Thread(new ThreadStart(wo.CountSlowly));
  thread1.Start();

This way we are keeping the function signature same while passing the parameter to the object. This is a dangerous practice if the object instance is being shared among different threads at the same time, otherwise it is harmless. From .NET 2.0 we have a new feature of passing parameters. In order to pass parameters we need to use ParameterizedThreadStart() delegate to pass parameters to a thread function. The signature of the function should look like this

  void function (object)

Now only one parameter can be passed to the function. So if we need to pass multiple parameters then we will need to create a custom object that holds the parameters or we can use a dictionary. Also inside the function we need unwrap the object to a strong type and work with it. In the next example we are going to pass two parameters to the function.

// Custom parameter type 
class
CountParams
{
    string _name = string.Empty; int _max = 0;
    public CountParams(string name, int max)
    {
        _name = name;
        _max = max;
    }
    public string Name { get { return _name; } }
    public int Max { get { return _max; } }
}

class WorkerObject
{
    public void CountSlowly(object data)
    {
        // Unwrap the parameters         CountParams parms = (CountParams)data;         for (int i = 0; i < parms.Max; i++)
        {
            Console.WriteLine("{0} says, I am counting ...{1}", i,
               parms.Name);
            Thread.Sleep(1000);
        }
    }
}

As we can see that we need pass two parameters to the function name and how many times will we count, so we have created a class to hold multiple parameters called CountParams. Then inside the function we unwrap the object to our parameter class and use it. Please note the if we had only one parameter to pass to the function like and integer then we would not need to create this parameter class and would be able to directly cast the object to int. Lets observe how we are calling the function ...

WorkerObject wo = new WorkerObject();
Thread thread1 = new Thread(new ParameterizedThreadStart(wo.CountSlowly));
thread1.Start(new CountParams("Bumble Bee", 10));

Now the above example shows us that we construct the thread with a ParameterizedThreadStart delegate and then in thread.Start() function we are passing our custom parameter object.

There is something called Thread Local Storage (TLS) which I will cover in a future part in the thread post series.

Multicasting

One more thing to remember that the delegate for ThreadStart is a like normal multicast delegate. So we can assign more than one function to the delegate. But that will not call each of the functions simultaneously rather call them on the same thread one by one. The example below shows sample multicasting  ...

WorkerObject wo = new WorkerObject();
WorkerObject wo2 = new WorkerObject();
ThreadStart ts = new ThreadStart(wo.CountSlowly);
ts += new ThreadStart(wo2.CountSlowly);
Thread thread1 = new Thread(ts);
thread1.Start();

In the sample above the thread will call wo.CountSlowly and when the function call finishes then will call wo2.CountSlowly. This is how we can use multicasting while threading.

With this I will close this post. The next post on threading will contain information on DataSlots, locking and some synchronization.

kick it on DotNetKicks.com


Profile your performance worries with DotTrace 3.0 profiler

We have a large database for the website back at work and also we have a high amount of disk IO and we were using the ASP.NET membership provider that comes built-in. We replaced the ASP.NET provider as it is not suited to a large scale application and has quite a lot of performance issues. For example it creates an anonymous user if you just hit the site ... which is true for even bots. Think of the amount of garbage user you can get with ASP.NET membership.

After deploying the performance improvement to the site we saw that the CPU usage to the site became 3 times higher than usual ... not much of an improvement ..eh! However from the changes made to the code it was obvious that there should not be higher CPU usage rather it should be low.

I downloaded a version of the DotTrace profile from Jetbrains. Its an amazing profiling tool. We have several web servers, I took one of the servers out of the load balancer. When you profile an ASP.NET app, the profile restarts the IIS. During that time we do not want the users to experience error or no page so its best to take the web server out of the load balancer. Turned on the profiler and then put the server back into the load balancer so that it can serve pages and encounters real life load. After a few minutes took a snapshot of the server. All this time CPU was high.

DotTrace is an amazing tool and it shows amount of time and the amount of calls and % for all methods and all methods called by your application.

Note: Some of the method names in the images are renamed or hidden for security and copyright reasons.

Img_1

As one can see that the process request is taking 8.8% time which is normal but Application_AuthenticateRequest is taking 3.3% which is very strange.

Investigation showed that a method used to identify cookies from domain is taking huge time (see below). The code uses regular expressions.

Img_2

Since our website calls itself internally from the server side from various requests generating a cookie for server side call should be wrong and this method validates if this is not a localhost call.

 

Then I investigated the code and found the following

private static bool IsValidDomainName(string inputString)
{
    System.Text.RegularExpressions.Regex regex =
    new System.Text.RegularExpressions.Regex(@"^[a-zA-Z0-9\-\.]+\.(com|org|net|mil|edu|COM|ORG|NET|MIL|EDU)$",
    System.Text.RegularExpressions.RegexOptions.IgnoreCase |
    System.Text.RegularExpressions.RegexOptions.Compiled);

    return regex.IsMatch(inputString);
}


There are 2 things with this code. First of all this is a static function with a static expression validate with so the regex object can be static.

Second, the regex.IsMatch is a expensive function when it is in a area with high code coverage.

We just replaced this code with simple string matching and we our CPU just went down, see the web server CPU usage.

Bad_good_cpu

Click on the image to the high CPU usage vs after the applying the patch.

Dotrace can both CPU profile or memory profile an application and also have some cool views that you find bug fast.

It is evident that entry point of the request must me properly optimized as it called so many times. There is no chance for mistake for a code with huge coverage like that.

kick it on DotNetKicks.com