Jun Meng's blog: June 2006

.NET Guidance Explorer

Although I have disks about .NET patterns & practices, a desktop Guidance Explorer is quite handy:

The only problem is that the explorer does not include enough topics at this point. The developing group is adding and releasing more content nearly weekly.

Is there an auto update feature in the explorer? Unfortunately, no.

Using I/O Completion Port (IOCP) to implement our own thread pool

at 12:55 PM

Here I talk about another old but important technology for .NET server application.

Before we discuss thread pool, a concept should be clear: You can generate as many threads as you want in .NET code, but the default thread pool has at most 25 threads.

For a server application, it is normally necessary to use thread pool to leverage client requests, even on a single CPU computer.

1. Thread pool background knowledge

Creating thread is expensive: To create a thread, a kernel object is allocated and initialized, the thread's stack memory is allocated and initialized, and Windows sends every DLL in the process a DLL_THREAD_ATTACH notification, causing pages from disk to be faulted into memory so that code can execute. When a thread dies, every DLL is sent a DLL_THREAD_DETACH notification, the thread's stack memory is freed, and the kernel object is freed (if its usage count goes to 0).

Thread pool keeps some idle threads in suspended state to save time. At first, there is no thread in the thread pool. When the first request comes, a thread is created to process the request. If the request process takes long time (such as accessing web service), the thread enters sleeping mode at a certain time, the thread pool can create another thread to serve queued requests. When web service response comes back, the sleeping thread is woken up to continue the process. After the request is processed, the thread does not simply die, but serves other requests.

If we do not use thread pool, we will end up using one single thread to serve all client requests sequentially (for long-running request, the performance is terrible); or we create/destroy a thread for each client request, which is time consuming for large amount of requests.

2. Why do we need our own thread pool

By default, there are 25 threads in the thread pool for a .NET application. Normally, the .NET thread pool is OK if there are not many services running in the application and each request process is short. But if the application have many tasks to do (with several back ground threads, scheduler, many long-running requests), it is necessary to build our own thread pool.

3. How to build thread pool

Windows Server I/O Completion Port (IOCP) is a kernel queue that can be used for thread pool. The basic idea is to allocate a number of threads to wait on the queue. A request on the queue includes request data address that is supposed to indicate delegate value, so that the thread can run the delegated method.

You can read more about IOCP thread pool implementation in C# from Part I and Part II

The problem of the article is that the author did not show how to get the request data. You should use Target property of GCHandle to get the real .NET data back.

Microsoft Identity Integration Server

at 9:07 AM

Last night, I attended MICSUG user group meeting. The topic of the meeting was about Microsoft Identity Integration Server (MIIS) to integrate user identities from different data sources. MIIS is a good server for a big company where some legacy systems or departments want to share user identity.

The picture below is an example of MIIS use case. MIIS developer can define attribute mapping for data sources (SQL server, flat file, Active Directory, etc) to import/export identity data.

The current version has limited ability to manage password though. It is understandable that many systems use hash code to store password, so that MIIS can not get original password out. MS is developing password management in next version.

The next version of MIIS can also support Extranet for B2B/B2C identity integration. InfoCard (CardSpaces) will also be supported. It sounds like MIIS will become a good middleware for a big company.

Web Service Host Application Implementation

at 2:27 PM

I have been curious for a long time about how Web Service host application is implemented. After I build .NET web service assembly, how can Web Service host application receive request and call the correct method?

From the research I did on Web Service topics these days, I have a concrete idea about how to implement a Web Service host environment. How does ASP.NET platform process Web Service request is beyond my current ability. Here I only talk about the main steps for a normal Web Service host application to process Web Service request.

Reflection is the key for Web Service host application to work.

1. We define a Web Service class and Web Methods by using according attributes. When we deploy the assembly to the host application, the application uses reflection to load the assembly and initialize properties using those attributes:


Assembly assem = Assembly.LoadFrom(assemblyPath);
foreach (Type t in assem.GetExportedTypes())
{
  foreach (MethodInfo info in
    t.GetMethods(BindingFlags.Public | BingFlags.Instance)
  {
   if (info.IsDefined(typeof(WebMethodAttribute), false))
   {
    // Set according properties
   }
  }
}

Reflection is also used here to generate WSDL file by going through WebMethodAttribute.
2. When a SOAP request comes, the host application parses the SOAP packet to get class name, method name and parameter list.
3. The host application allocates an instance using the class name, either from instance pool or using a singleton
4. The host application finds the method by using method name and calls the method:


MethodInfo info = type.GetMethod(methodName);
object result = info.Invoke(instance, parameters[]);

5. Return the object in SOAP packet if there is no error; otherwise, return exception

The steps above are my simplest design of a Web Service host. I will consider WSE later.

Dynamic Code Compilation

at 9:47 AM

Maybe you already know this old technology -- dynamic code compilation, but I didn't touch it until today.

Suppose we put code on production box, later we need trace a bug in production environment. Of course there is no Visual Studio installed on production box. How can we do? Sometimes it is too complex to build a testing project and upload to production box. What about building a generic editor with code compilation capability, so that we can put testing code in it and compile the source code dynamically without complex command line typing?

Another example is Application Server. It is a good feature to compile testing code inside the server environment for module's stress test.

Inside the editor, we can define our own format for DLL references. For example:

//@ref "DLL file name"

The editor compiles source code in these steps:

Create an instance of CodeDomProvider CSharpCodeProvider (VBCodeProvider for Visual Basic)
Provide CompilerParameters for compiler options, such as adding DLL references
Compile source code using CompileAssemblyFromSource method of the CodeDomProvider
Check CompilerResults
Execute generated application if there were no errors

SoapHttpClientProtocol and XMLSerializer

at 11:48 AM

I took it for granted that because .NET web service client sends SOAP request to invoke web service method, the client side should use SoapFormatter to serialize SOAP request parameters. Today I know I was wrong.

Actually the client side uses XMLSerializer (instead of SOAPFormatter) to serialize SOAP request content.

Why? Because SOAPFormatter (or BinaryFormatter) serializes all public, private data members and even methods to a stream, while XMLSerializer only serializes public data members. Web service is supposed to integrate separate applications in different platform (.Net, or Java, etc.), so it does not make sense to send private members and methods to the other side.

If you want to send the whole object to the other side, you should use .NET remoting where you can make use of SOAPFormatter or BinaryFormatter.

What I learned in Mid-Atlantic Code Camp in Reston

at 10:45 AM

Yesterday I attended .NET Code Camp in Reston, VA. Before I went there, I had wondered how well it would be, because many good speakers went to TechEd 2006 in Boston. If this code camp had been scheduled days ahead of TechEd, maybe some TechEd speakers would attend the code camp to practice their topics. "Anyway, I will go there to have a look on this beautiful Saturday", so I went there in the morning.

According to the session schedule, some good speakers were not there of course, but I did find several good speakers on the list! :)

The schedule included five tracks: Web track, Data track, Smart Client track, Miscellaneous track, and Security track. There were no much stuff on WinFX (.NET 3.0). Below are the sessions I attended:

1) "Enterprise Library and Data Security": Gary Blatt was still humorous. It's pleasure to listen to his speech. He did not talk about Enterprise Library 2.0 though.

2) "Secure Click Once Smart Client Deployment": MS Regional Director - Brian Noyes' speech was full of exciting technology to me! :) He showed his broad knowledge on .NET platform. Actually, it was the first time for me to see a real "Click Once" (or Click Twice) deployment.

In his speech, Brian showed deployment and application manifest files for an assembly. Whenever an assembly is deployed, new manifest files will include hash code for its XML content, and application manifest file also includes hash code for DLL files. In this way, it is quite difficult for hackers to replace DLL files or to change manifest content.

On user side, Smart Client application runs code according to deployment server URL, so that user can run applications side-by-side deployed from QA server and production server.

It is impossible for me to write down all what I learned from this session. I will wait for his coming MSDN Online article about Secure Click Once Deployment.

3) "Refactoring: Why? When? How?": C# MVP Jonathan Cogley is also one of my favorite speakers (I attended two of his sessions :>). He did not prepare PowerPoint slides. What he did was to show in Visual Studio how to make existing code better for maintenace and performance purposes using refactoring techniques (Rename, Extract Method, Move Method, Introducing Explaining Variable, etc.) and tools (e.g. ReSharper, NUnit). To see a smart guy changing code step by step is really a good learning experience! :)

4) "Web Applications Security: Greatest Hits": Jonathan Cogley demonstrated SQL Injection and Cross-Site Script Attack in ASP.NET application, and how to change code to avoid attacks. Some concepts were not new to me, but I still got some good hints.

For example, we may separate input pages with HTML editor from other input pages. HTML editor accepts Java Script in the textbox, so we should disable Request Validation for that page. For other input pages, we should enable ASP.net Request Validation to avoid script attack.

5) "SQL Server Integration Services with Team Systems": Andy Leonard had planned to show Team system, but unfortunately his VPC died at that time. Instead, he showed us more exciting SSIS feathers. What surprised me in SSIS was its step-by-step debug capability inside Visual Studio.

Andy mentioned that DBA should be involved in Software Development Life Cycle to make system better. I totally agree with him. Nowadays, many systems are designed without DBA, Tester, even Developers being involved --- How can they develop the system without misunderstanding?

6) "Building Ajax Style Applications using ASP.NET 2.0 and Atlas": MS Regional Director Vishwas Lele showed some cool features of Atlas. He made it clear that Atlas does not use ASP.NET 2.0 Callback feature, it uses a special HTTP Handler to process JSON request directly without going through the whole ASP.NET page cycle.

Overall, although I did not see topics about latest .NET 3.0, I still learned a lot from this code camp and went back home happily! :)

.NET Application Server

at 9:10 AM

"Application Server" has been a common buzz word in Java world for many years. J2EE Application Server hosts Java applications and provide environment for deployment, configuration, transaction, logging, session management, instance management, reporting, exception handling, load balancing etc. What application developer cares is mainly the application's business logic.

Application Server in Java world makes perfect sense, because Java runs on different operating systems. It is necessary to build a common environment to hold applications.

But in .NET world, "Application Server" is not well-known. We do not need Application Server in .NET, right? Yes or no.

Yes. The main reason is that .NET is built upon Windows system, so that .NET can use features provided by Windows System (e.g. Transaction, Logging) directly.

No. We still need an integrated environment for Enterprise software. That is the reason why MS Enterprise Library exists. But to set up and use that library is still complex for a normal developer. Compared with Java Application Server environment, .NET framework is not an integrated product for developers and project managers. The only one Application Server I know is Interactive Server. From my research on that product these days, it has some limitations.

So, can we make an application server product to hold various .NET applications? Technically yes, although not easy. Maybe MS will build application server later. I heard a rumor that IIS 7 is a kind of Application Server, but I doubt. :)

Social Engineering, the USB Way

at 8:52 AM

The article "Social Engineering, the USB Way" amused and scared me a lot: When a credit union’s employees happened to find USB keys (with Trojans software), they were so happy to pick up and plug into company computers --- Everybody likes free stuff! :) The Trojans software ran secretly on those computers and sent emails to hackers with users’ important data --- It is so easy to hack a system!

From that story and discussions, I realized two important security problems people ignore:

1) Most people use Administrator account for daily work.

For a financial company with customer SSN, birth date, and address in database, it is quite important to train employees to use non-admin account to avoid virus and Trojans software to some degree. However, in reality, many people (even IT people!) in financial company still do not feel that danger. They love the convenience of Administrator account!

2) Auto-Run feature may run malicious software automatically

Auto-Run feature is useful to play music CDs (not from Sony!), but if hackers use that feature to install virus or Trojans software, it will become a nightmare to users.

Jun Meng's blog