The Voyage Beyond the Cloud: October 2013

Sunday, October 6, 2013

The Design Strategy of Distributed System Based on HTTP World

Today's distributed systems are based on internet and HTTP connection. Which means the private protocols other than TCP/IP connection are less supported and non-open standard. And beyond the Level 4 Network concept framework, HTTP and SOAP definitely dominate the way in transmission among those distributed systems. Here I wanna discuss several aspects of the technology design strategy existed in those distributed system. Before we expand the discussion, here are some terminology that lay out the system development.

Three-Tier System Architecture: In traditional distributed system, three-tier system has UI Layer, AP Layer and Data Layer. In Web-like distributed system, the UI Layer involve the browser and web page container. The AP Layer might involve a lot of SOAP-like service. If it's necessary, we sometimes have to dig into TCP/IP connection for communicating with legacy system which has no HTTP protocol OS support. The Data Layer is Database or File System on legacy system.

Development Stack: The technology we use to develop the distributed system. Corresponding to the Three-Tier system, the web page container and SOAP-like service management would rely on server we choose such as IIS, Tomcat, Jetty etc. You can see the application server could determine the development framework and languages - .Net and Java. There are some other popular application frameworks like Rail, PHP, Djongo would be based on Apache server. Those application frameworks have different Web UI rendering strategy, SOAP or RESTful management model and Data Persistent Framework for adoption in your project.

The web socket server implementation on Python

Recently, the Java Applets Sand Box were breached by hacker and cause a lot of security event on web. Therefore, the last sand box would be the browser itself. (If there is no trust-able sand box, I don't know how to build up a reliable infra on WWW.) And I used to write the TCP/IP Socket in Applet for real-time application on web. Therefore I turned into a new standard which is still in draft of W3C - Web Socket.

I have learned there is a client implementation in JavaScript for web socket - socket io. And also there are several server implementation on web socket including the Python. However, I found this implementation is quiet bit experimental-wise. First, its base library is called gevent-socketio which is the web socket server implementation that depends on gevent, greenlet and libevent. And those three libraries are pretty nasty when I try to build the package environment including them. First I have to let Python learn the compiler's setting to build libevent and hence the gevent and greenlet would be installed successfully. But I quit this way because my window's machine is too clumsy and no more 10 GB for visual studio as a compiler. So I choose the web site http://www.lfd.uci.edu/~gohlke/pythonlibs/ which provide the built binary for windows installation.

After I install the gevent and greenlet, I can easily use pip to install the gevent-websocket. (off course you should install the easy_install tool package first which is still a binary installation from https://pypi.python.org/pypi/setuptools. After all, we are dealing with windows without apt-get). The package of gevent-websocket is the protocol implementation for web socket standard and pretty easy to be install once you have easy_install set up.

So my Python Interpreter can run the script without error now. It's time to try the web socket server of Python. Since I couldn't afford a Mac or event Chrome book, I think this is the only way that dive into Python's world.

Javascript Object-Oriented coding style and Binding technique for non-blocking context

Javascript is a powerful language. Since I learn the node.js, I found the non-blocking style's Javascript coding could bring us some advantages on interactive and agile development process. However, behind those advantages there are some tricks to whom we should give attention while we write the code.

Objected-Oriented Javascript

Assuming we have a data model called ClassA, and the code is below:

var ClassA = function (client_name){
    this.name = client_name;
    var property1 = "this is the property 1 value of " + this.name;
    this.selfIntroduction = function (){
     return "Hello World! My name is " + this.name;
 };
    this._getProperty1 = function(){
     return property1;
 };
    this.getProperty1 = function(){
     return this.property1;
 };

    this.property2 = "this is the property 2 value of " + client_name;
    this._getProperty2 = function(){
     return property2;
 };
    this.getProperty2 = function(){
     return this.property2;
 };
};

var property1 = "Are you sure this is the property1 you want?";//global variable
var property2 = "Are you sure this is the property2 you want?";//global variable
// allocate the object of ClassA
var obj = new ClassA("user1");

console.log("client name: " + obj.name);
console.log("client var p1: " + obj.property1);
console.log("client private _get p1: " + obj._getProperty1());
console.log("client public   get p1: " + obj.getProperty1());
console.log("client this p2: " + obj.property2);
console.log("client private _get p2: " + obj._getProperty2());
console.log("client public   get p2: " + obj.getProperty2());

we output the obj and get this result:

client name: user1
client var p1: undefined
client private _get p1: this is the property 1 value of user1
client public   get p1: undefined
client this p2: this is the property 2 value of user1
client private _get p2: Are you sure this is the property2 you want?
client public   get p2: this is the property 2 value of user1

Here has some points require our attention
1. In ClassA, the "var" variable is private variable for "obj". We can only use private _get method without "this." as prefix in function to address property 1.
"obj.getProperty1()" is in totally wrong way to address property1. "this.property1" and "var property1" are absolutely different things.

2. Proper2 has declairation of prefix "this." which expose property2 as a public variable.
We could use "obj.property2" or public get method with "this." as prefix in function for addressing property2.

From this two observation, we could clearly be aware of the strictly discrimination from private variable to public variable in Javascript.

3. The most tricky part and also showing the odd behavior of Object-Oriented Javascript is the "obj._getProperty2()" with unexpecting result from global variable.
We get a response that would never happen while we are working on the C++, JAVA or C#.

And it will bring us two terms in Javascript - anonymous function and Bubble-up scoping.
First, "this._getProperty2" is a public reference just like "this.property2". However, this public reference points to an anonymous function
and when we apply "obj._getProperty2()" to "console.log", it is actually an inline function like:

console.log("client private _get p2: " + (function(){return property2;})());
//"(function(){return property2;})()" means directly execution of this anonymous function after we have implemented it.
//this technique is quite often used as constructor in Javascript object.

Apparently, no one has claim the property2 in this line, neither the "console.log". Only one guy has claimed and given the memory space to property2 - "global wise" i.e. node.
Javascipt has the behavior like HTML. They would bubble up for searching the variable reference.
Therefore, we could never get "undefined" but in totally wrong value without acknowledge of it, which is a nightmare for debugging.

For preventing us from this situation, we should alway use "this." to refer the public variable inside the class's public method.
Even though for class construction or encapsulation, there might be some private variable in our class, well-documented class layout and "undefined" detection can avoid the
confusion from bubble-up scoping. For example, "obj.property1" and "obj.getProperty1()" these public reference accessing all show "undefined" with consistent results
and they won't resolve the "property1" in global scope. What I mean is to rather declaire the "var" in the constructor-"function ClassA"-as possible and only
expose variable to public "this." when you really know what you are doing. On the otherhand, accessing the variable through "this." in Class's public method as possible.
Once you want to access the private variable in public refered function, you better know what you are doing.

Non-blocking Callback and Variable Resolution Issue

So far, we have stressed out some neccessary knowledge in Object-Oriented Javascript. Next, we bring the content further to multi-context management.
Suppose we have a function "main" which exploit some non-blocking api for interactive and dynamic linking in Javascript.

function main(){
    var obj2 = new ClassA("user2");
    ClassA.prototype.callbackHandler = nonblock_callback;//dynamically link a function to obj2
    var socket = new fake_io();
    socket.send(obj2.callbackHandler);
    console.log("Has issued a request to server");
}
function nonblock_callback(event){
    console.log("callback has been triggered by " + event);
    console.log("this is the callback event handler of " + this.name);
}
function fake_io(){
    this.send = function(callback){
        setTimeout(function(){callback("my event from fake_io");}, 2000);
    };
}
main();//execute

The result is below and be careful of last line in output:

Has issued a request to server
callback has been triggered by my event from fake_io
this is the callback event handler of undefined

Here we have applied some technique "prototype" in Javascropt. This is a basic infrastructure for Javascript's inheritance and overriding.
(sorry! I don't know how to do overloading. If you did, I'll appreciate your sharing).
Although we construct obj2 first, "ClassA.prototype" still give us a way to change the class layout for obj2 with dynamically attaching a handler function "nonblock_callback".
(We can only use "obj2.callbackHandler = nonblock_callback" for adding a function to obj2, but my purpose is to emphasize the ability of "prototype".
We must use prototype with discretion especially in this scenario which obj1 is also affected by "ClassA.prototype".)

After the delightful usage of "prototype", let us review the last line of response. What happened to this handler of user2?
Ok, let me rephrase the question: Who own your context while callback is triggered? (actually these are two different question, but last one is real key.)
I will put this question away and do a small experiment first. Put an extra line inside the "main()" function for declaring a public property "this.name".
Execute main() again.

function main(){
    this.name = "Ha Ha! I am the Javascript devil.";
    ......
 ......
}

Well, you have seen what I mean. Since we would apply Object-Oriented Javascript and non-blocking style.
This issue should be understood thoroughly by all programmers before implementation.
Firt, we consider the inline function while the time after two seconds was invoked by setTimeout.

main(){
    this.name = "Ha Ha! I am the Javascript devil.";//without this line "this.name" would be undefined
    (function(){nonblock_callback("my......"){console.log......; ...... + this.name);}})();
}

Yes, the whole context showed either this.name is undefined or we have the value whose owner is totally different from our expecting.
Javascipt has two apis named "call" and the other is "apply". They provide the same function which let us insert the object reference on callback function.
For example, in fake_io, we change the callback("my event from fake_io") into callback.apply(obj, ["the event_handler invoked by "]).
Therefore, the source code of fake_io would become:

function fake_io(){
    this.send = function(callback){
        setTimeout(function(){
     callback.apply(obj,["my event from fake_io"]);
 }, 2000);
    };
}

Now, we get resolution of "user1" in callback function. Hence, we could understand the power of "call" and "apply".
(please google them for asking the difference from these two apis).
But, once we change the callbask into callback.apply(obj2,["my event from fake_io"]), you might anticipate what problem exists in this code.

/home/brianko/JavascriptOO.js:54
     callback.apply(obj2,["my event from fake_io"]);
     ^
ReferenceError: obj2 is not defined
    at Object._onTimeout (/home/brianko/JavascriptOO.js:54:6)
    at Timer.ontimeout (timers.js:85:39)

Yes, the garbage collection has recycled our obj2 in main() and the context invoked by setTimeout has no vision about where is our obj2.
We need smarter way to do that and this technique is the knack called Javascript binding.
First, at the very beginning of code, we create a binding wrapper which is actually a Javascript function closure.

var callback_binding = function(obj, handler){
    var _self = obj;
    var _funcptr = handler;
    return function(){
     return _funcptr.apply(_self, arguments);
    };
}

Then, we change the deliberation of fake_io in main().

function main(){
    .........
    ClassA.prototype.callbackHandler = nonblock_callback;//dynamically link a function to obj2
    var callback = new callback_binding(obj2, obj2.callbackHandler);//binding
    var socket = new fake_io();
    socket.send(callback);
    .........
}
function fake_io(){
    this.send = function(callback){
        setTimeout(function(){callback("my event from fake_io");}, 2000);
    };
}

And the fake_io is the same as original one. Genuinely speaking, we don't have to dig into the fake_io which might be the component you bought from outside.
From the documentation, we know the fake_io.send() is non-blocking style and we use binding technique to wrap our callback function in main(), where we deliberate the fake_io.
Then execute main() function, your callback function will correctly resolve "this.name" as "user2".

While we are developing a mass project with some object-oriented Javascript technique, this is the knack to unleash the non-blocking power in Javascript.
Have fun!
Below is our code in JavascriptOO.js for running on node.js.

var callback_binding = function(obj, handler){
    var _self = obj;
    var _funcptr = handler;
    return function(){
 return _funcptr.apply(_self, arguments);
    };
}

var property1 = "Are you sure this is the property1 you want?";
var property2 = "Are you sure this is the property2 you want?";

var ClassA = function (client_name){
    this.name = client_name;
    var property1 = "this is the property 1 value of " + this.name;
    this.selfIntroduction = function (){
     return "Hello World! My name is " + this.name;
 };
    this._getProperty1 = function(){
     return property1;
 };
    this.getProperty1 = function(){
     return this.property1;
 };

    this.property2 = "this is the property 2 value of " + client_name;
    this._getProperty2 = function(){
     return property2;
 };
    this.getProperty2 = function(){
     return this.property2;
 };
};

var obj = new ClassA("user1");
/*
console.log("client name: " + obj.name);
console.log("client var p1: " + obj.property1);
console.log("client private _get p1: " + obj._getProperty1());
console.log("client public   get p1: " + obj.getProperty1());
console.log("client this p2: " + obj.property2);
console.log("client private _get p2: " + obj._getProperty2());
console.log("client public   get p2: " + obj.getProperty2());

console.log("anonymous p2: " + (function(){return property2;})());
*/

function main(){
    this.name = "Ha Ha! I am the Javascript devil.";
    var obj2 = new ClassA("user2");
    ClassA.prototype.callbackHandler = nonblock_callback;//dynamically link a function to obj2
    var callback = new callback_binding(obj2, obj2.callbackHandler);
    var socket = new fake_io();
    socket.send(callback);
    //socket.send(obj2.callbackHandler);
    console.log("Has issued a request to server");
}
function nonblock_callback(event){
    console.log("callback has been triggered by " + event);
    console.log("this is the callback event handler of " + this.name);
}
function fake_io(){
    this.send = function(callback){
        setTimeout(function(){
     callback("my event from fake_io");
 }, 2000);
    };
}
/*function fake_io(){
    this.name = "You better think about it";
    this.event = "my event from fake_io";
    this.send = function(callback){
        setTimeout(function(){callback(this.event);}, 2000);
 //callback(this.event);
    };
}*/
main();//execute

Composite Pattern on Web Service

In recent day, I have seen for some requirements that the system has been adopted to multiple devices other than web such as mobile, tablet etc. However, I found some system has their business logic embedded into some web page rendering code. As a Enterprise Distributed System Developer, there always turn out a question which software programmer might not have to encounter - Should I implement an Interface on Web service?

People like you might think : what's difference? Since you have decouple the modularity through an Interface, so you could just make reference to the module in web page or some other project that consume this module as a client. But here are the things, first this Interface contains an composite pattern which provide so many operations we want them to be hidden from web UI which means the UI team doesn't have to understand the operation detail. Second is the crucial reason that the operations require spec change all the time and we don't want to publish or deliver this module with the consequence of impacting the UI (they might have to recompile for adopting new feature of the business logic that just has been re-design and re-implement).

Hence, the Web Service-lized Composite Pattern gives us the benefit that we can expose whole bunch powerful services within Enterprise Distributed System and the clients from the other department or business domain don't have to learn about the detail. Even more, as long as the interface hasn't been re-org or re-structured, the business logic inside could adopt the new governor regulation, company policy or business management methodology without impact the UI. However, things are not as simple as I thought that just move the module of composite class onto web service. There are some point of view we have to consider about:

1. Are all the operation Thread-Safe? Assumable speaking your composite class can execute the complex business context. Then it might handle the multiple transaction between database. Or the class utilize the vendor's library which manage some file system or TCP/IP connection to the external legacy system. There always comes surprises within the library of black box especially when the module deal with so many business contexts as the system's feature advanced. Usually I will use the ajax to simulate the simultaneous request to a composite class that has just been implement behind web service. We need a developer with substantial business knowledge who is hard to be recruited for tracing the any resource contention or transaction lock inside the class. You could wrap the composite class with singleton pattern for managing the potential trouble of intrinsic multi-threaded web request. However, this superficial solution would bring the next question.

2. Would the labor operation drag down the performance and stability of Web Application Server severely? Web Server can not do every thing for us while the operation is really laboring and time consuming. While you have a resident daemon in OS that work for a laboring job, the web interface on this operation for monitoring, event dispatching and notification might not a good design due to the object management and scope would be handed out to web server not your daemon. Web server has its limit and we shouldn't treat it like a dummy daemon with all benefit about web access.

3. Do you have good solution to deal with Stateless Connection on State-full Transaction Management? Business context might be long and synchronization required. If you use your composite class within an application which provide the state management of data model. Then you should prepare that any context cutting and state synchronization effort would emerge in coming future.

4. How you handle the Authentication and Authority Logic rely on Domain or other Client Certificate Technique? These crucial features are resident in whole business context. Separating the composite class bring you to verify the authentication and authority between client library and web service again. For some enterprise systems, the principal-deputy relationship, surrogate operation and audit regulation might be your major burden while you want to pull out some business context for flexibility and accessibility.

Those issues should be considered at the beginning of the system and architecture design. Unfortunately, many projects are implemented before they realize where are the composite pattern. Even worse, the UI's code and business Logic have been entangled that require a lot of re-factor work before we extricate the composite class to web service.

Cloud Foundry

IaaS has been proved it is a workable service model. I have seen the Openstack built up my Ubuntu virtual machine in a few minutes. Therefore I can easily create more resource for system development and test.

SaaS is an emergent model for software vendor who can charge their customer based on the usage of software rather than the installation licence. However, even with the support of IaaS, SaaS still face the huge maintain cost on system configuration and backup strategy. Although IaaS brought us a better way on virtualization management, we still have to install, configure and trouble shooting on the system's essential utility or framework such as JVM, HTTP server, Database etc. Hence, the PaaS can fill this gap and bring the SaaS vendor more efficient development cycle especially for those SaaS vendor whose client base requires more customized function.

Cloud Foundry is an opensource PaaS provided by the most famous virtualization solution provider - VMWare. Although Google has the Google App Engine for SaaS vendor. VMWare's solution would be really attractive to the company self-owning private cloud. Most Company who has applied virtualization technology can enhance the infrastructure into IaaS smoothly under the assistance of adequate tool. However, it is really difficult for most company to define or create a PaaS on their private cloud. Hence, most developers are still struggling on designing the platform building block, essential utility configuration, backup, deployment strategy and maintenance. The shortage of PaaS would be the consolidation of platform which might not fit into the certain software really well. Under this circumstance, we might have to customized some configuration or assemble the essential utility for special requirement. Nonetheless, the customized utilities could still be part of our PaaS co-working with the other standardized framework or utilities.

Dependency Management

In the system development, that is really usual that multiple members would join their project to work out a specific function for the coming demand on this system. As the past paradigm which focused on the software and package aspect about the system, the all dependency would be considered about at the early design stage and we have already whole bunch tools that help us to control on those "static binding dependency" between the different binary library files. The compiler is the first stop to be the static binding dependency checker for our software and package. And the programming language infrastructure like C++ has introduced the namespace concept into our practice and give the developers and IDE a good foundation to manage the static binding dependency. A good static dependency design brings benefit on the system flexibility in the future. Therefore, there are so many useful design pattern about how to depict the hierarchy of classes. A good design pattern would have percipient about future change and coordinate libraries in a modest way.
However, just like the beginning of this post, the static binding only solve the internal coupling issue of a set of binary libraries. There are more issues when we combine all those projects into a system. We can categorizing those issues into two genre. One is the issue at early binding. It's pretty common that our program is running on a runtime like .NET or JVM and container like IIS, Tomcat, WPF or even a windows service. The runtime and container provide the variety and useful core libraries for our application. Hence, we write the configuration file for deploying our binary file quite often. Those configuration file is the prime communication way to those runtime and container and tell them what kind of basic service we want those foundation to provide such as authentication method, session control, type including or Resource location. Once the configuration has trouble with the binary files we've deployed or the runtime and container's ability. The early binding activity would alert an error at the initiation of the executable file when Operation system load the whole application into memory. Those executable files such as w3wp, java.exe are the most common container and runtime which require a strong background knowledge about their configuration or command line parameters. Early binding issues require the modern IDE or some trouble shooting skill to dig them out when we try to deploy the application on new machine.
The second genre of dependency management issue other than static binding is the lately binding which is the hardest part without a reliable tool when we deploy the project that we don't familiar with. So called lately binding means that we might never be aware of some type or class implementation loss at deployment until the actual request or running application went through those particular section of our source code. The lost library or class are only initiated when the application ask for. Those libraries or classes are dynamically loaded or reflected by the program when it's necessary. Which means there might be no adequate audit or document that recalls that weak points of the deployment activity. The dynamic loading mechanism is pretty advanced feature in modern system development. Although the blind spot at deployment activity would jeopardize our online application, the flexibility and replaceable binary at runtime still be really attractive to a lot of developers adopt this mechanism into their system design.
The three dependency management aspects mentioned above are all talking about the dependency inside the same process image of a specific memory section that operation system has allocated. But there is a dependency issue that is quiet common at distributed system project. The issues are addressed about the remote service dependency like web service or client-server dependency. At usually circumstance, the consumer (i.e. client) require a specific service provided by a non-existing or accidentally missing server. The occasion was generated by the remiss on deployment activity. Like the missing of dynamical loaded libraries, the possibility of sending the request to a vacant service entry would rise the stake of a certain exception that damages our online application. However, this is due to the neglect of some specific service for the system and the remedy is pretty easy if we have dealt with those exception neatly, just start the remote service. The trade-off between flexibility and integrity of deploying should not be skipped while system analysis. And the balance of flexibility and integrity can also be affected by or do impact to the design of static binding mutually. But this big question is belonging to the domain knowledge among how we isolate the most often changed part in some certain applications. This philosophy perspective has beyond the scope of this article.