Thursday, June 6, 2013

Dumping Tomcat packets using tshark

Tshark (command line version of wireshark) is a wonderful tool for dumping packets and recently I used it on my Mac since I couldn't easily get Tomcat to log the HTTP packets coming in on port 8080. Having used it in the past for lots of other reasons, I felt compelled to find a generic solution to this problem where you have to rely on application level logging to determine why something works or doesn't.

Here is the command I used (lo0 is the loopback interface since I was running the client and server on my PC)

tshark -f "tcp port 8080" -i lo0 -V. Here is a very good page on tshark that I am sure I will come back to again and again to get more juice out of this tool.

Getting Postman to test Restlet

Postman is a wonderful Chrome application for testing REST method calls. However it was a bit hard to get it to work for testing Rest calls with my Restlet server. Here are the gotchas I faced and havent yet figured them out completely. I love to record them so I can come back much later and they are still here ;)

1. Postman does not add the Content-Type header by itself. If you select an HTTP method which allows a body (e.g POST), it allows you to create the body and select the format (JSON/XML etc) but you must remember to add the Content-Type header.
2. If your application requires authentication, you can add that too. Postman supports basic, digest as well as OAuth (which I would love to test out next).
3. The biggest problem I ran into was when I sent XML or JSON body and the Restlet server replied back with 415 - Unsupported Media Type. The request does not even hit my application side! If you write a client application and choose the Media Type of MediaType.APPLICATION_XML and the server side method is annotated by @Post("txt:xml") it works. However, when you set the Content-Type header in Postman to application/xml it does not work. To debug this further, I installed wireshark and dumped the packet contents. I was surprised to find that the client built using Restlet framework was sending a Content-Type header as text/plain. This had to be some issue on my end. Interestingly, if I made the corresponding changes in Postman, the request coming out from that application also started to work. These are the two headers I inserted. Note that a Content-Type of text/plain still does not work. You must indicate the charset as well to make it work.

Content-Type: text/plain; charset=UTF-8
Accept: application/xml

Friday, May 24, 2013

Comparing Oracle NoSQL and MongoDB features

I recently had a chance to go through Oracle documentation on its NoSQL solution in an effort to evaluate it against MongoDB. The below comments are based on Oracle's 11g Release 2 document (4/23/2013) while Mongo is at its Release 2.4.3.

Oracle's deployment architecture for its NoSQL solution bears strong resemblance to MongoDB with some restrictions which I feel will go away in subsequent releases. The whole system is deployed using a pre-decided number of replica sets where each replica set may have a master and several slaves. The durability and consistency model also seems similar to what MongoDB offers, although the explanation  and control seemed a lot more complex. One of the restrictions is that the user must decide how many partitions the whole system will have in the beginning itself and these partitions are allocated amongst the shards. Mongo's concept of "chunks" seemed a lot similar but easier to use and understand. 

One of the biggest issues is security - the system has almost no security at user access level and there is no command line shell to interact with it. The only API which can possibly be used is Java. This is clearly not developer friendly right now. 

Perhaps the most confusing was the concept of JSON schemas which were used to save and retrieve the values in the key value database. Every time you save and retrieve a value from the database you have to specify a schema to serialize and de-serialize the data and these schemas may have different versions that you need to track. There could be multiple schemas concurrently being used (e.g each table could be using one schema). What was confusing about it was why Oracle took this approach and even if they did why was this not hidden under the hood so users dont have the deal with it. The boilerplate code which has to be written to constantly handle this simply looks unreadable and no developer would find it fun to use this system.

I also noticed the absence of ad-hoc indexing in the system, something I have begun to appreciate in Mongo now.

Another odd feature was that when an update was made to a set of records which has the same "major" key, the update could be made as a transaction. This was because a records which had the same "major key" always went to the same partiton which ends up on a single replica set and hence on one physical node which is acting as a master. This is one more thing which must be carefully considered by the developer before the schema is designed.

I found Mongo to be much more mature in terms of using true flexible schema - the user sees records as full JSON documents where each record in the same table can potentially have different fields. In Oracle, if a field is missing and its in the schema it must have a default value (very much like relational).

What seemed to be well thought through was the deployment architecture, durability, consistency model and how it was controlled by the user on a per operation basis. I am not sure of any big deployments using Oracle NoSQL yet so it would be good to hear if there are any. I am also expecting a big revamp in the next release from Oracle so this is easier to use from a development standpoint.

Tuesday, April 9, 2013

AD Authentication with SVN

Frequently with SVN, you would want to integrate with Active Directory to enable users to use their Windows Login and Password with SVN. Here is how you do it:

The Apache module to use in this case is the mod_auth_sspi. Once you have enabled that, set the SSPI configuration section in subversion.conf. the SSPIDomain should be set to the name of the domain you want to authenticate with.

When a user logs into SVN, the user ID they type into the SVN Authentication prompt is the Windows Account (without the domain) and the password is the domain password.

This may not be enough for you if you are using bugzilla integration with SVN using SCMBug. Bugzilla has an integration with AD (which I personally have not used) with enough documentation around it. If you use it, then just change the SCMBug configuration file to pass through the user id from SVN to Bugzilla. I think that should work, but if you find some tweaking is necessary in SCMBug, just post it back in the comments section for this post. I will definitely appreciate it in my next gig!

Wednesday, April 3, 2013

Interesting case for var -> val conversion in Scala

While Scala supports imperative as well as functional style of programming, it recommends programming in the latter. One of the challenges I faced was getting rid of a counter in the following BFS (Breadth First) style recursion (mutable to immutable variable conversion). I wont go into specifics but rather look at the pattern to be used here:

def BFSCounter(money: Int, coins: List[Int]): Int = {
    var n = 0
    // termination checks
    for (// loop dependent on input parameters
        n += BFSCounter( ...)

The above code launches multiple recursions in each call of BFSCounter and the variable n which maintains the count must be incremented and returned. 

The strategy to get rid of the mutable variable is to pass the counter from one call of BFSCounter to another.  To do that we must also make the calls to BFSCounter a constant (independent of variables or input parameters). This means understanding the traversal pattern, termination checks etc. The final code looks like the following:

def BFSCounter2(money: Int, coins: List[Int]): Int = {
    // termination checks
    val count1 = BFSCounter2(...0..)

Note that you may be able to reduce the code to just one call to BFSCounter2 or maybe you will need three calls. The point is that the number of calls should be finite and the counter is passed into the BFSCounter2 and passed back out as return value. The last call's return value is the final return value of the function.