Sunday, November 22, 2009

First experience with cloud computing (sort of)

Like many of you I am sure, I have been pounded with loads of articles on cloud computing.
I start to consider that there is something in a technology beyond pure marketing hype when IEEE Internet Computing dedicates a complete issue to the topic, which they did in October 2009.
So I decided to take a look and started using Google App Engine because it is free below certain quotas and also because I had some early experience of the Google Web Toolkit which is well supported by GAE.
GAE allows you to run java applications, actually classical J2EE Servlets, with a number of restrictions (for instance you cannot write to a file or open a port).
Since we recently restarted the COBOL Structure to XML Schema project, it seemed like a good idea to deploy it as a service available on the cloud. This way, developers who would like to get a sense of what the product does without investing time downloading and installing can do so.
The result is now available and I almost immediately started to get hits... and problems.
The first problem of course is that the translator is not a COBOL validating parser. I mean it is not a complete syntax checker. It is meant to process COBOL fragments that are supposed to compile OK in the first place. And yet, it is tempting to type COBOL statements in the input textarea of an HTML page, starting at column 1 instead of column 8. Today you often end up with an empty XML schema because the parser dropped everything that it did not recognize.
I guess we'll have to add more syntax checks after all.
The second problem has to do with GAE itself and Java. It has been discussed extensively on the Google App Engine group. When requests are received by GAE it picks up an instance of a VM somewhere on the cloud to service it. Chances are that this VM was last used for something totally different from running your own application. For this reason, Google actually cold starts the VM, which results in a large consumption of CPU... that counts against your quota!
Humm. That first experience has changed my view on cloud computing!

Wednesday, November 18, 2009

COBOL is weakly typed

Yes, whatever proponents think, COBOL is far from an ideal programming language.
Try to push an integer into a Java String and you will be stopped at compilation time. Now try something like "MOVE 15 TO A" where A has been defined as PIC X(2) and not only the compiler will let you through but even at runtime, that will actually work. You end up with the EBCDIC representation of characters '1' and '5' as the content of A.
Actually, you can move pretty much anything into a COBOL PIC X. You commonly do things like "MOVE LOW-VALUES TO A" where you end up with A filled with binary zeros. This is particularly useful in CICS/BMS applications where a field filled with low values will result in no data sent to a screen while a field filled with space characters will increase the volume of data sent (and therefore use up the, once precious, network bandwidth).
This is always a source of problems with integration solutions because the most natural mapping to a COBOL PIC X is a Java String. So you actually map a COBOL type that is not limited to characters to a Java type that only accepts characters.
It is important that an alternative mapping be available. Typically it is sometimes necessary to map a PIC X to a Java byte[] because the content cannot predictably be converted to characters.
When mapping to a Java String is mandatory (it usually is), it is also important that low level conversion routines remove non character content coming from the COBOL program before the Java String is populated.
Conversely, it is important that non character content gets inserted in the COBOL data when needed. For instance, if a COBOL field needs to be filled with low values (as opposed to space characters) then the integration conversion routines should provide an option to do so.
Keep in mind that the mainframe program might react differently to a field filled with low values as opposed to one filled with characters!

Friday, November 13, 2009

A new COBOL structure to XML schema translator

I have posted an initial release of a new project called legstar-cob2xsd that basically does COBOL structure to XML Schema translation.
There were several factors that led to this project:
  • The legstar-schemagen module in LegStar that does the job today was written in C and the parsing logic was hand coded. I think this might have driven away some users. legstar-cob2xsd is pure java and uses ANTLR grammars for COBOL structure parsing.
  • There is a clear need for an autonomous open source COBOL to XML Schema utility. So it makes sense to isolate that feature in its own project. People who need just that functionality won't have to figure out how to extract it from the other LegStar modules.
  • legstar-schemagen was systematically adding LegStar JAXB style annotations to the XML Schema produced. While this is still needed if you want to use LegStar for runtime transformations, this is not the default option anymore. This means people can use the resulting XML Schema totally outside LegStar if they want.
  • The clear separation of the COBOL parsing logic from the XML Schema generation means it is much easier now to create other targets. For instance JSON schemas.
  • Finally the fact that legstar-cob2xsd is in java allows JUnit tests to be much more comprehensive (and they are!).

Tuesday, November 10, 2009

What does CICS equates to in the Java world

It is always tough to explain what CICS exactly is to someone with a Java background.
One misconception I often hear is that it is similar to JBoss TS (Arjuna), the transaction manager in JBoss.
I think the issue is with the "Transaction" moniker. In the Java world, it is generally understood as a "Database transaction", something that has to do with coordinating resource updates.
In CICS, the meaning is different. If you monitor an active CICS system, what you will see are a number of active transactions. CICS transactions are primarily scheduling units. A CICS transaction is associated with an initial executable program which can in turn call other programs. CICS transactions get alloted chunks of memory for their programs to use, get CPU cycles when CICS decides, are authorized or not to a given user, etc. That's why CICS is called a Transaction Monitor: what is does is monitor CICS transactions.
A CICS transaction might very well never perform any update to any resource. It would still be a CICS transaction. A CICS transaction can send a form to a screen or receive data entered from a keyboard. CICS transactions can save conversation contexts, pretty much like servlets do.
Now, if a CICS transaction (or to be more precise some program it is associated with) does perform an update to a resource (say a VSAM file), then the CICS transaction is actually the default boundary from a "database transaction" standpoint. This means that when the transaction ends fine, all updates are committed but if the transaction fails (abends in CICS jargon), then all updates since the start of the CICS transaction are undone.
The CICS API also has COMMIT/ROLLBACK to give programs more control over the database transaction. From this standpoint, CICS is a local transaction manager. Now this is true for resources such as VSAM but more complex systems, such as DB2 or IMS-DB, have their own transaction managers. So who does the distributed transaction coordination that is needed when a CICS transaction updates both a VSAM file and a DB2 table?
Well, that would be RRS (Resource Recovery Services), a separated address space (process) in z/OS. That would be the closest thing I can think of to JBoss TS in a z/OS world.

Tuesday, November 3, 2009

When asked what the competition to LegStar is, I generally reply no one, because there are no other open source integration solutions for mainframes (to the best of my knowledge).
There are several commercial competitors though. They all seem to gather in the same places. DataDirect and HostBridge are the only ones missing from this list.
Most of these vendors (maybe Seagull to a lesser extent) are mainframe-centric. What I mean by that is that their products run natively on the mainframe, are priced/licensed accordingly and are sold to mainframe minded project leads. This is a closed circle. No place for open source there. Customers are not asking for it and vendors don't want to hear about it.
But out there, there is a growing population of java minded project leads and developers who are taking over higher and higher responsibilities in large IT departments. Many times, they have to deal with COBOL/PL/I/CICS/IMS applications as part of their projects.
This younger generation is a lot more responsive to open source. We are working for them.