September 27, 2015

RESTful software requirements specification (part 4 of 4)

Pages: 1 2 3 4

State diagram: Defining the REST service interface


In this stage we design the server-side service contracts, baed on the relations between our business models, as discovered in the class diagram. We do so by defining the state transition which happens to any business model after one of the supported CRUD methods is executed. Remember that in REST wording, the association is:
Create: PUT
Read:   GET
Update: POST
Delete: DELETE
According to the microformats standard which we apapt here, there’s typically a GET with an id to get the one entity with the id provided and a GET without id to get all the entities.

There is a 1 : 1 relation between business model and service: One service per business model. Hence we also have to define a unique service URL for each business model service: Can it be reached directly or only via a relation on another business model? The entity relations modeled in the class diagram can help us discover this.

Here is a service definition for the example use cases:

We do purposefully use a pseudo-RESTful syntax in these diagrams: We use HTTP verbs with RESTful URL patterns, response status codes, response content and metadata. This will greatly facilitate implementation. This also proves here the high degree of self-documentation that REST offers. Also note that even though we limit ourselves here to the few state transitions implied by the use cases, we can already see similarities between all the resources. It’s of course best practice to search for similarities and unify service contracts wherever possible; this is yet another thing we can directly translate into the actual implementation in order to stay DRY.

As you can see above, we covered the “get a book” functionality by the book resource itself. There is actually no need for an artificial “catalogue” pseudo-resource, as there is only one catalogue. Always try to think in independent “resources” when modeling state transfer, and ask yourself: “In which resource’s responsibility lies this state transfer?” You will then find out any state machine inconsistency.

Also, there is no “independent” /orders resource as an order is always bound to a user. Our class diagram actually revealed this when we saw that the multiplicity of the “user to order” relation in the “user” side is at least 1; an order without a user cannot exist. In the diagram, we illustrated this intention by using a composition relation between these two entities. Thinking of relations as potential “navigation paths” helps you prevent implementing illogical or inconsistent service interfaces later on.

Again note that we try to unify the service interface for every resource whenever possible, using the same “syntax” for every resource to handle these CRUD methods.


We can translate this state diagram directly into a Java EE JAX-RS RESTful web service just by applying the appropriate Java annotations to a resource POJO.

Because we created a unified service contract for every resource, we can stay DRY (don’t repeat yourself) compliant in the code as well by defining the service contracts in one base resource only:
public abstract class BaseResource<T extends BaseHalModel> {
    public List<T> findAll(@Context UriInfo uri) {
        return new ArrayList<>(getService().findAll());
    public T findById(@PathParam("id") Long id) {
        return getService().findById(id);
    public Response save(@Context UriInfo uri, T entity) {
        entity.removeLinks(); // cleanup HAL metadata
        try {
            entity = getService().save(entity);
            entity.addLink("edit", uri.getPath() + entity.getId()); // add HAL metadata
            return Response.status(Response.Status.OK).entity(entity).build();
        } catch (ConstraintViolationException ex) {
            return ex.getResponse();
    public Response delete(@PathParam("id") Long id) {
        try {
            return Response.status(Response.Status.NO_CONTENT).build();
        } catch (ConstraintViolationException ex) {
            return ex.getResponse();
    protected abstract BaseService<T> getService();
The actual implementation of the CRUD operations is not of interest from our REST service perspective. It would typically be implemented using an EntityManager which connects to a DB. For the demo implementation, I have mocked DB access with an application-scoped Map.

For the @PUT operation, I use the HAL format to add “link” metadata to the HTTP response. A validation constraint violation will trigger a 400 response with the validation constraint messages in the JSON body. Note that in a real world application, you would rather send a message bundle key than a localized validation error String so that the client side can localize the message.

Apart for the individual root paths of each concrete resource, their implementation is literally empty:
public class OrderResource extends BaseResource<Order> {
    // implement getService() by dependency injection

Feel free to take a look at all of the resource implementation classes at the project’s GitHub repository.

Finally, we can now implement the REST consumer on the client side, i.e. we implement the actions triggered by button / link clicks or other user interactions using AngularJS with Restangular for REST interface consumption.

This, for instance, is the implementation of OrdersEditCtrl which backs the 3. users/ view:
app.controller("OrdersEditCtrl", function ($scope, $route, $routeParams, $location, Restangular) {
    $scope.initEntity = function() {
        if ($ === "new") {
            $scope.entity ="users/0/orders/");
            if (typeof $routeParams.bookId !== 'undefined') {
                $scope.entity.deliveryDate = moment().add(1, 'days').toDate();
                $ ="books", $routeParams.bookId).get().$object;
    $scope.back = function() {
        $location.path("/books/" + $;
    $scope.execute = function() {
        $scope.entity.put().then(function(response) {
            $scope.errors = null;
            $location.path("/users/" + $routeParams.userId + "/orders/").search({created:});
        }, function (response) {
            $scope.errors =;
In initEntity(), the view is initialized with the information from the RESTful URL.

back() simply navigates back according to the information from the activity diagram and the UI mock-ups.

execute() actually does a PUT request to the server, triggering a navigation on success or rendering the error received from a non-2xx response. In the example use case for instance, this would be triggered by a Bean Validation constraint violation, e.g. an order delivery date which is not in the future.

Note that because retrieving a user is not actually a use case covered by this example application, I always work with a statically defined user with id 0.

I’ve created a blog post about AngularJS + Restangular interplay for REST service consumption, in case you’re interested in the technical details. Of course, the source code for all the other controllers is also part of the accompanying GitHub repository of this article.

That’s it. We have now inspected every important part of the specification and implementation of the example project.


With common enterprise software engineering best practices and architecture patterns such as RESTful contracts, object orientation, I18N and DRY built in right from the earliest specification stage, the resulting software product is clearly more likely to adhere to those principles as well. Rather than expecting the programmer to disentangle complex business requirements into a well-thought through implementation under the pressure of everyday software development, he and the business analyst work as a team to find the best possible technical solution for any business requirement in an early project phase.

Not only is overall software and documentation quality increased, but collaboration within the team will also tremendously profit. This of course is especially true in an agile development environment.

I never understood why requirements engineering and implementation should work separately as is often the case in “traditional” project structures. In this article, I wanted to show a way to incorporate some techniques which in my experience work well in software architecture and design into the software specification process as well. Of course, there may be many variations of this process other than the one I’ve illustrated here.

On a more technical note, I especially enjoyed the AngularJS / Restangular / JAX-RS interplay when working on the example project. With the basic architecture and specification laid out so clearly before implementation, it really felt like rapid application development with the UI building and styling, not technical or architectural problems, as the main time-consuming factor.

Please feel free to comment what you think of this article below and to share any experience in applying software development best practices to the requirements specification process.

Pages: 1 2 3 4

RESTful software requirements specification (part 3 of 4)

Pages: 1 2 3 4

UI mock-ups: Designing the UI


With all the information available now from the business model and the activity diagrams, we can easily model the UI with a mockup design tool.

Here are example mockups for all the UIs of this web application. Each mockup is accompanied by a short information section.

1. /books
Main entity: A List<Book> of all Books in the system 
Label / Link / Button Text EN Target URL Parameters
book.list.title Books
[0] (dynamic; see mockup) 2. /books/:id
  • id:

2. /books/:id
 Main entity: The Book with the id provided
Link / Button Text EN Target URL Parameters Book {0}
order Order 3. /users/
  • The currently logged in
  • the provided
back Back 2. /books

3. /users/
 Main entity: A new instance of Order with book = the Book with the id provided
Link / Button Text EN Target URL Parameters
purchase Purchase 4. /users/
  • The currently logged in
  • the of the newly created order
back Back 2. /books/:id
  • the provided

4. /users/
 Main entity: A List<Order> of all Orders in the system for the User with the id provided
Link / Button Text EN Target URL Parameters
order.list.title Order for user {0}
order.created New order {0} created successfully.
delete Delete 5. /users/
  • The provided
X X 5. /users/
  • The provided

5. /users/
 Same as UI mask 4.

As you can see, we make heavy use of the information from the domain model as well as of I18N.

Because the business model’s properties’ labels have already been defined, we don’t need to repeat this information on the UI mockups. Thus, a simple “label” is enough. However, additional labels as well as links / buttons need to be internationalized as well. This is done in the accompanying details table. Labels which contain variables follow a simple bracket syntax.

Having the labeling information defined once and once only diminishes ambiguity and makes it easier to apply changes later on. There is no more information duplication.

The additional tables also create the link between the masks and as such between the UI mockups and the activity diagram.

It’s important to have unique identifiers for the links / buttons. I have used a number ([0]) for the one link without a unique id.

Also note that there’s probably more UI design related information you should document. Most importantly, the overall UI design (including e.g. global navigation menus), the detailed default design for every input component (e.g. date pickers) and how error / information messages / input prompts are displayed. Having these things defined in a central place, rather than for each individual UI mask, greatly improves overall readability and matches the way DRY compliant software is built, namely based on reusable components. I haven’t defined these things explicitly here to keep the example brief.


With the information from these mockups and tables, it’s easy to implement the actual UIs using AngularJS-enhanced HTML markup. For instance, here is the implementation for mask no. 4.
    <h1>{{ 'order.list.title' | translate:user }}</h1>
    <div ng-show="errors != null" class="alert alert-danger" role="alert">
            <li ng-repeat="error in errors">{{error}}</li>
    <div ng-show="created != null"  class="alert alert-success" role="alert">
        <span data-translate="order.created" data-translate-value-id="{{created}}"></span>
        <a href="" ng-click="delete(created)" class="pull-right">{{ 'general.delete' | translate }}</a>
        <li ng-repeat="entity in entities">
            {{}}. {{}} ({{entity.deliveryDate | date:'yyyy-MM-dd'}}) <
            a href="" ng-click="delete(">X</a>
Note that the overall design is not defined for each individual page, but rather in a global template.

In order to keep things unified, the main entity of each mask is always initialized as entities (in case of a List) or entity (otherwise).

This will eventually be rendered like so:
Note that the RESTful URL is build as defined in the routing config earlier on.

Go to the project’s GitHub repository to see the source code of the other HTML pages.

Finally, don’t forget to add any additional I18N message keys to the bundle, for instance:
'order.list.title': 'Orders for user {{name}}',
'order.created': 'New order {{id}} created successfully.',
Note how AngularJS’ I18N facility supports variable replacement.

Pages: 1 2 3 4

RESTful software requirements specification (part 2 of 4)

Pages: 1 2 3 4

Class diagram: Designing the business models


The business models are the main entities in an object-oriented / REST-based architecture. Identifying them helps shifting the “action” / “verb” centric use case view to a “object” / “noun” centric view. Typically, all we have to do is literally identify the nouns in the use case descriptions: These are candidates for business models.

Here is a possible class diagram:

We have also extracted the properties of each business model and modeled their relations. Note that it’s perfectly valid to start with a very raw model and flesh it out even during implementation phase. At least, this is true for a real agile approach which is clearly to be preferred. For instance, during implementation we may find that we don’t actually need a “catalogue” object, or that it would be nice to have a “name” property for a “user”.

Also note that relationships between entities are labeled here. This will help us identify the resource URLs later on.

Based on the class diagram which really should serve as an overview, we can specify each business model and its properties in detail. For each property, we should at least specify:
  • its name as a unique identifier (in camelCase)
  • its real-world name, i.e. how it should be labeled and referred to in the UI (this is I18N built in)
  • its type
  • its default value, i.e. how it is presented to the user in a new mask
  • its validation constraints, i.e. what are valid values
For this example class diagram, we may come up with this definition:
Object Property Text EN Type Default Constraints
User - user - - -
User name name String - not empty
User orders orders List<Order> - -
Order - order - - -
Order deliveryDate delivery date Date tomorrow not emptymust be in the future
Order book book Book - not empty
Book - book - - -
Book isbn ISBN String - not empty
Book title title String - not empty

This diagram is typically built in close collaboration between business analysts and programmers.

It contains every information needed to implement the business model.


With the information from the class diagram and the business objects property details, we can build the server-side business models as POJO Java classes. Here, for instance, is the Order model:
public class Order extends BaseHalModel {
    private Book book;
    private Date deliveryDate;

    // getters / setters
The source code for the other model classes are avilable on the project’s GitHub repository.

Note that validation constraints are included declaratively into the model using Bean Validation constraints.

On the client side, the business model only exists as a JSON object which get automatically serialized / deserialized to / from the server.

We can, however, already define the I18N key mapping for localization, here using an AngularJS’s I18N module:
var translations = {
    '': 'Id',
    'book': 'Book',
    'book.isbn': 'ISBN',
    'book.title': 'Title',
    'order': 'Order',
    'order.deliveryDate': 'Delivery date',
A reasonable naming scheme would be, as shown in this example.

Activity diagram: Designing the page-flow


Based on the information we got from the use cases and the business models, we can now split each use case in actual UI masks and define the page-flow between these masks.

We can actually design page flow and the respective UI masks in parallel, with the information of one fostering understanding of the other.

Here’s an activity diagram for the three example use cases:
This clearly is a borderline UML-compliant version of an activity diagram, but it contains a lot of schematized information which will help us building the UI.
  • The URLs are actual RESTful URLs of the actual UI masks. Because these URLs are RESTful, they already contain the actual status information required to initialize themselves. I have here assigned an arbitrary unique id number for each URL to make it easier to refer to it in subsequent specification steps.
  • The arrows symbolize links or buttons. I have marked links which actually trigger server-side write operations (PUT, POST, DELETE) with their respective HTTP verb. Other links are assumed to be red-only (GET). This information is optional, but it helps shaping the overall picture.
  • Variables are marked with a colon (:). Where reasonable, we use a naming scheme based on the business model, e.g. makes it clear that this variable is the value of a book’s id.
  • Apart from RESTful URLs, we can also use request parameters such as ?
At this stage, we do not define where the variable values come from. Although the simple yet precise notation makes it almost obvious for most variables, we will specify this later in the UI mockups.

Note that to keep this example brief, I have neglected error handing behavior, e.g. what navigation steps are triggered when a validation constraint violation occurs. Of course, this would be part of a real world page flow design as well.

Of course, it also makes sense to unify URL building wherever possible: Stick to a common schema for every resource. Here, I’m using the microformats standard definition.


At this stage, only client side work is required. Based on the information in the activity diagram, we can already build the routing (mapping URLs to actual HTML file resources). In AngularJS, this is done by configuring the routeProvider:
app.config(function ($routeProvider) {
            .when('/books', {
                templateUrl: 'views/books/list.html',
                controller: 'BooksListCtrl'
            .when('/books/:id', {
                templateUrl: 'views/books/show.html',
                controller: 'BooksShowCtrl'
            .when('/users/:userId/orders', {
                templateUrl: 'views/orders/list.html',
                controller: 'OrdersListCtrl'
            .when('/users/:userId/orders/:id', {
                templateUrl: 'views/orders/edit.html',
                controller: 'OrdersEditCtrl'
                redirectTo: '/books'
Conveniently, AngularJS’ routing URL syntax matches the colon-style from our diagram; thus we can easily build one out of the other.

Pages: 1 2 3 4

RESTful software requirements specification (part 1 of 4)

The quality of a software product largely depends on the quality of its underlying specification. In this article, I’d like to show how to build a software requirements specification with software engineering design and architecture best practices such as REST, object-orientation and DRY built in.

Finding a common engineering language

As a software programmer, I typically depend on receiving a high quality software requirements specification in a project where we have dedicated business analysts. However, experience shows that there are typically huge discrepancies between how requirements specifications are designed and what programmers expect as a working ground to build solid software.

In some of the more extreme cases, all I receive is a collection of GUI-mockups alongside some plaintext explanations regarding e.g. mask-to-mask navigation.

Let me make it very clear that this is not simply the requirement engineer’s fault but rather the result of failing to find a common language which both, programmers and requirements engineers, understand.

In this article which is clearly written from a programmer’s perspective, I’d like to show how to apply software engineering design principles at the stage of requirements specification already which, as a result, produces a high quality software specification written in the language programmers understand: precise, detailed, well-defined, and in accordance to engineering best practices such as component reusage and DRY (don’t repeat yourself).

The building blocks

The main underlying architectural principles for this kind of common software design I’d like to suggest are object-orientation and REST (and I18N principles), for these reasons:
  • Object-orientation means thinking in business objects, actors, “nouns”. This is also known as “domain-driven design”. It mimics real-world thinking and is most typically the main underlying principle of the programming language in use, thus easy to implement.
  • REST is an architecture principle which reduces service interfaces (as in e.g. web client – backend server communication) to CRUD (create / read / update / delete) operations on business objects, actors, “nouns”. That’s right, it’s a perfect match for object orientation and largely adapted in many software frameworks, thus easy to implement. I’ve made an entire blog post about REST basics earlier this year.
  • I18N or internationalization / localization which typically means preparing the software to work with different UI languages. Instead of “hard-coding” output texts, we use text ids which are mapped to the actual translation, based on the language chosen at runtime. This highly increases software maintainability, and is built into most software frameworks, thus easy to implement.
These architecture principles match well any typical web client/server application.

Even if the software product is not based on these principles, applying them at the specification stage offers many of the advantages discussed presently.

To apply these principles, we build the actual software specification by using four types of commonly used UML diagrams plus UI mockups. Ideally, there is only little need for plain texts. For each aspect of the system (e.g. for each use case), we create the artifacts presented below.

In order to specify the server side implementation:
  • First of all, using the class diagram, we identify the participant business objects, their properties and their relation to each other. This diagram should be accompanied by a table with a definition of every property of every business model. We can later on basically generate our source code model classes (and, subsequently, our DB tables) from the diagram and the business properties table.
  • After that, we draw a state diagram for each business model identified in the previous diagram. Using these state diagrams, we identify the state transactions that occur within each of our classes, now viewed as “resources”. We can later on basically generate our RESTful service implementation from these diagrams.
In order to specify the client side implementation:
  • At first, we draw a use case diagram for the actual use case (a.k.a. user story). Although drawing a use case diagram is optional, it helps visualizing the information written down in the actual use case. The use case diagram is different from the other diagrams in that typically, we only draw one diagram for all use cases of the project / sub-system.
  • The class diagram (see above) is also needed for the client-side implementation.
  • Then, we draw an activity diagram illustrating the use case previously identified in more detail. We can later on use these diagrams to determine the detailed page flow for every user interaction in the UI.
  • Finally, we design the actual UI mockups, typically one for each activity previously identified.
It’s important to stick to the order of these design steps for server and client specification, respectively. However, depending on manpower or other factors, server and client implementations may either start in parallel or subsequently.

In an agile development environment (e.g. Scrum), these design steps will typically be followed immediately by the actual software implementation which are both part of the same sprint. This reduces the risks a loss of focus would bring and enables fast feedback cycles. Whilst this approach is clearly preferable for many reasons, the method explained in this article also works in “traditional”, more waterfall-like projects.

We use well-known UML diagram types only to facilitate common understanding. However, we will use them in a very pragmatic way. It’s more important to convey the important information through the diagram than trying to artificially stick closely to a diagram type’s formal definition.

I do also advocate pragmatic tool usage: We really want to use tools which are lightweight and easy to use with a minimum amount of functionality. Hence, I use the open source tool UMLet to draw UML diagrams and the open source tool Pencil to draw UI mock-ups.

Let’s build it

Enough now of the theoretic part! Let’s create an example application from scratch using this specification methodology. In the following sections, I will for a buch of example use cases run through each stage of software specification, as explained above, and for each stage, immediately show how its design would be reflected by the actual software source code. Note that in reality, implementing the source code would rather be the last step for each use case after the design is done.

For the understanding of this articles, it’s not important to actually understand the source code examples. Feel free to skip them if you’re only interested in the software specification view.

For the sake of this example, I use this very pragmatic web application tech stack:
  • AngularJS with Restangular for the client because there’s hardly any easier way to build a modern REST-consuming web client.
  • Java EE 7 JAX-RS for the REST server because there’s hardly any easier way to build a modern REST-producing web server.
The entire application source code is available on the accompanying GitHub repository.

In the following sections, specification work is marked in yellow; client implementation is marked in light blue and server implementation is marked in darker blue.

Actually, the example application will use REST on two layers:
  • The AngularJS single page application will feature RESTful URLs for user-friendly navigation.
  • The Java EE 7 web-service is implemented as a true RESTful server.
The example application should implement these three use cases based on an artificial “online book store”:
  1. The user can open a page with detail information (ISBN, title) about a book in the catalogue.
  2. After choosing a book, the user can order it, specifying the desired delivery date.
  3. The user can cancel his order at any time before delivery date.

For your reference, this is how the final application will look like:
#/books #/books/1?bookId=1
#/users/0/orders/new?bookId=1 #/users/0/orders?created=1

Use case diagram: Visualizing the information


Creating a use case diagram usually means identifying the “actors” and assigning them the use cases. Typically, there’s a single actor “user”, or one actor per user type (admin, normal user,…).

Here is a possible outcome after drawing all the user cases of the example application into a single diagram:

We only have one user here. We use “extends” to show that ordering a book implies having opened the book details in a previous step.


There is no implementation at this stage.

Pages: 1 2 3 4

September 20, 2015

Groovy by example: XML / HTML transformation (part 2 of 2)

Pages: 1 2

Transforming XML (continued)

Example: Flip an HTML table by 90°

I have built the example transformation function in two parts:
protected static String flipTable(def html) {
    // 1. Implement row / column flip
    // 2. Build new table
For more simple transformation tasks, combining these two steps into the actual builder may be preferred.

Transforming the structure

Here is the first part of the method:
Writer writer = new StringWriter()
MarkupBuilder xmlBuilder = new MarkupBuilder(writer)

Map rows = [:].withDefault{[]}

// 1. Implement row / column flip { _tr, _tr_i ->
    // for each row
    _tr.'*'.eachWithIndex { cell, cellNo ->
        // for each cell in row
        if (rows[cellNo][_tr_i] == null) {
            rows[cellNo][_tr_i] = cell
        else {
            // if its place is already occupied that means that it was filled by a previous cell
            // with colspan/rowspan > 1. Search for the new free place in the same row
            rows[searchNextFreeSpace(rows, cellNo, _tr_i)][_tr_i] = cell
        int rowspan = cell.@rowspan.toInteger()
        if (rowspan > 1) {
            (rowspan-1).times { i ->
                // mark all cells consumed by the ROWspan as "occupied" by inserting a value there
                rows[cellNo] << false
        int colspan = cell.@colspan.toInteger()
        if (colspan > 1) {
            (colspan-1).times { i ->
                // mark all cells consumed by the COLspan as "occupied" by inserting a value there
                rows[cellNo+i+1] << false
                if (rowspan > 1) {
                    (rowspan-1).times {
                        // mark all cells consumed by the ROWspan as "occupied"
                        // by inserting a value there
                        rows[cellNo+i+1] << false
The first part is the actual “row to column flip” implementation which actually isn’t that interesting from our XML-centric point of view. The important thing is that it works with the originally parsed node tree and returns a data structure which still contains the original nodes, although newly arranged.

The code builds a map of all output rows, with a row number mapped to the list of cells it will contain. It walks through the input table, row by row, and cell by cell, and inserts these into the new data structure at their appropriate position. The tricky part which takes up most of the code lines is handling cells with colspan > 1 and rowspan > 1. These will mark occupied neighbor cells with a boolean flag, and in subsequent runs, whenever a target cell is already occupied, the next free cell is checked and used if empty, jumping to the next row / column (this is implemented in the searchNextFreeSpace(…) method).

Rebuilding the structure

Here is the second part of the method:
// 2. Build new table
xmlBuilder.table(html.body.table[0].attributes()) {
    tbody {
        rows.each { rowIndex, tuples ->
            tr {
                tuples.each { tuple ->
                    if (tuple == null) {
                        // this cell was originally not present. Insert an empty one
                    else if (tuple in Boolean) {
                        // this cell is marked as "jump over" due to previous colspan/rowspan > 1
                    else {
                        // insert the whole XML tree
                        copy(flipCell(tuple, rowIndex), xmlBuilder)

return writer.toString()
  • Using the HTML builder, build a table just as in the previous example (no need for any parent structures), preserving the original table’s attributes.
  • Inside the table, place a tbody node (statically).
  • The next code line is a Groovy loop, not a builder invocation: iterate over the previously created rows, and then, for each row: build a tr structure.
  • Then iterate over the tuples of the row and build the td / th element accordingly.
  • Because of the way how HTML tables work, a row may actually contain less cells than it’s supposed to. Browser rendering will just jump to the next row then. In the example table, this is the case in the “Indonesia” row. But because we switched rows and columns, this would not work anymore as omitting a cell would cause a column shift. So, detect null cells and just insert an empty cell.
  • If the cell is a Boolean, it’s simply a leftover from the cells previously marked as shifted by colspan / rowspan. Simply jump over them.
  • Otherwise, apply additional local transformations to the cell and insert its entire sub-tree into the structure. The cell transformation is done in the flipCell(…) method we will inspect presently; recursively copying the node plus all its sub-nodes into the tree of course is effectuated using our previously prepared copy(…) method.
flipCell(…) is a very use case specific modification of the node: We have to turn former colspan attributes into rowspan attributes, and vice versa, and do other similar clean up work:
private static Node flipCell(originalNode, int index) {
    def attributes = originalNode.attributes().collectEntries { attr ->
        if (attr.key == 'colspan') {
            return [(attr.key): originalNode.attributes().rowspan]
        else if (attr.key == 'rowspan') {
            return [(attr.key): originalNode.attributes().colspan]
        else {
            return attr
    // change td nodes to th nodes if they are in the first row
    QName name = new QName(, index == 0 ? "th" :
    return new Node(originalNode.parent(), name, attributes, originalNode.value())
Again, the internals of this method are not really interesting from an XML transformation point of view. The method returns a new Node which copies the information from an original node provided (including all of its children), except that it swaps the colspan and rowspan attribute, plus it changes cells into header cells if they're now in the 1st row. Note that this latter adjustment also means that the entire operation is not strictly reversible (because there’s no th to td transformation).

Even for the main use case of flipping a table, there are many ways to implement the actual logic, and the code listings shown here do most certainly not represent an optimal solution as it’s really just an illustrating example.

Printing XML

As mentioned previously, with a new MarkupBuilder(StringWriter), you get pretty printing of the resulting xml markup for free:

Printing HTML

However, when using HTML as webpage markup, there’s a well-known problem which especially applies to pretty-printed markup: Because inner text may collide with pretty-formatting whitespaces, those additional whitespaces may be interpreted by the browser as additional to-be-printed whitespaces. This problem is lengthy discussed in this stackoverflow thread.

I have also incorporated one of the solutions offered by this stackoverflow thread, namely, inserting comments at the critical areas of the markup. I do so by using plain String regex in the formatHtml(String) method:
protected static String formatHtml(String html) {
    return html.replaceAll($/([^>])\n(\s*)</$, { all, lineEnd, space ->
        // text, followed by EOL, followed by a tag
        // insert comment between EOL and tag


XML processing is actually one of my favorite applications for the Groovy programming language. It really shows the versatility of Groovy being both a full-fletched general-purpose programming language and a quick and handy tool for everyday tasks. It serves me, as a Java programmer, as a powerful tool to tackle the otherwise cumbersome task of XML processing in a very concise way in well-known Java terrain.

A major hurdle for this example implementation in my view was the amount of information available about Groovy’s XML parsing / building facilities. Although there is quite a lot of basic information on the official website alone, these examples don’t really convey the amount of information necessarily to tackle non-trivial real world XML transformation tasks. I really hope that this article will help you if you’re stuck with Groovy XML / HTML processing and helps you finding new ideas of how to handle a problem in the process.

Again, the complete source code of this example implementation is available on GitHub.

Please let me know in the comments section whether this article was helpful for you or if it lacks any important information.

Pages: 1 2

Groovy by example: XML / HTML transformation (part 1 of 2)

In this blog post, I present a complete example of using Groovy for XML / HTML transformation because I think that other, more simple examples available on the web don’t quite show best practices and common pitfalls as clearly as a more in-depth example does.

Note: This article does not cover XML namespace awareness. Please refer to other online resources if you are interested in this topic.

By example!

As an example use case, we will actually transform an HTML <table>.

We will flip it by 90 degrees, i.e. turning its columns into rows and vice versa which I think may be both an interesting and a generally handy example. I’ve randomly chosen the OPEC members table from Wikipedia to do that.

We will transform this:

into this:
(Without surrounding elements.)

Note that the entire source code of this example is available at the accompanying GitHub repository. You may want to open its code in a separate browser window for quick reference.

Let me start off with some Groovy XML processing basics.

Parsing XML

Parsing XML is nicely explained in Groovy’s official documentation, hence I will not repeat it here.

It’s important to note that there are two facilities for XML parsing, groovy.util.XmlParser and groovy.util.XmlSlurper, and their API to access information on parsed XML nodes differ. This is nicely shown by this code example by mrhaki. For the example code, I will use XmlParser with fits HTML parsing better than XmlSlurper because it has better support for text child nodes. We also don’t need the lazy evaluation feature provided by XmlSlurper in this example.

Note that the example will not work with XmlSlurper!

Parsing HTML

XmlParser expects well-formed XML, thus we must transform the (potentially not well-formed) HTML into well-formed XML first. This issue is addressed by this stackoverflow question.

Bottom line: You have to use an additional 3rd party parser. I’ve always used the tagsoup parser for this, and it has always done a good job.
Note: The example code may not work with any other HTML sanitize.

The complete code to parse HTML text into a Groovy node structure is then:
Parser parser = new Parser()
def html = new XmlParser(parser).parseText(HTML)
It’s important to note that in the case of “sanitized” HTML, the HTML is then implicitly included in an <html><body> structure, hence the root node returned by the XmlParser is the <html> node.

Writing XML

In Groovy, you can build XML (as many other tree-like structures) from scratch by so-called builders. Groovy ships with two XML builder implementations: groovy.xml.MarkupBuilder and groovy.xml.StreamingMarkupBuilder; they are both briefly covered by the official documentation. The latter has better support for namespaces, but as we ignore them in this article, we will use MarkupBuilder here. Note that annoyingly, they differ in their API, which is again nicely illustrated on mrhaki’s blog.

Note that the example will not work with StreamingMarkupBuilder!

As every Groovy builder implements BuilderSupport, we can  use the same techniques and best practices for every builder, whether it builds an XML structure or any other nested structure. I have discussed many best practices in an earlier article on custom builders already.

So, to build basic HTML, you build the structure literally:
private static String buildSomeHtml() {
    Writer writer = new StringWriter()
    MarkupBuilder xmlBuilder = new MarkupBuilder(writer)
    xmlBuilder.html {
        body {
            div(id: 'myDiv') {
                mkp.yield('My Text')
    return writer.toString()
A few notes about working with MarkupBuilder:
  • To print the output, initialize the builder with a StringWriter and use its #toString() function after building the structure. This will in fact pretty-print the XML outcome. (Note that the printing API for StreamingMarkupBuilder works completely different!)
  • MarkupBuilder has an implicit reference mkp to MarkupBuilderHelper which you can use to e.g. insert XML comments and inline text.

Manipulating XML

Both Groovy XmlParser and XmlSlurper provide methods to directly manipulate existing XML on-the-fly as you would do with a DOM or a SAX parser. This is covered by the official documentation.

However, the code might get messy quite easily if you try to restructure major parts of the original document. You can then literally get lost in your document structure!

For our example where we really want to flip an entire table structure over, I will here present a different approach, combining XML parsing and writing:

Transforming XML

This is realized here as the combination of parsing the original document in a node structure and then building a new node structure from these original parts.

This is really straightforward. For instance, let’s assume that we want to build an HTML table with the same attributes as an existing HTML table node.

Remember that the general syntax for building a tree node is (in pseudocode)
builder.nodeName(attributesMap) {
    child2(…) {
In the following simple example, let’s assume xmlBuilder is our instance of MarkupBuilder, and html is the root node of an HTML document parsed with an XmlParser:
This will build an HTML table with the attributes copied from the previously parsed original HTML table. Because we can use arbitrary Groovy code within the builder, we can manipulate the original nodes in whatever fashion we wish before using them in the new builder.

However, adding a node and all of its child nodes into a new builder is a bit tricky. This calls for a dynamic, recursive solution. Let’s build a general-purpose deep node-to-node-identity transformator function!

A dynamic XML builder

This method will dynamically insert the node provided, including its complete sub-structure, into the builder provided:
protected static copy(node, builder) {
    if (node in String) {
    else {
        builder."${}"(node.attributes()) {
            node.children().each { child ->
                copy(child, builder)
Here’s how it works:
  • The node can be a simple String. This is the case for HTML inner texts. In this case, simply yield the text.
  • Else, add a node with the name dynamically built from the original node’s local name, i.e. the name without the namespace part.
  • Copy its attributes.
  • For all its children, call this function recursively.
This really is the identity transformation of a node tree. When called with the originally parsed node tree, it will return the node tree and is admittedly useless as such.

Still, it will serve as a key ingredient of our example tree transformation function which is explained on the next page.

Pages: 1 2

September 6, 2015

The JavaScript / TypeScript / CoffeScript / Dart choice 2015 (part 2 of 2)

Pages: 1 2


Based on this comparison, I’d like to present here the results of my personal evaluation for a JavaScript language implementation choice. Please note that I’m not a senior JavaScript developer, and I don’t have much experience with these languages. My opinion really mainly comes from reasoning about their features and the opinions of other users of the languages.

I encourage you to make your own evaluation and come to your own conclusions.

ECMAScript 5

Of course, ECMAScript 5 is just ECMAScript 5. It’s the fallback for whenever a more useful, efficient, concise, higher-level abstraction / language is not supported by the execution platform, which nowadays is the case for client-side programming on the browser for all the other languages discussed here.

But it’s no more than that. It is antiquated, tedious, sketchy and overall just a language you don’t want to work in anymore. Never use this language again, always use one of the higher-level alternatives presented here, and automatically transcompile if needed. This will ensure your code is expressive, concise, optimized and maintainable.

Of course, for very simple tasks such as small JavaScripts snippets within an HTML page, making some DOM manipulation with jQuery, it would be overkill to introduce another language for features you most likely don’t even need. This simplistic case is not addressed by this article.

Bottom line: No.


I’d like to start off the actual comparison with CoffeeScript. Of all the languages discussed here, CoffeeScript derives the most from original JavaScript. It also provides the most syntactic sugar and language enhancements of all these languages.

It does however hardly cover any high-level / static programming constructs such as enums or generics at all.

If you compare it with outdated ES5, it is clearly superior in almost every aspect. However, when comparing it with recent ES6, additions seem rather marginal; some things are even worse.

From what I understand, this situation arose partly due to the history of CoffeeScript itself: ES6 (luckily) took many concepts and constructs which were first introduced in CoffeeScript, and included it in its own syntax, overruling its original inspiration. Or as put by the decaffeinate npm module currently built to port CoffeeScript to ES6: “JavaScript is the future, in part thanks to CoffeeScript. Now that it has served its purpose, it's time to move on.”

Also, it apparently has the worst IDE support of all the 3rd party languages.

I wouldn’t recommend at this point, however, to force a switch from CoffeeScript to ES6 if you have a mature project and an experienced team working on it. CoffeeScript is not bad, much in the contrary.

On the other hand, I wouldn’t recommend starting a new CoffeeScript-based project now. I hate to say it, but its syntax and some of its concepts really are mereley interesting from an academic point of view now. It makes dramatic syntax changes when compared to traditional JavaScript, which are highly opinionated (and personally, I have to say, I actually like them). Still, now that we have a reasonable standard, we should go with it, for the sake of long term maintainability.

Bottom line: No.


When it comes to JavaScript similarity, Google’s Dart sits in between TypeScript and CoffeeScript (but very close to the TypeScript end).

As far as syntax and components are concerned, the language is pretty similar to TypeScript as well. One deviation for all other languages (which may be considered “major” by some developers) is the enforced semicolon at the end of a statement.

IDE support is good.

The biggest difference, however, lays under the hood: Much in contrast to TypeScript, Dart is technically not an extension, a superset, of ECMAScript, but a rewrite from scratch. This clearly diminishes interoperability and is a potential issue for migration paths. This is what Microsoft criticizes when they explain their commitment to the ECMAScript evolution: “Some examples, like Dart, portend that JavaScript has fundamental flaws and to support these scenarios requires a “clean break” from JavaScript in both syntax and runtime. We disagree with this point of view.”

When deciding between ECMAScript 6, Dart, and TypeScript, it’s this question of corporate politics which matters most in my opinion: Do you want to go for the “standard” way, or take the risk of a proprietary path with Google? For what I’ve seen so far, Dart doesn’t offer anything which in my opinion would be worth taking this risk.

Again, I would recommend here not to switch existing projects, but not to start new projects based on Dart, either. Sorry, Google.

Bottom line: No.


It seems that so far, Microsoft’s TypeScript is our best contestant to make the race. As mentioned earlier, TypeScript really delivers about the same ES5 addition as does Dart. But most importantly, TypeScript really shines when it comes to (as its name suggests) static typing.

Nowadays, TypeScript is trying hard to keep up with the features rolled out in new ECMAScript 6, but on top of that, it enables static typing and type enhancements, as they are known and loved by traditional high-level languages (and their developers) such as Java with concepts like Interfaces, Generics, and even Mixins. (Note that except for the strict static type checking part, these concepts are available in Dart as well.)

Most importantly, again, TypeScript is conceptually different from Dart (and CoffeeScript) in that it’s not written from scratch, but instead, based on original ECMAScript 6 proposals and with respect (and backwards-compatibility) to ECMAScript 5. This makes it a true ECMAScript 5 super-set and allows you to write any valid ES5 code within TypeScript. This really is a huge plus when it comes to standardization / long term maintainability considerations which should be a major part of any technology choice, let alone a swift ES5 migration path.

IDE support is apparently the best out of all the options. If you don’t like Visual Studio, there’s still an Eclipse plugin.

For the time being, you might consider TypeScript roughly equivalent to the new ECMAScript 6, but with static typing as a bonus. And I mean this quite literally: No major browser supports ES6 completely, so you might as well use TypeScript and transcompile everything to ES5. Then, at least, you get full IDE support (I’ll cover the ES6 side presenty).

Okay, let me round this up. I think the choice here primarily depends on the situation you are in:
  • If you’re client-side programming (browser), the choice is all yours, and it really depends whether you like (or need) static typing. Remember that it adds security at the expense of rapid development. If your team comes from a statically typed world like Java, you may try it. If however you’re used to a “soft fail” world like e.g. in AngularJS angular expressions, adding static typing may feel like and additional impediment.
  • On the server side, I think, the decision depends on the project size: For small projects (e.g. just a simple CRUD REST server, introducing static typing may be overkill or even counter-productive. If, however, you write huge parts of your core business logic in JavaScript, distributed in dozens of modules and classes, you should seriously consider adding that additional static safety net.
It’s absolutely mandatory that this is a team choice, and based on the team’s experience. Introducing static typing or not is a highly disputed topic, (and, by the way, so is TypeScript’s Scala-like static typing syntax), and this choice may be a deciding factor of which people, in the future, will be part of your team.

Bottom line: Depending on the situation, yes.

ECMAScript 6

ECMAScript is the future of JavaScript, and it will shape the future of the web. It’s that simple. When compared with ES5, it adds a myriad of syntactic elements and concepts, surpassing even the functionality of many well-established high-level programming languages such as Java. It also delivers several features out-of-the-box which up to now were covered by 3rd party JavaScript libraries only, most notably, module import / export and promises.

There’s that one well-known problem though. ES6 support by major browers is just not there yet. However, just as the 3rd party languages presented above rely on transcompilation to ES5, there are vanilla ES6-to-ES5 transpilers such as Babel and Traceur. You must thus put an additional automatic transcompilation step in your deployment pipeline to compile ES6 to ES5 until fully supported by your target browser / runtime (such as e.g. Node.js).

I have made my evaluation in the TypeScript section already, but to sum up:
  • Use TypeScript for big scale projects
  • Use ECMAScript 6 otherwise
Please, do also consider the lack of mature IDE support in your evaluation: Of all the major open source IDEs, only one Eclipse plugin currently seems to support ES6 whereas TypeScript’s IDE support is apparently very good.

Bottom line: Yes.


Within a few years, the JavaScript landscape has completely changed, and it really changed for the better. But with the new choices comes new responsibility.

I really believe in standards. And I believe that in terms of responsibility, one should stick with the standard unless it makes you miss great opportunities. Considering the excellent work put into ECMAScript 6, and how it clearly moves towards more progress in the near future, I do clearly consider this the way to go.

For certain situations, as explained above, TypeScript may also be an excellent option, but I would then really treat it as a separate, specialized language, and maybe even consider a mix of both dialects, with TypeScript in use for specialized cases only.

Other than that, I will concentrate my future studies of the JavaScript landscape focusing on the ECMAScript standard, keeping competitive dialects in mind, but without searching for dependency on them.

I really hope this overview post and my insights have helped other developers overwhelmed by the breadth of the JavaScript language landscape. Please let me know your thoughts, comments or critics in the comments section. I would also be highly interested in any experience of working with these new technologies in big-scale projects.

You may also be interested in

Pages: 1 2

The JavaScript / TypeScript / CoffeScript / Dart choice 2015 (part 1 of 2)

I created an overview of the “big four” JavaScript language dialects: ECMAScript, TypeScript, CoffeeScript and Dart, in order to study their respective advantages and disadvantages, and to evaluate the best choice for future projects.

TL;DR: Big JavaScript language family overview / comparison ahead!

JavaScript is not the same as it was ten years ago. Having recently gained momentum and attention both for server-side programming (mainly due to the Node.js platform) and client-side programming (thanks to sophisticated frameworks such as AngularJS), JavaScript is now a viable alternative to “traditional” languages and platforms such as Java or C# for any (web-based) project of arbitrary scale, especially as it unifies client and server code base through a single main language.

However, as a JavaScript newbie, you’ll discover that there’s also not just the JavaScript. There is actually a variety of different versions and dialects of the langue, which emerged throughout the chequered history of the language.

Also, JavaScript isn’t actually “JavaScript” anymore at all. When JavaScript developers refer to their language, they typically mean the official standardization through ECMA International called “ECMAScript”. Due to various shortcomings and lack of updates to the old ECMA standards (ECMAScript 5 (2009) and earlier) in recent years, a number of independent script languages have emerged, the most wide-spread of which are nowadays TypeScript, CoffeeScript, and Dart. These are designed to be “transcompiled” to optimized JavaScript, which is typically effectuated through an additional automated step in the application deployment process.

However, in June 2015, ECMA released their newest standards specification (ECMAScript 6 (2015), called “Harmony”) which does not only address most shortcomings in earlier versions, but really enhances the language with important concepts known from other high level languages such as classes, module imports and promises, but also includes concepts still lacking in many other major languages (such as String interpolation or parameter default values, lacking in Java).

Hence, nowadays, after you’ve chosen to build a JavaScript-based solution, but prior to making any tech stack decisions (Node.js, AngularJS, …), you have to choose an actual language implementation. The main question here is: Will you go for the official standard, and if so, for which one? Or does your situation require a specialized, 3rd party language implementation?

Here I present an overview of the currently most wide-spread members of the JavaScript family:

JavaScript language family overview

ECMAScript 5 (2009) ECMAScript 6 (2015) (“Harmony”) TypeScript 1.5 Dart 1.12.0 CoffeeScript 1.9.3
by Ecma International Ecma International Microsoft Google Jeremy Ashkenas
Open source license BSD BSD Apache BSD MIT
Compile to ECMAScript 5 (using e.g. Node.js) N/A YES (Through e.g. Babel or Traceur) YES YES YES
Run headless / natively (with own VM) N/A N/A NO YES NO
Native browser support YES NO (< 100% depending on engine) NO NO (just “Dartium” experimental browser) NO
Compile to JavaScript on-the-fly in browser N/A N/A YES (not officially supported / recommended) NO YES (not officially supported / recommended)
; optional YES YES YES NO YES
{ } optional NO NO NO NO YES
Definition keywords var
  • var
  • let
  • const
  • var
  • (static types)
Additional special operators (none) (none) (none)
  • ?. null-safe
  • .. cascade
  • ? existential
  • ?. null-safe
  • @ this
  • (more)
“Everything is an expression” NO NO NO NO YES
“Everything is an Object” NO NO YES YES NO
Optional explicit type checking NO NO YES YES (warnings only) NO
Control flow
Array / List comprehensions NO NO NO NO YES
String interpolation NO YES YES YES YES
Multiline Strings NO YES YES YES YES
Default value for optional parameters NO YES YES YES YES
Array spread NO YES YES NO YES
Destructuring NO YES YES NO YES
Generator functions NO YES NO YES YES
Promises NO YES NO YES (called “Future”) NO
Data structures
Accessors (implicit getters / setters) YES YES YES YES NO
Interfaces NO NO YES YES (classes are implicitly also interfaces) NO
Import / export NO YES YES YES NO
ECMAScript 5 interop
Embed ECMAScript 5 N/A N/A N/A YES YES
Dedicated IDE N/A N/A Visual Studio - -
Plugins for open source IDE (every major IDE) Eclipse (through plugin) Eclipse IntelliJ IDEA Community Edition Eclipse (but support seems dead)
In-browser debugging YES YES (with transcompilation or natively depending on engine) YES (depending on IDE / browser) YES (it uses Source Maps) YES (it uses Source Maps)

Some considerations


All standards and implementations are open source. The license should thus not be a deciding factor.

Compile to ECMAScript 5

All the 3rd party language implementations are actually “transcompiled” to ECMAScript 5 or another JavaScript-compliant language (which is quite remarkable as ECMAScript is a high-level programming language by itself). This is important for two usage scenarios:
  • Server: There is no dedicated VM for the language, so you have to compile it to a language your server environment can run (e.g. ECMAScript 5 / through Node.js).
  • Client: You’re targeting a browser without native support for the language. Apart from the 3rd party dialects, this is currently also true for the official ECMAScript 6 standard.
In either case, you’re going to add an additional “transcompile” step in your deployment process. Although the concrete implementation differ for each language, this process can and must be automated using e.g. Node.js / Grunt.

However, note that this automated “transcompiling” step actually provides an additional advantage:
  • The outcome JavaScript is typically “optimized”, more so than you could probably write by hand (of course, this depends on the language implementation)

Additional functionality and libraries

When comparing the various language implementations, it’s important to remember that because of the to-Javascript-transcompilation, you can use any 3rd party JavaScript extension library to include additional functionality not covered by the core distribution of the language. For instance:
  • Module import / export, as included in e.g. ECMAScript 6, is also included in Node.js.
  • Promises, as included in e.g. ECMAScript 6, are also provided by the Q library.
Thus, if a language lacks a desired feature, you’re most likely to find a JavaScript extension library for just that. For course, this does not apply to features which require extra syntax such as list comprehensions.

ECMAScript 5 interop

A language’s familiarity to ECMAScript 5 defines ease of a migration path. Here, we have two type of languages: ECMAScript 6 and TypeScript with are fully backward-compatible with ECMAScript 5 and are thus technically a super-set of ES5 functionality, and Dart and CoffeeScript which have built their own grammar from scratch. However more or less similar to original JavaScript, their parser doesn’t recognize JavaScript syntax if it derives from the language’s own syntax. However, these two languages come with a built-in means to evaluate native ECMAScript 5 code.


Some language come with their own IDE; for others, there are plugins available in major open source IDEs. I have neglected proprietary IDEs here. Of course, pure JavaScript writing is possible with any text editor, and the most important tools of a JavaScript deployment pipeline really are just command line tools, which may diminish the need for sophisticated IDE support in smaller projects.

Thanks to the so-called Source Maps facility, debugging on the original source code is possible even after transcompilation into ECMAScript 5.

At this point, I encourage you to study above comparison chart and draw your own conclusions, or read my own evaluation and conclusions on the next page.

Pages: 1 2