Monthly Archives: January 2010

Retrieve Line Numbers and Column Numbers in your ANTLR AST

We know that you can easily get the line numbers and column numbers information from the lexer token, via the properties: line & pos.

When facing an Abstract Syntax Tree (AST), the rules may be rewritten so that the tree node tokens are in different order. Then how to retrieve the line numbers?

This is very important in language execution, because users will definitely need information about which line caused the issue.

The answer is simple: Retrieve the information from underlying lexer tokens. In a tree grammar rule, there may be multiple tokens. So it’s up to you to decide which token to look into. A naive method may be to look into the first token of  a tree node, using the start property.

There is another method. You can check it out at Recovering line and column numbers in your Antlr AST.

when software engineering is still complicated

While writing code, programmers run into issue after issue. As examples, I will talk about the issues I encountered, and wish it may help others who run into similar issues. I wrote the program to monitor certain web pages. Therefore it should download web pages in correct encoding, in multi-threading environment. It should save the page to back-end database.  The issues I encountered:

1. Threading issue

HttpClient from Apache is used as a web client. From their explanation page, we see no issue in the Multithreading section. There is no problem to run their multithreading example. However, you will run into problem if you create the HttpClient instance not in current Thread or it’s parent thread. This issue is not documented there, but very confusing to new users.

2. Implicit Dependency

I use Hibernate to persistent data, and maven 2 for dependency management. Maven 2 is awesome. However, what if some dependencies are missed out? Antlr v2.7.5H3 is for HQL parsing. However, Hibernate maven guide doesn’t statement it clearly. When you start to run HQL, the error will pop-up. Additionally, the antlr dependency is not available in official maven 2 repository. You will need to use the JBoss repository.

3. Session cache

After saving the data to database, another thread will take out the data for further processing. However, if you call getCurrentSession() on SessionFactory instance, you will get nothing. If you stopped the program, and run it again, you will get data from last run, but not data from new run. The issue is caused by the Session cache, which was designed for performance improvement. The solution is to call openSession() on the SessionFactory instance (or call openStatelessSession() if you hate cache).

I encountered other issues as well, such as encoding issue. However, the preceding issues are typical confusing issues.Why is software engineering so complicated?

It’s because people try to ask software to address the whole world, which is much more complicated.

Tire production can use assemble line, because the requirement is simple: I need a production for my Toyota car.

The car production can use assemble line, because the requirement is simple as well: I need a transportation tool from city A to city B, in land.

What if clients ask for a transportation vehicle to travel from any two point in a 3-dimension space? What if clients add the dimension of time?

People wish software to be capable of handling situations more intelligently, therefore people can be more lazy. There is nothing wrong here. But it can be an issue of feasibility, if the requirements have no boundary.

Set the source level to Java 6 for maven 2 compiler

By default, the source level is 1.3 in maven 2. We need to config maven compiler to use Java 1.6. Add the following snippet to your pom: