Here is a good article explaining how to read cassandra source codes:
Check it out at http://skorage.org/2009/03/08/simple-thrift-tutorial/
Some of my thoughts: Unit tests are very important in TDD. It enables you easily refactor code, adhere to customer requirements, and verify logics at earliest points. It also enables Continuously Integration to work brilliantly.
While writing code, programmers run into issue after issue. As examples, I will talk about the issues I encountered, and wish it may help others who run into similar issues. I wrote the program to monitor certain web pages. Therefore it should download web pages in correct encoding, in multi-threading environment. It should save the page to back-end database. The issues I encountered:
1. Threading issue
HttpClient from Apache is used as a web client. From their explanation page, we see no issue in the Multithreading section. There is no problem to run their multithreading example. However, you will run into problem if you create the HttpClient instance not in current Thread or it’s parent thread. This issue is not documented there, but very confusing to new users.
2. Implicit Dependency
I use Hibernate to persistent data, and maven 2 for dependency management. Maven 2 is awesome. However, what if some dependencies are missed out? Antlr v2.7.5H3 is for HQL parsing. However, Hibernate maven guide doesn’t statement it clearly. When you start to run HQL, the error will pop-up. Additionally, the antlr dependency is not available in official maven 2 repository. You will need to use the JBoss repository.
3. Session cache
After saving the data to database, another thread will take out the data for further processing. However, if you call getCurrentSession() on SessionFactory instance, you will get nothing. If you stopped the program, and run it again, you will get data from last run, but not data from new run. The issue is caused by the Session cache, which was designed for performance improvement. The solution is to call openSession() on the SessionFactory instance (or call openStatelessSession() if you hate cache).
I encountered other issues as well, such as encoding issue. However, the preceding issues are typical confusing issues.Why is software engineering so complicated?
It’s because people try to ask software to address the whole world, which is much more complicated.
Tire production can use assemble line, because the requirement is simple: I need a production for my Toyota car.
The car production can use assemble line, because the requirement is simple as well: I need a transportation tool from city A to city B, in land.
What if clients ask for a transportation vehicle to travel from any two point in a 3-dimension space? What if clients add the dimension of time?
People wish software to be capable of handling situations more intelligently, therefore people can be more lazy. There is nothing wrong here. But it can be an issue of feasibility, if the requirements have no boundary.