Apr 08

I am playing with DSLs. As a developer without too much language/compiler experience, it is not easy to get started. Fortunately, my friend Yu has a lot of experience, and Internet is another resort for resources. So, it becomes a bit easier for my evening learning.

Based on my recent reading, I would recommend the following resources for kick-off. If you are a DSL beginner who dones’t have plenty of experience (like me), it may be helpful:

1. There are two papers
a. On the Specification of Textual Syntaxes for Models
b. TCS: a DSL for the Specification of Textual Concrete Syntaxes in Model Engineering
The papers are written by committers of eclipse TCS project. They help to give you a feel and look in an academic perspective. I think it good to know about basic framework of methodology and a few terms in implementing a DSL.

2. ANTLR, get your hands dirty
Having knowledge with ANTLR will help your language definition journey. It is quite easy to master ANTLR with well-prepared documentations:
a. ANTLR Getting Start
b. Five minites introduction ANTLR v3
c. ANTLR v3 by Mark Volkmann
d. Some DSL posts in the Article section.

3. Some Links for Domain Specific Languages
This is a post I wrote earlier. It contains a few useful links:
http://www.frankdu.com/weblog/archives/21

  • Share/Bookmark
Feb 27

Part 1 of the notes is located at http://www.frankdu.com/weblog/archives/46

The relative presentation slide is located at openArchitectureWare.org.

23. In XText, you start to work with defining concret syntax.

24. For existing meta model, use importMetamodel directive. Use preventMMGeneration to prvent any meta model generation.

25. Simple Editor Customization in Xtext
- Xtend, expression language used throughout oAW
- Constraint checks: oAW check language, based on Xtend
- OutlineView customization: override label(…) and image(…) for meta types
- Content Assist
- customize the font style for keyword (keyword only)

26. Xtext instantiates Ecore metamodels, which means that it can be processed with any EMF tool.
- Within oAW workflow: the only Xtext-specific aspect is using the generated parser. Xpand template language is powerful code generation tool. Easily traverse the model/meta model using Xtend language
- EMF way: EMF’s native resource mechanism (what are the details?)
- Your own code: use the generated parser.

27. NodeUtil with generated parser
- Typically you only work with AST (ecore file)
- Help to obtain info from the parser tree: element location, element text, parser tree node at certain offset.

28. Two phases for doing your DSL:
- designing your language
- building language tools
Xtext focuses on the second phase. Except from the phases, it is also important to provide framework that run the tasks defined your DSL.

29. oAW Xtext become a part of TMF project. The first release is expected in later half 2009.

30. A Xtext parser limitation. It’s impossible to add custom action code in the parser. Sometimes it results in ugly meta models, especially with building expression languages.

  • Share/Bookmark
Feb 26

The relative presentation slide is located at openArchitectureWare.org.

Part 2 of the notes is located at http://www.frankdu.com/weblog/archives/52

Below are my reading notes for Textual DSLs and text modeling in eclipse. I haven’t finished the ppt slides. Therefore, this is only part one.

1. EMF servers as the foundation. It provides Ecore Metamodel and framework tools like:
- editing
- transactions
- validation
- query
- distribution/persistence

2. GMF is used for building custom graphical editors based on EMF meta models. It is industry-proven technology. Based on GEF.

3. TMF is used for building custom textual editors. It is in incubation phase. There are two implementations: Xtext and TCS.

4. M2M (Model-to-Model) delivers an extensible framework for m2m transformation languages. ATL is M2M language from INRIA. QVT is an implementation.

5. M2T (Model-to-Text) focuses on transforming models into text (code generation, model serialization). For example, you may want to convert in-memory models into xml files for persistence/transportation purposes. You may want to use a parser to convert xml files back to models. There are 2 so-called frameworks:

- JET is code generation tools that are used by EMF
- Xpand is code generation tools that are part of M2T releases.

6. Xtext is originally from openArchitectureWare.com. It’s a good integration with eclipse. The oAW uses EMF as a basis, bases graphical editors on GMF, and all tooling are based on eclipse. Since Xtext has become part of eclipse TMF, there are two versions of Xtext: oAW Xtext, and TMF Xtext. The former is relatively mature. The latter is under active development, and expected to be first released sometime this year, namely in 2009.

7. DSL is a focused, processable language for addressing specific concerns in a specific domain. It is targeted to be a simple tool for a relatively complex domain. Therefore, in most cases, DSLs are human-readable to domain experts without any training. The popular DSL examples are SQL and Excel.

8. DSLs can be classified in many ways:
- configuration vs. customization
- internal vs. external
- graphical vs. textual

9. Xtext is a so-called framework tool for building external textual customization DSLs.

10. Is it possible to edit same model with both textual and graphical editing interfaces?
It might be possible. Consider one of the following: a. Visualize a subset of the model, using graphvis or prefuse. But it is typically read-only. b. Use different perspectives. Some of them use graphcial editor. It requires cross references between textual and graphical models). c. Edit the same model textually and graphically. Textual format is used as the serialization format from the graphical model. It requires writability and sync of both models!

11. Typically textual DSLs leverage one of many parser generators (ANTLR, Java CC, Lex/yacc). They help to generate a parser based on grammar definition. Consequencely, a parser tries to match text, and try to create a parse tree.

12. Typically, textual DSLs are transformed into an Abstract Syntax Tree (AST). It is ofen a binary tree. For exampe, the AST for 1 + 2*3:

Literal[1] AddExpression ( Literal[2] MultiplyExpression Literal[3])

Literal, AddExpression, and MultiplyExpression are binary nodes.

13. The AST can be taken as a model. But textual DSLs are written without careness of the AST. They can even be against AST.

14. Challenges in Xtext DSL implementation:
- Writting a parser is non-trivial.
- A parser generator makes life easier, but still not one for all.
- A parser generator only creates a matcher and/or a simplistic AST. You still need to further transform the model to easily processable form, and create an editor with syntax highlighting, code completion, etc.

Xtext is designed to ease unbearable burden of life like that.

15. Xtext is based on an EBNF grammar (what’s that? Why will it make a difference?). Xtext will generate:
- ANTLR-based parser
- EMF-based metamodel
- Eclipse editor with or extensible for: syntax highlighting, code completion, code folding, constraint checking, and so on.

16. Different Kinds of Xtext Rules:
- Type Rule
- String Rule
- Enum Rule
- Native Rule

17. Built-in Lexer Types in Xtext:
- ID
- STRING
- INT
- Comments ( Single line and multiple lines)
- Whitespace
The content of those rules is not transformed into the meta model. (How it matters?)

18. Built-in Reference Types in Xtext:
- Reference
- File Reference/Import

19. Abstract Type Rules are implicitly declared with a collection of OR-ed alternatives: R1 | R2 | R3. They will be mapped to abstract metaclass. The alternatives will become subclass. The common properties will be lifted into abstract superclass.

20. String Rules are declared in the format: String [rule_name]: [rule_definition];

21. Enum Rule is mapped to Enum in metamodel. Its format: Enum [rule_name]: [token_name="string"]+;

22. Native Rule. Example:
Native SL_COMMENT:
“‘#’ ~(’\n’|'\r’)* ‘\r’? ‘\n’”;

Term:
1. EMF - Eclipse Modeling Framework
2. GMF - Graphical Modeling Framework
3. GEF - Graphical Editing Framework
4. TMF - Textual Modeling Framework
5. DSL - Domain Specific Langauge
6. JET - Jave Emitter Templates
7. AST - Abstract Syntax Tree

Continue to read: Part 2 of the notes is located at http://www.frankdu.com/weblog/archives/52

  • Share/Bookmark