Traditionally, in the UNIX world, there are two complementary compiler construction tools which are available:
Later in this series of articles I hope to introduce two of these tools, JFlex and CUP, in a tutorial setting. I will enlist the help of several students in my compiler design course as coauthors. Ultimately, I hope to persuade those unfamiliar with these tools that they are very practical. I've chosen JFlex and CUP because they produce java code and it is high time for me to learn some java. This unfamiliarity with java will also provide me with a scapegoat when I do something stupid - I can blame it on that unfamiliarity.
This first article will provide background for the series. The next article, which should follow fairly soon, will show how/where to get these tools, give a very specific installation scenario, and produce a simple application as a development example. A third article will give a practical real world example (to be described below). If the third article becomes unwieldy, it may be broken into parts.
A utility such as JFlex builds a lexer from a specification file the programmer writes to define the 'words' (lexical tokens) in the desired language. Let's say the programmer defines a new langauge called pronto and writes a file 'pronto.flex' which defines valid lexical tokens for pronto. Then the command line operation 'JFlex pronto.flex' will produce a java version of a lexical analyzer, say, "Lexer.java'.
In its most primitive deployment, the lexer merely indicates that the source file consists of all valid lexical tokens or not - a boolean yes or no. The family of lexers under discussion are built to do more and, in particular, can cooperate with the parser (to be discussed under the next heading). In the typical application the lexer is, in fact, called by the parser and the lexer can do these jobs:
[ Note: The compilerati will realize that the lexer is actually checking the source code validity against a very simple 'regular grammar' specification and that the parser is checking the source code against a less simple 'context free grammar' specification - as defined in the Chomsky hierarchy. Typical compiler design books discuss the theory at length.]
A utility such as CUP builds a parser from a specification file the programmer writes to define the syntactic structure in the desired language. For the fictitious new language called pronto, the programmer might write the specification file as 'pronto.cup'. Then the command line operation 'java java_cup.Main < pronto.cup' will produce several files one of which is a java version of the desired syntactic analyzer, "parser.java'.
In its most primitive deployment, the parser indicates that the source
file is either grammatically correct or not - again, a boolean yes or no.
The family of parsers under discussion can do an additional task - whenever
a valid grammatical construct is found, perform any code that the programmer
cares to encode. This 'hook' is typically used to support target code generation
in two execution styles:
generated code written to a file to be executed later
generated code to be executed while the parser operates
Example 1 - A generalized industrial control language
Let's say Fred works for a company that produces industrial controllers, which are driven by a computer. When Fred is hired, the company has already developed and deployed a powerful, general pupose control language based on generalized, parallel state machines. However, customers must become programmers to use the controller; customers formally trained as chemical engineers, mechanical engineers, technicians etc. with little desire or time to learn a new general purpose programming language. The product is very general pupose, useful in many niche industries - automotive, petroleum, logging mills, satellite control, etc.
Fred has been hired to put a front end on the language for every one of the exploitable niche markets. In each case, the front end is to be tailored to the terminology used by the niche market customer and to be easy to use. The front end might be of the 'fill in the blanks' variety, a GUI, whatever. The front ends are really new languages all with the same target language (the general purpose control langauge). Each front end exemplifies an ASL. By using the compiler construction tools (e.g. JFlex and CUP), Fred capitalizes on several benefits including:
Fuzzy Logic has proved useful, not only in its traditional role in industrial control, but also in a decision making role. It can be used to play the stock market (and lose money more elegantly?), to choose from among a corporation's research or marketing strategies, to aid in avalanche prediction, etc.
Let's suppose that Fred's significant other, Sally, has programmed a general pupose Fuzzy Logic decision maker. To apply it to different problems it is initialized from appropriately different initialization data files. Sally markets this product to various niches, but finds former customers a constant burden on her time. The problem is really inherent in the way Fuzzy Logic works. The customer is the expert in his/her particular problem space. A Fuzzy Logic model is initialized by incorporating the expertise of the user. The user gains more expertise as the model is used and will constantly want to tweak the model. The crux of Sally's problem is that the initialization file must be prepared with great programming care. The customers are not programmers and easily make mistakes (most likely syntactic) in preparing such a data file. They always run into problems, call on Sally for help, and expect her to do it at a fairly low margin. She must respond to maintain her reputation and, hence, her financila success.
Her solution is to make an 'idiot-proof' front end that will automate writing the initialization data file. The front end is tailored to the niche customer's terminology and problem space. The front end is an ASL with the initialization data file as the target language. The translator can be written with the help of the compiler construction tools, just as done by Fred for the industrial control scenario. The translator guarantees that the lexical and syntactic content of the data file is correct.
If the front end is cleverly defined the customer will find it useful. Note that the customer is an expert on the problem semantics, where programmer Sally would be weakest. The customer will solve the semantic problems, and the ASL translator will avoid the lexical and syntactic problems related to the initialization data file.
Upshot
The two preceding examples have an obvious common thread. It should be emphasized that in designing the front end ASL's, Fred and Sally must interact strongly with the niche customers. After all, the ASL is to be useful to a programming novice who, at the same time, has expertise in the problem space. If this interaction fails, the ASL will not be well received and may fail in its intended market.
The Fuzzy Logic example is, in fact, the course project for this quarter's compiler design class - assuming Sally doesn't beat us to it.