Parsing something similar to C++

Go To


I want to parse an input file that has syntax similar to c++ source. The file will have components such as these:

//It will have comments.
//It will be able to recursively open other files.
include OtherInputFile.txt
//It will resolve scope
ObjectName::MemberVariable = 0.0;
  MemberVariable1 = 1.0;
  MemberVariable2 = 2.0;

The trouble is, I have no idea what I'm doing. I suppose what I need is a textbook chapter on parsing to orient myself to what technologies or algorithms are available.

2012-04-03 22:16
by 2NinerRomeo
You should look up DFA - K Mehta 2012-04-03 22:18
I hope that the syntax is substantially simpler than C++. Otherwise, you are in for many years worth of fun - James McNellis 2012-04-03 22:19
Wow! C++ is a hard row to hoe if you're not used to parsing stuff - Hoons 2012-04-03 22:19
You might like this book in that case: - Stuart Golodetz 2012-04-03 22:19
Have a look at Lex and Yacc - Amardeep AC9MF 2012-04-03 22:20
C++ is a true monster to parse correctly... are you sure you are aiming at something similar? Consider that for several years different compiler vendors (and we are talking about experts in the field) have been arguing on what C++ parsing rules really meant. There are also for example cases in which in C++ an arbitrary number of tokens needs to be read and analyzed just to decide what is the semantic meaning of the very first of them. You don't want to go there - 6502 2012-04-03 22:23
Perhaps I overstated the problem. The text in the example is roughly the scope of what I am looking at doing. I've made a decent living working in c++ parsing almost nothing. This is new ground for me. Thank you for your helpful direction - 2NinerRomeo 2012-04-03 22:30
To Clarify, I don't want to parse c++. The example is what I want to parse, and I thought to myself: hmm... that reminds me of c++ - 2NinerRomeo 2012-04-03 22:32
Actually, this is very easy to parse, however, you may wish to have a look at because it is quite similar and has good tools to work with it - std''OrgnlDave 2012-04-03 22:55


Lots of tools exist to build parsers:

2012-04-03 22:50
by Chris Dodd


I want to parse an input file that has syntax similar to c++ source

Pray it doesn't have templates, preprocessor, operator overloading and multiple inheritance. Otherwise you're in trouble.

I have no idea what I'm doing

Investigate Lex/Yacc. Read book about the parsing or google the subject ("how to make a language"). Some of those tools have tutorials and documentation links. I could swear I saw either bison or yacc or lexx tutorial that had mentioned book that was called "how to write a compiler" or something like that, but that was so long ago, that I don't remember what tool was that, or what was the book called.

The principle is basically the same: you define language grammar (C++ standard has language grammar example in one of appendices), split input file into tokens (throwing errors if tokens don't match grammar), then classify tokens (what is it? opening bracket, identifier, function name ?) and build a tree out of those tokens which is then converted into corresponding language objects/function calls etc. Depending on complexity of your language, you might skip most of the steps and wrestle input file into submission using bunch of regexps.

2012-04-03 22:50
by SigTerm
Multiple Inheritance is easy to parse -- its just tough to implement correctly and efficientl - Chris Dodd 2012-04-03 22:55