XML single node parsing with peg.js

Given the input <outer> Content <inner> Inner <single> </inner> </outer>.

How would I write grammar that parses the <single> node along with the nodes that have a matching closing node?

Here's my current grammer that was taken from here:


Content =
  (Element / Text)*

Element =
  startTag:StartTag content:Content endTag:EndTag {
    if (startTag != endTag) {
      throw new Error(
        "Expected </" + startTag + "> but </" + endTag + "> found."
      );
    }

    return {
      name:    startTag,
      content: content
    };
  }

StartTag =
  "<" name:TagName ">" { return name; }

EndTag =
  "</" name:TagName ">" { return name; }

TagName = chars:[a-z]+ { return chars.join(""); }
Text    = chars:[^<]+  { return chars.join(""); }

This only works with nodes that have a closing node.

I think the problem lies with the Text rule. So I've been experimenting with altering it to include a negative lookahead like:


Text    = chars:(!EndTag .)* EndTag { return chars.join(""); }

But that hasn't yielded anything successful yet.

Any ideas?

XML single node parsing with peg.js

No comments:

Post a Comment