XML single node parsing with peg.js



Given the input <outer> Content <inner> Inner <single> </inner> </outer>.


How would I write grammar that parses the <single> node along with the nodes that have a matching closing node?


Here's my current grammer that was taken from here:



Content =
(Element / Text)*

Element =
startTag:StartTag content:Content endTag:EndTag {
if (startTag != endTag) {
throw new Error(
"Expected </" + startTag + "> but </" + endTag + "> found."
);
}

return {
name: startTag,
content: content
};
}

StartTag =
"<" name:TagName ">" { return name; }

EndTag =
"</" name:TagName ">" { return name; }

TagName = chars:[a-z]+ { return chars.join(""); }
Text = chars:[^<]+ { return chars.join(""); }


This only works with nodes that have a closing node.


I think the problem lies with the Text rule. So I've been experimenting with altering it to include a negative lookahead like:



Text = chars:(!EndTag .)* EndTag { return chars.join(""); }


But that hasn't yielded anything successful yet.


Any ideas?


No comments:

Post a Comment