Markdown Parser
The Markdown parser converts Markdown-formatted text into sequences of Element objects, which can then be converted to HTML or processed in other ways.
All functions and types in this module are defined in the ply::markdown namespace.
Parser
Parser is the main class for parsing Markdown. It's designed for incremental use. Create a parser with createParser(), feed it lines of input with parseLine(), and call flush() when input is complete. Each function returns an Element when a top-level block has ended.
| Creation and Destruction | |
Owned<Parser> | createParser() |
void | destroy(Parser* parser) |
| Parsing | |
Owned<Element> | parseLine(Parser* parser, StringView line) |
Owned<Element> | flush(Parser* parser) |
Array<Owned<Element>> | parseWholeDocument(StringView markdown) |
| Converting to HTML | |
void | convertToHtml(Stream* outs, const Element* element, const HTML_Options& options) |
String | convertToHtml(StringView src) |
Creation and Destruction
Owned<Parser> createParser()Creates and returns a new Markdown parser. The parser maintains state across multiple calls to
parseLine().void destroy(Parser* parser)Destroys a parser and frees its resources. This is typically handled automatically when using
Owned<Parser>.
Parsing
Owned<Element> parseLine(Parser* parser, StringView line)Parses a single line of Markdown input. Returns an
Elementrepresenting a completed top-level block (such as a paragraph or list) if one has ended, ornullptrif the current block is still being built.Owned<Element> flush(Parser* parser)Terminates the current top-level block and returns it. Call this after all input lines have been processed to retrieve any remaining content.
Array<Owned<Element>> parseWholeDocument(StringView markdown)Parses an entire Markdown document and returns all top-level elements. This is a convenience function equivalent to calling
parseLine()for each line followed byflush().
Converting to HTML
void convertToHtml(Stream* outs, const Element* element, const HTML_Options& options)Converts an
Elementand all its children to HTML, writing the output to the provided stream. TheHTML_Optionsstruct controls conversion behavior:{table caption="
HTML_Optionsmembers"}bool|childAnchors|If true, generates anchor elements for headings {/table}String convertToHtml(StringView src)Convenience function that parses an entire Markdown document and converts it directly to HTML. This is equivalent to parsing all lines, collecting the elements, and converting each to HTML.
Parser State
The Parser struct exposes three members that represent the top-level element currently being built:
Element | rootElement | The top-level element being constructed; returned by parseLine() or flush() when complete |
Array<Element*> | elementStack | Ancestor elements (BlockQuote or ListItem) containing the current parsing location |
Element* | leafElement | The innermost block (Paragraph or CodeBlock) receiving text, or nullptr if none is active |
These members let you inspect the parser's current state. All elements in elementStack and leafElement are owned by rootElement (as descendants in its children tree). When a top-level block is complete, it is detached from rootElement and returned.
Element
The parser produces a tree of Element objects with the following member variables:
Type | type | The element type |
u32 | indentOrLevel | Indentation (for list items) or heading level (1-6) |
s32 | listStartNumber | Starting number for ordered lists; -1 for unordered |
bool | isLoose | Whether a list has blank lines between items |
char | listPunc | List marker character (-, *, +, or .) |
Array<Owned<Element>> | children | Child elements |
Element* | parent | Parent element (or nullptr for root) |
Array<String> | rawLines | Raw text lines for leaf blocks |
String | text | Text content for Text, CodeSpan, or link destination |
String | id | HTML id attribute for headings |
Each element has a type indicating what kind of Markdown element it represents, and may contain child elements or text content depending on its type.
| Container Blocks | |
Element::None | |
Element::List | |
Element::ListItem | |
Element::BlockQuote | |
| Leaf Blocks | |
Element::Heading | |
Element::Paragraph | |
Element::CodeBlock | |
| Inline Elements | |
Element::Text | |
Element::Link | |
Element::CodeSpan | |
Element::SoftBreak | |
Element::Emphasis | |
Element::Strong | |
Element::NoneDefault element type, typically used for the root of the document.
Element::ListAn ordered or unordered list. Contains
ListItemchildren. UselistStartNumberto determine if the list is ordered (>= 0) or unordered (-1). ThelistPuncmember indicates the list marker character (e.g.,-,*, or.).Element::ListItemAn individual item within a list. The
indentOrLevelmember indicates the indentation level.Element::BlockQuoteA block quote. Contains other block-level elements as children.
Element::HeadingA heading (H1-H6). The
indentOrLevelmember indicates the heading level (1-6). Theidmember can be used to set an HTML id attribute. Text content is stored inrawLines.Element::ParagraphA paragraph of text. Text content is stored in
rawLines.Element::CodeBlockA fenced or indented code block. The raw code is stored in
rawLines.Element::TextPlain text content within an inline context. The text is stored in the
textmember.Element::LinkA hyperlink. The link destination URL is stored in the
textmember. Child elements contain the link text.Element::CodeSpanInline code (backtick-delimited). The code content is stored in the
textmember.Element::SoftBreakA soft line break within a paragraph.
Element::EmphasisEmphasized text (typically rendered as italic). Child elements contain the emphasized content.
Element::StrongStrongly emphasized text (typically rendered as bold). Child elements contain the content.
Element Member Functions
bool | isContainerBlock() const |
bool | isLeafBlock() const |
bool | isInlineElement() const |
bool | isOrderedList() const |
void | addChildren(ArrayView<Owned<Element>> newChildren) |
bool isContainerBlock() constReturns
trueif the element is a container block (None,List,ListItem, orBlockQuote) that can have child blocks.bool isLeafBlock() constReturns
trueif the element is a leaf block (Heading,Paragraph, orCodeBlock) that contains text but not child blocks.bool isInlineElement() constReturns
trueif the element is an inline element (Text,Link,CodeSpan,SoftBreak,Emphasis, orStrong).bool isOrderedList() constReturns
trueif the element is an ordered list (type isListandlistStartNumber>= 0).void addChildren(ArrayView<Owned<Element>> newChildren)Adds child elements to this element and sets their parent pointers.