Spec
pepper_chico — 2014-09-03T19:05:18-04:00 — #1
Hi, I've rapidly looked through the spec and it seems that 6.8 Raw HTML finally address something which currently isn't specifically addressed by any parser, but which I find a must.
I've mentioned it in this discussion and in other places I can't recall.
The general situation is the need to tell the markdown parser to stop parsing,
so that anything can be entered safely without the risk of being accidentally
transformed by the parser, and also, be able to tell the parser that the
no-markdown island has finished and the parser can go on with its job.
It would be nice to also have such thing working inline with surrounding markdown.
This makes it a standard way for supporting, for example, MathJax, which
I've had several issues with when using Redcarpet and other parsers once I settled
with Kramdown, which claims to work with MathJax specifically, not meaning to
provide a general solution like in 6.8.
With Redcarpet for example, for trying to enter LaTeX inline to surrounding markdown
text, I tried to use SPAN tags surrounding the LaTeX, but the parser still transformed it
somehow.
So, are the current mechanisms in the spec able to really provide such
"no-markdown here, don't parse" and is it validated in the test suite?
Regards
pepper_chico — 2014-09-03T23:21:11-04:00 — #3
I thought the requeriments in Raw HTML were enough for this. Anyway, if not, to tell the truth, IMO, it would be much better to have a convention for no-markdown input, like there's __ for bold, there could be something else for no-markdown, like \( and \), or something in the markdown sense.
(notice, this was in response to a deleted post)
mightymax — 2014-09-04T10:09:27-04:00 — #4
This does seem like a gaping hole in the current spec. The only thing I can think of currently would be to use an HTML tag to trick the parser into ignoring it but I would definitely prefer an explicit mention in the spec. Also, parsing Markdown inside HTML is a popular option in kramdown, so the trick isn't that reliable.
poke — 2014-09-05T06:27:29-04:00 — #5
MediaWiki has a special <nowiki> … </nowiki>
tag to allow adding arbitrary stuff that isn’t passed through the parser. I think something like this would work fine for Markdown too.
pepper_chico — 2014-09-06T00:00:40-04:00 — #6
This feature is just demanding support for a kind of literal in CommonMark. C++ has a nice mechanism for providing raw string literals, which I believe could serve as inspiration:
R"("foo""bar")" → "foo""bar"
R"baz("(foo)""(bar)")baz" → "(foo)""(bar)"
Where R"?( and )?" are delimiters, with ? meaning any more complex delimiter, so to avoid clashing delimiters with the contents of the raw string literal.
Although I see that having such a simpler special purpose markdown-like syntax for this probably would provide an equivalent functionality that Raw HTML could provide.
mb21 — 2014-09-07T08:40:04-04:00 — #7
Text in code blocks and inline code is not interpreted as markdown, which I think is a valid use if you want to show the reader LaTeX code, e.g. 2*7*\pi
. If you want the LaTeX to be interpreted for math rendering, usually $...$
is used. Do you have further use cases for a no-markdown islands?
pepper_chico — 2014-09-07T09:27:10-04:00 — #8
@mb21hi, it's not valid because code blocks are, AFAIK, rendered as <pre>...</pre> which would make LaTeX code itself to be shown verbatim.
mb21 — 2014-09-07T09:45:38-04:00 — #9
I'm not following. Do you want raw TeX like in Pandoc?
pepper_chico — 2014-09-07T09:53:42-04:00 — #10
@mb21 No, I'm suggesting support for something simpler that would also help with MathJax right away, but not limited to it.
pepper_chico — 2014-09-07T10:01:02-04:00 — #11
By the way, Rust also is another programming language that offers raw string literals using clever delimiters. For example, r##"foo #"# bar"##
gives foo #"# bar
, the amount of #
s can change for the delimiters, so to avoid clashes with the contents.
Markdown uses _ and * for italic, bold, and bold-italic. It could be that more than 3 of such characters could mean no-markdown (or some other character counting from 1). For example, ****this won't be parsed****
.
mb21 — 2014-09-07T10:07:18-04:00 — #12
I guess what I'm asking is: do we really need markdown-islands? Because to me they seem like kind of a hack, they don't correspond to any semantic entity like "code" or "math". I don't see it analogous to programming languages, since there you always have quotes for string, like x = "foo"
, and sometimes you want another delimiter than quotes to avoid having to escape them. But in markdown and other markup languages like HTML, you don't have that.
As for your MathJax use case: e.g. Pandoc seems to do quite a good job by using $...$
. Or are you saying there are problems when using that markdown in implementations that don't support math? A basic test turned out not too bad, though.
pepper_chico — 2014-09-07T10:23:20-04:00 — #13
@mb21, I think there's some confusion, I'm not talking about support for parsing and taking action to render LaTeX, or any other language. When using MathJax in a web page for example, this is the job of the MathJax script to take LaTeX and render it appropriatelly, MathJax will understand its delimiters as $...$
or \(...)\
IIRC, and will take the stuff inside to produce rendered LaTeX, but what's inside is just verbatim LaTeX that didn't get pre-processed by some markdown parser, it should not.
Such job that MathJax provides using javascript, could also be provided in other context, I don't have another example, but one can infer that it would also be necessary to not have parsing of such input intended to be consumed by a script.