Lightweight Markup

text style | grouping and alignment | images and video | tables, code, math, and graphs | in search of lightweight markup

mediawiki wikidot markdown asciidoc confluence
autolinks CamelCase? no no no no no
autolinks url? yes yes no yes yes
can use html entities? yes yes if quoted: @<&amp;>@ yes yes yes
link offsite [http://website.com Website] [http://website.com Website] [Website](http://website.com) http://website.com[Website] [Website|http://website.com]
link onsite [[page|Pipe Trick]] [[[page|Pipe Trick]]] <a href="page">Pipe Trick</a> link:page.html[Pipe Trick] [Pipe Trick|page]
define anchor Headers automatically get anchors. Also:
<div id="foo"></div>
[[# foo]] <a name="foo"/> anchor:foo[] {anchor:foo}
link to anchor [[#foo|Foo]] [#foo Foo] <a href="#foo">Foo</a> <<anchor:foo,Foo>> [Foo|#foo]
comment <!-- comment --> [!-- comment --] <!-- comment --> // comment {HTMLcomment}in HTML source{HTMLcomment}
{HTMLcomment:hidden}not in HTML source{HTMLcomment}
HTML some HTML tags are permitted [[html]]
<h1>My Heading</h1>
[[/html]]
set off block level html tags with blank lines:

<h1>My Heading</h1>
 
none
text style
mediawiki wikidot markdown asciidoc confluence
italic text two single quotes:
''italic text''
//italic text// *italic text* 'italic text'
_italic text_
_italic text_
bold text three single quotes:
'''bold text'''
**bold text** **bold text** *bold text* *bold text*
fixed width text <tt>fixed width text</tt> {{fixed width text}} `fixed width text` +fixed width text+ {{fixed width text}}
underlined text <u>underlined text</u> __underlined text__ <span style="text-decoration: underline">underlined text</span> [underline]#underlined text# +underlined text+
literal text <nowiki>''not italics''</nowiki> @@//not italics//@@ \*not italics\*
`*not italics*`
+++_not italics_+++ \_not italics\_
superscript2 superscript<sup>2</sup> superscript^^2^^ superscript<sup>2</sup> superscript^2^ superscript{^}2^
subscript2 subscript<sub>2</sub> subscript,,2,, subscript<sub>2</sub> subscript~2~ subscript{~}2~
font color <font color="red">font color</font> ##red|font color## <span style="color: red">font color</span> [red]#font color# {color:red}font color{color}
font size <font size="16">foo</font> [[size 16px]]foo[[/size]]
[[size 120%]]foo[[/size]]
<font size=16>foo</font> {span:style=font-size: 16pt}foo{span}
span w/ class <span class="foo">
text
</span>
[[span class="foo"]]
text
[[/span]]
<span class="foo">
text
</span>
grouping and alignment
mediawiki wikidot markdown asciidoc confluence
line break <br/> space, underscore, newline:
 _
space, space, newline space, plus, newline
 +
newline or \\
horizontal rule four or more hyphens:
----
four or more hyphens:
----
three or more asterisks, hyphens, or underscores:
***
---
___
three or more single quotes
'''
----
align left none [[<]]
Left aligned
[[/<]]
none {align:left}Left aligned{align}
align right none [[>]]
Right aligned
[[/>]]
none {align:right}Right aligned{align}
align center none [[=]]
Centered
[[/=]]
none {center}Centered{center}
justify none [[==]]
Justified
[[/==]]
none {align:justify}Justified{align}
top level heading =top level heading= + top level heading # top level heading also used for document title; hence only once permitted per document
= top level heading =
h1. top level heading
next level heading ==next level heading== ++ next level heading ## next level heading == next level heading == h2. next level heading
headings automatically anchored? yes no no yes, with id attribute on header element yes
list element * list element
** sublist element
* list element
 * sublist element
* list element * list element
** sublist element
* list element
** sublist element
numbered list element # numbered list element # numbered list element values in markup are ignored:
1. numbered list element
error if values in markup are not sequential starting from one:
1. numbered list element
# numbered list element
definition list ; one : the 1st cardinal
; two : the 2nd cardinal
: one : the 1st cardinal
: two : the 2nd cardinal
<dl>
<dd>one
<dt>the 1st cardinal
<dd>two
<dt>the 2nd cardinal
</dl>
one::
  the 1st cardinal
two::
  the 2nd cardinal
none
block quote <blockquote>
Four score and twenty
years ago…
</blockquote>
> Four score and twenty
> years ago…
> Four score and twenty
> years ago…
****
Four score and twenty
years ago…
****
bq. Four score and twenty years ago…
collapsible and expandable section none [[collapsible show="+" hide="-"]]
text that can be hidden
[[/collapsible]]
none none
div w/ class <div class="foo">
markup
</div>
[[div class="foo"]]
markup
[[/div]]
<div class="foo">
markup
</div>
images and video
mediawiki wikidot markdown asciidoc confluence
image [[File:foo.jpg]] [[image foo.jpg]] image:foo.jpg[] !foo.jpg!
image link [[File:foo.jpg|link=http://foo.com]] [[image foo.jpg link="http://foo.com"]] <a href="http://foo.com">![](foo.jpg)</a> image:foo.jpg[link="http://foo.com"] none
image with alt [[File:foo.jpg|alt=Foo]] [[image foo.jpg alt="Foo"]] ![Foo](foo.jpg) image:foo.jpg[Foo] none
image size specify width; height will be proportionate:
[[File:foo.jpg|300px]]
height can also be specified; height proportionate if not:
[[image foo.jpg width="300px"]]
height can also be specified; height proportionate if not:
<img width="300px" src="foo.jpg">
specify width; height will be proportionate:
image:foo.jpg[width="300px"]
none
embedded youtube video none [[html]]
copy-and-paste youtube <object>
[[/html]]
copy-and-paste youtube <object> none
tables, code, math, and graphs
mediawiki wikidot markdown asciidoc confluence
table {|border="1"
!A!!B
|-
|1||2
|}
||~ A||~ B||
||1||2||
<table>
<tr><th>A<th>B
<tr><td>1<td>2
</table>
[width="20%", options="header"]
|===
|A|B
|1|2
|===
||A||B||
|1|2|
multiple column cell ||||~ title||
||1||2||
<table>
<tr><th colspan=2>title
<tr><td>1<td>2
</table>
[width="20%", options="header"]
|===
2+|title
|1|2
|===
pre-formatted fixed-width block with no need to escape markup or < and & <pre>
int add(int a, int b) {
  return (a+b);
}
</pre>
[[code]]
int add(int a, int b) {
  return (a+b);
}
[[/code]]
set off from surrounding blocks with blank lines and indent each line at least 4 spaces:

    int add(int a, int b) {
      return (a+b);
    }
 
----
int add(int a, int b) {
  return (a+b);
}
----
{noformat}
nt add(int a, int b) {
  return (a+b);
}
{noformat}
highlighted code <source lang=c>
int add(int a, int b) {
  return (a+b);
}
</source>
[[code lang="cpp"]]
int add(int a, int b) {
  return (a+b);
}
[[/code]]
none source-hightlight must be installed:
[source,c]
----
int add(int a, int b) {
  return (a+b);
}
----
{code:java}
public class Foo {
  public static void main(String[] args) {
    System.out.println("foo");
  }
}
{code}
languages which can be highlighted 100+ languages php, html, cpp, css, diff, dtd, java, javascript, perl, python, ruby, xml none 150+ languages java, javascript, actionscript, html, xml, sql
inline LaTeX <math>
\int_0^\infty \frac{1}{x^2} dx
<\math>
[[$ \int_0^\infty \frac{1}{x^2} dx $]] none none
block LaTeX [[math]]
\int_0^\infty \frac{1}{x^2} dx
[[/math]]
none none
graphviz none none none {graphviz}
digraph {
A -> B
A -> C
C -> D
}
{graphviz}
_________________________________________ _________________________________________ _________________________________________ _________________________________________ _________________________________________

MediaWiki

MediaWik Syntaxi

MediaWiki powers Wikipedia and the wikifarm Wikia. The source code is freely available.

Wikipedia was launched in January 2001. The site initially used wiki software implemented in Perl called UseModWiki. In January 2002 the site switched to custom software written in PHP. The PHP code was rewritten for scalability in July 2002. It was given the name MediaWiki in 2003 and was eventually open sourced.

UseModWiki had a spare set of markup which did not expand much on the markup used by Wiki Base, the original wiki software used by WikiWikiWeb.

Wiki Base (1995) UseModWiki (1999) MediaWiki (2002)
link CamelCase CamelCase
[[Double Bracket]]
[[Double Bracket]]
italic ''italic'' ''italic'' ''italic''
bold '''bold''' '''bold''' '''bold'''
horizontal rule ---- ---- ----
top level heading none none =top level heading=
next level heading none none ==next level heading==
bullet list item * list item * list item * list item
numbered list item none # list item # list item
image can be URL:
foo.jpg
can be URL:
foo.jpg
[[File:foo.jpg]]
MediaWiki permits raw URL images like its predecessors but the feature is turned off on Wikipedia
table none added in 2003:
||A||B||
||1||2||
{|
!A!!B
|-
|1||2
|}

For those who like to experiment, here are sandboxes for Wiki Base and UseModWiki.

In Wiki Base the way to prevent a camel case word from becoming a link was to insert six single quotes into the word like this: C''''''amelCase. The odd markup was not intentional. The six single quotes actually parse as a bold empty string.

UseModWiki had a feature called "free links" which was attractive to the founders of Wikipedia. It permitted page titles to have spaces, commas, periods, or hyphens in them. These pages could be linked with double brackets: e.g. [[Los Angeles]]. In the URL the spaces are replaced with underscores. UseModWiki is case insensitive but MediaWiki is case sensitive except for the first letter.

Wikidot

Wikidot Syntax

A wikifarm with its own markup which is edited directly instead of via a WYSIWYG editor.

Supports a personal domain and custom CSS. Free and ad-supported or paid and ad-free.

Markdown

Markdown

Markdown was developed in 2004. The first implementation was in Perl. It is used by Tumblr, Stackoverflow, and Reddit. Here is a sandbox.

On Ubuntu you can install a script which converts Markdown to HTML with the following command:

sudo apt-get install markdown

It is a Perl script. If you copy it over to a Mac it will probably work as long as the necessary Perl module (Text::Markdown) is installed.

The command line script converts Markdown to HTML. Invoke it like this:

$ markdown foo.md > foo.html

AsciiDoc

AsciiDoc User Guide
HTML Slidy: Slide Shows in HTML and XHTML

AsciiDoc can be installed by a variety of package managers. On Ubuntu:

sudo apt-get install asciidoc

On a Mac with MacPorts:

sudo port install asciidoc

AsciiDoc documents have a .txt suffix by default. The following command will create a file called foo.html:

$ asciidoc foo.txt

slideshows

$ asciidoc -b slidy foo.txt

man pages

Confluence

Confluence Wiki Markup

Proprietary wiki integrated with JIRA which is issue tracking software also from Atlasssian.

In Search of Lightweight Markup

Blogs

A simple markup strategy is to use HTML markup but not the whitespace treatment. A scheme can be devised for using whitespace to indicate paragraphs and linebreaks, making those tags unnecessary. We see this in blogging software. Blogger is simplistic. It converts each newline to line break: <br>, and puts the entire post in a <div>.

Wordpress converts a string of whitespace characters with a single newline to a line break: <br>. If the string of whitespace characters contains two or more newlines it will convert it to a paragraph break: </p><p>. The begin and end of the post get a <p> and </p> tag so that the paragraphs are all closed. With Wordpress it isn't possible to put more than one line of space between paragraphs.

A difficulty with blog markup is source code. Both Blogger and Wordpress whitelist the <pre> element which can be used to preserve whitespace. The <code> element can be used to encourage a fixed font width rendering. If this is done the <code> element goes inside the <pre> element as <pre> is a block level element and only inline content is permitted inside the <code> element. Note that every <, >, and & in the code must be replaced with the HTML entities &lt; &gt; and &amp; because the <![CDATA[ mechanism describe below doesn't work. I see a lot of blogs with code snippets and I don't know how they do it. A Wordpress plugin, perhaps?

Blogging markup observes the first rule of lightweight markup: (1) use whitespace to define paragraphs. They disregard the second rule: (2) make it possible to copy-and-paste source code into the markup.

HTML

How light can one be with HTML?

It is worthwhile to read what the W3C has to say about paragraphs and line breaks. Paragraphs can only contain inline elements, though they are not themselves inline elements. Thus paragraphs cannot be nested. It is not required that text be inside a paragraph. The end tag is optional. It will be inferred by a subsequent paragraph tag or the end of the parent tag.

The line break, by contrast, is an inline element and the end tag is prohibited.

The rules regarding <p> and <br> are a conspicuous difference between HTML and XHTML. XHTML like XML requires a closing tag for all elements. Plain HTML is better for being lightweight.
If an HTML end tag is optional or forbidden, don't use it. If an HTML document does not have a head, then the html, head, and body tags can be completely eliminated. Here are the common HTML elements with optional (O) or forbidden (F) tags:

element start tag end tag
body O O
br - F
head O O
hr - F
html O O
img - F
li - O
p - O
td - O
th - O
tr - O

If you want to display source code in HTML, here is the best way to do it:

<pre><![CDATA[
if x < 3 && y > 0 then
  print "hello"
endif
]]></pre>

Putting <code> tags inside the <pre> is usually unnecessary if you have control of the CSS. If you don't and you aren't getting fixed width font with the above then add them and hope they fix it.

Here is the third rule of lightweight markup: (3) be lighter than intelligent use (i.e. omitting optional and forbidden tags) of HTML. In particular have a lighter solution than <pre><![CDATA[ for source code.

Tables

There is a lot of information I like to store in tables. I used to edit HTML tables directly, but it was too slow and too hard to edit the tables when they got big. Part of the problem was I didn't know the closing tags were optional so my field separator was nine characters: </td><td>. Also I think that mixing punctuation and letters in the field separator makes the markup hard to read. We can make a rule out of it: (4) don't mix letters and punctuation in the field separators for tables. The hope is to make it easy to look at the markup and see how many columns are in a row.

Mediawiki table syntax lacks a property that most of the other lightweight markups have and seems desirable: (5) lines of table markup should be one-to-one with rows in the table. Mediawiki uses extra lines to start and terminate the table as well as an extra lines to separate rows. The one-to-one mapping of lines to rows makes it easier to move rows around in the table. I would like it if it were easier to move columns around as well, but I don't see how that it can be achieved with markup. Spreadsheets can do it and emacs in org-mode can do it. It might be worthwhile to develop the ability to export and import the data to and from a spreadsheet or emacs, maybe with the help of awk, for this feature and the fact that they align the columns.

Wikidot and Confluence have the undesirable property that if you need to add attributes to elements in the table you have to switch to a completely different and more verbose syntax: (6) the table markup should permit setting attributes so appearance can be fine-tuned. We can't have two different types of syntax because if you go with the simple syntax and later realize you need to switch you are in for a painful editing session.

I would like to be able to copy-and-paste data from an HTML page to a spreadsheet. It almost works because someone (the browser?) uses tabs to indicate the field separators. It fails if there are line breaks in any of the cells.

A Few More Rules

(7) the markup should be complete. This can be achieved by providing an escape mechanism to enter HTML. A whitelist can be provided for applications that need to limit the allowed HTML.

(8) the markup should be as easy to read as plain text.

(9) it should be possible to distinguish the markup from the plain text with a few simple rules. The wikidot is pretty good about this because most of the markup contains adjacent punctuation (often the same punctuation character duplicated). These are the exceptions: links, lists, block quotes, and H1 headings.

(10) all plain text is valid markup.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License