# Entity and numeric character references

Valid HTML entity references and numeric character references can be used in place of the corresponding Unicode character, with the following exceptions:

  • Entity and character references are not recognized in code blocks and code spans.

  • Entity and character references cannot stand in place of special characters that define structural elements in CommonMark. For example, although * can be used in place of a literal * character, * cannot replace * in emphasis delimiters, bullet list markers, or thematic breaks.

Conforming CommonMark parsers need not store information about whether a particular character was represented in the source using a Unicode character or an entity reference.

Entity references (opens new window) consist of & + any of the valid HTML5 entity names + ;. The document https://html.spec.whatwg.org/multipage/entities.json (opens new window) is used as an authoritative source for the valid entity references and their corresponding code points.

Example 321

Markdown HTML Demo
  & © Æ Ď
¾ ℋ ⅆ
∲ ≧̸

<p>  &amp; © Æ Ď
¾ ℋ ⅆ
∲ ≧̸</p>

Decimal numeric character (opens new window) consist of &# + a string of 1–7 arabic digits + ;. A numeric character reference is parsed as the corresponding Unicode character. Invalid Unicode code points will be replaced by the REPLACEMENT CHARACTER (U+FFFD). For security reasons, the code point U+0000 will also be replaced by U+FFFD.

Example 322

Markdown HTML Demo
&#35; &#1234; &#992; &#0;

<p># Ӓ Ϡ �</p>

Hexadecimal numeric character (opens new window) consist of &# + either X or x + a string of 1-6 hexadecimal digits + ;. They too are parsed as the corresponding Unicode character (this time specified with a hexadecimal numeral instead of decimal).

Example 323

Markdown HTML Demo
&#X22; &#XD06; &#xcab;

<p>&quot; ആ ಫ</p>

Here are some nonentities:

Example 324

Markdown HTML Demo
&nbsp &x; &#; &#x;
&#87654321;
&#abcdef0;
&ThisIsNotDefined; &hi?;

<p>&amp;nbsp &amp;x; &amp;#; &amp;#x;
&amp;#87654321;
&amp;#abcdef0;
&amp;ThisIsNotDefined; &amp;hi?;</p>

Although HTML5 does accept some entity references without a trailing semicolon (such as &copy), these are not recognized here, because it makes the grammar too ambiguous:

Example 325

Markdown HTML Demo
&copy

<p>&amp;copy</p>

Strings that are not on the list of HTML5 named entities are not recognized as entity references either:

Example 326

Markdown HTML Demo
&MadeUpEntity;

<p>&amp;MadeUpEntity;</p>

Entity and numeric character references are recognized in any context besides code spans or code blocks, including URLs, link titles (opens new window), and fenced code block (opens new window) info strings (opens new window):

Example 327

Markdown HTML Demo
<a href="&ouml;&ouml;.html">

<a href="&ouml;&ouml;.html">

Example 328

Markdown HTML Demo
[foo](/f&ouml;&ouml; "f&ouml;&ouml;")

<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>

Example 329

Markdown HTML Demo
[foo]

[foo]: /f&ouml;&ouml; "f&ouml;&ouml;"

<p><a href="/f%C3%B6%C3%B6" title="föö">foo</a></p>

Example 330

Markdown HTML Demo
``` f&ouml;&ouml;
foo
```

<pre><code class="language-föö">foo
</code></pre>

Entity and numeric character references are treated as literal text in code spans and code blocks:

Example 331

Markdown HTML Demo
`f&ouml;&ouml;`

<p><code>f&amp;ouml;&amp;ouml;</code></p>

Example 332

Markdown HTML Demo
    f&ouml;f&ouml;

<pre><code>f&amp;ouml;f&amp;ouml;
</code></pre>

=

Entity and numeric character references cannot be used in place of symbols indicating structure in CommonMark documents.

Example 333

Markdown HTML Demo
&#42;foo&#42;
*foo*

<p>*foo*
<em>foo</em></p>

Example 334

Markdown HTML Demo
&#42; foo

* foo

<p>* foo</p>
<ul>
<li>foo</li>
</ul>

Example 335

Markdown HTML Demo
foo&#10;&#10;bar

<p>foo

bar</p>

Example 336

Markdown HTML Demo
&#9;foo

<p>→foo</p>

Example 337

Markdown HTML Demo
[a](url &quot;tit&quot;)

<p>[a](url &quot;tit&quot;)</p>