ToolBook
Support us on Ko-fi
Help us keep this free, forever

HTML Entity Encoder / Decoder

How to encode and decode HTML entities online

Encode special characters to safe HTML entities or decode HTML entities back to readable text.

  1. Choose Encode or Decode mode

    Select Encode to convert characters like <, >, & and " into HTML entities safe for markup. Select Decode to reverse HTML entities back to their original characters.

  2. Paste your text or markup

    Paste the HTML snippet, plain text, or entity string you want to process into the input box.

  3. Select entity format

    When encoding, pick Named (e.g. &amp;) for readability, Numeric (e.g. &#38;) for decimal precision, or Hex (e.g. &#x26;) to match Unicode code charts.

  4. Copy the result

    Click Copy to send the encoded or decoded output to your clipboard and paste it into your code.

Frequently asked questions

What are HTML entities?

HTML entities are special codes used to represent characters that have a reserved meaning in HTML (&, <, >, ") or characters that are difficult to type directly. They start with & and end with ;, such as &amp; for & or &lt; for <.

When should I encode HTML entities?

Always encode user-supplied text before inserting it into HTML. Failing to do so opens you to Cross-Site Scripting (XSS) attacks. A malicious user could inject <script> tags or other harmful markup through any unescaped text. Any backend template or front-end innerHTML call should escape &, <, >, and " first.

What is the difference between named and numeric entities?

Named entities use a human-readable name like &amp; or &copy;. Numeric entities use either decimal (&#38;) or hex (&#x26;) Unicode code points. All three formats render identically in a browser. Numeric and hex entities cover every Unicode character; named entities only cover a defined subset.

What is hex format for HTML entities?

Hex entities use the format &#xNN; where NN is the character's Unicode code point written in hexadecimal. For example, &#x26; represents & and &#x3C; represents <. Hex notation maps directly to Unicode charts, which many developers find easier to cross-reference when working with non-Latin scripts or symbols.

Which characters must always be encoded in HTML?

Five characters are mandatory: & encodes to &amp;, < encodes to &lt;, > encodes to &gt;, " encodes to &quot;, and ' encodes to &#39;. These define HTML structure and must always be escaped before inserting user-supplied text into a page. All other characters are optional to encode but can improve compatibility with older parsers.

Does HTML encoding protect against all XSS attacks?

HTML entity encoding protects against HTML-context injection. It does not protect against JavaScript context injection (inside event handlers or <script> tags) or CSS injection. Those require additional escaping appropriate to their context, so always encode for the specific context where the text will appear.

What is the difference between HTML encoding and URL encoding?

HTML encoding replaces characters with HTML entities (e.g. &amp; for &) and is used when embedding text inside HTML markup. URL encoding replaces characters with percent-encoded bytes (e.g. %26 for &) and is used inside URL query strings and path segments. They serve different contexts and are not interchangeable.

How do I encode HTML entities in JavaScript or Python?

In JavaScript, most template engines and frameworks handle encoding automatically. For manual encoding, you can set element.textContent instead of innerHTML, which the browser escapes for you. In Python, the built-in html.escape() function encodes the five core HTML characters. Relying on your language's standard library is safer than writing a custom replacement.

What is &nbsp; and when should I use it?

&nbsp; is a non-breaking space. Unlike a regular space, it prevents a line break at its position and renders in contexts where the HTML parser would normally collapse whitespace. Use it for units attached to numbers (e.g. 25&nbsp;°C) or to keep two words together on the same line.

HTML entities: XSS prevention, special characters, and the full entity list

Why you must escape HTML before rendering user content, and the difference between named and numeric entities.

Why HTML entities exist

HTML uses < and > as tag delimiters, & as the start of entity references, and " inside attribute values. If your content contains these characters, the parser will misinterpret them as markup.

The solution is to replace them with their entity equivalents before inserting into HTML:

| Character | Named entity | Numeric entity | Why | |---|---|---|---| | & | &amp; | &#38; | Starts an entity reference | | < | &lt; | &#60; | Opens a tag | | > | &gt; | &#62; | Closes a tag | | " | &quot; | &#34; | Ends an attribute value | | ' | &#39; | &#39; | Single-quote in attributes |

Encoding and XSS prevention

Failing to encode user-supplied content before inserting it into HTML is the root cause of Cross-Site Scripting (XSS) — one of the most prevalent web vulnerabilities. If a user submits <script>alert(1)</script> and you render it verbatim, the browser executes it.

Encoding < to &lt; and > to &gt; turns executable markup into inert text that displays correctly but never runs.

Context matters. HTML entity encoding protects you in HTML context (between tags and in attribute values). It does not protect you in:

  • JavaScript context (<script> tags) — requires JavaScript string escaping
  • CSS context (<style> tags) — requires CSS escaping
  • URL attribute context (href, src) — requires URL encoding

Named vs numeric entities

Named entities like &copy; or &mdash; are human-readable and covered by the HTML5 specification. There are over 2,000 named entities.

Numeric entities come in two forms:

  • Decimal: &#169; for ©
  • Hex: &#xa9; for © (note the x prefix)

Numeric entities cover every Unicode code point. They are more portable since they work in XHTML and XML without an entity declaration, while named entities outside the HTML-reserved set (&amp;, &lt;, &gt;, &quot;, &apos;) require a DOCTYPE or XML entity declaration.

Non-breaking space (&nbsp;)

&nbsp; is the most misunderstood entity. It looks like a space but has two special behaviours:

  1. The browser will not wrap a line at an &nbsp; position.
  2. It does not collapse — consecutive &nbsp; entities create multiple spaces, unlike regular spaces which the HTML parser collapses to one.

Good uses: 25&nbsp;°C, Dr.&nbsp;Smith, phone number groups. Bad use: as an indent or paragraph spacer (use CSS for that).

Typography entities

HTML ships rich typographic entities that save you from inserting literal Unicode characters:

  • &mdash; (—) em dash — for parenthetical interruptions — like this.
  • &ndash; (–) en dash — for ranges: 2020–2024.
  • &hellip; (…) horizontal ellipsis — better than three separate dots.
  • &ldquo; &rdquo; (" ") curly double quotes.
  • &lsquo; &rsquo; (' ') curly single quotes.

Most modern frameworks handle these automatically via smart-quotes processing, but when hand-authoring HTML it is worth knowing the entity names.