Advertisement · 728×90

Text to Encode

Common HTML Entities Reference

Character	Entity Name	Entity Number
<	<	<
>	>	>
&	&	&
"	"	"
'	'	'
©	©	©
®	®	®
™	™	™
€	€	€
£	£	£
¢	¢	¢
—	—	—
–	–	–
	(non-breaking space)
×	×	×
÷	÷	÷

Encoded Output

Advertisement · 300×250

What Are HTML Entities?

HTML entities are special sequences of characters used to represent characters that have special meaning in HTML syntax or that cannot easily be typed on standard keyboards. An HTML entity starts with an ampersand (&) and ends with a semicolon (;), with a name or numeric code in between. For example, the less-than sign (<) is represented as < in HTML — without this encoding, the browser would interpret it as the start of an HTML tag.

There are two types of HTML entity references: named entities, which use human-readable names like © for the copyright symbol ©, and numeric character references, which use decimal (©) or hexadecimal (©) Unicode code points. Named entities are easier to remember and read; numeric references work for any Unicode character even if it has no named entity defined.

When to Use HTML Entities

HTML entity encoding is required in specific situations:

Characters with syntactic meaning: The five characters with special HTML meaning — < (less-than), > (greater-than), & (ampersand), " (double quote), and ' (apostrophe) — must be encoded when they appear in HTML content that is not part of a tag or attribute. Failing to encode them can break your HTML structure or create cross-site scripting (XSS) vulnerabilities.
Security — XSS prevention: User-generated content (names, comments, form inputs) that is displayed in HTML must have special characters encoded to prevent XSS attacks. If a user enters <script>alert('xss')</script> as their name and you display it unencoded, the browser will execute the script. Encoding it as <script> displays it safely as text.
Special symbols and punctuation: Typographic characters like em dash (—), en dash (–), non-breaking space, copyright symbol, registered trademark, and currency symbols can either be typed directly as UTF-8 characters (safe in modern HTML5) or encoded as entities. Using entities is useful when the character cannot be typed easily or when working in ASCII-only environments.
Mathematical and scientific notation: Symbols like ×, ÷, ±, ∞, ∑, and π have named entities that make their use in HTML self-documenting and portable.

UTF-8 vs HTML Entities

In modern HTML5 documents with a UTF-8 charset declaration (<meta charset="UTF-8">), you can use most Unicode characters directly in your HTML without encoding them as entities. A copyright symbol can be typed or pasted as © directly, and the browser renders it correctly. HTML entities for those characters (©) are functionally equivalent — they exist for compatibility with older encodings and ASCII-only environments.

However, the five HTML-significant characters (<, >, &, ", ') must be encoded when appearing in text content, regardless of charset. This is not a charset issue — it is a structural HTML parsing issue. Forgetting this is one of the most common causes of HTML rendering bugs and XSS vulnerabilities in web applications.

HTML Entities and Accessibility

Screen readers and assistive technologies handle HTML entities correctly — they read the character the entity represents, not the entity code itself. A screen reader encountering © will say "copyright" or the copyright symbol character, not "ampersand-c-o-p-y-semicolon". This means using entities does not negatively affect accessibility in any way.

Frequently Asked Questions

No. In HTML5 documents with a UTF-8 declaration, you can use any Unicode character directly without encoding. The only characters you must always encode are the five HTML-significant characters: < > & " and ' (in specific contexts). Non-ASCII characters like accented letters (é, ñ, ü), currency symbols (€, £), and emoji can be included directly in UTF-8 HTML without entity encoding.

Both represent the same character — the less-than sign (<). < is a named entity reference that uses a human-readable name. < is a decimal numeric character reference using the Unicode/ASCII code point 60. < is the hexadecimal equivalent. Named entities are easier to read; numeric references are more universal and can represent any Unicode character even if it has no named entity.

HTML entity encoding is the primary defence against reflected and stored XSS in HTML contexts — it prevents user-supplied characters like < and > from being interpreted as HTML tags. However, encoding must be context-appropriate: different encoding is required for HTML attributes, JavaScript strings, URLs, and CSS values. A single encoding function applied universally is not sufficient — proper XSS prevention requires context-aware output encoding at each insertion point.

A non-breaking space ( ) is a space character that prevents line breaks at that point. It is used to keep words that should stay together on the same line — like "10 kg", "Dr. Smith", or a number and its unit. It is also used for fine-grained typographic spacing in HTML. However, it should not be used for indentation or layout purposes — use CSS margins and padding instead.

The ampersand (&) is the entity delimiter character in HTML — it starts every entity reference. Because of this, a literal ampersand in HTML content must be encoded as &. Forgetting to encode bare ampersands in HTML content is a common bug, especially in URLs used in href attributes (e.g., <a href="page?a=1&b=2"> should be page?a=1&b=2 in HTML). Browsers typically handle unencoded ampersands tolerantly but validators will flag them as errors.

XML defines only five built-in entities: <, >, &, ', and ". HTML-specific named entities like ©, —, and   are not valid in XML unless they are declared in the DTD. In XHTML (XML-based HTML), only these five built-in entities plus numeric character references work universally. For other characters in XML, use Unicode numeric references (© for ©) or declare the entities in your DTD.

Related Free Tools

Need a custom tool built for your business?

Get a Free Quote

Free HTML Entity Encoder & Decoder

What Are HTML Entities?

When to Use HTML Entities

UTF-8 vs HTML Entities

HTML Entities and Accessibility

Frequently Asked Questions

Do I need to encode all non-ASCII characters?

What is the difference between &lt; and &#60;?

Does HTML entity encoding prevent XSS?

Why is the non-breaking space entity used?

What is the & character called in HTML entities?

Can HTML entities be used in XML?

Related Free Tools