Domains

What is Punycode?

Definition

Punycode is a method defined in RFC 3492 for converting Unicode strings into ASCII labels, enabling Internationalized Domain Names (IDNs) to be used in the DNS.

Punycode is an encoding scheme designed to represent Unicode characters (including non-ASCII scripts such as Arabic, Chinese, Cyrillic, and Hebrew) using only the limited ASCII character set permitted in Domain Name System (DNS) labels. It was standardized in RFC 3492 in 2003. The DNS originally only supported letters, digits, and hyphens (LDH format) from ASCII. To internationalize domain names without breaking the underlying infrastructure, Punycode was developed as the key algorithm for the Internationalizing Domain Names in Applications (IDNA) protocol.

Punycode works by taking a string of Unicode characters and producing a case-insensitive ASCII string that starts with the prefix 'xn--'. The conversion uses a specialized bootstrapping algorithm: it separates out any ASCII characters that appear in the input, then encodes the remaining non-ASCII characters as a compressed sequence of alphanumeric characters. The result is guaranteed to fit within the 63-byte limit for a single DNS label. For example, the Japanese domain '例え.テスト' is converted to 'xn--r8jz45g.xn--zckzah'.

The algorithm is reversible, allowing applications to decode Punycode back to the original Unicode form for display. Punycode itself is only one component of IDNA: the full IDNA framework (updated in RFC 5891, 5892, and 5893) specifies normalization, case folding, and validation rules that must be applied before Punycode encoding. Today, all major web browsers, email clients, and DNS resolvers support Punycode. Its correct implementation is critical for security: homograph attacks exploit visually similar Unicode characters to create deceptive domain names, and Punycode makes these attacks possible to detect by revealing the underlying encoded form.

Key facts

  • RFC 3492 defines the Punycode algorithm for IDN encoding.
  • Punycode output always begins with the ASCII prefix 'xn--'.
  • It maps any valid Unicode string to a reversible ASCII-only label.
  • The DNS label length limit of 63 octets remains enforced after Punycode conversion.
  • Punycode enables scripts such as Cyrillic, Arabic, and Han characters in domain names.

How it works in practice

The Unicode domain 'münchen.de' (with an umlaut) is encoded by Punycode as 'xn--mnchen-3ya.de'. A browser receiving this label decodes it for display as 'münchen.de'. Similarly, the Russian domain 'россия.рф' becomes 'xn--p1ai.xn--h1alffa'.

Related terms

IDN (Internationalized Domain Name) IDNA (Internationalizing Domain Names in Applications) DNS label ASCII Unicode Homograph attack RFC 5891

References

More in Domains

Auth Code

A unique, per-domain secret code that the losing (current) registrar must provide so the gaining (new) registrar can authorize a domain transfer.

ccTLD

A ccTLD is a two-letter top-level domain assigned to a country or territory based on the ISO 3166-1 alpha-2 code, such as .us for the United States or .jp for Japan.

Domain Lock

A registrar-level status that prevents unauthorized domain transfers, modifications, or deletions until the registrant explicitly removes the lock.

Domain Privacy

An optional service that replaces the domain registrant's personal contact information in WHOIS records with the registrar's proxy details to shield the owner from spam and unwanted disclosure.

EPP

EPP (Extensible Provisioning Protocol) is an XML-based application protocol used by domain name registries and registrars to provision domain names, manage contacts, and transfer registrations.

Grace Period

The grace period is a window after a domain expires during which the registrant can renew at the standard renewal fee, without incurring additional redemption costs.

IDN

An Internationalized Domain Name (IDN) is a domain name that includes characters outside the ASCII set, encoded as Punycode for compatibility with the DNS.

RDAP

RDAP (Registration Data Access Protocol) is a modern RESTful protocol for querying domain name and IP address registration data, replacing the older WHOIS protocol with structured JSON responses and role-based access controls.

Registrant

The registrant is the legal holder of a domain name, listed as the owner in the registry database and responsible for the domain's renewal and administration.

Registrar

A domain registrar is an ICANN-accredited company that sells domain name registrations to individuals and organizations, managing the reservation of domain names within the DNS.

Who Is Online

In total there are 93 users online: 0 registered, 87 guests and 6 bots.

Bots: AhrefsBot Bingbot Facebook Other Bot Other Spider SemrushBot

Users active in the past 15 minutes. Total registered members: 340