ASCII Escaping of Unicode Characters
RFC 5137, “ASCII Escaping of Unicode Characters”, is a Best Current Practice document published in February 2008 by J. Klensin. The canonical text is published by the RFC Editor.
Abstract
There are a number of circumstances in which an escape mechanism is needed in conjunction with a protocol to encode characters that cannot be represented or transmitted directly. With ASCII coding, the traditional escape has been either the decimal or hexadecimal numeric value of the character, written in a variety of different ways. The move to Unicode, where characters occupy two or more octets and may be coded in several different forms, has further complicated the question of escapes. This document discusses some options now in use and discusses considerations for selecting one for use in new IETF protocols, and protocols that are now being internationalized. This document specifies an Internet Best Current Practices for the Internet Community, and requests discussion and suggestions for improvements.
What “Best Current Practice” means
Documents the IETF community's recommended operational or procedural practice rather than a protocol specification.
The canonical text of RFC 5137 is hosted at rfc-editor.org. Available in TXT,HTML.
- RFC 5136 Defining Network Capacity
- RFC 5138 A Uniform Resource Name Namespace for the Commission for the Management and Application of Geoscience Information
- RFC 5135 IP Multicast Requirements for a Network Address Translator and a Network Address Port Translator
- RFC 5139 Revised Civic Location Format for Presence Information Data Format Location Object
- RFC 5134 A Uniform Resource Name Namespace for the EPCglobal Electronic Product Code and Related Standards
- RFC 5140 A Telephony Gateway REgistration Protocol
- RFC 5141 A Uniform Resource Name Namespace for the International Organization for Standardization
- RFC 5142 Mobility Header Home Agent Switch Message