UTF-16 について

Words near each other

・ Utetheisa sumatrana
・ Utetheisa timorensis
・ Utetheisa transiens
・ Utetheisa vandenberghi
・ Utetheisa varians
・ Utetheisa variolosa
・ Utetheisa vollenhovii
・ Utetheisa watubela
・ Utetheisa witti
・ Utetheisa ypsilon
・ Uteute
・ UTEX Industries
・ UTEXAS
・ UTF
・ UTF-1
・ UTF-16
・ UTF-32
・ UTF-7
・ UTF-8
・ UTF-9 and UTF-18
・ UTF-EBCDIC
・ UTF1 (gene)
・ UTFO
・ UTFO (album)
・ UTFSE
・ UTFSF
・ Utgard
・ Utgard (software)
・ Utgard Peak
・ Utgård

Dictionary Lists

mini英和辞書

翻訳と辞書　辞書検索 [ 開発暫定版 ]

スポンサードリンク

UTF-16 ：ウィキペディア英語版

UTF-16

UTF-16 (16-bit Unicode Transformation Format) is a character encoding capable of encoding all 1,112,064 possible characters in Unicode. The encoding is variable-length, as code points are encoded with one or two 16-bit ''code units''. (also see Comparison of Unicode encodings for a comparison of UTF-8, -16 & -32)
UTF-16 developed from an earlier fixed-width 16-bit encoding known as UCS-2 (for 2-byte Universal Character Set) once it became clear that a fixed-width 2-byte encoding could not encode enough characters to be truly universal.
== History ==

In the late 1980s work began on developing a uniform encoding for a "Universal Character Set" (UCS) that would replace earlier language-specific encodings with one coordinated system. The goal was to include all required characters from most of the world's languages, as well as symbols from technical domains such as science, mathematics, and music. The original idea was to expand the typical 256-character encodings requiring 1 byte per character with an encoding using 2¹⁶ = 65,536 values requiring 2 bytes per character. Two groups worked on this in parallel, the IEEE and the Unicode Consortium, the latter representing mostly manufacturers of computing equipment. The two groups attempted to synchronize their character assignments, so that the developing encodings would be mutually compatible. The early 2-byte encoding was usually called "Unicode", but is now called "UCS-2".
Early in this process, however, it became increasingly clear that 2¹⁶ characters would not suffice, and IEEE introduced a larger 31-bit space with an encoding (UCS-4) that would require 4 bytes per character. This was resisted by the Unicode Consortium, both because 4 bytes per character wasted a lot of disk space and memory, and because some manufacturers were already heavily invested in 2-byte-per-character technology. The UTF-16 encoding scheme was developed as a compromise to resolve this impasse in version 2.0 of the Unicode standard in July 1996〔(【引用サイトリンク】title=Questions about encoding forms )〕 and is fully specified in RFC 2781 published in 2000 by the IETF.〔ISO/IEC 10646:2014 "Information technology — Universal Coded Character Set (UCS)" sections 9 and 10.〕〔''The Unicode Standard'' version 7.0 (2014) (section 2.5. )〕〔(RFC 2781 ), February 2000.〕
In UTF-16, code points greater or equal to 2¹⁶ are encoded using ''two'' 16-bit code units. The standards organizations chose the largest block available of un-allocated 16-bit code points to use as these code units, and code points from this range are not individually encodable in UTF-16 (and not legally encodable in any UTF encoding).
UTF-16 is specified in the latest versions of both the international standard ISO/IEC 10646 and the Unicode Standard.

抄文引用元・出典: フリー百科事典『ウィキペディア（Wikipedia）』
■ウィキペディアで「UTF-16」の詳細全文を読む

スポンサードリンク

翻訳と辞書 : 翻訳のためのインターネットリソース