ISO/IEC 8859-2: Difference between revisions
Adding short description: "8-bit character set for Central and Eastern European languages in Latin script" (Shortdesc helper) |
m v2.05b - Bot T20 CW#61 - Fix errors for CW project (Reference before punctuation) |
||
(25 intermediate revisions by 12 users not shown) | |||
Line 12: | Line 12: | ||
| basedon = [[ISO-8859-1]] |
| basedon = [[ISO-8859-1]] |
||
| next = |
| next = |
||
| otherrelated = [[Windows-1250]] |
| otherrelated = [[Windows-1250]], [[Mac OS Croatian encoding|MacCroatian]] |
||
| classification = [[Extended ASCII]], [[ISO 8859]] |
| classification = [[Extended ASCII]], [[ISO/IEC 8859]] |
||
}} |
}} |
||
'''ISO/IEC 8859-2:1999''', ''Information technology — 8-bit single-byte coded graphic character sets — Part 2: Latin alphabet No. 2'', is part of the [[ISO/IEC 8859]] series of ASCII-based standard [[character encoding]]s, first edition published in 1987. It is informally referred to as "Latin-2". It is generally intended for Central<ref>{{cite web|title=Microsoft Outlook Message Encodings|url=https://technet.microsoft.com/en-us/library/cc179149(v=office.12).aspx}}</ref> or "Eastern European" languages that are written in the Latin script. Note that ISO/IEC 8859-2 is very different from [[code page 852]] (MS-DOS Latin 2, PC Latin 2) which is also referred to as "Latin-2" in Czech and Slovak regions.<ref> |
'''ISO/IEC 8859-2:1999''', ''Information technology — 8-bit single-byte coded graphic character sets — Part 2: Latin alphabet No. 2'', is part of the [[ISO/IEC 8859]] series of ASCII-based standard [[character encoding]]s, first edition published in 1987. It is informally referred to as "Latin-2". It is generally intended for Central<ref>{{cite web|title=Microsoft Outlook Message Encodings| date=10 January 2017 |url=https://technet.microsoft.com/en-us/library/cc179149(v=office.12).aspx}}</ref> or "Eastern European" languages that are written in the Latin script. Note that ISO/IEC 8859-2 is very different from [[code page 852]] (MS-DOS Latin 2, PC Latin 2) which is also referred to as "Latin-2" in Czech and Slovak regions.<ref>{{Cite web |title=The Czech and Slovak Character Encoding Mess Explained |url=http://luki.sdf-eu.org/txt/cs-encodings-faq.html#pcl2 |access-date=2022-02-27 |website=luki.sdf-eu.org}}</ref> Almost half the use of the encoding is for Polish, and it's the main legacy encoding for Polish, while virtually all use of it has been replaced by UTF-8 (on the web). |
||
'''ISO-8859-2''' is the [[Internet Assigned Numbers Authority|IANA]] preferred charset name for this standard when supplemented with the [[C0 and C1 control codes]] from [[ISO/IEC 6429]]. 0. |
'''ISO-8859-2''' is the [[Internet Assigned Numbers Authority|IANA]] preferred charset name for this standard when supplemented with the [[C0 and C1 control codes]] from [[ISO/IEC 6429]]. Less than 0.04% of all web pages use ISO-8859-2 as of October 2022.<ref>{{Cite web |title=Usage Statistics and Market Share of ISO-8859-2 for Websites, October 2022 |url=https://w3techs.com/technologies/details/en-iso885902 |access-date=2022-10-23 |website=w3techs.com}}</ref><ref>{{Cite web|url=https://w3techs.com/technologies/history_overview/character_encoding|title = Historical trends in the usage statistics of character encodings for websites, February 2022}}</ref> Microsoft has assigned '''code page 28592''' a.k.a. '''Windows-28592''' to ISO-8859-2 in Windows. IBM assigned [[code page 912]] to ISO 8859-2,<ref>{{cite web |url=https://github.com/unicode-org/icu-data/blob/main/charset/data/xml/ibm-912_P100-1995.xml |title=Icu-data/Charset/Data/XML/Ibm-912_P100-1995.XML at main · unicode-org/Icu-data |website=[[GitHub]] }}</ref> until that code page was extended in 1999.<ref>{{cite web |url=https://github.com/unicode-org/icu-data/blob/main/charset/data/ucm/ibm-912_P100-1999.ucm |title=Icu-data/Charset/Data/Ucm/Ibm-912_P100-1999.ucm at main · unicode-org/Icu-data |website=[[GitHub]] }}</ref> '''Code page 1111''' is similar, but replaces byte B0 ° (degree sign) with U+02DA ˚ (ring above). |
||
[[Windows-1250]] is similar to ISO-8859-2 and has all the printable characters it has and more. However a few of them are rearranged (unlike [[Windows-1252]], which keeps all printable characters from [[ISO-8859-1]] in the same place). |
[[Windows-1250]] is similar to ISO-8859-2 and has all the printable characters it has and more. However a few of them are rearranged (unlike [[Windows-1252]], which keeps all printable characters from [[ISO-8859-1]] in the same place). |
||
== |
==Language coverage== |
||
These code values can be used for the following languages: |
These code values can be used for the following languages: |
||
Line 29: | Line 29: | ||
* [[Croatian language|Croatian]] |
* [[Croatian language|Croatian]] |
||
* [[Czech language|Czech]] |
* [[Czech language|Czech]] |
||
* [[Finnish language|Finnish]]{{efn|The missing letter [[Å]] is officially a part of the [[Finnish alphabet]], however it has no native use and its usage is limited to foreign names only.}} |
|||
* [[German language|German]]{{efn|Fully compatible with [[ISO/IEC 8859-1]] for German texts.}} |
|||
* [[German language|German]]{{efn|In 2017, the [[Council for German Orthography]] officially added a capital [[ẞ]], but is not actually required as SS can be used instead.}} |
|||
* [[Hungarian language|Hungarian]] |
* [[Hungarian language|Hungarian]] |
||
* [[Polish language|Polish]] |
* [[Polish language|Polish]] |
||
⚫ | * [[Romanian language|Romanian]]{{efn|This character set unifies [[Ș]] and [[Ț]] (S,T with commas below) with [[Ş]] and [[Ţ]] (S, T with [[cedilla]]s), as did virtually all other character sets including Microsoft's [[Windows-1250]] and the first version of [[Unicode]]. Unicode subsequently disunified them however Unicode notes as of 2014{{cn|date=April 2018}} that disunifying the letters with comma below was a mistake, causing corruptions of Romanian data: pre-existing data and input methods would still contain the older cedilla codepoints, complicating text searching.}} |
||
* [[Rotokas alphabet|Rotokas]] |
|||
* [[Serbian language|Serbian Latin]] |
* [[Serbian language|Serbian Latin]] |
||
* [[Slovak language|Slovak]] |
* [[Slovak language|Slovak]] |
||
Line 39: | Line 42: | ||
* [[Turkmen language|Turkmen]]<!-- All letters found in [[Turkmen alphabet]] are available here--> |
* [[Turkmen language|Turkmen]]<!-- All letters found in [[Turkmen alphabet]] are available here--> |
||
}}{{notelist}} |
}}{{notelist}} |
||
It can also be used for [[Romanian language|Romanian]], but it is not well suited for that language, due to lacking letters s and t with commas below, although it provides s and t with similar-looking [[cedilla]]s. These letters were unified in the first versions of the [[Unicode]] standard, meaning that the appearance with cedilla or with a comma was treated as a glyph choice rather than as separate characters; fonts intended for use with Romanian should therefore, in theory, have characters with a comma below at those code points. |
|||
⚫ | |||
==Code page layout== |
==Code page layout== |
||
Differences from [[ISO-8859-1]] have the Unicode code point number underneath. |
|||
In the following table characters are shown together with their corresponding [[Unicode]] code points. Differences from [[ISO-8859-1]] are shown with darker shading on top of their legend colours. |
|||
{|{{chset- |
{|{{chset-table-header1|ISO/IEC 8859-2 (Latin-2)}} |
||
{{chset-table-header|ISO/IEC 8859-2 (Latin-2)}} |
|||
|- |
|- |
||
|{{chset-left1|0x}} |
|||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|- |
|- |
||
|{{chset-left1|1x}} |
|||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|- |
|- |
||
|{{chset-left1|2x}} |
|||
|{{chset- |
|{{chset-ctrl1|U+0020 SPACE| [[space character|SP]] }} |
||
|{{chset- |
|{{chset-cell1|U+0021 EXCLAMATION MARK|[[Exclamation mark|!]]}} |
||
|{{chset- |
|{{chset-cell1|U+0022 QUOTATION MARK|[[Quotation mark|"]]}} |
||
|{{chset- |
|{{chset-cell1|U+0023 NUMBER SIGN|[[Number sign|#]]}} |
||
|{{chset- |
|{{chset-cell1|U+0024 DOLLAR SIGN|[[Dollar sign|$]]}} |
||
|{{chset- |
|{{chset-cell1|U+0025 PERCENT SIGN|[[Percent sign|%]]}} |
||
|{{chset- |
|{{chset-cell1|U+0026 AMPERSAND|[[Ampersand|&]]}} |
||
|{{chset- |
|{{chset-cell1|U+0027 APOSTROPHE|[[Apostrophe|']]}} |
||
|{{chset- |
|{{chset-cell1|U+0028 LEFT PARENTHESIS|[[Parenthesis|(]]}} |
||
|{{chset- |
|{{chset-cell1|U+0029 RIGHT PARENTHESIS|[[Parenthesis|)]]}} |
||
|{{chset- |
|{{chset-cell1|U+002A ASTERISK|[[Asterisk|*]]}} |
||
|{{chset- |
|{{chset-cell1|U+002B PLUS SIGN|[[Plus sign|+]]}} |
||
|{{chset- |
|{{chset-cell1|U+002C COMMA|[[Comma (punctuation)|,]]}} |
||
|{{chset- |
|{{chset-cell1|U+002D HYPHEN-MINUS|[[Hyphen-minus|-]]}} |
||
|{{chset- |
|{{chset-cell1|U+002E FULL STOP|[[Full stop|.]]}} |
||
|{{chset- |
|{{chset-cell1|U+002F SOLIDUS|[[Slash (punctuation)|/]]}} |
||
|- |
|- |
||
|{{chset-left1|3x}} |
|||
|{{chset- |
|{{chset-cell1|U+0030 DIGIT ZERO|[[0]]}} |
||
|{{chset- |
|{{chset-cell1|U+0031 DIGIT ONE|[[1]]}} |
||
|{{chset- |
|{{chset-cell1|U+0032 DIGIT TWO|[[2]]}} |
||
|{{chset- |
|{{chset-cell1|U+0033 DIGIT THREE|[[3]]}} |
||
|{{chset- |
|{{chset-cell1|U+0034 DIGIT FOUR|[[4]]}} |
||
|{{chset- |
|{{chset-cell1|U+0035 DIGIT FIVE|[[5]]}} |
||
|{{chset- |
|{{chset-cell1|U+0036 DIGIT SIX|[[6]]}} |
||
|{{chset- |
|{{chset-cell1|U+0037 DIGIT SEVEN|[[7]]}} |
||
|{{chset- |
|{{chset-cell1|U+0038 DIGIT EIGHT|[[8]]}} |
||
|{{chset- |
|{{chset-cell1|U+0039 DIGIT NINE|[[9]]}} |
||
|{{chset- |
|{{chset-cell1|U+003A COLON|[[colon (punctuation)|:]]}} |
||
|{{chset- |
|{{chset-cell1|U+003B SEMICOLON|[[semicolon|;]]}} |
||
|{{chset- |
|{{chset-cell1|U+003C LESS-THAN SIGN|[[less-than sign|<]]}} |
||
|{{chset- |
|{{chset-cell1|U+003D EQUALS SIGN|[[equals sign|{{=}}]]}} |
||
|{{chset- |
|{{chset-cell1|U+003E GREATER-THAN SIGN|[[greater-than sign|>]]}} |
||
|{{chset- |
|{{chset-cell1|U+003F QUESTION MARK|[[question mark|?]]}} |
||
|- |
|- |
||
|{{chset-left1|4x}} |
|||
|{{chset- |
|{{chset-cell1|U+0040 COMMERCIAL AT|[[@]]}} |
||
|{{chset- |
|{{chset-cell1|U+0041 LATIN CAPITAL LETTER A|[[A]]}} |
||
|{{chset- |
|{{chset-cell1|U+0042 LATIN CAPITAL LETTER B|[[B]]}} |
||
|{{chset- |
|{{chset-cell1|U+0043 LATIN CAPITAL LETTER C|[[C]]}} |
||
|{{chset- |
|{{chset-cell1|U+0044 LATIN CAPITAL LETTER D|[[D]]}} |
||
|{{chset- |
|{{chset-cell1|U+0045 LATIN CAPITAL LETTER E|[[E]]}} |
||
|{{chset- |
|{{chset-cell1|U+0046 LATIN CAPITAL LETTER F|[[F]]}} |
||
|{{chset- |
|{{chset-cell1|U+0047 LATIN CAPITAL LETTER G|[[G]]}} |
||
|{{chset- |
|{{chset-cell1|U+0048 LATIN CAPITAL LETTER H|[[H]]}} |
||
|{{chset- |
|{{chset-cell1|U+0049 LATIN CAPITAL LETTER I|[[I]]}} |
||
|{{chset- |
|{{chset-cell1|U+004A LATIN CAPITAL LETTER J|[[J]]}} |
||
|{{chset- |
|{{chset-cell1|U+004B LATIN CAPITAL LETTER K|[[K]]}} |
||
|{{chset- |
|{{chset-cell1|U+004C LATIN CAPITAL LETTER L|[[L]]}} |
||
|{{chset- |
|{{chset-cell1|U+004D LATIN CAPITAL LETTER M|[[M]]}} |
||
|{{chset- |
|{{chset-cell1|U+004E LATIN CAPITAL LETTER N|[[N]]}} |
||
|{{chset- |
|{{chset-cell1|U+004F LATIN CAPITAL LETTER O|[[O]]}} |
||
|- |
|- |
||
|{{chset-left1|5x}} |
|||
|{{chset- |
|{{chset-cell1|U+0050 LATIN CAPITAL LETTER P|[[P]]}} |
||
|{{chset- |
|{{chset-cell1|U+0051 LATIN CAPITAL LETTER Q|[[Q]]}} |
||
|{{chset- |
|{{chset-cell1|U+0052 LATIN CAPITAL LETTER R|[[R]]}} |
||
|{{chset- |
|{{chset-cell1|U+0053 LATIN CAPITAL LETTER S|[[S]]}} |
||
|{{chset- |
|{{chset-cell1|U+0054 LATIN CAPITAL LETTER T|[[T]]}} |
||
|{{chset- |
|{{chset-cell1|U+0055 LATIN CAPITAL LETTER U|[[U]]}} |
||
|{{chset- |
|{{chset-cell1|U+0056 LATIN CAPITAL LETTER V|[[V]]}} |
||
|{{chset- |
|{{chset-cell1|U+0057 LATIN CAPITAL LETTER W|[[W]]}} |
||
|{{chset- |
|{{chset-cell1|U+0058 LATIN CAPITAL LETTER X|[[X]]}} |
||
|{{chset- |
|{{chset-cell1|U+0059 LATIN CAPITAL LETTER Y|[[Y]]}} |
||
|{{chset- |
|{{chset-cell1|U+005A LATIN CAPITAL LETTER Z|[[Z]]}} |
||
|{{chset- |
|{{chset-cell1|U+005B LEFT SQUARE BRACKET|[[Square brackets|[]]}} |
||
|{{chset- |
|{{chset-cell1|U+005C REVERSE SOLIDUS|[[Backslash|\]]}} |
||
|{{chset- |
|{{chset-cell1|U+005D RIGHT SQUARE BRACKET|[[Square brackets|]]]}} |
||
|{{chset- |
|{{chset-cell1|U+005E CIRCUMFLEX ACCENT|[[Caret|^]]}} |
||
|{{chset- |
|{{chset-cell1|U+005F LOW LINE|[[Underscore|_]]}} |
||
|- |
|- |
||
|{{chset-left1|6x}} |
|||
|{{chset- |
|{{chset-cell1|U+0060 GRAVE ACCENT|[[`]]}} |
||
|{{chset- |
|{{chset-cell1|U+0061 LATIN SMALL LETTER A|[[a]]}} |
||
|{{chset- |
|{{chset-cell1|U+0062 LATIN SMALL LETTER B|[[b]]}} |
||
|{{chset- |
|{{chset-cell1|U+0063 LATIN SMALL LETTER C|[[c]]}} |
||
|{{chset- |
|{{chset-cell1|U+0064 LATIN SMALL LETTER D|[[d]]}} |
||
|{{chset- |
|{{chset-cell1|U+0065 LATIN SMALL LETTER E|[[e]]}} |
||
|{{chset- |
|{{chset-cell1|U+0066 LATIN SMALL LETTER F|[[f]]}} |
||
|{{chset- |
|{{chset-cell1|U+0067 LATIN SMALL LETTER G|[[g]]}} |
||
|{{chset- |
|{{chset-cell1|U+0068 LATIN SMALL LETTER H|[[h]]}} |
||
|{{chset- |
|{{chset-cell1|U+0069 LATIN SMALL LETTER I|[[i]]}} |
||
|{{chset- |
|{{chset-cell1|U+006A LATIN SMALL LETTER J|[[j]]}} |
||
|{{chset- |
|{{chset-cell1|U+006B LATIN SMALL LETTER K|[[k]]}} |
||
|{{chset- |
|{{chset-cell1|U+006C LATIN SMALL LETTER L|[[l]]}} |
||
|{{chset- |
|{{chset-cell1|U+006D LATIN SMALL LETTER M|[[m]]}} |
||
|{{chset- |
|{{chset-cell1|U+006E LATIN SMALL LETTER N|[[n]]}} |
||
|{{chset- |
|{{chset-cell1|U+006F LATIN SMALL LETTER O|[[o]]}} |
||
|- |
|- |
||
|{{chset-left1|7x}} |
|||
|{{chset- |
|{{chset-cell1|U+0070 LATIN SMALL LETTER P|[[p]]}} |
||
|{{chset- |
|{{chset-cell1|U+0071 LATIN SMALL LETTER Q|[[q]]}} |
||
|{{chset- |
|{{chset-cell1|U+0072 LATIN SMALL LETTER R|[[r]]}} |
||
|{{chset- |
|{{chset-cell1|U+0073 LATIN SMALL LETTER S|[[s]]}} |
||
|{{chset- |
|{{chset-cell1|U+0074 LATIN SMALL LETTER T|[[t]]}} |
||
|{{chset- |
|{{chset-cell1|U+0075 LATIN SMALL LETTER U|[[u]]}} |
||
|{{chset- |
|{{chset-cell1|U+0076 LATIN SMALL LETTER V|[[v]]}} |
||
|{{chset- |
|{{chset-cell1|U+0077 LATIN SMALL LETTER W|[[w]]}} |
||
|{{chset- |
|{{chset-cell1|U+0078 LATIN SMALL LETTER X|[[x]]}} |
||
|{{chset- |
|{{chset-cell1|U+0079 LATIN SMALL LETTER Y|[[y]]}} |
||
|{{chset- |
|{{chset-cell1|U+007A LATIN SMALL LETTER Z|[[z]]}} |
||
|{{chset- |
|{{chset-cell1|U+007B LEFT CURLY BRACKET|[[Curly brackets|{]]}} |
||
|{{chset- |
|{{chset-cell1|U+007C VERTICAL LINE|[[Vertical bar|{{pipe}}]]}} |
||
|{{chset- |
|{{chset-cell1|U+007D RIGHT CURLY BRACKET|[[Curly brackets|}]]}} |
||
|{{chset- |
|{{chset-cell1|U+007E TILDE|[[Tilde|~]]}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|- |
|- |
||
|{{chset-left1|8x}} |
|||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|- |
|- |
||
|{{chset-left1|9x}} |
|||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|{{chset- |
|{{chset-cell1|||style=background:#DDD}} |
||
|- |
|- |
||
|{{chset-left1|Ax}} |
|||
|{{chset- |
|{{chset-ctrl1|U+00A0 NO-BREAK SPACE|[[Non-breaking space|NBSP]]}} |
||
|{{chset- |
|{{chset-cell1|u=0104|U+0104 LATIN CAPITAL LETTER A WITH OGONEK|[[Ą]]}} |
||
|{{chset- |
|{{chset-cell1|u=02D8|U+02D8 BREVE|[[Breve|˘]]}} |
||
|{{chset- |
|{{chset-cell1|u=0141|U+0141 LATIN CAPITAL LETTER L WITH STROKE|[[Ł]]}} |
||
|{{chset- |
|{{chset-cell1|U+00A4 CURRENCY SIGN|[[currency sign (generic)|¤]]}} |
||
|{{chset- |
|{{chset-cell1|u=013D|U+013D LATIN CAPITAL LETTER L WITH CARON|[[Ľ]]}} |
||
|{{chset- |
|{{chset-cell1|u=015A|U+015A LATIN CAPITAL LETTER S WITH ACUTE|[[Ś]]}} |
||
|{{chset- |
|{{chset-cell1|U+00A7 SECTION SIGN|[[section sign|§]]}} |
||
|{{chset- |
|{{chset-cell1|U+00A8 DIAERESIS|[[Diaeresis (diacritic)|¨]]}} |
||
|{{chset- |
|{{chset-cell1|u=0160|U+0160 LATIN CAPITAL LETTER S WITH CARON|[[Š]]}} |
||
|{{chset- |
|{{chset-cell1|u=015E|U+015E LATIN CAPITAL LETTER S WITH CEDILLA|[[Ş]]}} |
||
|{{chset- |
|{{chset-cell1|u=0164|U+0164 LATIN CAPITAL LETTER T WITH CARON|[[Ť]]}} |
||
|{{chset- |
|{{chset-cell1|u=0179|U+0179 LATIN CAPITAL LETTER Z WITH ACUTE|[[Ź]]}} |
||
|{{chset- |
|{{chset-ctrl1|U+00AD SOFT HYPHEN|[[soft hyphen|SHY]]}} |
||
|{{chset- |
|{{chset-cell1|u=017D|U+017D LATIN CAPITAL LETTER Z WITH CARON|[[Ž]]}} |
||
|{{chset- |
|{{chset-cell1|u=017B|U+017B LATIN CAPITAL LETTER Z WITH DOT ABOVE|[[Ż]]}} |
||
|- |
|- |
||
|{{chset-left1|Bx}} |
|||
|{{chset- |
|{{chset-cell1|U+00B0 DEGREE SIGN|[[degree symbol|°]]}} |
||
|{{chset- |
|{{chset-cell1|u=0105|U+0105 LATIN SMALL LETTER A WITH OGONEK|[[ą]]}} |
||
|{{chset- |
|{{chset-cell1|u=02DB|U+02DB OGONEK|[[Ogonek|˛]]}} |
||
|{{chset- |
|{{chset-cell1|u=0142|U+0142 LATIN SMALL LETTER L WITH STROKE|[[ł]]}} |
||
|{{chset- |
|{{chset-cell1|U+00B4 ACUTE ACCENT|[[acute accent|´]]}} |
||
|{{chset- |
|{{chset-cell1|u=013E|U+013E LATIN SMALL LETTER L WITH CARON|[[ľ]]}} |
||
|{{chset- |
|{{chset-cell1|u=015B|U+015B LATIN SMALL LETTER S WITH ACUTE|[[ś]]}} |
||
|{{chset- |
|{{chset-cell1|u=02C7|U+02C7 CARON|[[Caron|ˇ]]}} |
||
|{{chset- |
|{{chset-cell1|U+00B8 CEDILLA|[[cedilla|¸]]}} |
||
|{{chset- |
|{{chset-cell1|u=0161|U+0161 LATIN SMALL LETTER S WITH CARON|[[š]]}} |
||
|{{chset- |
|{{chset-cell1|u=015F|U+015F LATIN SMALL LETTER S WITH CEDILLA|[[ş]]}} |
||
|{{chset- |
|{{chset-cell1|u=0165|U+0165 LATIN SMALL LETTER T WITH CARON|[[ť]]}} |
||
|{{chset- |
|{{chset-cell1|u=017A|U+017A LATIN SMALL LETTER Z WITH ACUTE|[[ź]]}} |
||
|{{chset- |
|{{chset-cell1|u=02DD|U+02DD DOUBLE ACUTE ACCENT|[[Double acute accent|˝]]}} |
||
|{{chset- |
|{{chset-cell1|u=017E|U+017E LATIN SMALL LETTER Z WITH CARON|[[ž]]}} |
||
|{{chset- |
|{{chset-cell1|u=017C|U+017C LATIN SMALL LETTER Z WITH DOT ABOVE|[[ż]]}} |
||
|- |
|- |
||
|{{chset-left1|Cx}} |
|||
|{{chset- |
|{{chset-cell1|u=0154|U+0154 LATIN CAPITAL LETTER R WITH ACUTE|[[Ŕ]]}} |
||
|{{chset- |
|{{chset-cell1|U+00C1 LATIN CAPITAL LETTER A WITH ACUTE|[[Á]]}} |
||
|{{chset- |
|{{chset-cell1|U+00C2 LATIN CAPITAL LETTER A WITH CIRCUMFLEX|[[Â]]}} |
||
|{{chset- |
|{{chset-cell1|u=0102|U+0102 LATIN CAPITAL LETTER A WITH BREVE|[[Ă]]}} |
||
|{{chset- |
|{{chset-cell1|U+00C4 LATIN CAPITAL LETTER A WITH DIAERESIS|[[Ä]]}} |
||
|{{chset- |
|{{chset-cell1|u=0139|U+0139 LATIN CAPITAL LETTER L WITH ACUTE|[[Ĺ]]}} |
||
|{{chset- |
|{{chset-cell1|u=0106|U+0106 LATIN CAPITAL LETTER C WITH ACUTE|[[Ć]]}} |
||
|{{chset- |
|{{chset-cell1|U+00C7 LATIN CAPITAL LETTER C WITH CEDILLA|[[Ç]]}} |
||
|{{chset- |
|{{chset-cell1|u=010C|U+010C LATIN CAPITAL LETTER C WITH CARON|[[Č]]}} |
||
|{{chset- |
|{{chset-cell1|U+00C9 LATIN CAPITAL LETTER E WITH ACUTE|[[É]]}} |
||
|{{chset- |
|{{chset-cell1|u=0118|U+0118 LATIN CAPITAL LETTER E WITH OGONEK|[[Ę]]}} |
||
|{{chset- |
|{{chset-cell1|U+00CB LATIN CAPITAL LETTER E WITH DIAERESIS|[[Ë]]}} |
||
|{{chset- |
|{{chset-cell1|u=011A|U+011A LATIN CAPITAL LETTER E WITH CARON|[[Ě]]}} |
||
|{{chset- |
|{{chset-cell1|U+00CD LATIN CAPITAL LETTER I WITH ACUTE|[[Í]]}} |
||
|{{chset- |
|{{chset-cell1|U+00CE LATIN CAPITAL LETTER I WITH CIRCUMFLEX|[[Î]]}} |
||
|{{chset- |
|{{chset-cell1|u=010E|U+010E LATIN CAPITAL LETTER D WITH CARON|[[Ď]]}} |
||
|- |
|- |
||
|{{chset-left1|Dx}} |
|||
|{{chset- |
|{{chset-cell1|u=0110|U+0110 LATIN CAPITAL LETTER D WITH STROKE|[[Đ]]}} |
||
|{{chset- |
|{{chset-cell1|u=0143|U+0143 LATIN CAPITAL LETTER N WITH ACUTE|[[Ń]]}} |
||
|{{chset- |
|{{chset-cell1|u=0147|U+0147 LATIN CAPITAL LETTER N WITH CARON|[[Ň]]}} |
||
|{{chset- |
|{{chset-cell1|U+00D3 LATIN CAPITAL LETTER O WITH ACUTE|[[Ó]]}} |
||
|{{chset- |
|{{chset-cell1|U+00D4 LATIN CAPITAL LETTER O WITH CIRCUMFLEX|[[Ô]]}} |
||
|{{chset- |
|{{chset-cell1|u=0150|U+0150 LATIN CAPITAL LETTER O WITH DOUBLE ACUTE|[[Ő]]}} |
||
|{{chset- |
|{{chset-cell1|U+00D6 LATIN CAPITAL LETTER O WITH DIAERESIS|[[Ö]]}} |
||
|{{chset- |
|{{chset-cell1|U+00D7 MULTIPLICATION SIGN|[[Multiplication sign|×]]}} |
||
|{{chset- |
|{{chset-cell1|u=0158|U+0158 LATIN CAPITAL LETTER R WITH CARON|[[Ř]]}} |
||
|{{chset- |
|{{chset-cell1|u=016E|U+016E LATIN CAPITAL LETTER U WITH RING ABOVE|[[Ů]]}} |
||
|{{chset- |
|{{chset-cell1|U+00DA LATIN CAPITAL LETTER U WITH ACUTE|[[Ú]]}} |
||
|{{chset- |
|{{chset-cell1|u=0170|U+0170 LATIN CAPITAL LETTER U WITH DOUBLE ACUTE|[[Ű]]}} |
||
|{{chset- |
|{{chset-cell1|U+00DC LATIN CAPITAL LETTER U WITH DIAERESIS|[[Ü]]}} |
||
|{{chset- |
|{{chset-cell1|U+00DD LATIN CAPITAL LETTER Y WITH ACUTE|[[Ý]]}} |
||
|{{chset- |
|{{chset-cell1|u=0162|U+0162 LATIN CAPITAL LETTER T WITH CEDILLA|[[Ţ]]}} |
||
|{{chset- |
|{{chset-cell1|U+00DF LATIN SMALL LETTER SHARP S|[[ß]]}} |
||
|- |
|- |
||
|{{chset-left1|Ex}} |
|||
|{{chset- |
|{{chset-cell1|u=0155|U+0155 LATIN SMALL LETTER R WITH ACUTE|[[ŕ]]}} |
||
|{{chset- |
|{{chset-cell1|U+00E1 LATIN SMALL LETTER A WITH ACUTE|[[á]]}} |
||
|{{chset- |
|{{chset-cell1|U+00E2 LATIN SMALL LETTER A WITH CIRCUMFLEX|[[â]]}} |
||
|{{chset- |
|{{chset-cell1|u=0103|U+0103 LATIN SMALL LETTER A WITH BREVE|[[ă]]}} |
||
|{{chset- |
|{{chset-cell1|U+00E4 LATIN SMALL LETTER A WITH DIAERESIS|[[ä]]}} |
||
|{{chset- |
|{{chset-cell1|u=013A|U+013A LATIN SMALL LETTER L WITH ACUTE|[[ĺ]]}} |
||
|{{chset- |
|{{chset-cell1|u=0107|U+0107 LATIN SMALL LETTER C WITH ACUTE|[[ć]]}} |
||
|{{chset- |
|{{chset-cell1|U+00E7 LATIN SMALL LETTER C WITH CEDILLA|[[ç]]}} |
||
|{{chset- |
|{{chset-cell1|u=010D|U+010D LATIN SMALL LETTER C WITH CARON|[[č]]}} |
||
|{{chset- |
|{{chset-cell1|U+00E9 LATIN SMALL LETTER E WITH ACUTE|[[é]]}} |
||
|{{chset- |
|{{chset-cell1|u=0119|U+0119 LATIN SMALL LETTER E WITH OGONEK|[[ę]]}} |
||
|{{chset- |
|{{chset-cell1|U+00EB LATIN SMALL LETTER E WITH DIAERESIS|[[ë]]}} |
||
|{{chset- |
|{{chset-cell1|u=011B|U+011B LATIN SMALL LETTER E WITH CARON|[[ě]]}} |
||
|{{chset- |
|{{chset-cell1|U+00ED LATIN SMALL LETTER I WITH ACUTE|[[í]]}} |
||
|{{chset- |
|{{chset-cell1|U+00EE LATIN SMALL LETTER I WITH CIRCUMFLEX|[[î]]}} |
||
|{{chset- |
|{{chset-cell1|u=010F|U+010F LATIN SMALL LETTER D WITH CARON|[[ď]]}} |
||
|- |
|- |
||
|{{chset-left1|Fx}} |
|||
|{{chset- |
|{{chset-cell1|u=0111|U+0111 LATIN SMALL LETTER D WITH STROKE|[[đ]]}} |
||
|{{chset- |
|{{chset-cell1|u=0144|U+0144 LATIN SMALL LETTER N WITH ACUTE|[[ń]]}} |
||
|{{chset- |
|{{chset-cell1|u=0148|U+0148 LATIN SMALL LETTER N WITH CARON|[[ň]]}} |
||
|{{chset- |
|{{chset-cell1|U+00F3 LATIN SMALL LETTER O WITH ACUTE|[[ó]]}} |
||
|{{chset- |
|{{chset-cell1|U+00F4 LATIN SMALL LETTER O WITH CIRCUMFLEX|[[ô]]}} |
||
|{{chset- |
|{{chset-cell1|u=0151|U+0151 LATIN SMALL LETTER O WITH DOUBLE ACUTE|[[ő]]}} |
||
|{{chset- |
|{{chset-cell1|U+00F6 LATIN SMALL LETTER O WITH DIAERESIS|[[ö]]}} |
||
|{{chset- |
|{{chset-cell1|U+00F7 DIVISION SIGN|[[Obelus|÷]]}} |
||
|{{chset- |
|{{chset-cell1|u=0159|U+0159 LATIN SMALL LETTER R WITH CARON|[[ř]]}} |
||
|{{chset- |
|{{chset-cell1|u=016F|U+016F LATIN SMALL LETTER U WITH RING ABOVE|[[ů]]}} |
||
|{{chset- |
|{{chset-cell1|U+00FA LATIN SMALL LETTER U WITH ACUTE|[[ú]]}} |
||
|{{chset- |
|{{chset-cell1|u=0171|U+0171 LATIN SMALL LETTER U WITH DOUBLE ACUTE|[[ű]]}} |
||
|{{chset- |
|{{chset-cell1|U+00FC LATIN SMALL LETTER U WITH DIAERESIS|[[ü]]}} |
||
|{{chset- |
|{{chset-cell1|U+00FD LATIN SMALL LETTER Y WITH ACUTE|[[ý]]}} |
||
|{{chset- |
|{{chset-cell1|u=0163|U+0163 LATIN SMALL LETTER T WITH CEDILLA|[[ţ]]}} |
||
|{{chset- |
|{{chset-cell1|u=02D9|U+02D9 DOT ABOVE|[[Dot (diacritic)|˙]]}} |
||
|} |
|} |
||
{{chset-legend}} |
|||
==See also== |
==See also== |
||
Line 348: | Line 345: | ||
==External links== |
==External links== |
||
*[https://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=28246&ICS1=35&ICS2=40&ICS3= ISO 8859-2:1999] |
*[https://www.iso.org/iso/en/CatalogueDetailPage.CatalogueDetail?CSNUMBER=28246&ICS1=35&ICS2=40&ICS3= ISO/IEC 8859-2:1999] |
||
*[https:// |
*[https://ecma-international.org/publications-and-standards/standards/ecma-94 Standard ECMA-94]: 8-Bit Single Byte Coded Graphic Character Sets - Latin Alphabets No. 1 to No. 4 ''2nd edition (June 1986)'' |
||
*[https:// |
*[https://itscj.ipsj.or.jp/ir/101.pdf ISO-IR 101] Right-Hand Part of Latin Alphabet No.2 ''(February 1, 1986)'' |
||
*[https://web.archive.org/web/20120214140410/https://nl.ijs.si/gnusl/cee/iso8859-2.html ISO 8859-2 (Latin 2) Resources] |
*[https://web.archive.org/web/20120214140410/https://nl.ijs.si/gnusl/cee/iso8859-2.html ISO 8859-2 (Latin 2) Resources] |
||
Revision as of 05:20, 28 May 2024
MIME / IANA | ISO-8859-2 |
---|---|
Alias(es) | iso-ir-101, csISOLatin2, latin2, l2, IBM1111 |
Language(s) | (see below) |
Standard | ECMA-94:1986, ISO/IEC 8859 |
Classification | Extended ASCII, ISO/IEC 8859 |
Extends | US-ASCII |
Based on | ISO-8859-1 |
Other related encoding(s) | Windows-1250, MacCroatian |
ISO/IEC 8859-2:1999, Information technology — 8-bit single-byte coded graphic character sets — Part 2: Latin alphabet No. 2, is part of the ISO/IEC 8859 series of ASCII-based standard character encodings, first edition published in 1987. It is informally referred to as "Latin-2". It is generally intended for Central[1] or "Eastern European" languages that are written in the Latin script. Note that ISO/IEC 8859-2 is very different from code page 852 (MS-DOS Latin 2, PC Latin 2) which is also referred to as "Latin-2" in Czech and Slovak regions.[2] Almost half the use of the encoding is for Polish, and it's the main legacy encoding for Polish, while virtually all use of it has been replaced by UTF-8 (on the web).
ISO-8859-2 is the IANA preferred charset name for this standard when supplemented with the C0 and C1 control codes from ISO/IEC 6429. Less than 0.04% of all web pages use ISO-8859-2 as of October 2022.[3][4] Microsoft has assigned code page 28592 a.k.a. Windows-28592 to ISO-8859-2 in Windows. IBM assigned code page 912 to ISO 8859-2,[5] until that code page was extended in 1999.[6] Code page 1111 is similar, but replaces byte B0 ° (degree sign) with U+02DA ˚ (ring above).
Windows-1250 is similar to ISO-8859-2 and has all the printable characters it has and more. However a few of them are rearranged (unlike Windows-1252, which keeps all printable characters from ISO-8859-1 in the same place).
Language coverage
These code values can be used for the following languages:
- ^ The missing letter Å is officially a part of the Finnish alphabet, however it has no native use and its usage is limited to foreign names only.
- ^ In 2017, the Council for German Orthography officially added a capital ẞ, but is not actually required as SS can be used instead.
- ^ This character set unifies Ș and Ț (S,T with commas below) with Ş and Ţ (S, T with cedillas), as did virtually all other character sets including Microsoft's Windows-1250 and the first version of Unicode. Unicode subsequently disunified them however Unicode notes as of 2014[citation needed] that disunifying the letters with comma below was a mistake, causing corruptions of Romanian data: pre-existing data and input methods would still contain the older cedilla codepoints, complicating text searching.
Code page layout
Differences from ISO-8859-1 have the Unicode code point number underneath.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0x | ||||||||||||||||
1x | ||||||||||||||||
2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
7x | p | q | r | s | t | u | v | w | x | y | z | { | | | } | ~ | |
8x | ||||||||||||||||
9x | ||||||||||||||||
Ax | NBSP | Ą 0104 |
˘ 02D8 |
Ł 0141 |
¤ | Ľ 013D |
Ś 015A |
§ | ¨ | Š 0160 |
Ş 015E |
Ť 0164 |
Ź 0179 |
SHY | Ž 017D |
Ż 017B |
Bx | ° | ą 0105 |
˛ 02DB |
ł 0142 |
´ | ľ 013E |
ś 015B |
ˇ 02C7 |
¸ | š 0161 |
ş 015F |
ť 0165 |
ź 017A |
˝ 02DD |
ž 017E |
ż 017C |
Cx | Ŕ 0154 |
Á | Â | Ă 0102 |
Ä | Ĺ 0139 |
Ć 0106 |
Ç | Č 010C |
É | Ę 0118 |
Ë | Ě 011A |
Í | Î | Ď 010E |
Dx | Đ 0110 |
Ń 0143 |
Ň 0147 |
Ó | Ô | Ő 0150 |
Ö | × | Ř 0158 |
Ů 016E |
Ú | Ű 0170 |
Ü | Ý | Ţ 0162 |
ß |
Ex | ŕ 0155 |
á | â | ă 0103 |
ä | ĺ 013A |
ć 0107 |
ç | č 010D |
é | ę 0119 |
ë | ě 011B |
í | î | ď 010F |
Fx | đ 0111 |
ń 0144 |
ň 0148 |
ó | ô | ő 0151 |
ö | ÷ | ř 0159 |
ů 016F |
ú | ű 0171 |
ü | ý | ţ 0163 |
˙ 02D9 |
See also
References
- ^ "Microsoft Outlook Message Encodings". 10 January 2017.
- ^ "The Czech and Slovak Character Encoding Mess Explained". luki.sdf-eu.org. Retrieved 2022-02-27.
- ^ "Usage Statistics and Market Share of ISO-8859-2 for Websites, October 2022". w3techs.com. Retrieved 2022-10-23.
- ^ "Historical trends in the usage statistics of character encodings for websites, February 2022".
- ^ "Icu-data/Charset/Data/XML/Ibm-912_P100-1995.XML at main · unicode-org/Icu-data". GitHub.
- ^ "Icu-data/Charset/Data/Ucm/Ibm-912_P100-1999.ucm at main · unicode-org/Icu-data". GitHub.
External links
- ISO/IEC 8859-2:1999
- Standard ECMA-94: 8-Bit Single Byte Coded Graphic Character Sets - Latin Alphabets No. 1 to No. 4 2nd edition (June 1986)
- ISO-IR 101 Right-Hand Part of Latin Alphabet No.2 (February 1, 1986)
- ISO 8859-2 (Latin 2) Resources