ASCII: Difference between revisions

Content deleted Content added
No edit summary
Unicode does not have millions of code points anymore. It is limited to 17 planes of 65536 characters each, about 1.1 million.
(8 intermediate revisions by 6 users not shown)
Line 2:
{{hatnote group|
{{other uses}}
{{Distinguish|text=MS [[Windows-1252]] or other types of [[Extendedextended ASCII]]}}
}}
{{Use mdy dates|date=June 2013|cs1-dates=y}}
Line 23:
| classification = [[ISO/IEC 646|ISO/IEC 646 series]]
}}
'''ASCII''' ({{IPAc-en|audio=En-us-ASCII.ogg|ˈ|æ|s|k|iː}} {{respell|ASS|kee}}),<ref name="Mackenzie_1980">{{cite book |url=https://textfiles.meulie.net/bitsaved/Books/Mackenzie_CodedCharSets.pdf |title=Coded Character Sets, History and Development |series=The Systems Programming Series |author-last=Mackenzie |author-first=Charles E. |date=1980 |edition=1 |publisher=[[Addison-Wesley Publishing Company, Inc.]] |isbn=978-0-201-14460-4 |lccn=77-90165 |pages=6, 66, 211, 215, 217, 220, 223, 228, 236–238, 243–245, 247–253, 423, 425–428, 435–439 |access-date=2019-08-25 |archive-url=https://web.archive.org/web/20160526172151/https://textfiles.meulie.net/bitsaved/Books/Mackenzie_CodedCharSets.pdf |archive-date=May 26, 2016 |url-status=live |df=mdy-all }}</ref>{{rp|6}} an acronym for '''American Standard Code for Information Interchange''', is a [[character encoding]] standard for electronic communication. ASCII codes represent text in computers, [[telecommunications equipment]], and other devices. Because of technical limitations of computer systems at the time it was invented, ASCII has just 128 [[code point]]s, of which only 95 are {{Pslink|printable characters}}, which severely limited its scope. Modern computer systems have evolved to use [[Unicode]], which has millionsover ofa million code points, but the first 128 of these are the same as the ASCII set.
 
The [[Internet Assigned Numbers Authority]] (IANA) prefers the name '''US-ASCII''' for this character encoding.<ref name="IANA_2007">{{cite web|website=Internet Assigned Numbers Authority (IANA)|date=May 14, 2007|url=https://www.iana.org/assignments/character-sets|title=Character Sets|access-date=2019-08-25}}</ref>
Line 98:
==<span class="anchor" id="Code chart"></span><span class="anchor" id="ASCII printable code chart"></span><span class="anchor" id="ASCII printable characters"></span>Character set==
 
[[File:ASCII Table (suitable for printing).svg|thumb]]
{|{{chset-table-header1|ASCII (1977/1986)}}
|-
Line 658 ⟶ 659:
{{Main|Extended ASCII}}{{See also|ISO/IEC 8859|UTF-8}}
<!-- to be mentioned [[USASCII-8]] -->
Eventually, as 8-, [[16-bit computing|16-]], and [[32-bit computing|32-bit]] (and later [[64-bit computing|64-bit]]) computers began to replace [[12-bit computing|12-]], [[18-bit computing|18-]], and [[36-bit computing|36-bit]] computers as the norm, it became common to use an 8-bit byte to store each character in memory, providing an opportunity for extended, 8-bit relatives of ASCII. In most cases these developed as true extensions of ASCII, leaving the original character-mapping intact, but adding additional character definitions after the first 128 (i.e., 7-bit) characters. ASCII itself remained a seven-bit code: the term "extended ASCII" has no official status.
 
For some countries, 8-bit extensions of ASCII were developed that included support for characters used in local languages; for example, [[ISCII]] for India and [[VISCII]] for Vietnam. [[Kaypro]] [[CP/M]] computers used the "upper" 128 characters for the Greek alphabet.{{citation needed|date=November 2023}}
Line 668 ⟶ 669:
IBM defined [[code page 437]] for the [[IBM PC]], replacing the control characters with graphic symbols such as [[Emoticon|smiley faces]], and mapping additional graphic characters to the upper 128 positions.<ref>{{cite book |url=http://www.bitsavers.org/pdf/ibm/pc/pc/6025008_PC_Technical_Reference_Aug81.pdf |title=Technical Reference |at=Appendix C. Of Characters Keystrokes and Color |edition=First |date=August 1981 |series=Personal Computer Hardware Reference Library |publisher=IBM}}</ref> [[Digital Equipment Corporation]] developed the [[Multinational Character Set]] (DEC-MCS) for use in the popular [[VT220]] [[computer terminal|terminal]] as one of the first extensions designed more for international languages than for block graphics. [[Apple Inc.|Apple]] defined [[Mac OS Roman]] for the Macintosh and [[Adobe Inc.|Adobe]] defined the [[PostScript Standard Encoding]] for [[PostScript]]; both sets contained "international" letters, typographic symbols and punctuation marks instead of graphics, more like modern character sets.
 
The [[ISO/IEC 8859]] standard (derived from the DEC-MCS) provided a standard that most systems copied (or at least were based on, when not copied exactly). A popular further extension designed by Microsoft, [[Windows-1252]] (often mislabeled as [[ISO-8859-1]]), added the typographic punctuation marks needed for traditional text printing. ISO-8859-1, Windows-1252, and the original 7-bit ASCII were the most common character encodingsencoding methods on the [[World Wide Web]] until 2008, when [[UTF-8]] overtook them.<ref name="UTF-8_2008"/>
 
[[ISO/IEC 4873]] introduced 32 additional control codes defined in the 80–9F [[hexadecimal]] range, as part of extending the 7-bit ASCII encoding to become an 8-bit system.<ref name="Unicode-5.0_2006">{{cite book |author=The Unicode Consortium |editor-first=Julie D. |editor-last=Allen |title=The Unicode standard, Version 5.0 |date=2006-10-27 |publisher=[[Addison-Wesley Professional]] |location=Upper Saddle River, New Jersey, US |isbn=978-0-321-48091-0 |chapter-url=http://unicode.org/book/ch13.pdf |archive-url=https://ghostarchive.org/archive/20221009/http://unicode.org/book/ch13.pdf |archive-date=2022-10-09 |url-status=live |access-date=2015-03-13 |chapter=Chapter 13: Special Areas and Format Characters |page=314}}</ref>