Jump to content

Examine individual changes

This page allows you to examine the variables generated by the Edit Filter for an individual change.

Variables generated for this change

VariableValue
Edit count of the user (user_editcount)
null
Name of the user account (user_name)
'86.55.42.171'
Age of the user account (user_age)
0
Groups (including implicit) the user is in (user_groups)
[ 0 => '*' ]
Rights that the user has (user_rights)
[ 0 => 'createaccount', 1 => 'read', 2 => 'edit', 3 => 'createtalk', 4 => 'writeapi', 5 => 'viewmywatchlist', 6 => 'editmywatchlist', 7 => 'viewmyprivateinfo', 8 => 'editmyprivateinfo', 9 => 'editmyoptions', 10 => 'abusefilter-log-detail', 11 => 'urlshortener-create-url', 12 => 'centralauth-merge', 13 => 'abusefilter-view', 14 => 'abusefilter-log', 15 => 'vipsscaler-test' ]
Whether the user is editing from mobile app (user_app)
false
Whether or not a user is editing through the mobile interface (user_mobile)
true
Page ID (page_id)
43026
Page namespace (page_namespace)
0
Page title without namespace (page_title)
'Endianness'
Full page title (page_prefixedtitle)
'Endianness'
Edit protection level of the page (page_restrictions_edit)
[]
Last ten users to contribute to the page (page_recent_contributors)
[ 0 => 'Comp.arch', 1 => 'R. S. Shaw', 2 => 'Guy Harris', 3 => 'Gah4', 4 => 'Nomen4Omen', 5 => 'Vincent Lefèvre', 6 => '82.152.109.221', 7 => 'PabloStraub', 8 => 'Vt320', 9 => 'EWLwiki' ]
Page age in seconds (page_age)
636256393
Action (action)
'edit'
Edit summary/reason (summary)
''
Old content model (old_content_model)
'wikitext'
New content model (new_content_model)
'wikitext'
Old page wikitext, before the edit (old_wikitext)
'{{Short description|Order of bytes in a computer word}} {{Redirect2|Big-endian|Little-endian|the conflicting ideologies in ''Gulliver’s Travels''|Lilliput and Blefuscu#History and politics}} {{Multiple issues| {{refimprove|date=July 2020}} {{Condense|date=August 2020}} }} In [[computing]], '''endianness''' is the order or sequence of [[byte]]s of a [[word (data type)|word]] of digital data in [[computer memory]]. Endianness is primarily expressed as '''big-endian''' ('''BE''') or '''little-endian''' ('''LE'''). A big-endian system stores the [[most significant byte]] of a word at the smallest [[memory address]] and the [[least significant byte]] at the largest. A little-endian system, in contrast, stores the least-significant byte at the smallest address.<ref>[https://betterexplained.com/articles/understanding-big-and-little-endian-byte-order/ Understanding big and little endian byte order]</ref><ref>[https://developer.apple.com/library/archive/documentation/CoreFoundation/Conceptual/CFMemoryMgmt/Concepts/ByteOrdering.html#//apple_ref/doc/uid/20001150-CJBEJBHH Byte Ordering PPC]</ref><ref>[https://developer.ibm.com/articles/au-endianc/ Writing endian-independent code in C]</ref> '''Bi-endianness''' is a feature supported by numerous computer architectures that feature switchable endianness in data fetches and stores or for instruction fetches. Other orderings are generically called '''middle-endian''' or '''mixed-endian'''.<ref>{{cite web |title = Internet Hall of Fame Pioneer |url = http://internethalloffame.org/inductees/danny-cohen |website = [[Internet Hall of Fame]] |publisher = [[The Internet Society]] }}</ref><ref>{{cite web |first = David |last = Cary |title = Endian FAQ |url = http://david.carybros.com/html/endian_faq.html |access-date = 2010-10-11 }}</ref><ref>{{cite journal |last = James |first = David V. |title = Multiplexed buses: the endian wars continue |journal = [[IEEE Micro]] |date = June 1990 |volume = 10 |issue = 3 |pages = 9–21 |doi = 10.1109/40.56322 |s2cid = 24291134 |issn = 0272-1732 }}</ref><ref>{{cite journal |last1 = Blanc |first1 = Bertrand |last2 = Maaraoui |first2 = Bob |title = Endianness or Where is Byte 0? |date = December 2005 |url = http://3bc.bertrand-blanc.com/endianness05.pdf |access-date = 2008-12-21 }}</ref> Endianness may also be used to describe the order in which the [[bit]]s are transmitted over a communication channel, e.g., big-endian in a communications channel transmits the most significant bits first.<ref>{{cite web |title=RFC 1700 |url=https://tools.ietf.org/html/rfc1700}}</ref> Bit-endianness is seldom used in other contexts. == Etymology == [[File:Gullivers_travels.jpg|thumb|The adjective ''endian'' comes from the 1726 novel ''[[Gulliver's Travels]]'' by [[Jonathan Swift]] where characters known as Lilliputians are divided into those breaking the shell of a [[boiled egg]] from the big end (''Big-Endians'') or from the little end (''Little-Endians'')]] [[Danny Cohen (computer scientist)|Danny Cohen]] introduced the terms ''big-endian'' and ''little-endian'' into computer science for data ordering in an [[Internet Experiment Note]] published in 1980.<ref name="HOLY">{{cite IETF |title = On Holy Wars and a Plea for Peace |ien = 137 |last = Cohen |first = Danny |author-link = Danny Cohen (computer scientist) |date = 1980-04-01 |url = http://www.ietf.org/rfc/ien/ien137.txt |quote = ...which bit should travel first, the bit from the little end of the word, or the bit from the big end of the word? The followers of the former approach are called the Little-Endians, and the followers of the latter are called the Big-Endians. |publisher = [[Internet Engineering Task Force|IETF]] }} Also published at ''[[IEEE Computer]]'', [https://ieeexplore.ieee.org/document/1667115 October 1981 issue].</ref> The adjective ''endian'' has its origin in the writings of 18th century Anglo-Irish writer [[Jonathan Swift]]. In the 1726 novel ''[[Gulliver's Travels]]'', he portrays the conflict between sects of Lilliputians divided into those breaking the shell of a [[boiled egg]] from the big end or from the little end. He called them the ''Big-Endians'' and the ''Little-Endians''.<ref>{{cite book |first = Jonathan |last =Swift |title = Gulliver's Travels |year = 1726 |url = http://en.wikisource.org/wiki/Gulliver%27s_Travels/Part_I/Chapter_IV }}</ref><ref>{{Citation |last1 = Bryant |first1 = Randal E. |author-link = Randal Bryant |last2 = David |first2 = O'Hallaron |title = Computer Systems: A Programmer's Perspective |publisher = Pearson Education |year = 2016 |edition = 3 |isbn = 978-1-488-67207-1 |page = 79 }}</ref> Cohen makes the connection to ''Gulliver's Travels'' explicit in the appendix to his 1980 note. ==Overview== Computers store information in various-sized groups of binary bits. Each group is assigned a number, called its ''address'', that the computer uses to access that data. On most modern computers, the smallest data group with an address is eight bits long and is called a byte. Larger groups comprise two or more bytes, for example, a [[32-bit computing|32-bit]] word contains four bytes. There are two possible ways a computer could number the individual bytes in a larger group, starting at either end. Both types of endianness are in widespread use in digital electronic engineering. The initial choice of endianness of a new design is often arbitrary, but later technology revisions and updates perpetuate the existing endianness to maintain [[backward compatibility]]. Internally, any given computer will work equally well regardless of what endianness it uses since its hardware will consistently use the same endianness to both store and load its data. For this reason, programmers and computer users normally ignore the endianness of the computer they are working with. However, endianness can become an issue when moving data external to the computer – as when transmitting data between different computers, or a programmer investigating internal computer bytes of data from a [[memory dump]] – and the endianness used differs from expectation. In these cases, the endianness of the data must be understood and accounted for. {{multiple image | header = Endian example | image1 = Big-Endian.svg | caption1 = Big-endian | image2 = Little-Endian.svg | width2 = <!-- displayed width of image; overridden by "width" above --> | caption2 = Little-endian }} These two diagrams show how two computers using different endianness store a 32-bit (four byte) integer with the value of {{mono|[[Hexadecimal|0x]]0A0B0C0D}}. In both cases, the integer is broken into four bytes, {{mono|0x0A}}, {{mono|0x0B}}, {{mono|0x0C}}, and {{mono|0x0D}}, and the bytes are stored in four sequential byte locations in memory, starting with the memory location with address ''a'', then ''a + 1'', ''a + 2'', and ''a + 3''. The difference between big and little endian is the order of the four bytes of the integer being stored. The left-side diagram shows a computer using big-endian. This starts the storing of the integer with the ''most''-significant byte, {{mono|0x0A}}, at address ''a'', and ends with the ''least''-significant byte, {{mono|0x0D}}, at address ''a + 3''. The right-side diagram shows a computer using little-endian. This starts the storing of the integer with the ''least''-significant byte, {{mono|0x0D}}, at address ''a'', and ends with the ''most''-significant byte, {{mono|0x0A}}, at address ''a + 3''. Since each computer uses its same endianness to both store and retrieve the integer, the results will be the same for both computers. Issues may arise when memory is addressed by bytes instead of integers, or when memory contents are transmitted between computers with different endianness. Big-endianness is the dominant ordering in networking protocols, such as in the [[internet protocol suite]], where it is referred to as '''network order''', transmitting the most significant byte first. Conversely, little-endianness is the dominant ordering for processor architectures ([[x86]], most [[ARM architecture|ARM]] implementations, base [[RISC-V]] implementations) and their associated memory. [[File format]]s can use either ordering; some formats use a mixture of both or contain an indicator of which ordering is used throughout the file.<ref>{{Cite web|date=April 1992|title=RFC 1314 – A File Format for the Exchange of Images in the Internet|url=https://datatracker.ietf.org/doc/html/rfc1314#page-7|access-date=2021-08-16|website=datatracker.ietf.org|publisher=[[Internet Engineering Task Force]]|quote=TIFF files start with a file header which specifies the byte order used in the file (i.e., Big or Little Endian)}}</ref> The styles of little- and big-endian may also be used more generally to characterize the ordering of any representation, e.g. the digits in a [[numeral system]] or the sections of a [[date format by country|date]]. Numbers in [[positional notation]] are generally written with their digits in left-to-right big-endian order, even in [[Writing system#Directionality|right-to-left scripts]]. Similarly, programming languages use big-endian digit ordering for numeric [[Literal (computer programming)|literals]]. == Basics == [[Computer memory]] consists of a sequence of storage cells (smallest [[address space|addressable]] units); in machines that support [[byte addressing]], those units are called ''[[byte]]s''. Each byte is identified and accessed in hardware and software by its [[memory address]]. If the total number of bytes in memory is ''n'', then addresses are enumerated from 0 to ''n''&nbsp;−&nbsp;1. Computer programs often use data structures or [[Field (computer science)|fields]] that may consist of more data than can be stored in one byte. In the context of this article where its type cannot be arbitrarily complicated, a "field" consists of a consecutive sequence of bytes and represents a "simple data value" which – at least potentially – can be manipulated by ''one'' single [[Instruction set architecture|hardware instruction]]. On most systems, the address of a multi-byte simple data value is the address of its first byte (the byte with the lowest address).{{NoteTag|An exception to this rule is e.g. the Add instruction of the [[IBM 1401]] which addresses variable-length fields at their low-order (highest-addressed) position with their lengths being defined by a [[Word mark (computer hardware)|word mark]] set at their high-order (lowest-addressed) position. When an operation such as addition is performed, the processor begins at the low-order positions at the high addresses of the two fields and works its way down to the high-order.}} Another important attribute of a byte being part of a "field" is its "significance". These attributes of the parts of a field play an important role in the sequence the bytes are accessed by the computer hardware, more precisely: by the low-level algorithms contributing to the results of a computer instruction. === Numbers === [[Positional notation|Positional number systems]] (mostly base 10, base 2, or base 256 in the case of 8-bit bytes) are the predominant way of representing and particularly of manipulating [[Integer (computer science)|integer data]] by computers. In pure form this is valid for moderate sized non-negative integers, e.g. of C data type <code>[[Data type#Numeric types|unsigned]]</code>. In such a number system, the ''value'' of a digit which it contributes to the whole number is determined not only by its value as a single digit, but also by the position it holds in the complete number, called its significance. These positions can be mapped to memory mainly in two ways:<ref name="TanenbaumAustin2012">{{cite book |first1 = Andrew S. |last1 = Tanenbaum |first2 = Todd M. |last2 = Austin |title = Structured Computer Organization |url = https://books.google.com/books?id=m0HHygAACAAJ |access-date = 18 May 2013 |date = 4 August 2012 |publisher = Prentice Hall PTR |isbn = 978-0-13-291652-3 }}</ref> * decreasing numeric significance with increasing memory addresses (or increasing time), known as ''big-endian'' and * increasing numeric significance with increasing memory addresses (or increasing time), known as ''little-endian''.{{NoteTag|Note that, in these expressions, the term "end" is meant as the extremity where the ''big'' resp. ''little'' significance is written ''first'', namely where the field ''starts''.}} The integer data that are directly supported by the [[Arithmetic logic unit|computer hardware]] have a fixed width of a low power of 2, e.g. 8 bits ≙ 1 byte, 16 bits ≙ 2 bytes, 32 bits ≙ 4 bytes, 64 bits ≙ 8 bytes, 128 bits ≙ 16 bytes. The low-level access sequence to the bytes of such a field depends on the operation to be performed. The least-significant byte is accessed first for [[addition]], [[subtraction]] and [[multiplication]]. The most-significant byte is accessed first for [[Division (mathematics)|division]] and [[Natural number#Order|comparison]]. See {{section link||Calculation order}}. For [[Floating-point arithmetic|floating-point]] numbers, see {{section link||Floating point}}. === Text === When character (text) strings are to be compared with one another, e.g. in order to support some mechanism like [[Sorting algorithm|sorting]], this is very frequently done [[Lexicographical order|lexicographically]] where a single positional element (character) also has a positional value. Lexicographical comparison means almost everywhere: first character ranks highest – as in the telephone book.{{NoteTag|Almost all machines which can do this using ''one'' instruction only (see {{section link||Variable-length data}}) are anyhow of type big-endian or at least mixed-endian.}} Integer numbers written as text are always represented most significant digit first in memory, which is similar to big-endian, independently of [[text direction]]. == Hardware == Many historical and extant processors use a big-endian memory representation, either exclusively or as a design option. Other processor types use little-endian memory representation; others use yet another scheme called ''[[Endianness#Middle|middle-endian]]'', ''mixed-endian'' or ''[[PDP-11]]-endian''. Some instruction sets feature a setting which allows for switchable endianness in data fetches and stores, instruction fetches, or both. This feature can improve performance or simplify the logic of networking devices and software. The word ''bi-endian'', when said of hardware, denotes the capability of the machine to compute or pass data in either endian format. Dealing with data of different endianness is sometimes termed the ''NUXI problem''.<ref>{{cite web |title = NUXI problem |work = The [[Jargon File]] |url = http://catb.org/jargon/html/N/NUXI-problem.html |access-date = 2008-12-20 }}</ref> This terminology alludes to the byte order conflicts encountered while [[Porting|adapting]] [[UNIX]], which ran on the mixed-endian PDP-11,{{NoteTag|The PDP-11 architecture is little-endian within its native 16-bit words, but stores 32-bit data as an unusual '''big'''-endian word pairs.}} to a big-endian IBM Series/1 computer. Unix was one of the first systems to allow the same code to be compiled for platforms with different internal representations. One of the first programs converted was supposed to print out {{code|Unix}}, but on the Series/1 it printed {{code|nUxi}} instead.<ref>{{cite journal |last1=Jalics|first1=Paul J. |last2=Heines|first2=Thomas S. |title = Transporting a portable operating system: UNIX to an IBM minicomputer |journal=Communications of the ACM|date=1 December 1983|volume=26|issue=12|pages=1066–1072|doi=10.1145/358476.358504|s2cid=15558835 }}</ref> The [[IBM System/360]] uses big-endian byte order, as do its successors [[System/370]], [[ESA/390]], and [[z/Architecture]]. The [[PDP-10]] uses big-endian addressing for byte-oriented instructions. The [[IBM Series/1]] minicomputer uses big-endian byte order. The [[Datapoint 2200]] used simple bit-serial logic with little-endian to facilitate [[carry propagation]]. When Intel developed the [[Intel 8008|8008]] microprocessor for Datapoint, they used little-endian for compatibility. However, as Intel was unable to deliver the 8008 in time, Datapoint used a [[medium-scale integration]] equivalent, but the little-endianness was retained in most Intel designs, including the [[MCS-48]] and the [[Intel 8086|8086]] and its [[x86]] successors.<ref>{{cite web|last=House|first=David|title=Oral History Panel on the Development and Promotion of the Intel 8008 Microprocessor |url = http://archive.computerhistory.org/resources/text/Oral_History/Intel_8008/Intel_8008_1.oral_history.2006.102657982.pdf#page=5 |publisher=[[Computer History Museum]] |access-date=23 April 2014 |author2=Faggin, Federico |author3=Feeney, Hal |author4=Gelbach, Ed |author5=Hoff, Ted |author6=Mazor, Stan |author7= Smith, Hank |page =b5 |date=2006-09-21 |quote = Mazor: And lastly, the original design for Datapoint ... what they wanted was a [bit] serial machine. And if you think about a serial machine, you have to process all the addresses and data one-bit at a time, and the rational way to do that is: low-bit to high-bit because that’s the way that carry would propagate. So it means that [in] the jump instruction itself, the way the 14-bit address would be put in a serial machine is bit-backwards, as you look at it, because that’s the way you’d want to process it. Well, we were gonna built a byte-parallel machine, not bit-serial and our compromise (in the spirit of the customer and just for him), we put the bytes in backwards. We put the low-byte [first] and then the high-byte. This has since been dubbed “Little Endian” format and it’s sort of contrary to what you’d think would be natural. Well, we did it for Datapoint. As you’ll see, they never did use the [8008] chip and so it was in some sense “a mistake”, but that [Little Endian format] has lived on to the 8080 and 8086 and [is] one of the marks of this family.}}</ref><ref name="Lunde2009">{{cite book |first = Ken |last = Lunde |title = CJKV Information Processing |url = https://books.google.com/books?id=SA92uQqTB-AC&pg=PA29 |access-date=21 May 2013 |date = 13 January 2009 |publisher = O'Reilly Media, Inc. |isbn = 978-0-596-51447-1 |page = 29 }}</ref> The [[DEC Alpha]], [[Atmel AVR]], [[VAX]], the [[MOS Technology 6502]] family (including [[Western Design Center]] [[65802]] and [[65C816]]), the Zilog [[Z80]] (including [[Z180]] and [[eZ80]]), the [[Altera]] [[Nios II]], and many other processors and processor families are also little-endian. The Motorola [[Motorola 6800|6800]] / 6801, the [[6809]] and the [[Motorola 68000 series|68000 series]] of processors used the big-endian format. The Intel [[8051]], contrary to other Intel processors, expects 16-bit addresses for LJMP and LCALL in big-endian format; however, xCALL instructions store the return address onto the stack in little-endian format.<ref>{{cite web|url=http://www.keil.com/support/man/docs/c51/c51_xe.htm|title=Cx51 User's Guide: E. Byte Ordering|website=keil.com}}</ref> [[SPARC]] historically used big-endian until version 9, which is [[#Bi-endianness|bi-endian]]. Similarly early IBM POWER processors were big-endian, but the [[PowerPC]] and [[Power ISA]] descendants are now bi-endian. The [[ARM architecture]] was little-endian before version 3 when it became bi-endian. === Newer architectures === The [[IA-32]] and [[x86-64]] instruction set architectures use the little-endian format. Other instruction set architectures that follow this convention, allowing only little-endian mode, include [[Nios II]], [[Andes Technology]] NDS32, and [[Qualcomm Hexagon]]. Solely big-endian architectures include the IBM [[z/Architecture]] and [[OpenRISC]]. Some instruction set architectures are "bi-endian" and allow running software of either endianness; these include [[Power ISA]], [[SPARC]], ARM [[AArch64]], [[C-Sky]], and [[RISC-V]]. [[IBM AIX]] and [[IBM i]] run in big-endian mode on bi-endian Power ISA; [[Linux]] originally ran in big-endian mode, but by 2019, IBM had transitioned to little-endian mode for Linux to ease the porting of Linux software from x86 to Power.<ref>{{cite web |title=Little endian and Linux on IBM Power Systems |url=https://developer.ibm.com/articles/l-power-little-endian-faq-trs/ |date=2016-06-16 |website=IBM |author=Jeff Scheel |access-date=2022-03-27}}</ref><ref>{{cite web |last1=Timothy Prickett Morgan |title=The Transition To RHEL 8 Begins On Power Systems |url=https://www.itjungle.com/2019/06/10/the-transition-to-rhel-8-begins-on-power-systems/ |website=ITJungle |publisher=ITJungle |access-date=26 March 2022}}</ref> SPARC has no relevant little-endian deployment, as both [[Oracle Solaris]] and Linux run in big-endian mode on bi-endian SPARC systems, and can be considered big-endian in practice. ARM, C-Sky, and RISC-V have no relevant big-endian deployments, and can be considered little-endian in practice. === Bi-endianness<span class="anchor" id="Bi-endian hardware"></span> ===<!-- bi, not big --> Some architectures (including [[ARM architecture|ARM]] versions 3 and above, [[PowerPC]], [[DEC Alpha|Alpha]], [[SPARC]] V9, [[MIPS architecture|MIPS]], [[Intel i860]], [[PA-RISC]], [[SuperH|SuperH SH-4]] and [[IA-64]]) feature a setting which allows for switchable endianness in data fetches and stores, instruction fetches, or both. This feature can improve performance or simplify the logic of networking devices and software. The word ''bi-endian'', when said of hardware, denotes the capability of the machine to compute or pass data in either endian format. Many of these architectures can be switched via software to default to a specific endian format (usually done when the computer starts up); however, on some systems, the default endianness is selected by hardware on the motherboard and cannot be changed via software (e.g. the Alpha, which runs only in big-endian mode on the [[Cray T3E]]). Note that the term ''bi-endian'' refers primarily to how a processor treats data accesses. Instruction accesses (fetches of instruction words) on a given processor may still assume a fixed endianness, even if data accesses are fully bi-endian, though this is not always the case, such as on Intel's [[IA-64]]-based Itanium CPU, which allows both. Note, too, that some nominally bi-endian CPUs require motherboard help to fully switch endianness. For instance, the 32-bit desktop-oriented [[PowerPC]] processors in little-endian mode act as little-endian from the point of view of the executing programs, but they require the motherboard to perform a 64-bit swap across all 8 byte lanes to ensure that the little-endian view of things will apply to [[Input/Output|I/O]] devices. In the absence of this unusual motherboard hardware, device driver software must write to different addresses to undo the incomplete transformation and also must perform a normal byte swap. Some CPUs, such as many PowerPC processors intended for embedded use and almost all SPARC processors, allow per-page choice of endianness. SPARC processors since the late 1990s (SPARC v9 compliant processors) allow data endianness to be chosen with each individual instruction that loads from or stores to memory. The [[ARM architecture]] supports two big-endian modes, called ''BE-8'' and ''BE-32''.<ref>{{cite web|title=Differences between BE-32 and BE-8 buses|url=http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0290g/ch06s05s01.html}}</ref> CPUs up to ARMv5 only support BE-32 or word-invariant mode. Here any naturally aligned 32-bit access works like in little-endian mode, but access to a byte or 16-bit word is redirected to the corresponding address and unaligned access is not allowed. ARMv6 introduces BE-8 or byte-invariant mode, where access to a single byte works as in little-endian mode, but accessing a 16-bit, 32-bit or (starting with ARMv8) 64-bit word results in a byte swap of the data. This simplifies unaligned memory access as well as memory-mapped access to registers other than 32 bit. Many processors have instructions to convert a word in a register to the opposite endianness, that is, they swap the order of the bytes in a 16-, 32- or 64-bit word. All the individual bits are not reversed though. Recent Intel x86 and x86-64 architecture CPUs have a MOVBE instruction ([[Intel Core]] since generation 4, after [[Intel Atom|Atom]]),<ref>{{cite web |title = How to detect New Instruction support in the 4th generation Intel® Core™ processor family |url = https://software.intel.com/sites/default/files/article/405250/how-to-detect-new-instruction-support-in-the-4th-generation-intel-core-processor-family.pdf |access-date = 2 May 2017 }}</ref> which fetches a big-endian format word from memory or writes a word into memory in big-endian format. These processors are otherwise thoroughly little-endian. ===Floating point=== {{anchor|Floating-point and endianness}}<!-- This section is transcluded on the [[double-precision floating point]] article --> Although many processors use little-endian storage for all types of data (integer, floating point), there are a number of hardware architectures where [[floating-point]] numbers are represented in big-endian form while integers are represented in little-endian form.<ref>{{citation |title=Floating-Point Formats |author-first=John J. G. |author-last=Savard |date=2018 |orig-year=2005 |work=quadibloc |url=http://www.quadibloc.com/comp/cp0201.htm |access-date=2018-07-16 |url-status=live |archive-url=https://web.archive.org/web/20180703001709/http://www.quadibloc.com/comp/cp0201.htm |archive-date=2018-07-03}}</ref> There are [[ARM architecture|ARM]] processors that have half little-endian, half big-endian floating-point representation for double-precision numbers; both 32-bit words are stored in little-endian like integer registers, but the most significant one first. [[VAX]] floating point stores little-endian 16-bit words in big-endian order. Because there have been many floating-point formats with no network standard representation for them, the [[External Data Representation|XDR]] standard uses big-endian IEEE 754 as its representation. It may therefore appear strange that the widespread [[IEEE 754]] floating-point standard does not specify endianness.<ref>{{cite web |title = pack – convert a list into a binary representation |url = http://www.perl.com/doc/manual/html/pod/perlfunc/pack.html }}</ref> Theoretically, this means that even standard IEEE floating-point data written by one machine might not be readable by another. However, on modern standard computers (i.e., implementing IEEE 754), one may safely assume that the endianness is the same for floating-point numbers as for integers, making the conversion straightforward regardless of data type. Small [[embedded system]]s using special floating-point formats may be another matter, however. ===Variable-length data=== Most instructions considered so far contain the size (lengths) of their [[operand]]s within the [[operation code]]. Frequently available operand lengths are 1, 2, 4, 8, or 16 bytes. But there are also architectures where the length of an operand may be held in a separate field of the instruction or with the operand itself, e.g. by means of a [[word mark (computer hardware)|word mark]]. Such an approach allows operand lengths up to 256 bytes or larger. The data types of such operands are [[string (computer science)|character strings]] or [[binary-coded decimal|BCD]]. Machines able to manipulate such data with one instruction (e.g. compare, add) include the [[IBM 1401]], [[IBM 1410|1410]], [[IBM 1620|1620]], [[IBM System/360|System/360]], [[IBM System/370|System/370]], [[ESA/390]], and [[z/Architecture]], all of them of type big-endian.<!--[[User:Kvng/RTH]]--> ===Simplified access to part of a field=== On most systems, the address of a multi-byte value is the address of its first byte (the byte with the lowest address); little-endian systems of that type have the property that, for sufficiently low values, the same value can be read from memory at different lengths without using different addresses (even when [[byte alignment|alignment]] restrictions are imposed). For example, a 32-bit memory location with content {{code|4A 00 00 00|class=nowrap}} can be read at the same address as either [[8-bit computing|8-bit]] (value = 4A), [[16-bit computing|16-bit]] (004A), [[24-bit computing|24-bit]] (00004A), or [[32-bit computing|32-bit]] (0000004A), all of which retain the same numeric value. Although this little-endian property is rarely used directly by high-level programmers, it is occasionally employed by code optimizers as well as by [[assembly language]] programmers.{{Example needed|s|date=April 2022}} In more concrete terms, identities like this are the equivalent of the following [[C (programming language)|C code]] returning "true" on most little-endian systems: <syntaxhighlight lang="C"> union { uint8_t u8; uint16_t u16; uint32_t u32; uint64_t u64; } u = { .u64 = 0x4A }; puts(u.u8 == u.u16 && u.u8 == u.u32 && u.u8 == u.u64 ? "true" : "false"); </syntaxhighlight> While not allowed by C++, such [[type punning]] code is allowed as "implementation-defined" by the C11 standard<ref>{{cite web |title = C11 standard |url = https://www.iso.org/standard/57853.html |publisher = ISO |access-date=15 August 2018 |at = Section 6.5.2.3 "Structure and Union members", §3 and footnote 95 |quote = 95) If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called “type punning”).}}</ref> and commonly used<ref>{{cite web |title=3.10 Options That Control Optimization: -fstrict-aliasing |url=https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Type-punning |website=GNU Compiler Collection (GCC) |publisher=Free Software Foundation |access-date=15 August 2018}}</ref> in code interacting with hardware.<ref>{{cite mailing list |first=Linus |last=Torvalds |title=[GIT PULL] Device properties framework update for v4.18-rc1 |url=https://lkml.org/lkml/2018/6/5/769 |mailing-list=Linux Kernel|access-date=15 August 2018 | date=5 Jun 2018 |quote=The fact is, using a union to do type punning is the traditional AND STANDARD way to do type punning in gcc. In fact, it is the *documented* way to do it for gcc, when you are a f*cking moron and use "-fstrict-aliasing" ...}}</ref> On the other hand, in some situations it may be useful to obtain an approximation of a multi-byte or multi-word value by reading only its most significant portion instead of the complete representation; a big-endian processor may read such an approximation using the same base-address that would be used for the full value. Simplifications of this kind are of course not portable across systems of different endianness. ===Calculation order=== Some operations in [[positional number system]]s have a natural or preferred order in which the elementary steps are to be executed. This order may affect their performance on small-scale byte-addressable processors and [[microcontroller]]s. However, high-performance processors usually fetch typical multi-byte operands from memory in the same amount of time they would have fetched a single byte, so the complexity of the hardware is not affected by the byte ordering. Addition, subtraction, and multiplication start at the least significant digit position and [[Adder (electronics)|propagate the carry]] to the subsequent more significant position. On most systems, the address of a multi-byte value is the address of its first byte (the byte with the lowest address); when this first byte contains the least significant digit – which is equivalent to ''little''-endianness, then the implementation of these operations is marginally simpler. Comparison and division start at the most significant digit and propagate a possible carry to the subsequent less significant digits. For fixed-length numerical values (typically of length 1,2,4,8,16), the implementation of these operations is marginally simpler on big-endian machines. Some big-endian processors (e.g. the IBM System/360 and its successors) contain hardware instructions for lexicographically comparing varying length character strings. The normal data transport by an [[Assignment (computer science)|assignment]] statement is in principle independent of the endianness of the processor. ==={{anchor|Middle|Mixed|Medium}}Middle-endian=== Numerous other orderings, generically called ''middle-endian'' or ''mixed-endian'', are possible.{{Citation needed|reason=Are there any examples in computer architecture?|date=June 2021}} The [[PDP-11]] is in principle a 16-bit little-endian system. The instructions to convert between floating-point and integer values in the optional floating-point processor of the PDP-11/45, PDP-11/70, and in some later processors, stored 32-bit "double precision integer long" values with the 16-bit halves swapped from the expected little-endian order. The [[UNIX]] [[C (programming language)|C]] compiler used the same format for 32-bit long integers. This ordering is known as ''PDP-endian''.<ref>{{cite book|url=http://bitsavers.org/pdf/dec/pdp11/handbooks/PDP1145_Handbook_1973.pdf|title=PDP-11/45 Processor Handbook|page=165|year=1973|publisher=[[Digital Equipment Corporation]]}}</ref> A way to interpret this endianness is that it stores a 32-bit integer as two 16-bit words in big-endian, but the words themselves are little-endian (E.g. "jag cog sin" would be "gaj goc nis"): {| cellpadding="4" style="border-collapse: collapse; margin: 0.4em 0.4em; text-align: center;" |+Storage of a 32-bit integer, {{mono|0x0A0B0C0D}}, on a PDP-11 |- | colspan="6" |''increasing addresses''&nbsp;&nbsp;→ |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" | {{mono|0B<sub>h</sub>}} |style="border: 1px solid;" | {{mono|0A<sub>h</sub>}} |style="border: 1px solid;" | {{mono|0D<sub>h</sub>}} |style="border: 1px solid;" | {{mono|0C<sub>h</sub>}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" colspan="2" | {{mono|0A0B<sub>h</sub>}} |style="border: 1px solid;" colspan="2" | {{mono|0C0D<sub>h</sub>}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |} The 16-bit values here refer to their numerical values, not their actual layout. [[Segment descriptors]] of [[IA-32]] and compatible processors keep a 32-bit base address of the segment stored in little-endian order, but in four nonconsecutive bytes, at relative positions 2, 3, 4 and 7 of the descriptor start. In [[date and time notation in the United States]], dates are middle-endian and differ from [[date format by country|date formats worldwide]]. ==Endian dates== Dates can be represented with different endianness by the ordering of the year, month and day. For example, September 12, 2002 can be represented as: * little-endian date (day, month, year), {{mono|12-09-2002}} * middle-endian dates (month, day, year), {{mono|09-12-2002}} * big-endian date (year, month, day), {{mono|2002-09-12}} as with [[ISO 8601]] ==Byte addressing== {{Main|Byte addressing}} When memory bytes are printed sequentially from left to right (e.g. in a [[hex dump]]), little-endian representation of integers has the significance increasing from left to right. In other words, it appears backwards when visualized, which can be counter-intuitive. This behavior arises, for example, in [[FourCC]] or similar techniques that involve packing characters into an integer, so that it becomes a sequences of specific characters in memory. Let's define the notation {{code|'John'}} as simply the result of writing the characters in hexadecimal [[ASCII]] and appending {{code|0x}} to the front, and analogously for shorter sequences (a [[C syntax#Character constants|C multicharacter literal]], in Unix/MacOS style): ' J o h n ' hex 4A 6F 68 6E ---------------- -> 0x4A6F686E On big-endian machines, the value appears left-to-right, coinciding with the correct string order for reading the result: {| cellpadding="4" style="border-collapse: collapse; margin: 0.4em 0.4em; text-align: center;" |- | colspan="6" |''increasing addresses''&nbsp;&nbsp;→ |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" | {{mono|4A<sub>h</sub>}} |style="border: 1px solid;" | {{mono|6F<sub>h</sub>}} |style="border: 1px solid;" | {{mono|68<sub>h</sub>}} |style="border: 1px solid;" | {{mono|6E<sub>h</sub>}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" | {{mono|'J'}} |style="border: 1px solid;" | {{mono|'o'}} |style="border: 1px solid;" | {{mono|'h'}} |style="border: 1px solid;" | {{mono|'n'}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |} But on a little-endian machine, one would see: {| cellpadding="4" style="border-collapse: collapse; margin: 0.4em 0.4em; text-align: center;" |- | colspan="6" |''increasing addresses''&nbsp;&nbsp;→ |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" | {{mono|6E<sub>h</sub>}} |style="border: 1px solid;" | {{mono|68<sub>h</sub>}} |style="border: 1px solid;" | {{mono|6F<sub>h</sub>}} |style="border: 1px solid;" | {{mono|4A<sub>h</sub>}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" | {{mono|'n'}} |style="border: 1px solid;" | {{mono|'h'}} |style="border: 1px solid;" | {{mono|'o'}} |style="border: 1px solid;" | {{mono|'J'}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |} Middle-endian machines complicate this even further; for example, on the [[PDP-11]], the 32-bit value is stored as two 16-bit words {{mono|'Jo'}} {{mono|'hn'}} in big-endian, with the characters in the 16-bit words being stored in little-endian: {| cellpadding="4" style="border-collapse: collapse; margin: 0.4em 0.4em; text-align: center;" |- | colspan="6" |''increasing addresses''&nbsp;&nbsp;→ |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" | {{mono|6F<sub>h</sub>}} |style="border: 1px solid;" | {{mono|4A<sub>h</sub>}} |style="border: 1px solid;" | {{mono|6E<sub>h</sub>}} |style="border: 1px solid;" | {{mono|68<sub>h</sub>}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" | {{mono|'o'}} |style="border: 1px solid;" | {{mono|'J'}} |style="border: 1px solid;" | {{mono|'n'}} |style="border: 1px solid;" | {{mono|'h'}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |} ==Byte swapping== Byte-swapping consists of masking each byte and shifting them to the correct location. Many compilers provide [[Intrinsic function|built-ins]] that are likely to be compiled into native processor instructions ({{code|bswap}}/{{code|movbe}}), such as {{code|__builtin_bswap32}}. Software interfaces for swapping include: * Standard [[#Networking|network endianness]] functions (from/to BE, up to 32-bit).<ref>{{man|3|byteorder|Linux}}</ref> Windows has a 64-bit extension in {{code|winsock2.h}}. * BSD and Glibc {{code|endian.h}} functions (from/to BE and LE, up to 64-bit).<ref>{{man|3|endian|Linux}}</ref> * [[macOS]] {{code|OSByteOrder.h}} macros (from/to BE and LE, up to 64-bit). ==Files and filesystems== The recognition of endianness is important when reading a file or filesystem that was created on a computer with different endianness. Some [[CPU]] instruction sets provide native support for endian byte swapping, such as {{code|bswap}}<ref>{{cite web|url=http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf|title=Intel 64 and IA-32 Architectures Software Developer's Manual Volume 2 (2A, 2B & 2C): Instruction Set Reference, A-Z|at=p. 3–112|publisher=Intel|date=September 2016|access-date=2017-02-05}}</ref> ([[x86]] - [[Intel 80486|486]] and later), and {{code|rev}}<ref>{{cite web|url=http://infocenter.arm.com/help/topic/com.arm.doc.ddi0487a.k_10775/index.html|title=ARMv8-A Reference Manual|publisher=[[ARM Holdings]]}}</ref> ([[ARM architecture|ARMv6]] and later). Some [[compiler]]s have built-in facilities for byte swapping. For example, the [[Intel]] [[Fortran]] compiler supports the non-standard {{code|CONVERT}} specifier when opening a file, e.g.: {{code|1=OPEN(unit, CONVERT='BIG_ENDIAN',...)|2=fortran|class=nowrap}}. Some compilers have options for generating code that globally enable the conversion for all file IO operations. This permits the reuse of code on a system with the opposite endianness without code modification. Fortran sequential unformatted files created with one endianness usually cannot be read on a system using the other endianness because Fortran usually implements a [[storage record|record]] (defined as the data written by a single Fortran statement) as data preceded and succeeded by count fields, which are integers equal to the number of bytes in the data. An attempt to read such a file using Fortran on a system of the other endianness then results in a run-time error, because the count fields are incorrect. This problem can be avoided by writing out sequential binary files as opposed to sequential unformatted. Note however that it is relatively simple to write a program in another language (such as [[C (programming language)|C]] or [[Python (programming language)|Python]]) that parses Fortran sequential unformatted files of "foreign" endianness and converts them to "native" endianness, by converting from the "foreign" endianness when reading the Fortran records and data. [[Unicode]] text can optionally start with a [[byte order mark]] (BOM) to signal the endianness of the file or stream. Its code point is U+FEFF. In [[UTF-32]] for example, a big-endian file should start with {{code|00 00 FE FF|class=nowrap}}; a little-endian should start with {{code|FF FE 00 00|class=nowrap}}. Application binary data formats, such as for example [[MATLAB]] ''.mat'' files, or the ''.bil'' data format, used in topography, are usually endianness-independent. This is achieved by storing the data always in one fixed endianness, or carrying with the data a switch to indicate the endianness. An example of the first case is the binary [[XLS file]] format that is portable between Windows and Mac systems and always little-endian, leaving the Mac application to swap the bytes on load and save when running on a big-endian Motorola 68K or PowerPC processor.<ref>{{cite web |url=http://download.microsoft.com/download/0/B/E/0BE8BDD7-E5E8-422A-ABFD-4342ED7AD886/Excel97-2007BinaryFileFormat(xls)Specification.xps |title=Microsoft Office Excel 97 - 2007 Binary File Format Specification (*.xls 97-2007 format) |year=2007 |publisher=Microsoft Corporation }}</ref> [[TIFF]] image files are an example of the second strategy, whose header instructs the application about endianness of their internal binary integers. If a file starts with the signature {{code|MM}} it means that integers are represented as big-endian, while {{code|II}} means little-endian. Those signatures need a single 16-bit word each, and they are [[palindrome]]s (that is, they read the same forwards and backwards), so they are endianness independent. {{code|I}} stands for [[Intel]] and {{code|M}} stands for [[Motorola]], the respective [[CPU]] providers of the [[IBM PC]] compatibles (Intel) and [[Apple Macintosh]] platforms (Motorola) in the 1980s. Intel CPUs are little-endian, while Motorola 680x0 CPUs are big-endian. This explicit signature allows a TIFF reader program to swap bytes if necessary when a given file was generated by a TIFF writer program running on a computer with a different endianness. As a consequence of its original implementation on the Intel 8080 platform, the operating system-independent [[File Allocation Table]] (FAT) file system is defined with little-endian byte ordering, even on platforms using another endianness natively, necessitating byte-swap operations for maintaining the FAT. [[ZFS]], which combines a [[filesystem]] and a [[Logical volume management|logical volume manager]], is known to provide adaptive endianness and to work with both big-endian and little-endian systems.<ref>{{cite AV media |url=http://open-zfs.org/wiki/Documentation/Read_Write_Lecture |title=FreeBSD Kernel Internals: An Intensive Code Walkthrough |author=Matt Ahrens |year=2016 |publisher=OpenZFS Documentation/Read Write Lecture }}</ref> ==Networking== Many [[IETF RFC]]s use the term ''network order'', meaning the order of transmission for bits and bytes ''over the wire'' in [[network protocols]]. Among others, the historic RFC 1700 (also known as [[Internet standard]] STD 2) has defined the network order for protocols in the [[Internet protocol suite]] to be big-endian, hence the use of the term "network byte order" for big-endian byte order.<ref> {{cite IETF | title = Assigned Numbers | rfc = 1700 | std = 2 | sectionname = Data Notations | page = 3 | last1 = Reynolds | first1 = J. | author-link1 = Joyce K. Reynolds | last2 = Postel | first2 = J. | author-link2 = Jon Postel |date=October 1994 | publisher = [[Internet Engineering Task Force|IETF]] | access-date = 2012-03-02 }} </ref> However, not all protocols use big-endian byte order as the network order. The [[Server Message Block]] (SMB) protocol uses little-endian byte order. In [[CANopen]], multi-byte parameters are always sent [[least significant byte]] first (little-endian). The same is true for [[Ethernet Powerlink]].<ref>Ethernet POWERLINK Standardisation Group (2012), ''EPSG Working Draft Proposal 301: Ethernet POWERLINK Communication Profile Specification Version 1.1.4'', chapter 6.1.1.</ref> The [[Berkeley sockets]] [[application programming interface|API]] defines a set of functions to convert 16-bit and 32-bit integers to and from network byte order: the {{code|htons}} (host-to-network-short) and {{code|htonl}} (host-to-network-long) functions convert 16-bit and 32-bit values respectively from machine (''host'') to network order; the {{code|ntohs}} and {{code|ntohl}} functions convert from network to host order.<ref> {{cite book | title = The Open Group Base Specifications Issue 7 | author = IEEE and The Open Group | date = 2018 | volume = 2 | chapter = 3. System Interfaces | page = 1120 | url = https://pubs.opengroup.org/onlinepubs/9699919799/functions/htonl.html | access-date = 2021-04-09 }} </ref><ref>{{Cite web|title=htonl(3) - Linux man page|url=https://linux.die.net/man/3/htonl|access-date=2021-04-09|website=linux.die.net}}</ref> These functions may be a [[no-op]] on a big-endian system. While the high-level network protocols usually consider the byte (mostly meant as ''[[octet (computing)|octet]]'') as their atomic unit, the lowest network protocols may deal with ordering of bits within a byte. ==Bit endianness== [[Bit numbering]] is a concept similar to endianness, but on a level of bits, not bytes. '''Bit endianness''' or '''bit-level endianness''' refers to the transmission order of bits over a serial medium. The bit-level analogue of little-endian (least significant bit goes first) is used in [[RS-232]], [[HDLC]], [[Ethernet]], and [[USB]]. Some protocols use the opposite ordering (e.g. [[Teletext]], [[I²C|I<sup>2</sup>C]], [[SMBus]], [[PMBus]], and [[synchronous optical networking|SONET and SDH]]<ref>Cf. Sec. 2.1 Bit Transmission of [http://tools.ietf.org/html/draft-ietf-pppext-sonet-as-00 draft-ietf-pppext-sonet-as-00 "Applicability Statement for PPP over SONET/SDH"]</ref>), and [[ARINC_429#Bit_numbering,_Transmission_Order,_and_Bit_Significance|ARINC 429]] uses one ordering for its label field and the other ordering for the remainder of the frame. Usually, there exists a consistent view to the bits irrespective of their order in the byte, such that the latter becomes relevant only on a very low level. One exception is caused by the feature of some [[cyclic redundancy check]]s to detect ''all'' [[burst error]]s up to a known length, which would be spoiled if the bit order is different from the byte order on serial transmission. Apart from serialization, the terms ''bit endianness'' and ''bit-level endianness'' are seldom used, as computer architectures where each individual bit has a unique address are rare. Individual bits or [[bit field]]s are accessed via their numerical value or, in high-level programming languages, assigned names, the effects of which, however, may be machine dependent or lack [[software portability]]. == Notes == {{NoteFoot}} == References == {{Reflist}} [[Category:Computer memory]] [[Category:Data transmission]] [[Category:Metaphors]] [[Category:Software wars]]'
New page wikitext, after the edit (new_wikitext)
'== Etymology == [[File:Gullivers_travels.jpg|thumb|The adjective ''endian'' comes from the 1726 novel ''[[Gulliver's Travels]]'' by [[Jonathan Swift]] where characters known as Lilliputians are divided into those breaking the shell of a [[boiled egg]] from the big end (''Big-Endians'') or from the little end (''Little-Endians'')]] [[Danny Cohen (computer scientist)|Danny Cohen]] introduced the terms ''big-endian'' and ''little-endian'' into computer science for data ordering in an [[Internet Experiment Note]] published in 1980.<ref name="HOLY">{{cite IETF |title = On Holy Wars and a Plea for Peace |ien = 137 |last = Cohen |first = Danny |author-link = Danny Cohen (computer scientist) |date = 1980-04-01 |url = http://www.ietf.org/rfc/ien/ien137.txt |quote = ...which bit should travel first, the bit from the little end of the word, or the bit from the big end of the word? The followers of the former approach are called the Little-Endians, and the followers of the latter are called the Big-Endians. |publisher = [[Internet Engineering Task Force|IETF]] }} Also published at ''[[IEEE Computer]]'', [https://ieeexplore.ieee.org/document/1667115 October 1981 issue].</ref> The adjective ''endian'' has its origin in the writings of 18th century Anglo-Irish writer [[Jonathan Swift]]. In the 1726 novel ''[[Gulliver's Travels]]'', he portrays the conflict between sects of Lilliputians divided into those breaking the shell of a [[boiled egg]] from the big end or from the little end. He called them the ''Big-Endians'' and the ''Little-Endians''.<ref>{{cite book |first = Jonathan |last =Swift |title = Gulliver's Travels |year = 1726 |url = http://en.wikisource.org/wiki/Gulliver%27s_Travels/Part_I/Chapter_IV }}</ref><ref>{{Citation |last1 = Bryant |first1 = Randal E. |author-link = Randal Bryant |last2 = David |first2 = O'Hallaron |title = Computer Systems: A Programmer's Perspective |publisher = Pearson Education |year = 2016 |edition = 3 |isbn = 978-1-488-67207-1 |page = 79 }}</ref> Cohen makes the connection to ''Gulliver's Travels'' explicit in the appendix to his 1980 note. ==Overview== Computers store information in various-sized groups of binary bits. Each group is assigned a number, called its ''address'', that the computer uses to access that data. On most modern computers, the smallest data group with an address is eight bits long and is called a byte. Larger groups comprise two or more bytes, for example, a [[32-bit computing|32-bit]] word contains four bytes. There are two possible ways a computer could number the individual bytes in a larger group, starting at either end. Both types of endianness are in widespread use in digital electronic engineering. The initial choice of endianness of a new design is often arbitrary, but later technology revisions and updates perpetuate the existing endianness to maintain [[backward compatibility]]. Internally, any given computer will work equally well regardless of what endianness it uses since its hardware will consistently use the same endianness to both store and load its data. For this reason, programmers and computer users normally ignore the endianness of the computer they are working with. However, endianness can become an issue when moving data external to the computer – as when transmitting data between different computers, or a programmer investigating internal computer bytes of data from a [[memory dump]] – and the endianness used differs from expectation. In these cases, the endianness of the data must be understood and accounted for. {{multiple image | header = Endian example | image1 = Big-Endian.svg | caption1 = Big-endian | image2 = Little-Endian.svg | width2 = <!-- displayed width of image; overridden by "width" above --> | caption2 = Little-endian }} These two diagrams show how two computers using different endianness store a 32-bit (four byte) integer with the value of {{mono|[[Hexadecimal|0x]]0A0B0C0D}}. In both cases, the integer is broken into four bytes, {{mono|0x0A}}, {{mono|0x0B}}, {{mono|0x0C}}, and {{mono|0x0D}}, and the bytes are stored in four sequential byte locations in memory, starting with the memory location with address ''a'', then ''a + 1'', ''a + 2'', and ''a + 3''. The difference between big and little endian is the order of the four bytes of the integer being stored. The left-side diagram shows a computer using big-endian. This starts the storing of the integer with the ''most''-significant byte, {{mono|0x0A}}, at address ''a'', and ends with the ''least''-significant byte, {{mono|0x0D}}, at address ''a + 3''. The right-side diagram shows a computer using little-endian. This starts the storing of the integer with the ''least''-significant byte, {{mono|0x0D}}, at address ''a'', and ends with the ''most''-significant byte, {{mono|0x0A}}, at address ''a + 3''. Since each computer uses its same endianness to both store and retrieve the integer, the results will be the same for both computers. Issues may arise when memory is addressed by bytes instead of integers, or when memory contents are transmitted between computers with different endianness. Big-endianness is the dominant ordering in networking protocols, such as in the [[internet protocol suite]], where it is referred to as '''network order''', transmitting the most significant byte first. Conversely, little-endianness is the dominant ordering for processor architectures ([[x86]], most [[ARM architecture|ARM]] implementations, base [[RISC-V]] implementations) and their associated memory. [[File format]]s can use either ordering; some formats use a mixture of both or contain an indicator of which ordering is used throughout the file.<ref>{{Cite web|date=April 1992|title=RFC 1314 – A File Format for the Exchange of Images in the Internet|url=https://datatracker.ietf.org/doc/html/rfc1314#page-7|access-date=2021-08-16|website=datatracker.ietf.org|publisher=[[Internet Engineering Task Force]]|quote=TIFF files start with a file header which specifies the byte order used in the file (i.e., Big or Little Endian)}}</ref> The styles of little- and big-endian may also be used more generally to characterize the ordering of any representation, e.g. the digits in a [[numeral system]] or the sections of a [[date format by country|date]]. Numbers in [[positional notation]] are generally written with their digits in left-to-right big-endian order, even in [[Writing system#Directionality|right-to-left scripts]]. Similarly, programming languages use big-endian digit ordering for numeric [[Literal (computer programming)|literals]]. == Basics == [[Computer memory]] consists of a sequence of storage cells (smallest [[address space|addressable]] units); in machines that support [[byte addressing]], those units are called ''[[byte]]s''. Each byte is identified and accessed in hardware and software by its [[memory address]]. If the total number of bytes in memory is ''n'', then addresses are enumerated from 0 to ''n''&nbsp;−&nbsp;1. Computer programs often use data structures or [[Field (computer science)|fields]] that may consist of more data than can be stored in one byte. In the context of this article where its type cannot be arbitrarily complicated, a "field" consists of a consecutive sequence of bytes and represents a "simple data value" which – at least potentially – can be manipulated by ''one'' single [[Instruction set architecture|hardware instruction]]. On most systems, the address of a multi-byte simple data value is the address of its first byte (the byte with the lowest address).{{NoteTag|An exception to this rule is e.g. the Add instruction of the [[IBM 1401]] which addresses variable-length fields at their low-order (highest-addressed) position with their lengths being defined by a [[Word mark (computer hardware)|word mark]] set at their high-order (lowest-addressed) position. When an operation such as addition is performed, the processor begins at the low-order positions at the high addresses of the two fields and works its way down to the high-order.}} Another important attribute of a byte being part of a "field" is its "significance". These attributes of the parts of a field play an important role in the sequence the bytes are accessed by the computer hardware, more precisely: by the low-level algorithms contributing to the results of a computer instruction. === Numbers === [[Positional notation|Positional number systems]] (mostly base 10, base 2, or base 256 in the case of 8-bit bytes) are the predominant way of representing and particularly of manipulating [[Integer (computer science)|integer data]] by computers. In pure form this is valid for moderate sized non-negative integers, e.g. of C data type <code>[[Data type#Numeric types|unsigned]]</code>. In such a number system, the ''value'' of a digit which it contributes to the whole number is determined not only by its value as a single digit, but also by the position it holds in the complete number, called its significance. These positions can be mapped to memory mainly in two ways:<ref name="TanenbaumAustin2012">{{cite book |first1 = Andrew S. |last1 = Tanenbaum |first2 = Todd M. |last2 = Austin |title = Structured Computer Organization |url = https://books.google.com/books?id=m0HHygAACAAJ |access-date = 18 May 2013 |date = 4 August 2012 |publisher = Prentice Hall PTR |isbn = 978-0-13-291652-3 }}</ref> * decreasing numeric significance with increasing memory addresses (or increasing time), known as ''big-endian'' and * increasing numeric significance with increasing memory addresses (or increasing time), known as ''little-endian''.{{NoteTag|Note that, in these expressions, the term "end" is meant as the extremity where the ''big'' resp. ''little'' significance is written ''first'', namely where the field ''starts''.}} The integer data that are directly supported by the [[Arithmetic logic unit|computer hardware]] have a fixed width of a low power of 2, e.g. 8 bits ≙ 1 byte, 16 bits ≙ 2 bytes, 32 bits ≙ 4 bytes, 64 bits ≙ 8 bytes, 128 bits ≙ 16 bytes. The low-level access sequence to the bytes of such a field depends on the operation to be performed. The least-significant byte is accessed first for [[addition]], [[subtraction]] and [[multiplication]]. The most-significant byte is accessed first for [[Division (mathematics)|division]] and [[Natural number#Order|comparison]]. See {{section link||Calculation order}}. For [[Floating-point arithmetic|floating-point]] numbers, see {{section link||Floating point}}. === Text === When character (text) strings are to be compared with one another, e.g. in order to support some mechanism like [[Sorting algorithm|sorting]], this is very frequently done [[Lexicographical order|lexicographically]] where a single positional element (character) also has a positional value. Lexicographical comparison means almost everywhere: first character ranks highest – as in the telephone book.{{NoteTag|Almost all machines which can do this using ''one'' instruction only (see {{section link||Variable-length data}}) are anyhow of type big-endian or at least mixed-endian.}} Integer numbers written as text are always represented most significant digit first in memory, which is similar to big-endian, independently of [[text direction]]. == Hardware == Many historical and extant processors use a big-endian memory representation, either exclusively or as a design option. Other processor types use little-endian memory representation; others use yet another scheme called ''[[Endianness#Middle|middle-endian]]'', ''mixed-endian'' or ''[[PDP-11]]-endian''. Some instruction sets feature a setting which allows for switchable endianness in data fetches and stores, instruction fetches, or both. This feature can improve performance or simplify the logic of networking devices and software. The word ''bi-endian'', when said of hardware, denotes the capability of the machine to compute or pass data in either endian format. Dealing with data of different endianness is sometimes termed the ''NUXI problem''.<ref>{{cite web |title = NUXI problem |work = The [[Jargon File]] |url = http://catb.org/jargon/html/N/NUXI-problem.html |access-date = 2008-12-20 }}</ref> This terminology alludes to the byte order conflicts encountered while [[Porting|adapting]] [[UNIX]], which ran on the mixed-endian PDP-11,{{NoteTag|The PDP-11 architecture is little-endian within its native 16-bit words, but stores 32-bit data as an unusual '''big'''-endian word pairs.}} to a big-endian IBM Series/1 computer. Unix was one of the first systems to allow the same code to be compiled for platforms with different internal representations. One of the first programs converted was supposed to print out {{code|Unix}}, but on the Series/1 it printed {{code|nUxi}} instead.<ref>{{cite journal |last1=Jalics|first1=Paul J. |last2=Heines|first2=Thomas S. |title = Transporting a portable operating system: UNIX to an IBM minicomputer |journal=Communications of the ACM|date=1 December 1983|volume=26|issue=12|pages=1066–1072|doi=10.1145/358476.358504|s2cid=15558835 }}</ref> The [[IBM System/360]] uses big-endian byte order, as do its successors [[System/370]], [[ESA/390]], and [[z/Architecture]]. The [[PDP-10]] uses big-endian addressing for byte-oriented instructions. The [[IBM Series/1]] minicomputer uses big-endian byte order. The [[Datapoint 2200]] used simple bit-serial logic with little-endian to facilitate [[carry propagation]]. When Intel developed the [[Intel 8008|8008]] microprocessor for Datapoint, they used little-endian for compatibility. However, as Intel was unable to deliver the 8008 in time, Datapoint used a [[medium-scale integration]] equivalent, but the little-endianness was retained in most Intel designs, including the [[MCS-48]] and the [[Intel 8086|8086]] and its [[x86]] successors.<ref>{{cite web|last=House|first=David|title=Oral History Panel on the Development and Promotion of the Intel 8008 Microprocessor |url = http://archive.computerhistory.org/resources/text/Oral_History/Intel_8008/Intel_8008_1.oral_history.2006.102657982.pdf#page=5 |publisher=[[Computer History Museum]] |access-date=23 April 2014 |author2=Faggin, Federico |author3=Feeney, Hal |author4=Gelbach, Ed |author5=Hoff, Ted |author6=Mazor, Stan |author7= Smith, Hank |page =b5 |date=2006-09-21 |quote = Mazor: And lastly, the original design for Datapoint ... what they wanted was a [bit] serial machine. And if you think about a serial machine, you have to process all the addresses and data one-bit at a time, and the rational way to do that is: low-bit to high-bit because that’s the way that carry would propagate. So it means that [in] the jump instruction itself, the way the 14-bit address would be put in a serial machine is bit-backwards, as you look at it, because that’s the way you’d want to process it. Well, we were gonna built a byte-parallel machine, not bit-serial and our compromise (in the spirit of the customer and just for him), we put the bytes in backwards. We put the low-byte [first] and then the high-byte. This has since been dubbed “Little Endian” format and it’s sort of contrary to what you’d think would be natural. Well, we did it for Datapoint. As you’ll see, they never did use the [8008] chip and so it was in some sense “a mistake”, but that [Little Endian format] has lived on to the 8080 and 8086 and [is] one of the marks of this family.}}</ref><ref name="Lunde2009">{{cite book |first = Ken |last = Lunde |title = CJKV Information Processing |url = https://books.google.com/books?id=SA92uQqTB-AC&pg=PA29 |access-date=21 May 2013 |date = 13 January 2009 |publisher = O'Reilly Media, Inc. |isbn = 978-0-596-51447-1 |page = 29 }}</ref> The [[DEC Alpha]], [[Atmel AVR]], [[VAX]], the [[MOS Technology 6502]] family (including [[Western Design Center]] [[65802]] and [[65C816]]), the Zilog [[Z80]] (including [[Z180]] and [[eZ80]]), the [[Altera]] [[Nios II]], and many other processors and processor families are also little-endian. The Motorola [[Motorola 6800|6800]] / 6801, the [[6809]] and the [[Motorola 68000 series|68000 series]] of processors used the big-endian format. The Intel [[8051]], contrary to other Intel processors, expects 16-bit addresses for LJMP and LCALL in big-endian format; however, xCALL instructions store the return address onto the stack in little-endian format.<ref>{{cite web|url=http://www.keil.com/support/man/docs/c51/c51_xe.htm|title=Cx51 User's Guide: E. Byte Ordering|website=keil.com}}</ref> [[SPARC]] historically used big-endian until version 9, which is [[#Bi-endianness|bi-endian]]. Similarly early IBM POWER processors were big-endian, but the [[PowerPC]] and [[Power ISA]] descendants are now bi-endian. The [[ARM architecture]] was little-endian before version 3 when it became bi-endian. === Newer architectures === The [[IA-32]] and [[x86-64]] instruction set architectures use the little-endian format. Other instruction set architectures that follow this convention, allowing only little-endian mode, include [[Nios II]], [[Andes Technology]] NDS32, and [[Qualcomm Hexagon]]. Solely big-endian architectures include the IBM [[z/Architecture]] and [[OpenRISC]]. Some instruction set architectures are "bi-endian" and allow running software of either endianness; these include [[Power ISA]], [[SPARC]], ARM [[AArch64]], [[C-Sky]], and [[RISC-V]]. [[IBM AIX]] and [[IBM i]] run in big-endian mode on bi-endian Power ISA; [[Linux]] originally ran in big-endian mode, but by 2019, IBM had transitioned to little-endian mode for Linux to ease the porting of Linux software from x86 to Power.<ref>{{cite web |title=Little endian and Linux on IBM Power Systems |url=https://developer.ibm.com/articles/l-power-little-endian-faq-trs/ |date=2016-06-16 |website=IBM |author=Jeff Scheel |access-date=2022-03-27}}</ref><ref>{{cite web |last1=Timothy Prickett Morgan |title=The Transition To RHEL 8 Begins On Power Systems |url=https://www.itjungle.com/2019/06/10/the-transition-to-rhel-8-begins-on-power-systems/ |website=ITJungle |publisher=ITJungle |access-date=26 March 2022}}</ref> SPARC has no relevant little-endian deployment, as both [[Oracle Solaris]] and Linux run in big-endian mode on bi-endian SPARC systems, and can be considered big-endian in practice. ARM, C-Sky, and RISC-V have no relevant big-endian deployments, and can be considered little-endian in practice. === Bi-endianness<span class="anchor" id="Bi-endian hardware"></span> ===<!-- bi, not big --> Some architectures (including [[ARM architecture|ARM]] versions 3 and above, [[PowerPC]], [[DEC Alpha|Alpha]], [[SPARC]] V9, [[MIPS architecture|MIPS]], [[Intel i860]], [[PA-RISC]], [[SuperH|SuperH SH-4]] and [[IA-64]]) feature a setting which allows for switchable endianness in data fetches and stores, instruction fetches, or both. This feature can improve performance or simplify the logic of networking devices and software. The word ''bi-endian'', when said of hardware, denotes the capability of the machine to compute or pass data in either endian format. Many of these architectures can be switched via software to default to a specific endian format (usually done when the computer starts up); however, on some systems, the default endianness is selected by hardware on the motherboard and cannot be changed via software (e.g. the Alpha, which runs only in big-endian mode on the [[Cray T3E]]). Note that the term ''bi-endian'' refers primarily to how a processor treats data accesses. Instruction accesses (fetches of instruction words) on a given processor may still assume a fixed endianness, even if data accesses are fully bi-endian, though this is not always the case, such as on Intel's [[IA-64]]-based Itanium CPU, which allows both. Note, too, that some nominally bi-endian CPUs require motherboard help to fully switch endianness. For instance, the 32-bit desktop-oriented [[PowerPC]] processors in little-endian mode act as little-endian from the point of view of the executing programs, but they require the motherboard to perform a 64-bit swap across all 8 byte lanes to ensure that the little-endian view of things will apply to [[Input/Output|I/O]] devices. In the absence of this unusual motherboard hardware, device driver software must write to different addresses to undo the incomplete transformation and also must perform a normal byte swap. Some CPUs, such as many PowerPC processors intended for embedded use and almost all SPARC processors, allow per-page choice of endianness. SPARC processors since the late 1990s (SPARC v9 compliant processors) allow data endianness to be chosen with each individual instruction that loads from or stores to memory. The [[ARM architecture]] supports two big-endian modes, called ''BE-8'' and ''BE-32''.<ref>{{cite web|title=Differences between BE-32 and BE-8 buses|url=http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0290g/ch06s05s01.html}}</ref> CPUs up to ARMv5 only support BE-32 or word-invariant mode. Here any naturally aligned 32-bit access works like in little-endian mode, but access to a byte or 16-bit word is redirected to the corresponding address and unaligned access is not allowed. ARMv6 introduces BE-8 or byte-invariant mode, where access to a single byte works as in little-endian mode, but accessing a 16-bit, 32-bit or (starting with ARMv8) 64-bit word results in a byte swap of the data. This simplifies unaligned memory access as well as memory-mapped access to registers other than 32 bit. Many processors have instructions to convert a word in a register to the opposite endianness, that is, they swap the order of the bytes in a 16-, 32- or 64-bit word. All the individual bits are not reversed though. Recent Intel x86 and x86-64 architecture CPUs have a MOVBE instruction ([[Intel Core]] since generation 4, after [[Intel Atom|Atom]]),<ref>{{cite web |title = How to detect New Instruction support in the 4th generation Intel® Core™ processor family |url = https://software.intel.com/sites/default/files/article/405250/how-to-detect-new-instruction-support-in-the-4th-generation-intel-core-processor-family.pdf |access-date = 2 May 2017 }}</ref> which fetches a big-endian format word from memory or writes a word into memory in big-endian format. These processors are otherwise thoroughly little-endian. ===Floating point=== {{anchor|Floating-point and endianness}}<!-- This section is transcluded on the [[double-precision floating point]] article --> Although many processors use little-endian storage for all types of data (integer, floating point), there are a number of hardware architectures where [[floating-point]] numbers are represented in big-endian form while integers are represented in little-endian form.<ref>{{citation |title=Floating-Point Formats |author-first=John J. G. |author-last=Savard |date=2018 |orig-year=2005 |work=quadibloc |url=http://www.quadibloc.com/comp/cp0201.htm |access-date=2018-07-16 |url-status=live |archive-url=https://web.archive.org/web/20180703001709/http://www.quadibloc.com/comp/cp0201.htm |archive-date=2018-07-03}}</ref> There are [[ARM architecture|ARM]] processors that have half little-endian, half big-endian floating-point representation for double-precision numbers; both 32-bit words are stored in little-endian like integer registers, but the most significant one first. [[VAX]] floating point stores little-endian 16-bit words in big-endian order. Because there have been many floating-point formats with no network standard representation for them, the [[External Data Representation|XDR]] standard uses big-endian IEEE 754 as its representation. It may therefore appear strange that the widespread [[IEEE 754]] floating-point standard does not specify endianness.<ref>{{cite web |title = pack – convert a list into a binary representation |url = http://www.perl.com/doc/manual/html/pod/perlfunc/pack.html }}</ref> Theoretically, this means that even standard IEEE floating-point data written by one machine might not be readable by another. However, on modern standard computers (i.e., implementing IEEE 754), one may safely assume that the endianness is the same for floating-point numbers as for integers, making the conversion straightforward regardless of data type. Small [[embedded system]]s using special floating-point formats may be another matter, however. ===Variable-length data=== Most instructions considered so far contain the size (lengths) of their [[operand]]s within the [[operation code]]. Frequently available operand lengths are 1, 2, 4, 8, or 16 bytes. But there are also architectures where the length of an operand may be held in a separate field of the instruction or with the operand itself, e.g. by means of a [[word mark (computer hardware)|word mark]]. Such an approach allows operand lengths up to 256 bytes or larger. The data types of such operands are [[string (computer science)|character strings]] or [[binary-coded decimal|BCD]]. Machines able to manipulate such data with one instruction (e.g. compare, add) include the [[IBM 1401]], [[IBM 1410|1410]], [[IBM 1620|1620]], [[IBM System/360|System/360]], [[IBM System/370|System/370]], [[ESA/390]], and [[z/Architecture]], all of them of type big-endian.<!--[[User:Kvng/RTH]]--> ===Simplified access to part of a field=== On most systems, the address of a multi-byte value is the address of its first byte (the byte with the lowest address); little-endian systems of that type have the property that, for sufficiently low values, the same value can be read from memory at different lengths without using different addresses (even when [[byte alignment|alignment]] restrictions are imposed). For example, a 32-bit memory location with content {{code|4A 00 00 00|class=nowrap}} can be read at the same address as either [[8-bit computing|8-bit]] (value = 4A), [[16-bit computing|16-bit]] (004A), [[24-bit computing|24-bit]] (00004A), or [[32-bit computing|32-bit]] (0000004A), all of which retain the same numeric value. Although this little-endian property is rarely used directly by high-level programmers, it is occasionally employed by code optimizers as well as by [[assembly language]] programmers.{{Example needed|s|date=April 2022}} In more concrete terms, identities like this are the equivalent of the following [[C (programming language)|C code]] returning "true" on most little-endian systems: <syntaxhighlight lang="C"> union { uint8_t u8; uint16_t u16; uint32_t u32; uint64_t u64; } u = { .u64 = 0x4A }; puts(u.u8 == u.u16 && u.u8 == u.u32 && u.u8 == u.u64 ? "true" : "false"); </syntaxhighlight> While not allowed by C++, such [[type punning]] code is allowed as "implementation-defined" by the C11 standard<ref>{{cite web |title = C11 standard |url = https://www.iso.org/standard/57853.html |publisher = ISO |access-date=15 August 2018 |at = Section 6.5.2.3 "Structure and Union members", §3 and footnote 95 |quote = 95) If the member used to read the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type as described in 6.2.6 (a process sometimes called “type punning”).}}</ref> and commonly used<ref>{{cite web |title=3.10 Options That Control Optimization: -fstrict-aliasing |url=https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html#Type-punning |website=GNU Compiler Collection (GCC) |publisher=Free Software Foundation |access-date=15 August 2018}}</ref> in code interacting with hardware.<ref>{{cite mailing list |first=Linus |last=Torvalds |title=[GIT PULL] Device properties framework update for v4.18-rc1 |url=https://lkml.org/lkml/2018/6/5/769 |mailing-list=Linux Kernel|access-date=15 August 2018 | date=5 Jun 2018 |quote=The fact is, using a union to do type punning is the traditional AND STANDARD way to do type punning in gcc. In fact, it is the *documented* way to do it for gcc, when you are a f*cking moron and use "-fstrict-aliasing" ...}}</ref> On the other hand, in some situations it may be useful to obtain an approximation of a multi-byte or multi-word value by reading only its most significant portion instead of the complete representation; a big-endian processor may read such an approximation using the same base-address that would be used for the full value. Simplifications of this kind are of course not portable across systems of different endianness. ===Calculation order=== Some operations in [[positional number system]]s have a natural or preferred order in which the elementary steps are to be executed. This order may affect their performance on small-scale byte-addressable processors and [[microcontroller]]s. However, high-performance processors usually fetch typical multi-byte operands from memory in the same amount of time they would have fetched a single byte, so the complexity of the hardware is not affected by the byte ordering. Addition, subtraction, and multiplication start at the least significant digit position and [[Adder (electronics)|propagate the carry]] to the subsequent more significant position. On most systems, the address of a multi-byte value is the address of its first byte (the byte with the lowest address); when this first byte contains the least significant digit – which is equivalent to ''little''-endianness, then the implementation of these operations is marginally simpler. Comparison and division start at the most significant digit and propagate a possible carry to the subsequent less significant digits. For fixed-length numerical values (typically of length 1,2,4,8,16), the implementation of these operations is marginally simpler on big-endian machines. Some big-endian processors (e.g. the IBM System/360 and its successors) contain hardware instructions for lexicographically comparing varying length character strings. The normal data transport by an [[Assignment (computer science)|assignment]] statement is in principle independent of the endianness of the processor. ==={{anchor|Middle|Mixed|Medium}}Middle-endian=== Numerous other orderings, generically called ''middle-endian'' or ''mixed-endian'', are possible.{{Citation needed|reason=Are there any examples in computer architecture?|date=June 2021}} The [[PDP-11]] is in principle a 16-bit little-endian system. The instructions to convert between floating-point and integer values in the optional floating-point processor of the PDP-11/45, PDP-11/70, and in some later processors, stored 32-bit "double precision integer long" values with the 16-bit halves swapped from the expected little-endian order. The [[UNIX]] [[C (programming language)|C]] compiler used the same format for 32-bit long integers. This ordering is known as ''PDP-endian''.<ref>{{cite book|url=http://bitsavers.org/pdf/dec/pdp11/handbooks/PDP1145_Handbook_1973.pdf|title=PDP-11/45 Processor Handbook|page=165|year=1973|publisher=[[Digital Equipment Corporation]]}}</ref> A way to interpret this endianness is that it stores a 32-bit integer as two 16-bit words in big-endian, but the words themselves are little-endian (E.g. "jag cog sin" would be "gaj goc nis"): {| cellpadding="4" style="border-collapse: collapse; margin: 0.4em 0.4em; text-align: center;" |+Storage of a 32-bit integer, {{mono|0x0A0B0C0D}}, on a PDP-11 |- | colspan="6" |''increasing addresses''&nbsp;&nbsp;→ |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" | {{mono|0B<sub>h</sub>}} |style="border: 1px solid;" | {{mono|0A<sub>h</sub>}} |style="border: 1px solid;" | {{mono|0D<sub>h</sub>}} |style="border: 1px solid;" | {{mono|0C<sub>h</sub>}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" colspan="2" | {{mono|0A0B<sub>h</sub>}} |style="border: 1px solid;" colspan="2" | {{mono|0C0D<sub>h</sub>}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |} The 16-bit values here refer to their numerical values, not their actual layout. [[Segment descriptors]] of [[IA-32]] and compatible processors keep a 32-bit base address of the segment stored in little-endian order, but in four nonconsecutive bytes, at relative positions 2, 3, 4 and 7 of the descriptor start. In [[date and time notation in the United States]], dates are middle-endian and differ from [[date format by country|date formats worldwide]]. ==Endian dates== Dates can be represented with different endianness by the ordering of the year, month and day. For example, September 12, 2002 can be represented as: * little-endian date (day, month, year), {{mono|12-09-2002}} * middle-endian dates (month, day, year), {{mono|09-12-2002}} * big-endian date (year, month, day), {{mono|2002-09-12}} as with [[ISO 8601]] ==Byte addressing== {{Main|Byte addressing}} When memory bytes are printed sequentially from left to right (e.g. in a [[hex dump]]), little-endian representation of integers has the significance increasing from left to right. In other words, it appears backwards when visualized, which can be counter-intuitive. This behavior arises, for example, in [[FourCC]] or similar techniques that involve packing characters into an integer, so that it becomes a sequences of specific characters in memory. Let's define the notation {{code|'John'}} as simply the result of writing the characters in hexadecimal [[ASCII]] and appending {{code|0x}} to the front, and analogously for shorter sequences (a [[C syntax#Character constants|C multicharacter literal]], in Unix/MacOS style): ' J o h n ' hex 4A 6F 68 6E ---------------- -> 0x4A6F686E On big-endian machines, the value appears left-to-right, coinciding with the correct string order for reading the result: {| cellpadding="4" style="border-collapse: collapse; margin: 0.4em 0.4em; text-align: center;" |- | colspan="6" |''increasing addresses''&nbsp;&nbsp;→ |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" | {{mono|4A<sub>h</sub>}} |style="border: 1px solid;" | {{mono|6F<sub>h</sub>}} |style="border: 1px solid;" | {{mono|68<sub>h</sub>}} |style="border: 1px solid;" | {{mono|6E<sub>h</sub>}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" | {{mono|'J'}} |style="border: 1px solid;" | {{mono|'o'}} |style="border: 1px solid;" | {{mono|'h'}} |style="border: 1px solid;" | {{mono|'n'}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |} But on a little-endian machine, one would see: {| cellpadding="4" style="border-collapse: collapse; margin: 0.4em 0.4em; text-align: center;" |- | colspan="6" |''increasing addresses''&nbsp;&nbsp;→ |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" | {{mono|6E<sub>h</sub>}} |style="border: 1px solid;" | {{mono|68<sub>h</sub>}} |style="border: 1px solid;" | {{mono|6F<sub>h</sub>}} |style="border: 1px solid;" | {{mono|4A<sub>h</sub>}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" | {{mono|'n'}} |style="border: 1px solid;" | {{mono|'h'}} |style="border: 1px solid;" | {{mono|'o'}} |style="border: 1px solid;" | {{mono|'J'}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |} Middle-endian machines complicate this even further; for example, on the [[PDP-11]], the 32-bit value is stored as two 16-bit words {{mono|'Jo'}} {{mono|'hn'}} in big-endian, with the characters in the 16-bit words being stored in little-endian: {| cellpadding="4" style="border-collapse: collapse; margin: 0.4em 0.4em; text-align: center;" |- | colspan="6" |''increasing addresses''&nbsp;&nbsp;→ |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" | {{mono|6F<sub>h</sub>}} |style="border: 1px solid;" | {{mono|4A<sub>h</sub>}} |style="border: 1px solid;" | {{mono|6E<sub>h</sub>}} |style="border: 1px solid;" | {{mono|68<sub>h</sub>}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |- |style="border: 1px solid; border-left: hidden;" | {{mono|...}} |style="border: 1px solid;" | {{mono|'o'}} |style="border: 1px solid;" | {{mono|'J'}} |style="border: 1px solid;" | {{mono|'n'}} |style="border: 1px solid;" | {{mono|'h'}} |style="border: 1px solid; border-right: hidden;" | {{mono|...}} |} ==Byte swapping== Byte-swapping consists of masking each byte and shifting them to the correct location. Many compilers provide [[Intrinsic function|built-ins]] that are likely to be compiled into native processor instructions ({{code|bswap}}/{{code|movbe}}), such as {{code|__builtin_bswap32}}. Software interfaces for swapping include: * Standard [[#Networking|network endianness]] functions (from/to BE, up to 32-bit).<ref>{{man|3|byteorder|Linux}}</ref> Windows has a 64-bit extension in {{code|winsock2.h}}. * BSD and Glibc {{code|endian.h}} functions (from/to BE and LE, up to 64-bit).<ref>{{man|3|endian|Linux}}</ref> * [[macOS]] {{code|OSByteOrder.h}} macros (from/to BE and LE, up to 64-bit). ==Files and filesystems== The recognition of endianness is important when reading a file or filesystem that was created on a computer with different endianness. Some [[CPU]] instruction sets provide native support for endian byte swapping, such as {{code|bswap}}<ref>{{cite web|url=http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf|title=Intel 64 and IA-32 Architectures Software Developer's Manual Volume 2 (2A, 2B & 2C): Instruction Set Reference, A-Z|at=p. 3–112|publisher=Intel|date=September 2016|access-date=2017-02-05}}</ref> ([[x86]] - [[Intel 80486|486]] and later), and {{code|rev}}<ref>{{cite web|url=http://infocenter.arm.com/help/topic/com.arm.doc.ddi0487a.k_10775/index.html|title=ARMv8-A Reference Manual|publisher=[[ARM Holdings]]}}</ref> ([[ARM architecture|ARMv6]] and later). Some [[compiler]]s have built-in facilities for byte swapping. For example, the [[Intel]] [[Fortran]] compiler supports the non-standard {{code|CONVERT}} specifier when opening a file, e.g.: {{code|1=OPEN(unit, CONVERT='BIG_ENDIAN',...)|2=fortran|class=nowrap}}. Some compilers have options for generating code that globally enable the conversion for all file IO operations. This permits the reuse of code on a system with the opposite endianness without code modification. Fortran sequential unformatted files created with one endianness usually cannot be read on a system using the other endianness because Fortran usually implements a [[storage record|record]] (defined as the data written by a single Fortran statement) as data preceded and succeeded by count fields, which are integers equal to the number of bytes in the data. An attempt to read such a file using Fortran on a system of the other endianness then results in a run-time error, because the count fields are incorrect. This problem can be avoided by writing out sequential binary files as opposed to sequential unformatted. Note however that it is relatively simple to write a program in another language (such as [[C (programming language)|C]] or [[Python (programming language)|Python]]) that parses Fortran sequential unformatted files of "foreign" endianness and converts them to "native" endianness, by converting from the "foreign" endianness when reading the Fortran records and data. [[Unicode]] text can optionally start with a [[byte order mark]] (BOM) to signal the endianness of the file or stream. Its code point is U+FEFF. In [[UTF-32]] for example, a big-endian file should start with {{code|00 00 FE FF|class=nowrap}}; a little-endian should start with {{code|FF FE 00 00|class=nowrap}}. Application binary data formats, such as for example [[MATLAB]] ''.mat'' files, or the ''.bil'' data format, used in topography, are usually endianness-independent. This is achieved by storing the data always in one fixed endianness, or carrying with the data a switch to indicate the endianness. An example of the first case is the binary [[XLS file]] format that is portable between Windows and Mac systems and always little-endian, leaving the Mac application to swap the bytes on load and save when running on a big-endian Motorola 68K or PowerPC processor.<ref>{{cite web |url=http://download.microsoft.com/download/0/B/E/0BE8BDD7-E5E8-422A-ABFD-4342ED7AD886/Excel97-2007BinaryFileFormat(xls)Specification.xps |title=Microsoft Office Excel 97 - 2007 Binary File Format Specification (*.xls 97-2007 format) |year=2007 |publisher=Microsoft Corporation }}</ref> [[TIFF]] image files are an example of the second strategy, whose header instructs the application about endianness of their internal binary integers. If a file starts with the signature {{code|MM}} it means that integers are represented as big-endian, while {{code|II}} means little-endian. Those signatures need a single 16-bit word each, and they are [[palindrome]]s (that is, they read the same forwards and backwards), so they are endianness independent. {{code|I}} stands for [[Intel]] and {{code|M}} stands for [[Motorola]], the respective [[CPU]] providers of the [[IBM PC]] compatibles (Intel) and [[Apple Macintosh]] platforms (Motorola) in the 1980s. Intel CPUs are little-endian, while Motorola 680x0 CPUs are big-endian. This explicit signature allows a TIFF reader program to swap bytes if necessary when a given file was generated by a TIFF writer program running on a computer with a different endianness. As a consequence of its original implementation on the Intel 8080 platform, the operating system-independent [[File Allocation Table]] (FAT) file system is defined with little-endian byte ordering, even on platforms using another endianness natively, necessitating byte-swap operations for maintaining the FAT. [[ZFS]], which combines a [[filesystem]] and a [[Logical volume management|logical volume manager]], is known to provide adaptive endianness and to work with both big-endian and little-endian systems.<ref>{{cite AV media |url=http://open-zfs.org/wiki/Documentation/Read_Write_Lecture |title=FreeBSD Kernel Internals: An Intensive Code Walkthrough |author=Matt Ahrens |year=2016 |publisher=OpenZFS Documentation/Read Write Lecture }}</ref> ==Networking== Many [[IETF RFC]]s use the term ''network order'', meaning the order of transmission for bits and bytes ''over the wire'' in [[network protocols]]. Among others, the historic RFC 1700 (also known as [[Internet standard]] STD 2) has defined the network order for protocols in the [[Internet protocol suite]] to be big-endian, hence the use of the term "network byte order" for big-endian byte order.<ref> {{cite IETF | title = Assigned Numbers | rfc = 1700 | std = 2 | sectionname = Data Notations | page = 3 | last1 = Reynolds | first1 = J. | author-link1 = Joyce K. Reynolds | last2 = Postel | first2 = J. | author-link2 = Jon Postel |date=October 1994 | publisher = [[Internet Engineering Task Force|IETF]] | access-date = 2012-03-02 }} </ref> However, not all protocols use big-endian byte order as the network order. The [[Server Message Block]] (SMB) protocol uses little-endian byte order. In [[CANopen]], multi-byte parameters are always sent [[least significant byte]] first (little-endian). The same is true for [[Ethernet Powerlink]].<ref>Ethernet POWERLINK Standardisation Group (2012), ''EPSG Working Draft Proposal 301: Ethernet POWERLINK Communication Profile Specification Version 1.1.4'', chapter 6.1.1.</ref> The [[Berkeley sockets]] [[application programming interface|API]] defines a set of functions to convert 16-bit and 32-bit integers to and from network byte order: the {{code|htons}} (host-to-network-short) and {{code|htonl}} (host-to-network-long) functions convert 16-bit and 32-bit values respectively from machine (''host'') to network order; the {{code|ntohs}} and {{code|ntohl}} functions convert from network to host order.<ref> {{cite book | title = The Open Group Base Specifications Issue 7 | author = IEEE and The Open Group | date = 2018 | volume = 2 | chapter = 3. System Interfaces | page = 1120 | url = https://pubs.opengroup.org/onlinepubs/9699919799/functions/htonl.html | access-date = 2021-04-09 }} </ref><ref>{{Cite web|title=htonl(3) - Linux man page|url=https://linux.die.net/man/3/htonl|access-date=2021-04-09|website=linux.die.net}}</ref> These functions may be a [[no-op]] on a big-endian system. While the high-level network protocols usually consider the byte (mostly meant as ''[[octet (computing)|octet]]'') as their atomic unit, the lowest network protocols may deal with ordering of bits within a byte. ==Bit endianness== [[Bit numbering]] is a concept similar to endianness, but on a level of bits, not bytes. '''Bit endianness''' or '''bit-level endianness''' refers to the transmission order of bits over a serial medium. The bit-level analogue of little-endian (least significant bit goes first) is used in [[RS-232]], [[HDLC]], [[Ethernet]], and [[USB]]. Some protocols use the opposite ordering (e.g. [[Teletext]], [[I²C|I<sup>2</sup>C]], [[SMBus]], [[PMBus]], and [[synchronous optical networking|SONET and SDH]]<ref>Cf. Sec. 2.1 Bit Transmission of [http://tools.ietf.org/html/draft-ietf-pppext-sonet-as-00 draft-ietf-pppext-sonet-as-00 "Applicability Statement for PPP over SONET/SDH"]</ref>), and [[ARINC_429#Bit_numbering,_Transmission_Order,_and_Bit_Significance|ARINC 429]] uses one ordering for its label field and the other ordering for the remainder of the frame. Usually, there exists a consistent view to the bits irrespective of their order in the byte, such that the latter becomes relevant only on a very low level. One exception is caused by the feature of some [[cyclic redundancy check]]s to detect ''all'' [[burst error]]s up to a known length, which would be spoiled if the bit order is different from the byte order on serial transmission. Apart from serialization, the terms ''bit endianness'' and ''bit-level endianness'' are seldom used, as computer architectures where each individual bit has a unique address are rare. Individual bits or [[bit field]]s are accessed via their numerical value or, in high-level programming languages, assigned names, the effects of which, however, may be machine dependent or lack [[software portability]]. == Notes == {{NoteFoot}} == References == {{Reflist}} [[Category:Computer memory]] [[Category:Data transmission]] [[Category:Metaphors]] [[Category:Software wars]]'
Unified diff of changes made by edit (edit_diff)
'@@ -1,38 +1,2 @@ -{{Short description|Order of bytes in a computer word}} -{{Redirect2|Big-endian|Little-endian|the conflicting ideologies in ''Gulliver’s Travels''|Lilliput and Blefuscu#History and politics}} -{{Multiple issues| -{{refimprove|date=July 2020}} -{{Condense|date=August 2020}} -}} - -In [[computing]], '''endianness''' is the order or sequence of [[byte]]s of a [[word (data type)|word]] of digital data in [[computer memory]]. Endianness is primarily expressed as '''big-endian''' ('''BE''') or '''little-endian''' ('''LE'''). A big-endian system stores the [[most significant byte]] of a word at the smallest [[memory address]] and the [[least significant byte]] at the largest. -A little-endian system, in contrast, stores the least-significant byte at the smallest address.<ref>[https://betterexplained.com/articles/understanding-big-and-little-endian-byte-order/ Understanding big and little endian byte order]</ref><ref>[https://developer.apple.com/library/archive/documentation/CoreFoundation/Conceptual/CFMemoryMgmt/Concepts/ByteOrdering.html#//apple_ref/doc/uid/20001150-CJBEJBHH Byte Ordering PPC]</ref><ref>[https://developer.ibm.com/articles/au-endianc/ Writing endian-independent code in C]</ref> '''Bi-endianness''' is a feature supported by numerous computer architectures that feature switchable endianness in data fetches and stores or for instruction fetches. -Other orderings are generically called '''middle-endian''' or '''mixed-endian'''.<ref>{{cite web |title = Internet Hall of Fame Pioneer |url = http://internethalloffame.org/inductees/danny-cohen |website = [[Internet Hall of Fame]] |publisher = [[The Internet Society]] }}</ref><ref>{{cite web -|first = David |last = Cary -|title = Endian FAQ -|url = http://david.carybros.com/html/endian_faq.html -|access-date = 2010-10-11 -}}</ref><ref>{{cite journal - |last = James |first = David V. - |title = Multiplexed buses: the endian wars continue - |journal = [[IEEE Micro]] - |date = June 1990 - |volume = 10 - |issue = 3 - |pages = 9–21 - |doi = 10.1109/40.56322 - |s2cid = 24291134 - |issn = 0272-1732 - }}</ref><ref>{{cite journal - |last1 = Blanc |first1 = Bertrand - |last2 = Maaraoui |first2 = Bob - |title = Endianness or Where is Byte 0? - |date = December 2005 - |url = http://3bc.bertrand-blanc.com/endianness05.pdf - |access-date = 2008-12-21 - }}</ref> - -Endianness may also be used to describe the order in which the [[bit]]s are transmitted over a communication channel, e.g., big-endian in a communications channel transmits the most significant bits first.<ref>{{cite web |title=RFC 1700 |url=https://tools.ietf.org/html/rfc1700}}</ref> Bit-endianness is seldom used in other contexts. - == Etymology == '
New page size (new_size)
46958
Old page size (old_size)
49608
Size change in edit (edit_delta)
-2650
Lines added in edit (added_lines)
[]
Lines removed in edit (removed_lines)
[ 0 => '{{Short description|Order of bytes in a computer word}}', 1 => '{{Redirect2|Big-endian|Little-endian|the conflicting ideologies in ''Gulliver’s Travels''|Lilliput and Blefuscu#History and politics}}', 2 => '{{Multiple issues|', 3 => '{{refimprove|date=July 2020}}', 4 => '{{Condense|date=August 2020}}', 5 => '}}', 6 => '', 7 => 'In [[computing]], '''endianness''' is the order or sequence of [[byte]]s of a [[word (data type)|word]] of digital data in [[computer memory]]. Endianness is primarily expressed as '''big-endian''' ('''BE''') or '''little-endian''' ('''LE'''). A big-endian system stores the [[most significant byte]] of a word at the smallest [[memory address]] and the [[least significant byte]] at the largest. ', 8 => 'A little-endian system, in contrast, stores the least-significant byte at the smallest address.<ref>[https://betterexplained.com/articles/understanding-big-and-little-endian-byte-order/ Understanding big and little endian byte order]</ref><ref>[https://developer.apple.com/library/archive/documentation/CoreFoundation/Conceptual/CFMemoryMgmt/Concepts/ByteOrdering.html#//apple_ref/doc/uid/20001150-CJBEJBHH Byte Ordering PPC]</ref><ref>[https://developer.ibm.com/articles/au-endianc/ Writing endian-independent code in C]</ref> '''Bi-endianness''' is a feature supported by numerous computer architectures that feature switchable endianness in data fetches and stores or for instruction fetches.', 9 => 'Other orderings are generically called '''middle-endian''' or '''mixed-endian'''.<ref>{{cite web |title = Internet Hall of Fame Pioneer |url = http://internethalloffame.org/inductees/danny-cohen |website = [[Internet Hall of Fame]] |publisher = [[The Internet Society]] }}</ref><ref>{{cite web', 10 => '|first = David |last = Cary', 11 => '|title = Endian FAQ', 12 => '|url = http://david.carybros.com/html/endian_faq.html', 13 => '|access-date = 2010-10-11', 14 => '}}</ref><ref>{{cite journal', 15 => ' |last = James |first = David V.', 16 => ' |title = Multiplexed buses: the endian wars continue', 17 => ' |journal = [[IEEE Micro]]', 18 => ' |date = June 1990', 19 => ' |volume = 10', 20 => ' |issue = 3', 21 => ' |pages = 9–21', 22 => ' |doi = 10.1109/40.56322', 23 => ' |s2cid = 24291134', 24 => ' |issn = 0272-1732', 25 => ' }}</ref><ref>{{cite journal', 26 => ' |last1 = Blanc |first1 = Bertrand', 27 => ' |last2 = Maaraoui |first2 = Bob', 28 => ' |title = Endianness or Where is Byte 0?', 29 => ' |date = December 2005', 30 => ' |url = http://3bc.bertrand-blanc.com/endianness05.pdf', 31 => ' |access-date = 2008-12-21', 32 => ' }}</ref>', 33 => '', 34 => 'Endianness may also be used to describe the order in which the [[bit]]s are transmitted over a communication channel, e.g., big-endian in a communications channel transmits the most significant bits first.<ref>{{cite web |title=RFC 1700 |url=https://tools.ietf.org/html/rfc1700}}</ref> Bit-endianness is seldom used in other contexts. ', 35 => '' ]
Whether or not the change was made through a Tor exit node (tor_exit_node)
false
Unix timestamp of change (timestamp)
1651657981