Citation Needed

I may revisit this later. Consider this a late draft. I’m calling this done.

“Should array indices start at 0 or 1? My compromise of 0.5 was rejected without, I thought, proper consideration.” — Stan Kelly-Bootle

Sometimes somebody says something to me, like a whisper of a hint of an echo of something half-forgotten, and it lands on me like an invocation. The mania sets in, and it isn’t enough to believe; I have to know.

I’ve spent far more effort than is sensible this month crawling down a rabbit hole disguised, as they often are, as a straightforward question: why do programmers start counting at zero?

Now: stop right there. By now your peripheral vision should have convinced you that this is a long article, and I’m not here to waste your time. But if you’re gearing up to tell me about efficient pointer arithmetic or binary addition or something, you’re wrong. You don’t think you’re wrong and that’s part of a much larger problem, but you’re still wrong.

For some backstory, on the off chance anyone still reading by this paragraph isn’t an IT professional of some stripe: most computer languages including C/C++, Perl, Python, some (but not all!) versions of Lisp, many others – are “zero-origin” or “zero-indexed”. That is to say, in an array A with 8 elements in it, the first element is A[0], and the last is A[7]. This isn’t universally true, though, and other languages from the same (and earlier!) eras are sometimes one-indexed, going from A[1] to A[8].

While it’s a relatively rare practice in modern languages, one-origin arrays certainly aren’t dead; there’s a lot of blood pumping through Lua these days, not to mention MATLAB, Mathematica and a handful of others. If you’re feeling particularly adventurous Haskell apparently lets you pick your poison at startup, and in what has to be the most lunatic thing I’ve seen on a piece of silicon since I found out the MIPS architecture had runtime-mutable endianness, Visual Basic (up to v6.0) featured the OPTION BASE flag, letting you flip that coin on a per-module basis. Zero- and one-origin arrays in different corners of the same program! It’s just software, why not?

All that is to say that starting at 1 is not an unreasonable position at all; to a typical human thinking about the zeroth element of an array doesn’t make any more sense than trying to catch the zeroth bus that comes by, but we’ve clearly ended up here somehow. So what’s the story there?

The usual arguments involving pointer arithmetic and incrementing by sizeof(struct) and so forth describe features that are nice enough once you’ve got the hang of them, but they’re also post-facto justifications. This is obvious if you take the most cursory look at the history of programming languages; C inherited its array semantics from B, which inherited them in turn from BCPL, and though BCPL arrays are zero-origin, the language doesn’t support pointer arithmetic, much less data structures. On top of that other languages that antedate BCPL and C aren’t zero-indexed. Algol 60 uses one-indexed arrays, and arrays in Fortran are arbitrarily indexed – they’re just a range from X to Y, and X and Y don’t even need to be positive integers.

So by the early 1960’s, there are three different approaches to the data structure we now call an array.

  • Zero-indexed, in which the array index carries no particular semantics beyond its implementation in machine code.
  • One-indexed, identical to the matrix notation people have been using for quite some time. It comes at the cost of a CPU instruction or disused word to manage the offset; usability isn’t free.
  • Arbitrary indices, in which the range is significant with regards to the problem you’re up against.

So if your answer started with “because in C…”, you’ve been repeating a good story you heard one time, without ever asking yourself if it’s true. It’s not about *i = a + n*sizeof(x) because pointers and structs didn’t exist. And that’s the most coherent argument I can find; there are dozens of other arguments for zero-indexing involving “natural numbers” or “elegance” or some other unresearched hippie voodoo nonsense that are either wrong or too dumb to rise to the level of wrong.

The fact of it is this: before pointers, structs, C and Unix existed, at a time when other languages with a lot of resources and (by the standard of the day) user populations behind them were one- or arbitrarily-indexed, somebody decided that the right thing was for arrays to start at zero.

So I found that person and asked him.

His name is Dr. Martin Richards; he’s the creator of BCPL, now almost 7 years into retirement; you’ve probably heard of one of his doctoral students Eben Upton, creator of the Raspberry Pi. I emailed him to ask why he decided to start counting arrays from zero, way back then. He replied that…

As for BCPL and C subscripts starting at zero. BCPL was essentially designed as typeless language close to machine code. Just as in machine code registers are typically all the same size and contain values that represent almost anything, such as integers, machine addresses, truth values, characters, etc. BCPL has typeless variables just like machine registers capable of representing anything. If a BCPL variable represents a pointer, it points to one or more consecutive words of memory. These words are the same size as BCPL variables. Just as machine code allows address arithmetic so does BCPL, so if p is a pointer p+1 is a pointer to the next word after the one p points to. Naturally p+0 has the same value as p. The monodic indirection operator ! takes a pointer as it’s argument and returns the contents of the word pointed to. If v is a pointer !(v+I) will access the word pointed to by v+I. As I varies from zero upwards we access consecutive locations starting at the one pointed to by v when I is zero. The dyadic version of ! is defined so that v!i = !(v+I). v!i behaves like a subscripted expression with v being a one dimensional array and I being an integer subscript. It is entirely natural for the first element of the array to have subscript zero. C copied BCPL’s approach using * for monodic ! and [ ] for array subscription. Note that, in BCPL v!5 = !(v+5) = !(5+v) = 5!v. The same happens in C, v[5] = 5[v]. I can see no sensible reason why the first element of a BCPL array should have subscript one. Note that 5!v is rather like a field selector accessing a field in a structure pointed to by v.

This is interesting for a number of reasons, though I’ll leave their enumeration to your discretion. The one that I find most striking, though, is that this is the earliest example I can find of the understanding that a programming language is a user interface, and that there are difficult, subtle tradeoffs to make between resources and usability. Remember, all this was at a time when everything about the future of human-computer interaction was up in the air, from the shape of the keyboard and the glyphs on the switches and keycaps right down to how the ones and zeros were manifested in paper ribbon and bare metal; this note by the late Dennis Ritchie might give you a taste of the situation, where he mentions that five years later one of the primary reasons they went with C’s square-bracket array notation was that it was getting steadily easier to reliably find square brackets on the world’s keyboards.

“Now just a second, Hoye”, I can hear you muttering. “I’ve looked at the BCPL manual and read Dr. Richards’ explanation and you’re not fooling anyone. That looks a lot like the efficient-pointer-arithmetic argument you were frothing about, except with exclamation points.” And you’d be very close to right. That’s exactly what it is – the distinction is where those efficiencies take place, and why.

BCPL was first compiled on an IBM 7094here’s a picture of the console, though the entire computer took up a large room – running CTSS – the Compatible Time Sharing System – that antedates Unix much as BCPL antedates C. There’s no malloc() in that context, because there’s nobody to share the memory core with. You get the entire machine and the clock starts ticking, and when your wall-clock time block runs out that’s it. But here’s the thing: in that context none of the offset-calculations we’re supposedly economizing are calculated at execution time. All that work is done ahead of time by the compiler.

You read that right. That sheet-metal, “wibble-wibble-wibble” noise your brain is making is exactly the right reaction.

Whatever justifications or advantages came along later – and it’s true, you do save a few processor cycles here and there and that’s nice – the reason we started using zero-indexed arrays was because it shaved a couple of processor cycles off of a program’s compilation time. Not execution time; compile time.

Does it get better? Oh, it gets better:

IBM had been very generous to MIT in the fifties and sixties, donating or discounting its biggest scientific computers. When a new top of the line 36-bit scientific machine came out, MIT expected to get one. In the early sixties, the deal was that MIT got one 8-hour shift, all the other New England colleges and universities got a shift, and the third shift was available to IBM for its own use. One use IBM made of its share was yacht handicapping: the President of IBM raced big yachts on Long Island Sound, and these boats were assigned handicap points by a complicated formula. There was a special job deck kept at the MIT Computation Center, and if a request came in to run it, operators were to stop whatever was running on the machine and do the yacht handicapping job immediately.

Jobs on the IBM 7090, one generation behind the 7094, were batch-processed, not timeshared; you queued up your job along with a wall-clock estimate of how long it would take, and if it didn’t finish it was pulled off the machine, the next job in the queue went in and you got to try again whenever your next block of allocated time happened to be. As in any economy, there is a social context as well as a technical context, and it isn’t just about managing cost, it’s also about managing risk. A programmer isn’t just racing the clock, they’re also racing the possibility that somebody will come along and bump their job and everyone else’s out of the queue.

I asked Tom Van Vleck, author of the above paragraph and also now retired, how that worked. He replied in part that on the 7090…

“User jobs were submitted on cards to the system operator, stacked up in a big tray, and a rudimentary system read, loaded, and ran jobs in sequence. Typical batch systems had accounting systems that read an ID card at the beginning of a user deck and punched a usage card at end of job. User jobs usually specified a time estimate on the ID card, and would be terminated if they ran over. Users who ran too many jobs or too long would use up their allocated time. A user could arrange for a long computation to checkpoint its state and storage to tape, and to subsequently restore the checkpoint and start up again.

The yacht handicapping job pertained to batch processing on the MIT 7090 at MIT. It was rare — a few times a year.”

So: the technical reason we started counting arrays at zero is that in the mid-1960’s, you could shave a few cycles off of a program’s compilation time on an IBM 7094. The social reason is that we had to save every cycle we could, because if the job didn’t finish fast it might not finish at all and you never know when you’re getting bumped off the hardware because the President of IBM just called and fuck your thesis, it’s yacht-racing time.

There are a few points I want to make here.

The first thing is that as far as I can tell nobody has ever actually looked this up.

Whatever programmers think about themselves and these towering logic-engines we’ve erected, we’re a lot more superstitious than we realize. We tell and retell this collection of unsourced, inaccurate stories about the nature of the world without ever doing the research ourselves, and there’s no other word for that but “mythology”. Worse, by obscuring the technical and social conditions that led humans to make these technical and social decisions, by talking about the nature of computing as we find it today as though it’s an inevitable consequence of an immutable set of physical laws, we’re effectively denying any responsibility for how we got here. And worse than that, by refusing to dig into our history and understand the social and technical motivations for those choices, by steadfastly refusing to investigate the difference between a motive and a justification, we’re disavowing any agency we might have over the shape of the future. We just keep mouthing platitudes and pretending the way things are is nobody’s fault, and the more history you learn and the more you look at the sad state of modern computing the the more pathetic and irresponsible that sounds.

Part of the problem is access to the historical record, of course. I was in favor of Open Access publication before, but writing this up has cemented it: if you’re on the outside edge of academia, $20/paper for any research that doesn’t have a business case and a deep-pocketed backer is completely untenable, and speculative or historic research that might require reading dozens of papers to shed some light on longstanding questions is basically impossible. There might have been a time when this was OK and everyone who had access to or cared about computers was already an IEEE/ACM member, but right now the IEEE – both as a knowledge repository and a social network – is a single point of a lot of silent failure. “$20 for a forty-year-old research paper” is functionally indistinguishable from “gone”, and I’m reduced to emailing retirees to ask them what they remember from a lifetime ago because I can’t afford to read the source material.

The second thing is how profoundly resistant to change or growth this field is, and apparently has always been. If you haven’t seen Bret Victor’s talk about The Future Of Programming as seen from 1975 you should, because it’s exactly on point. Over and over again as I’ve dredged through this stuff, I kept finding programming constructs, ideas and approaches we call part of “modern” programming if we attempt them at all, sitting abandoned in 45-year-old demo code for dead languages. And to be clear: that was always a choice. Over and over again tools meant to make it easier for humans to approach big problems are discarded in favor of tools that are easier to teach to computers, and that decision is described as an inevitability.

This isn’t just Worse Is Better, this is “Worse Is All You Get Forever”. How many off-by-one disasters could we have avoided if the “foreach” construct that existed in BCPL had made it into C? How much more insight would all of us have into our code if we’d put the time into making Michael Chastain’s nearly-omniscient debugging framework – PTRACE_SINGLESTEP_BACKWARDS! – work in 1995? When I found this article by John Backus wondering if we can get away from Von Neumann architecture completely, I wonder where that ambition to rethink our underpinnings went. But the fact of it is that it didn’t go anywhere. Changing how you think is hard and the payoff is uncertain, so by and large we decided not to. Nobody wanted to learn how to play, much less build, Engelbart’s Violin, and instead everyone gets a box of broken kazoos.

In truth maybe somebody tried – maybe even succeeded! – but it would cost me hundreds of dollars to even start looking for an informed guess, so that’s the end of that.

It’s hard for me to believe that the IEEE’s membership isn’t going off a demographic cliff these days as their membership ages, and it must be awful knowing they’ve got decades of delicious, piping-hot research cooked up that nobody is ordering while the world’s coders are lining up to slurp watery gruel out of a Stack-Overflow-shaped trough and pretend they’re well-fed. You might not be surprised to hear that I’ve got a proposal to address both those problems; I’ll let you work out what it might be.

108 Comments

  1. Zeynep
    Posted October 22, 2013 at 6:29 pm | Permalink

    You’ve knocked it out of the park again.

    I knew nuffink about the relevant history and only a little about that “because pointers” thing, so I can’t comment to that. But IEEE does have a captive audience, if a limited one, for new members as long as they can keep offering conference registration discounts and grad students need to go to said conferences.

    I thought they had a history museum sort of thing publicly accessible, but I haven’t been there in years so I don’t know what’s available, or if it is indeed open.

  2. Posted October 22, 2013 at 7:25 pm | Permalink

    If you have an 8-bit register, it can contain any of 256 distinct values between 0 and 255.

    If you have a 256-element array indexed by that 8-bit register, the potential array subscripts are from 0 to 255. That’s why it’s more natural to start from zero, because 256 can’t be represented in the index register.

  3. mhoye
    Posted October 22, 2013 at 7:41 pm | Permalink

    There’s no “natural” here. It’s just as “natural” to that your eight bit register denote a range from -127 to 128 as it is to denote a range from 0 to 255.

    Seriously, the whole point of this whole article I just put a whole pile of effort into writing is a direct rebuttal to the completely arbitrary, made up thing you just said.

  4. Posted October 22, 2013 at 8:10 pm | Permalink

    I always thought it was because early programmers were mathematicians (sure, the EEs built the hardware and had some input, but engineers aren’t any less respected now than they ever were no matter how we like to pretend otherwise), and their silly fixation on radices and what “10” means lead to a lot of how languages were laid out.

  5. mhoye
    Posted October 22, 2013 at 8:20 pm | Permalink

    That’s how Fortran happened, but not C.

  6. Mike Kozlowski
    Posted October 22, 2013 at 10:09 pm | Permalink

    Great article.

    FYI, Perl is a bit more complicated than you let on. Look up $[.

  7. Alex G
    Posted October 23, 2013 at 12:36 am | Permalink

    This was fascinating, but noticed you’d said it’s “final” but also this typo: “thought the entire computer took up a large room” should be “though…”

  8. mhoye
    Posted October 23, 2013 at 6:56 am | Permalink

    I’m always happy to fix typos. Thanks!

  9. Oration
    Posted October 23, 2013 at 7:22 am | Permalink

    : There’s no “natural” here. It’s just as “natural” to that your eight
    : bit register denote a range from -127 to 128 as it is to denote a
    : range from 0 to 255.

    I think that depends on where your nature comes from. As someone that learned assembler first (I learned m68k assembly before learning C), it felt perfectly natural for me to start at zero.

    In any assembler memory range I worked with, I always had to think of the range as “Address of first byte + size of range” — which meant that any byte in the range was always “First byte + N”, where N=0 for the first byte.

    For the old greybeards that also learned assembly languages before learning anything compiled, it probably also felt perfectly natural for them to start from zero.

    I’d love to see Dr. Richards comment — have you let him know that this went up?

  10. Konrad
    Posted October 23, 2013 at 7:33 am | Permalink

    Concerning access to the classics of computing, I fully agree with what you say.

    Concerning the issue of indexing, you are simplifying the question way too much for my taste. History is one aspect, whose exploration requires a bit more than asking a question to a single person and then re-interpreting his answer to make it fit your argument.

    But history isn’t everything. Today, as you say, we see both zero-based and one-based indexing. Why did both survive over many decades? I’d say because both make sense, depending on how you look at the question. I don’t buy your “efficiency always wins” argument: Python has zero-based indexing, and yet its indexing operations are so expensive that an integer decrement wouldn’t make a difference. I don’t believe either that Guido van Rossum chose base zero because he was brainwashed by C: he made many radically different choices, not the least of which is indentation-based syntax. I am willing to believe that he chose base zero because that’s what seemed more natural to him.

  11. mhoye
    Posted October 23, 2013 at 8:30 am | Permalink

    I think that depends on where your nature comes from.

    ARGH.

  12. Posted October 23, 2013 at 8:55 am | Permalink

    This article was an interesting read, but it certainly misses a few of the bigger picture items. I understand your frustration with access to the academic literature, and I agree that we need to work on fixing it.

    The implication that we use 0-based arrays because of the IBM President’s penchant for yachting, however, is simply ridiculous.

    > Whatever justifications or advantages came along later – and it’s true, you do save a few processor cycles here and there and that’s nice – the reason we started using zero-indexed arrays was because it shaved a couple of processor cycles off of a program’s compilation time. Not execution time; compile time.

    No, no, no. Find a source for this, instead of stating this as fact after you’ve just spent several paragraphs complaining about unsourced superstition.

    We use 0-based indexing in many programming languages because it makes both pointer arithmetic and slicing more natural. See Dijkstra’s remarks about it here: https://www.cs.utexas.edu/~EWD/transcriptions/EWD08xx/EWD831.html

    You note that 1-based indexing is natural for many mathematical programming languages (Lua had a strong early connection to Fortran), and that’s true. What you failed to note is that 1-based indexing was much more common in older programming languages, and that there has been a trend to 0-based indexing due to what many consider its inherent advantages over 1-based indexing from a programming context.

    See the nearly endless discussion about this on the Julia programming thread on Reddit from last year: http://www.reddit.com/r/programming/comments/pv3k9/why_we_created_julia_a_new_programming_language/c3sikhi

  13. mhoye
    Posted October 23, 2013 at 9:18 am | Permalink

    Maybe you should read the whole thing again? Because there’s a link to Dijkstra’s article right there in the text already, as well as references to one-indexing in older languages. It’s all in the first third of the article.

  14. Posted October 23, 2013 at 9:30 am | Permalink

    I missed your Dijkstra reference, because it was in this sentence from the article:

    > there are dozens of other arguments for zero-indexing involving “natural numbers” or “elegance” or some other unresearched hippie voodoo nonsense that are either wrong or too dumb to rise to the level of wrong.

    I think a reasonable reading of that sentence is that Prof. Dijkstra is wrong or too dumb to rise to the level of wrong.

    It’s pretty clear that we have radically different views of how programming languages and their communities have evolved, as well as the reliability of sources. I think your point is lost, however, in the vitriolic language of this post.

  15. mhoye
    Posted October 23, 2013 at 9:41 am | Permalink

    I think a reasonable reading of that sentence is that Prof. Dijkstra is wrong or too dumb to rise to the level of wrong.

    Dijkstra’s a very smart, eminently respectable fellow, and I have no quarrel with him. People citing an only-tangentially-related paper written by Dijkstra twenty years after the fact is where the “too dumb to rise to the level of wrong” part comes into play.

  16. Posted October 23, 2013 at 10:01 am | Permalink

    I understand your point, but I feel that your article is losing its way between “why did we start doing this” and “why are we still doing it now”. My interpretation of history is that Martin Richards chose to use 0-indexing because it was natural to him when he designed the language and its implementation, not for any perceived performance improvements in the compiler. I think we disagree on that point, so we should probably leave it at that.

  17. mhoye
    Posted October 23, 2013 at 12:40 pm | Permalink

    The entire article I just wrote is, at length, a rebuttal to your position. Listen to yourself. “Felt natural”, “perceived improvements”. I don’t care what you “feel” or think someone “perceived”. You disagree with me, go do the research yourself and bring back the evidence.

  18. Posted October 23, 2013 at 1:06 pm | Permalink

    I find it odd that you don’t mention the most obvious reason why we’re still doing it today: Length calculations. If you have a start and end index, end-start gives you the length. Always. In languages with one-based indexes, there’s always a +1 or -1 to be strewn in there, and branching instructions you have to remember to add to go from start and length to actual end offset. Which is why maths generally uses zero-based indices.

  19. mhoye
    Posted October 23, 2013 at 1:39 pm | Permalink

    In languages with one-based indexes, there’s always a +1 or -1 to be strewn in there

    Um… you might want to work through that on a whiteboard or something. You’re not actually making the point you think you’re making, and when you say “maths generally use zero base indexes”, you should perhaps note that two of the languages I specifically mention above as being one-indexed above are “MATLAB” and “Mathematica”.

  20. Posted October 23, 2013 at 1:41 pm | Permalink

    Great depth on an overlooked piece of programming history. Appreciate the research and occasional, but welcome sarcasm.

  21. Justin
    Posted October 23, 2013 at 2:46 pm | Permalink

    Do you know anything more about the history of that debugger you mentioned? Because my eyes widened a bit at the sight of that email.

  22. mhoye
    Posted October 23, 2013 at 3:39 pm | Permalink

    I know, right? The last thing I heard about it was in this slashdot comment, years later.

  23. Posted October 23, 2013 at 4:34 pm | Permalink

    I should have expected that all the comments on here would miss the forest for the trees.

    As someone who has a liberal-arts background and now finds themselves working full-time in the tech world, I’ve always been hungry to learn about the history of our craft and the theoretical underpinnings of what we do. I had assumed that everyone else already knew this stuff — had learned it in school or something — and was sort of shocked that few other people in the field care. Other disciplines force you to read your history and your forebearers, but we seem to leave that for the weirdos who care about writing compilers while everyone else just muddles through writing for-loops.

  24. Walt French
    Posted October 23, 2013 at 4:56 pm | Permalink

    From recollection and a smidge of Googling, FORTRAN was strictly 1-based thru FORTRAN 77, and only acquired arbitrary indexes in FORTRAN 90. The first language I encountered with arbitrary indexing was Wirth’s Pascal, designed in the late 60’s, with an obvious eye on ALGOL, which acquired arbitrary lower bounds in ALGOL 68.

    My own guess for the monstrosity of forcing needless mental gymnastics on programmers was the mentality that C was a structured assembly language, which tried to reflect hardware as opposed to helping the coder express himself. Or perhaps to assert an American independence from the effete Europeans.

  25. Walt French
    Posted October 23, 2013 at 5:07 pm | Permalink

    Woops, looks like FORTRAN 66 (IV) was the last fixed-lower-bound of 1.

    So by the time C was being fashioned, there were still popular languages that didn’t yet allow arbitrary bounds.

  26. Joe
    Posted October 23, 2013 at 5:19 pm | Permalink

    I think this blog post could be summed up entirely with “I don’t like zero-based indexing and the origin of it refutes your rational reasons for using it today. So there.”

    Understanding that a pointer refers to the first element of an array, pointer math is a perfectly logical reason for zero-based indexing to exist today. You saying otherwise is wrong. You don’t think you’re wrong and that’s part of a much larger problem, but you’re still wrong.

  27. mhoye
    Posted October 23, 2013 at 5:37 pm | Permalink

    Hi, Joe! Thanks for leaving your comment. I’m sure it’s made you feel very clever. Unfortunately, you’ve missed every last one of the points I’ve made. Don’t let that stop you from feeling smug! But perhaps in addition to that you could try reading for comprehension next time.

    Best,

    – mhoye

  28. Joe
    Posted October 23, 2013 at 5:56 pm | Permalink

    Wow. Why are you being so nasty? By quoting you I was hoping you’d realize how dogmatic you sound. I was hoping you’d elaborate more on why it’s your preference and less on why you think everyone else is wrong. But I’ve since lost interest.

  29. Posted October 23, 2013 at 8:23 pm | Permalink

    The 0/1 base issue predates computer programming by millenia.
    When discussing intervals, there are N intervals between N+1 points – does one count the intervals at their start point or their end point.
    The easiest case to understand the question, is around the turn of the millenium. Is the year 2000 the 1st year of the new millenium or the last year of the last millenium (only at mid-night on 31st Dec. is the year 2000 ‘complete’ and thus ‘countable’)? Ditto when looking at decades. Does the ‘noughties’ span 2000-2009 or 2001-2010?
    For people’s ages, we normally say a child is 1 year old when they are more than 12 months old (and less than 24 months of age) – we count completed years. So to talk about the start point, we need to refer to day/month/year ‘zero’.
    On the other hand, when counting atomic things (like fingers) we ‘naturally’ start at 1!

  30. Posted October 23, 2013 at 8:55 pm | Permalink

    This is great. One of the best things I’ve read in a while. Thank you for writing it.

    Also, let us all fling lucre and fame at Michael Elizabeth Chastain!

  31. Anonymous
    Posted October 23, 2013 at 9:13 pm | Permalink

    A typo in your article: Dennis Ritchie, not Richie.

  32. mhoye
    Posted October 23, 2013 at 9:47 pm | Permalink

    God, misspelling DMR’s name, that’s embarrassing. Fixed, thanks very much.

  33. Posted October 23, 2013 at 11:07 pm | Permalink

    Well done, interesting, but!

    The citations you have mention that it looked more -natural- to think of ponters that way and there was no reason to do something else. So basically yes, it is the pointer argument…

    But you seem to be keen to make a different statement, so even if your facts, if I’m not missing anything, just say that is natural for pointers, you make a cool elaborate argument about it being an optimization at compile time… Out of thin air…

    Yes it is an optimization, but the bcpl author didn’t say optimization was the reason, that seems your fiction…

    Also a guy in the comments mentions that if you start from 1 you waste one value basically… To which you reply that it’s not the point because you could interpret a register to range from 1 to 256… But if you do so, then the compiler doesn’t need a different logic and 1-based would not be slower to compile, so you destroy your own fiction to disprove another guy… :)

  34. Posted October 24, 2013 at 12:29 am | Permalink

    Guido Van Rossum chimes in on why he chose 0-based indexing for python at https://plus.google.com/115212051037621986145/posts/YTUxbXYZyfi

  35. Posted October 24, 2013 at 3:15 am | Permalink

    This is all about “beginning point vs median point” decision.
    If you identify every interval by it’s beginning point (and end), you go for 0-based indexing.
    If you identify every interval by it’s median point (and length), you go for 1-based indexing.

    The second one is more natural to our brain (like what you say). That’s why mathematicians chose it.
    BUT, remember, our brain is NOT DIGITAL. The design of common computers is much different from the design of our brain. That’s a much bigger and fundamental incompatibility. Perhaps this is gonna get solved by “Singularity” (and DNA computers or stuff like that) in the future. But we have to sacrifice a lot of our brain until we get there (not just about this issue).

  36. Posted October 24, 2013 at 3:31 am | Permalink

    Our brain is fuzzy. And in fuzzy logic, there are not accurate boundaries. So beginning and end is meaningless
    For example, particles of every physical object is distributed in the whole world. But there is point like center of gravity (similar to median point), where the density of object is probably (not accurately) maximum. So instead of defining a 3d boundary for object, we define a density function from R^3 to R
    This logic is more like median point logic and 1-based indexing

  37. Posted October 24, 2013 at 3:32 am | Permalink

    Also read about:
    http://en.wikipedia.org/wiki/Normal_distribution

  38. Posted October 24, 2013 at 3:42 am | Permalink

    Like Tim Peters said in The Zen of Python:
    Special cases aren’t special enough to break the rules.
    Although practicality beats purity.

  39. Tomalak Geret'kal
    Posted October 24, 2013 at 4:10 am | Permalink

    It’s not “the zeroth element” but the element “zero away from the start of the array”, which makes perfect sense to anyone with a logical mind. The remainder of your rant is irrelevant after you miss this point.

  40. Posted October 24, 2013 at 4:25 am | Permalink

    I may have misread the BCPL manual ( http://cm.bell-labs.com/cm/cs/who/dmr/bcpl.html ), but I don’t think that just because there isn’t dynamically allocated memory you can easily precompute all the -1 pointer offsets in the code.

    Functions (any BCPL bloc actually) can declare and allocate arrays on the stack (page 22 of the manual), and recursive functions are allowed (page 23), thus you can’t always predict during compilation where the array will be (and decrement all occurrences of this predicted position before writing it in the compiled code).

    Moreover, both sides of the vector application operator can be any arbitrary expression (page 11), you can’t expect to always have a constant one to statically decrement.

    Because of those two points, I think that even in the context of BCPL/CTSS, 1-indexed arrays would cost execution time as well as compilation time.

  41. Jack Jansen
    Posted October 24, 2013 at 5:10 am | Permalink

    Thought-provoking article, it really made me think hard about the issues (as well as bring back memories of BCPL, Pascal, and what-have-you:-).

    It is clear that 0-based addressing causes headaches and non-elegant code in all sorts of situations, but I clearly remember that 1-based addressing does the same in other situations.

    You now have me wondering whether it has something to do with the indices being truly discrete, or discrete points over a continuous domain…. To a passenger the first bus of the day is clearly bus number 1, but to the guy creating the schedule it may be more logical to think of it as bus 0….

    Thanks!

  42. Posted October 24, 2013 at 7:05 am | Permalink

    Maybe I’m missing something, but I don’t see anything that suggests that in the BCPL expression v!i, either v or i must be a constant? Which is the only way you could optimize out the math required for 1-based indexing at compile time.

    I wonder if there is a processor that has a non-zero based indirect addressing scheme in its instruction set.

  43. Jalex
    Posted October 24, 2013 at 7:30 am | Permalink

    0, 1, who cares. Either one makes sense to me because 0 is a valid non-zero number, and 1 helps people who count with their fingers. When you use 1, it’s likely treated as 0 behind the scenes anyway.

  44. Greg
    Posted October 24, 2013 at 11:02 am | Permalink

    While the historical context you provide is interesting, I think it’s disingenuous to suggest that many of us have simply “been repeating a good story.” Ritchie himself wrote, “Because pointers in BCPL and B are merely integer indices in the memory array, arithmetic on them is meaningful: if p is the address of a cell, then p+1 is the address of the next cell. This convention is the basis for the semantics of arrays in both languages.” [http://cm.bell-labs.com/cm/cs/who/dmr/chist.html] Sure, maybe we don’t all know the citation off the tops of our heads (I had to look it up), but that doesn’t mean it can be described as a myth.

    Additionally, your conclusion that programmers start counting at 0 because of something that was at the root of BCPL assumes that subsequently developed languages simply copied its array semantics. The quote above from Ritchie suggests that C shares those semantics not just because it was convenient to copy from BCPL, but because it was a good idea with regard to pointer arithmetic.

    That said, this is some interesting historical context, and I thank you for sharing.

  45. Posted October 24, 2013 at 11:57 am | Permalink

    I like the yacht story, but it doesn’t sit quite right with me. Having now learned more about BCPL and the IBM 7090 architecture in the last 24 hours than anyone who wasn’t alive to use either should know, I’m skeptical of the connection. More than likely it comes down to the economics of the underlying hardware.

    The use of 0 to denote the first element in a vector in BCPL seems tied to its nature as a mid-level systems language with pointer access to memory. As Dr. Richards points out, BCPL did support offset addressing:
    LET V = VEC 10
    W := V!1

    W now contains the value of the second element of vector V. But you’re asking (rhetorically), why not the first element?

    I don’t buy the compile-time efficiency argument–not entirely. The amount of additional compile time required to make V!1 point to the first element of V is one “subtract 1″ operation; while not nothing, this is at least amortized over every time that dereference is made at runtime. The real problem of a non-zero vector base comes in at runtime when the offset is non-constant.

    Recall that BCPL is a typeless language. V and W are just values; how they’re interpreted is entirely dependent on the operators applied to them (the exact opposite of C++, in a way). Assuming you want your regular math to be correct, then W had better equal 11 in the following example:
    V := 5
    W := 6
    X := V + W

    In BCPL’s notation, !X is the value stored at memory cell 5, and !Y is the value stored at memory cell 6. Because BCPL is typeless, whether or not you add V to W, or W to V doesn’t matter: !X points to the right location. (This also implies that having “VEC 10″ just return a location one-less than the actual start of the vector, so that !V = V!1, won’t work.)

    You could certainly make the case that the ! operator should make the necessary offset adjustment. This could be done with an extra “subtract 1″ operation, but as luck would have it the 7094 hardware supports indexed memory accesses[0], so the resulting machine code would actually be very efficient. But indexing requires the right offset, in this case a 1, be kept in one of the seven index registers. (The alternative, preloading one as needed for each dereference, would slow execution and bloat the executable.) Either way, some resources that (presumably) could be put to better use in a $23M (2012 dollars) machine with only 32K of 36-bit words would be tied up solely to deal with the syntactic sugar of 1-based vectors.

    So BCPL is 0-based because that’s the most efficient way to use the 7094’s 0-based hardware, and efficiency would be key on a very expensive piece of kit that (also presumably) you would want to keep busy 24/7. In that sense, yacht handicapping becomes just another resource constraint (it’s effectively a non-maskable interrupt that reduces the MIPS/day you have available).

    As for why the the 7094 is 0-based and not 1-based, presumably this keeps the 2s-complement indexing adder (which is separate from the 1s-complement adder in the main ALU) simpler, an important consideration given that the machine was built using discrete transistors. Less transistors means a simpler design, simpler testing, and a longer MTBF, all of which ought to increase IBM’s profit.

    Footnote: my favourite piece of trivia that I learned while looking into this is that the 7094’s ferrite core memory needed to be kept within a certain temperature range–typically 106oF. To do this, early 7094s bathed the core memory in oil, which was heated and circulated with a pump. Sometimes, the circulatory system would fail.

    0. http://www.frobenius.com/indirect-addressing.htm

  46. Kyle Butt
    Posted October 24, 2013 at 12:18 pm | Permalink

    Haskell is arbitrary indexed. Just so you know, at least since the haskell ’98 standard:
    http://www.haskell.org/onlinereport/array.html

  47. mhoye
    Posted October 24, 2013 at 12:55 pm | Permalink

    I think it’s disingenuous to suggest that many of us have simply “been repeating a good story.”

    I didn’t say many, I said “most”, and a week spent googling my way through the Stack Overflow echo chamber overwhelmingly supports that assertion. If you have better evidence, I’m all ears! I’d love to find out that more people are better-informed than I currently suspect, that’d be great.

    But every time somebody says something about “natural numbers” – and there’s a lot of them around, many on this very page! – my case gets stronger.

  48. Posted October 24, 2013 at 1:12 pm | Permalink

    Thanks for a really great article. You’ve given me great material to add depth and richness to the discussion around 0-indexing, tech standards (and where they come from), and why programming languages generally suck to learn for non-programmers.

  49. Posted October 24, 2013 at 1:19 pm | Permalink

    A thing that seems really hard for people to come to grips with sometimes is that we live in the world we make; that is, that practically nothing about the circumstances of human life in 2013 is “natural” or inevitable[1]. Including, obviously, programming language UI decisions like how to index arrays.

    This is endlessly troublesome, because if you don’t recognize the fact that most everything is the result of some kind of path-dependent series of human decisions, you’re not going to be likely to recognize that it is all subject to change and improvement. Both in programming and in the world in general.

    [1] Though obviously if you believe in a strictly mechanical and deterministic universe, _everything_ is both natural and inevitable. But that’s not a helpful mode of thinking, usually.

  50. Quintesse
    Posted October 24, 2013 at 4:44 pm | Permalink

    > People citing an only-tangentially-related paper written by Dijkstra twenty years after the fact

    20 years after the fact? I think *you* need to do some fact-checking, he wrote that in ’82 after more than 30 years in IT.

    He’s also the author of the visionary “A Case against the GO TO Statement” paper. I’ll take his opinion over yours any day of the week :)

    Besides, I’m Dutch, he’s our country’s greatest contribution to IT, he can’t be wrong ;)

  51. Fidel Orozco
    Posted October 25, 2013 at 3:29 pm | Permalink

    I like your effort to move on preconceptions. But the zero-index array was not a bad desicion from programmers. Not BCPL or C or anything. The low level only reflect the bare metal design. It was a decisión from hardware engineers. A memory chip with all their bits/signals in the bus address turned off (Zero), is valid memory location where you can allocate the bits from the data bus. So you can store in the position 0. It was an stupid hardware design.

  52. Posted October 25, 2013 at 3:39 pm | Permalink

    Like Angelo, I’m curious about the evidence foe the optinization motivation. it seems just as likely that the compiler authors were motivated by simplicity of implementation.

  53. Jed Wesley-Smith
    Posted October 25, 2013 at 5:05 pm | Permalink

    Great article. One minor quibble though, to say that the famous Backus article “didn’t go anywhere” is to somewhat ignore functional programming, and languages such as Haskell that are based around these ideas. I’d simply contend that they are are in fact going somewhere, and that place is very interesting, relevant, and active!

  54. Anonymous
    Posted October 26, 2013 at 5:10 am | Permalink

    Excellent post! I’m glad you mentioned Fortran. Those arrays were great — what a fun language to learn!

    Cheers.

  55. ceil
    Posted October 26, 2013 at 10:41 pm | Permalink

    I had a rather interesting conversation today with someone on this very subject. I, being the philosopher, argued in support of $array[1] being the first element. Guy I was discussing it with, being the programmer, liked his zero indices.

    His arguments were mostly along the lines of ‘pointers’ and ‘counting using an offset’. When I suggested that high-level languages ought to be _human_ readable and that the compiler could make the needed changes to achieve the desired effect, he then brought up the argument about efficiency in the compiler.

    As I understand it, high-level programming languages are designed to be as easy to use for the _human_ as possible. Humans start counting at one. If I have a box of items, I start counting the items at ‘one’, not ‘zero from the start of my count’. I conceded that he’d have a point if we were talking about ASM, but the discussion revolved around C, perl, and Lua – languages separated enough from metal to allow the interpreter/compiler to put $array[1] at spot 0000 if it needed, and $array[10] at 1001, etc.

    Of course, after our conversation, he mentioned that the smaller unit should go first when noting dates (DD/MM/YY[YY]). He didn’t respond when I asked if he also writes time as S:M:H :-)

  56. Marc
    Posted October 27, 2013 at 9:33 am | Permalink

    You are right: “natural” languages count from 1. That’s because Romans fail to invent the zero; it was too late when zero overrode one as the first digit.

    However it’s more “natural” to count from the more recent zero “invention” not just for bare metal computers but also for abstract, “inhuman” maths. That’s because p + 0 = p and other useful *interval* properties; see Guido’s post and others above for maths details.

    Designers of programming languages hesitate between natural languages and maths. I tend to prefer and more importantly trust maths-based languages better but it’s a question of education and taste – either option seems to do the job fine.

    And that’s it: I just summarized the entire debate. Except for the last, missing part: why do so many people care so much that they can’t even listen to the other side and prefer to write extremely long and boring rants instead? Maybe because counting is learned at such a very young age.

    PS: interestingly, many clocks count time from 0 to 23. This avoids *interval* problems like “what year did the 21st century start?”

  57. Posted October 28, 2013 at 5:54 am | Permalink

    So much SMOKE, so little FIRE. I will share one word only : CONVENTION
    It is all about convention with historical reasoning, which you apparently CHOOSE to disregard as voodoo magic :)))

    In fact I urge you to create your own language where you start counting at .5 :))))
    see how far you will go and how happy you will be at the end.

    Otherwise why 0? although you will most likely disregard it as voodoo too, but anyway:
    computers only understand binary, so at the point you want to represent any other number (the way human do, what we know as math)you hit convention. if you can go left and right, one person will choose left other right :) simple!
    why start counting from 0 has to do with what do we do about 0 question first?
    is it positive is it negative? (I now even math people don’t think about it much)
    start counting from -1?
    in a binary range 2^n we certainly want to represent those. you mentioned, symmetry and elegance, but certainly that was not the convention took, e.g. 2^8 we have a range of -127 to +126, where 0 is considered part of the positive half. certainly better than having symmetric range and 2times 0, +0 and -0.
    Now you are free to do play with the combinations of what else is possible and fall to the same convention used today :)

    Another point: DOES IT MATTER?
    I’vent even realized LUA is 1based, since 99% of the time all you want is iterate over all elements in a collection!

  58. Randy
    Posted October 28, 2013 at 12:21 pm | Permalink

    0-based arrays make sense when the task accesses simple lists. As van Rossum said, a [closed+open) range notation is an elegant way to control iteration boundaries. But 1-based arrays make more sense when operating on matrix data (especially multidimensional). Expressing linear algebra is considerably more elegant with an origin of 1,…

    It’s hardly a surprise then that low level pointer-centric languages like C/Modula chose the former notation while math-centric languages like Fortran/Matlab/Mathematica chose the latter.

    The real issue is, which coding idiom is appropriate for your task? It’s only languages like Python and C++, which attempt to serve all masters and thus span multiple coding idioms, where this debate and confusion will never die.

  59. Posted October 28, 2013 at 12:59 pm | Permalink

    Mike, you’re saying that foreach has existed in BCPL, but skimming through the manual, I can’t find it (google doesn’t help either). Can you comment on how was foreach used in BCPL?

  60. Jegbert
    Posted October 28, 2013 at 2:14 pm | Permalink

    So, you’re saying that every person who indexes from 0 does so because. In other words, there’s no such thing as independent invention, and people choose conventions solely based on familiarity, and are never exposed to multiple implementations and forced to pick which one they prefer for other reasons

    Colour me unimpressed by those assertions.

  61. Jegbert
    Posted October 28, 2013 at 2:15 pm | Permalink

    So, you’re saying that every person who indexes from 0 does so solely because they saw it in an earlier language. In other words, there’s no such thing as independent invention, and people choose conventions solely based on familiarity, and are never exposed to multiple implementations and forced to pick which one they prefer for other reasons

    Colour me unimpressed by those underlying assumptions.

  62. Tom
    Posted October 28, 2013 at 5:28 pm | Permalink

    Wow. Just wow. I can’t believe they teach so little in computer science these days that you actually believe any of this.

    You’ve totally got your cart before the horse. And just like that expression, over time it gets all mixed up and people start saying “you’ve got your horse before the cart.”

    The simple answer you started with IS the answer. The fact that the efficiency of zero indexing might reveal itself through these (or any of a thousand other) cute little stories about when computer efficiency actually mattered DOES NOT MEAN that these cute little stories were the reason. These cute little stories were side-effects of the ONLY REASONABLE SOLUTION (and also pretty much blindingly obvious solution) to any programmer working in that era. And it was the only obvious and reasonable solution, because, DUH, pointers and adding zero.

    The only thing really interesting about this article is just how far removed modern programming practices are from any concept of caring about what’s going on inside the computer.

    Grumble grumble grumble,
    curmudgeonly yours,

    Tom

  63. Iridian Kiiskinen
    Posted October 28, 2013 at 7:49 pm | Permalink

    For the trivial cases, the differences between 0-, and 1-based indexing can be said to be subjective. But for trivial cases, one shouldn’t be using integer indexing in the first place but proper abstractions for dealing with collections, map, for_each and the like. All trivial cases are moot for this discussion, imho.

    So, you’re using integer indexing for its the integral properties. Say you have an array of items from a over-the-wire low-level library. You have an implicit segmentation, where each segment is 4 consecutive items. You don’t have multiple level arrays.
    int segmentIndexFromItem0Based(int i) { return i / 4; }
    int segmentIndexFromItem1Based(int i) { return (i – 1) / 4; }
    int firstSegmentIndexFromItem0Based(int i) { return i & (~3); }
    int firstSegmentIndexFromItem1Based(int i) { return ((i – 1) & (~3)) + 1; }

    Iterating over the segments and interleaving data:
    0-based:
    for (int i = 0; i != isize; ++i) for (int j = 0; j != jsize; ++j)
    array[i * width + j] = data1[i] + data2[j];
    vs.
    1-based:
    for (int i = 1; i <= isize; ++i) for (int j = 1; j <= jsize; ++j)
    array[(i – 1) * width + j] = data1[i] + data2[j];

    With 1-based indexing you need to be dealing with the -1/+1's. And you will end up with off-by-ones unless you're careful, something you have to deal with far less with 0-based indexing. This is not a subjective difference and is the reason programming languages correctly choose to go against natural intuition in my honest opinion.

    My main critique towards this article is, that while it is an interesting take on the history and I appreciate it, it's rather short-sighted to assume that the main reason we're using 0-based indexing is that some granddaddy chose to use them because of the reasons of his day.
    No: the continuous propagation of 0-based indexing with new languages is due to the arithmetically superior properties when combating off-by-1 errors with more complex problems, especially when dealing with lower level abstractions. Which is stuff that matters for any serious programmer. In my less-than-humble opinion.

  64. Iridian Kiiskinen
    Posted October 28, 2013 at 7:50 pm | Permalink

    Exercise: spot the completely unintentional off-by-1 bug in my above code.

  65. Iridian Kiiskinen
    Posted October 28, 2013 at 8:15 pm | Permalink

    Tom, I’ll have to ask, how exactly are Python and C++ somehow serving 1-based indices? “bar”[1] == ‘a’ for both and it doesn’t stop there. With pythons dict-basedness, you can pretend to do 1-based, but with C++ what you have is as 0-based language as they come.

  66. Martin Fredriksson
    Posted October 29, 2013 at 3:44 pm | Permalink

    “One reason to count from 0 is that doing so encourages us to use asymmetric ranges to express intervals.”
    Koenig, Moo, in “Accelerated C++”, section 2.6, page 31.

    There are real and strong reasons for counting from 0, but they apply to specific situations and not to all situations, as several people have commented on (for example the excellent comment by Iridian Kiiskinen).

    You make several mistakes in your article. First you seem to assume that there is *one right way* of counting, when the truth is that different problems have different preferred solutions. Second I find it strange that you have researched this issue without finding at least the very simple and very commonly noted reason I quote above. If you haven’t got time to read any software development books that mention this issue, you could perhaps consider to at least browse through the Wikipedia article on “Zero-based numbering”.

  67. Posted October 30, 2013 at 6:20 pm | Permalink

    MY EYES BURN!!

    Apart from that, I red your write-up again and again …and feel a little uncomfortable with the conclusion: it seems that you’re suggesting convention should be abandoned in favor of (potentially better) new systems. Well, yes and no: it’s strictly about mental disposition. You cannot ask a smith to stop using an hammer and use a brand-new-super-efficient-tool he never saw in his life, but you can train his apprentice to experiment with that. Since, more likely, the apprentice is being teach by the smith itself, you can very well see how the hammer is going to survive a long time, even if the efficiency of the new tool has been proven out of doubts.

    That said, I really enjoyed reading this article so….thanks!

    But seriously, consider changing the background/text color combination…..MY EYES BURN! :)

  68. Posted October 31, 2013 at 3:28 am | Permalink

    Thank you for getting me to revive memories about reading up on 0-based and 1-based indexing!

    As for your argumentation, I liked getting some more background, but in the end, the 0-based or 1-based decision is just a decision of user-interface.

    “123”[0] == “1” feels strange.
    “123”[0:1] == “1” feels about right.
    “123”[:1] + “123”[1:] == “123” feels really elegant: “till 1 plus from 1”
    “123”[0,0+3] == “123” feels good: Length 3.
    “123”[0,0+0] == “” Length 0 is the 0 string.

    “123”[1] == “1” feels simple. If I had to teach a non-programmer how to create code in the shortest possible time, that would be my way. But then this
    “123”[1:1] == “1” (fortran) feels really strange: “It has a length of 0! Oh, wait, no, the second number is a different kind of number than the first.”
    “123”[1:1+3] is an error!

    “123”[+0] == “1” feels OK. But inelegant: Why do I have to think of this as offset?
    “123”[+0:+1] == “1” feels OK.
    “123”[+0:+3] == “123” feels OK.

    The biggest difference I see is in this:

    0-based (python):

    “123”[:1] + “123”[1:] (1 is the dividing point) and
    “123”[n:n+length]

    1-based (fortran):

    “123”[:1] + “123”[2:] and
    “123”[n:n+(length-1)]

    0-based indexing has different meaning to the parts of the interval (closed vs. open) while 1-based indexing requires different values for the numbers you put in. And in my opinion, number-based errors are much harder to spot than index-based ones (“you always use 0-indexed arrays” vs. “but in this field you have to increment or decrement the number by 1”).

    In the end, I would decide to use 0-based indexing, because of fuzzy preference: The number 0 was an extremely important step in the development of *counting*, and this argument is much stronger than the practicability arguments I just presented (because adding up all the factors, the approaches are very similar: All of them have some inelegant parts).

  69. Posted October 31, 2013 at 3:30 am | Permalink

    PS: Note that I intentionally chose an example which is hard for 0-based indexing: Numbers starting from “1”.

  70. Andrew
    Posted October 31, 2013 at 3:40 am | Permalink

    I learned off-by-one mathematics by writing line editors for GW-Basic and Pascal. While I didn’t have the theoretical background at the time, I could feel there was something wrong with always having to add or subtract one, and having to check and double check loops over intervals.

    If people do programming for any length of time, they will learn the offset problem, and make a solution. Turbo Pascal’s solution was to support multi-dimensional arrays, in which the compiler does the off-by-one adjustments for you — instead of x[(i-base)*blah+j], it’s x[i,j]. This works until the dimensions of the array are not known in advance (e.g. during string manipulation).

  71. Posted October 31, 2013 at 12:14 pm | Permalink

    Hi Mike: I don’t really care if you’re 100% correct (or even 100% wrong) about this – telling people that “yacht-racing influenced zero-based indexing code optimization” makes a great story, and is a terrific teaching aid when introducing people to the idea of counting from zero. I wish I’d known this a couple of weeks ago. At the time I just hand-waved away zero-indexing by saying “just because programmers do things like that”. Yacht-racing would have made for a better lecture.

    Thanx for adding another anecdote to our mythology!

    –Bob.

  72. Posted October 31, 2013 at 3:55 pm | Permalink

    Great article, however I I get that its plausible that Martin Richards was shaving compile time off his jobs, but you don’t exactly cite him giving that as a reason. You connect a few dots that seem to imply that this might be the case, but its not quite the smoking gun. Did you ask him if saving compile time was a factor when he made this design decision?

  73. James
    Posted November 1, 2013 at 3:43 am | Permalink

    I don’t think it at all reasonable to paraphrase:

    “It is entirely natural for the first element of the array to have subscript zero…I can see no sensible reason why the first element of a BCPL array should have subscript one.”

    to

    “the reason we started using zero-indexed arrays was because it shaved a couple of processor cycles off of a program’s compilation time”.

    He doesn’t even mention performance. You could perhaps take it as implied that he is talking about performance, but I don’t take that reading.

    That said, though 0-indexing makes most sense when dealing with pointer arithmetic, very little programming nowadays uses pointer arithmetic, so if everyone wants to switch over to 1-indexing that’s fine by me.

  74. Igor Andronov
    Posted November 1, 2013 at 9:09 am | Permalink

    In the late 60’s I learnt to program in Algol, it was an implemented on an Elliot 803B computer where array indices were arbitrary, which is what I came to expect from a ‘high level’ language. Why should I start from one or zero, my real world problem could map to an array with any negative or positive starting point in any dimension. The only stipulation was that the lower bound had to be a less than the upper bound (actually and perversely they could also be equal). Later I moved on to program in assembler, where the notion of subscripts was whatever you wanted to make it, get the storage address of the list starting point and visualise it as whatever you like, index along by incrementing the address depending on what you were doing. So to me the idea of the first element of an array being either zero or one, does not matter a bit, I can live with either, it’s just the convention in a particular language. As for all this talk of performance differences, to someone that used to choose instruction types based on their execution time, I don’t think it really makes a blind bit of difference…

  75. Posted November 1, 2013 at 11:06 am | Permalink

    I’m surprised you didn’t mention CFML (ColdFusion). It’s not as old as some of those languages, but it’s been doing 1-based arrays since 1995. I often see people complaining that it’s dumb for doing so, but it’s a high level web scripting language originally aimed at HTML designers and I always thought it made a great deal of sense.

  76. Craig Hughes
    Posted November 1, 2013 at 11:45 am | Permalink

    Try doing:

    a[i++ % n]

    if you’re using base-1 indices…

  77. H. Peter Anvin
    Posted November 1, 2013 at 1:00 pm | Permalink

    I had run into exclusive use of zero-based indexing in languages that had no connection or history shared with CPL, BCPL, B, or C long before I had heard of any of those languages. I suspect that the combination of zero-based addressing being fundamental to assembly programmers and the history in some branches of mathematics to use zero-based indexing are the case for that, but I don’t know for sure. I do find it somewhat frustrating that arbitrary-base indexing isn’t more universal… in languages which have it I have used some really odd array bases sometimes, in order to make the code cleaner.

  78. Russ
    Posted November 1, 2013 at 2:16 pm | Permalink

    A fun exploration, but you managed to invalidate your opening argument (that pointer arithmetic is the not the reason). As you established, it works that way because the creator of BCPL felt that it was “entirely natural”. His explanation is entirely based on _arithmetic_ operations on what he calls _pointers_.

    So pointer arithmetic is exactly the reason why arrays in languages descended from BCPL are zero-indexed, and I wasn’t wrong at all! Hooray for me.

  79. Sergey
    Posted November 2, 2013 at 12:35 am | Permalink

    I have not seen this mentioned yet, but 1-based indexing have huge advantage of having ‘invalid’ index (0). Consider writing ‘find’ function for element in array and think how to handle situation when element is not found, – in 1-based world we can just return 0 in this case. (using negative numbers not always possible if your index is must be unsigned type)

  80. Posted November 2, 2013 at 6:29 am | Permalink

    Very nice work!

    I’ve written a book-length treatment of essentially this argument, that programmers’ insistence on a ahistorical perspective is responsible for the persistence of harmful myths that hinder progress.

    This is the book: http://leanpub.com/leprechauns

    Would you allow me to use some of the above material in one way or another for the book? If that’s of some interest, please shoot me an email.

  81. Kaleberg
    Posted November 2, 2013 at 11:33 pm | Permalink

    I always gathered that the first zero indexed arrays were in assembly language, at least since the introduction of index registers in the 1950s. If you wanted to load the n-th element of an array into the accumulator, you’d write: LOAD ARRAY(N) where ARRAY was the symbol defined at the start of your array and N was the index register number. The hardware added ARRAY, which was embedded in your instruction by the assembler and loader, to the value in the index register and fetched that word. If you had one based arrays, you’d have to write: LOAD ARRAY-1(N) which is a pain in the ass.

    Interestingly, data structures in assembly language were considered “reverse indexing”. I ran into a guy who wrote a paper about the idea of using the index registers to store pointers rather than indices. I’ll see if I can find a citation. So, to create a data structure, you just defined offsets in the assembler language (e.g. NEXTPTR = 0; KEY = 1; DATUM = 2), then you could put the pointer to the data structure in an index register, say register P, and just “reverse index” load: LOAD VALUE(P) and voila, a whole new way of accessing data. Notice how it turned array indexing on its head. If you look back at programming in the 1950s, this was a radical new approach.

  82. Posted November 2, 2013 at 11:54 pm | Permalink

    I don’t think the reference to BCPL goes back far enough to pin down the origin of zero-indexed arrays. The original version of the JOVIAL language had zero-indexed arrays in 1961. The earliest citation I can readily find online is A specification of JOVIAL from the December 1963 CACM (unfortunately, the text is only available if you have ACM DIgital Library access):
    —-
    In designating an individual value from an n-dimensional array, the array item name must be subscripted by an n-component index list of numeric formulas; and where the size of a dimension is k items, the integral value (truncated, if necessary) of the corresponding component of the index list can only range from 0 thru k-1.
    —-
    Given that JOVIAL was intended for use in developing “embedded” military systems, it had a number of other features which clearly influenced C and its predecessors. I’d say it is fairly likely Dr. Richards knew of JOVIAL in 1967 when he came up with BCPL, it was a well-known language when I started programming in 1969….

  83. Kaleberg
    Posted November 3, 2013 at 12:03 am | Permalink

    The citation may be “A generalized technique for symbol manipulation and numerical calculation”, Douglas Ross, CACM 1961. Read on to the second page of the article.

  84. Kaleberg
    Posted November 3, 2013 at 11:56 am | Permalink

    I’m not sure the issue was saving a cycle or two during compilation. Even with a 10ms cycle time, you might be talking about a second or two for a program with hundreds of array definitions. By the time you got to 1ms cycles, this was irrelevant.

    The real issue was mathematical convention. The first element of an array had index one. The top left corner of a matrix was element one comma one. If you were doing general relativity, the indices were 1, 2, 3, 4 – one for each dimension of space-time. (Look at an old GR book, not Gravitation.) If you look at the algorithms and related notes for the ENIAC, the first element was element one.

    The first computer languages tended to respect this convention. You had one based arrays in COBOL and FORTRAN. The former because it was supposed to resemble English and the latter because it was supposed to resemble mathematics. Zero based indexing became more popular in later languages or languages that added arrays later (like LISP).

    I do appreciate your efforts to look at programming history. It is obvious that some people do. For example, Apple has been developing what they used to call a “capability” based system architecture for its IOS and MacOS. The idea is that each program or component has just the access rights it needs rather than some high level of general access. They are moving towards what they used to call a “domain” based access control architecture, though, for now at least, each application runs in its own domain rather than a more general approach.

    Even back in the 1970s, it was fun to hit the thesis archives at MIT’s Barker Library. They were chock full of good ideas. In the 1980s when software patents were allowed, I knew that the thesis archives would be a great source of novel, non-obvious patents. They might be prior art, but since no one ever read them, who would call you out?

  85. Joker_vD
    Posted November 3, 2013 at 8:39 pm | Permalink

    As one of my teachers said, “2/3 of all CS stuff has been invented by the end of the seventies. We haven’t implemented much of it yet, thirty and something years later”. And much criticism from that era still applies. Remember a Dijkstra’s essay where he descriped an air traffic control program with ~1500 conditional branches and then complained that it’s impossible to test every possible execution path, so we really need formal verification systems? Well, people somehow managed to cope along without them, but today… Enter the threads—and now we *really* need automated formal reasoning. Where is it? Haven’t seen it.

    And as for “mythology”… well, that’s how it is in other sciences as well? Over here, History of Mathematics course is usually offered to graduate math students only, since they have to sit the History of Science exam to qualify for PhD program–undergrads usually are satisfied with myths they hear during lectures on Calculus and Algebra. Not an entirely satisfying state of affairs, I admit, but hey—there are so many other things from the science itself to teach, so there is no time to re-tell the history.

  86. Posted November 4, 2013 at 6:12 pm | Permalink

    Your quote from Dr. Martin Richards doesn’t say anything about efficient pointer arithmetic, at compile time or otherwise. He says “It is entirely natural” and “I can see no sensible reason why the first element of a BCPL array should have subscript one” and “BCPL was essentially designed as typeless language close to machine code”.

    It sounds to me like it was an aesthetic decision about keeping BCPL minimal and close to machine code. Given the x!y notation BCPL used, it also seems clear that he wasn’t very concerned about replicating math notation where indexing from 1 is the standard. So I think you also have to ask why the authors of C decided to keep indexing from 0 with the probably-math-inspired x[y] notation.

    So, you’re telling a story here that you inferred from some suggestive but not 100% clear accounts of how BCPL was designed and used. How is this better than the pattern of inferring mythologies that you’re railing against?

    The question really is, how much evidence is enough evidence? I looked at C and felt I understood well enough why arrays were 0-indexed. The reasoning that I suspect led to 0-indexed arrays seems to me to be a clear theme in the language. I didn’t feel the need to dig further, because the answer seemed fairly clear and the cost of being wrong seemed fairly low. You might think, well, but you can misinterpret the intentions in a language, so you should find the person who did it and ask him. But, how do you know you won’t misinterpret what he tells you, or that he even remembers it right? And at what point does your responsibility to gain confidence in what you believe end?

  87. Anon
    Posted November 10, 2013 at 4:12 pm | Permalink

    A very long, roundabout argument, which arrives at the wrong conclusion.

    The real reason is simple, very simple. Almost all CPU’s contain an addressing mode within their instruction sets usually denoted as “base plus index” (or some variant).

    These addressing modes work by taking in the contents of two CPU registers (one containing the “base” address, the other containing the “index” value), adding them together in hardware, and using the result to retrieve a value from memory at the resultant address value.

    At the hardware level, the design looks very much like this:

    (base register) + (index register) => final address

    Now, because two values always enter the adder, and the final address is always a sum of two inputs, in order to reference the very first item (the item that lives at the address pointed to by “base”), one has to supply the adder with a zero as the “index” in order to output the “base” address value unmodified.

    And so began the zero-based array indexing concept. It was simply the way the hardware was designed. And it propagated through to the high level languages because the folks writing the high level languages were already accustomed to thinking in zero-based indexing from all the assembly language they were writing up to that point.

  88. John S. Gruber
    Posted November 10, 2013 at 8:22 pm | Permalink

    The story in your post reminded me of the 7040 computer I used to start to learn to program. The decks of cards were delimited by $JOB cards.

    Your history lesson doesn’t mention another big factor of years ago: memory was very tight and it was desirable for the object code to be as compact as possible for the sake of large applications. Many jobs were quite sizeable because the libraries weren’t shared. They had to fit in your allotment, too. It was worth wondering whether you wanted to “waste” several bytes or a word for an extra computation.

    If you were on an old enough system you had all (and only) the limited computer memory. For a 1401 that was, perhaps, a couple K storage units–perhaps like a later minicomputer. Third generation mainframe computers might have a couple megabytes, but you would have to fit your program into perhaps 100KB.

    Larger programs might be runnable, but from less favored queues of jobs–and at one time that would include an optimizing compiler. A normal job might be limited to a half hour of compute time and 128K of memory around 1970. One queue running jobs overnight might take 256K jobs running up to an hour or two of CPU time. Jobs that waited excessively in memory might be considered to be stalled and cancelled.

    I’m not trying to support the argument that any of these constraints caused all programmers to think in terms of 0 based indexes, but it would be incorrect to think that there was just one resource limit in play if you are going to consider constraints.

    I still take the time to delete the inter-word space when I split a long line into two short ones — it’s an old habit I don’t have the will to break.

  89. Justin
    Posted November 12, 2013 at 5:08 pm | Permalink

    Thank you! So glad I decided to read this. Informational, inspiring, thought provoking, all of it.

  90. Posted November 14, 2013 at 3:36 am | Permalink

    OK, I must not be understanding your argument at all. It seems like it’s, “you uninformed people think that arrays are zero-based because of pointer arithmetic, but in fact, it’s because of… pointer arithmetic”.

    First off, you state, “[BCPL] doesn’t support pointer arithmetic”. Then, you quote Dr. Richards, creator of BCPL, as saying, “Just as machine code allows address arithmetic so does BCPL…” Who’s wrong, you, or Dr. Richards?

    The truth is, C (and B before it, and BCPL before it), while high-level languages, were designed to allow low-level memory access, providing constructs that map efficiently to the underlying machine instructions. Portable machine language, if you like.

    In C, array notation uses pointer arithmetic to access elements. ISO/IEC 9899 states (section 6.5.2.1, Array Subscripting, paragraph 2):
    “The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th element of E1 (counting from zero).”

    In memory, if an array A is located at memory location 0, A[0] is equivalent to *(A + 0). Physical memory addresses are zero-indexed, and C (and its precursors) followed suit. End of story.

    As much as I enjoy your stories about old IBMs and yacht racing, I don’t understand how they’re relevant. What am I missing here?

  91. jonesy
    Posted November 15, 2013 at 6:50 am | Permalink

    Thanks, Mike, for a fine thought-provoking piece of writing, and thanks all for some most interesting, informative, comments.

    A quibble, to all those having difficulty with counting. Zero was not added by the Hindu to help count; it was invented to help do arithmetic and higher maths.

    A calendar at its most sophisticated is a counting – there is no zero anything. There is in some calendars an arbitrary event from which one may count forwards and backwards – with no need for any zero anything; you’re either in year 1 or -1. A child of less than one year’s age is not zero years old; he’s four months, seven months, etc.

    As for measuring length, “zero” is the metal thingy at the end of my tape measure to hang on the end of a board. If I’m doing time, it’s a null until I start the timer and then I’m in a fraction of the first second. But it’s true that a zero makes a convenient place-holder for showing that the timer hasn’t started yet.

    If one wants to start an index with zero for some reason, go ahead; there are handy uses. But please don’t mis-represent it as something it’s not. I’ll continue to start a matrix with 1,1.

    Which reminds me: GFA Basic (3.5u, anyway) let one use zero or one – but it had to be declared first off, and maintained throughout. One could do offsets on their own, however, something I found handy in some functions and subroutines. So long as I clearly commmented it, it worked OK.

    Fortran IV and Fast Fortran were so long ago that I’ve clean forgotten them. BCPL and such were what the priests played with in the computer center; I was but a lowly engineering student turning in card decks hoping to get a clean run.

  92. Posted November 15, 2013 at 9:07 am | Permalink

    Pascal defaulted to 1 based index arrays but allowed you to choose anything that you wanted.

    You could define an array a[-50..100] if you wanted to. For that matter I believe that ALGOL allowed the same thing.

    I thought this was a nice bit of syntax sugar that allowed the array to follow your problem in a natural way. The compiler would then do the conversion for you.

    It probably wasn’t used enough to make it into later languages.

    I blame C for this and the terrible == for comparison and = for assignment construct.

  93. Lynn Dobbsq
    Posted November 15, 2013 at 11:23 am | Permalink

    Wow. That was wonderful. A friend of mine is forever talking about what we knew in the 60s and 70s that we don’t know now. He speaks with awe about Xerox Park, for instance.

    I feel your pain about some of the comments to your post.

    Human beings, in the main, are superstitious creatures who do not like change or controversy. We quickly shift the arbitrary and capricious rules of our childhood into our personal reality. We seek out “our own kind” so that we have a more or less comfortable fit of personal realities.

    There is no reason to assume that computer professionals at ever level aren’t the same way. I think it is nigh on to miraculous that human beings can ever innovate.

    It is rare and we often celebrate those who can (or those who can make some personal fame out of stealing another’s innovation).

    Off my hobby horse, now. :)

    Thank you for this article.

  94. malix
    Posted November 15, 2013 at 11:57 am | Permalink

    WOW! Thank you for that! I laughed, I cried, and the comments were almost more entertaining than the article!
    PLEASE keep rebutting all the idiots, I mean commenters…

  95. David
    Posted November 15, 2013 at 1:30 pm | Permalink

    What everyone including the author fails to acknowledge is that there is no Array, it doesn’t exist as its an abstract data type.

    Having a 0 index simply acknowledges this.

  96. mhoye
    Posted November 15, 2013 at 3:28 pm | Permalink

    Well, Slashdot comments are pretty much exactly what I remember them being.

  97. slashdotter
    Posted November 15, 2013 at 4:44 pm | Permalink

    thanks for leaving slashdot, please don’t stay in touch.

  98. Posted November 15, 2013 at 8:09 pm | Permalink

    First post, you insensitive clod!

  99. Posted November 17, 2013 at 5:50 am | Permalink

    @David, technically, arrays are a derived type.

    The standard says:
    “An array type describes a contiguously allocated nonempty set of objects with a particular member object type, called the element type.36) Array types are characterized by their element type and by the number of elements in the array. An array type is said to be derived from its element type, and if its element type is T , the array type is sometimes called ‘‘array of T’’. The construction of an array type from an element type is called ‘‘array type derivation’’.”

  100. mhoye
    Posted November 18, 2013 at 10:12 am | Permalink

    frist psot

  101. Michael Anthony
    Posted November 18, 2013 at 4:42 pm | Permalink

    I’m ambivalent on this, well I’m the sort of person that would give his right arm to be ambidextrous, but would i still be?

    People get lost in tall buildings because of zero indexing of the floors / levels / storeys.
    During the town planning stages Ground Floor is usually ( in most countries ) exempt from rateable floor area because it is there anyway without the building. So it gets to be floor 0.

    At construction ( analogous to computer technicians ) they just roll with whatever. I doubt the boffins would have had any advantage in starting at 1 index when interrupted for the yacht interrupt ). But the builder has to build the 0 floor, it has costs to pass on to the client. A 3 storey building under the above designation has 4 levels of cost. He sees ground floor as Floor 1

    The lift manufacturer gets handed the specification for marking the floors, depending on who that spec came from; the architect or the builder.

    So the end user, Joe Public who is quite separate from the development (compiler) process could either “naturally” see First-Floor as the first floor in the air, or, the belonging to the first door in that she encounters. Pizza deliveries meant for the ninth floor gets delivered to the eight. It all depends on whether his or her’s perception is history based ( book learnin’), or for-present-analytical based (free-thinker,fly by the seat of m’ pants rebel type)

    I’m done

  102. dr2chase
    Posted November 20, 2013 at 10:36 pm | Permalink

    I think the comments about “close to the machine” and mentioning the number of indexing modes are on point; that is surely the impression I got of BCPL, way back when.

    A couple of other notes: it is my distinct recollection that LET A = VEC N
    would allocate memory for cells at addresses A+0 through A+N; that is VEC N allocates N+1 cells.

    My memory of string addressing in BCPL is a little hazy, because cells were typically larger than bytes, and I have a distinct recollection that their length did not exceed 255, because the byte zero contained the length.

    And though I recall a desire to be “close to the machine”, there were was also INTCODE, for interpreting BCPL, and at one time students used a machine whose entire operating system (OS6) was executed as INTCODE (for a 15x performance penalty — _The Computer Journal_, 15:2, p.118 ). My copies of other Historical Documents are at work, not here, but you might be amused to know that one summer I worked at IBM San Jose Research Labs, and hacked up an interpreter for Backus’s FP84 — in BCPL, on VM/CMS. The BCPL port to OS/360 and VM/CMS was quite nice, and gave clean access to many features of the OS.

    And lo, some of the Historical Documents are now available as PDF (sadly, not indexed for search — yet) http://www.cs.ox.ac.uk/files/3231/PRG09.pdf http://www.cs.ox.ac.uk/files/4632/PRG-9%28c%29.pdf

  103. Claude
    Posted November 21, 2013 at 1:30 am | Permalink

    Thanks for digging this up! From the comments it is certainly polarizing, as in 0 against 1.

    I think of 0-based indexes to be pointing to the boundary, between the array elements, and 1-based indexes to be pointing to the center of the array elements.

  104. Ashley Yakeley
    Posted November 21, 2013 at 3:15 am | Permalink

    Clearly bytes should be one-based: they have 256 possible values, and they are most easily counted 1,2,3…256. I wonder who first came up with the idea of zero-based bytes?

    The Romans used one-based addition, for example, the day after tomorrow is three days away (those three days are today, tomorrow, the day after), while going up the diatonic musical scale an octave is, like its name suggests, an interval of eight notes. However we seem to have switched to a less intuitive zero-based system since…

  105. Martin Roller
    Posted November 21, 2013 at 4:50 am | Permalink

    Ah, I was waiting for the elevator argument. I remember in Siemens (who invented the electric elevator) office buildings they start numbering floors with 2 at the ground floor, even if there are no floors below. For Siemens employees this seemed very natural.

  106. foobar
    Posted November 21, 2013 at 5:24 am | Permalink

    As for IEEE and ACM … yes, it’s highly ironic how this field completely fails to leverage the technology it has created for advancing itself.

    As for your array index rant, though, it seems to me like you missed the point.

    First, let me point out a myth that you are perpetuating: “first” does not mean “the element with index 1″, you are mixing up ordinal and cardinal numbers there.

    “First” means “the element that has no predecessor”, which would be the element with index 20 for an array that is indexed starting at 20, the element with index 1 for an array that is indexed starting at 1, the element with index -42 for an array that is indexed starting at -42, and the element with index 0 for an array that is indexed starting at 0. Or to give an even more illustrative example: “A” is not the “Ath letter of the alphabet”, it’s the “first letter of the alphabet” – indices are simply names, possibly with some defined order, and there is no fixed relationship between those names and the ordinal numbers specifying their position in the respective order.

    Thus, there also is no such thing as “zeroth”, other than in some colloquial or highly confused use.

    Your confusion in that regard probably stems from the fact that the most common everyday use of ordinal numbers is for referring to elements of some collection of (material) objects, and it so happens that in a collection consisting of a single object, that object is the first object (its ordinal number), and the corresponding size of the collection is 1 (a cardinal number), if you add a second object, the size of the collection grows to 2, and so on. So, “1” and “first”, “2” and “second”, […] tend to coincide in this manner, but that really is not much more than a coincidence.

    Now, it also so happens that most programming languages do not distinguish explicitly between ordinal and cardinal numbers, but rather just use some subset type of the integers both for arithmetic and for naming elements of arrays. As such, they need to define some mapping between the two uses – which usually happens by simply designating some cardinal number to be the index of the first element of an array and using the same order for both. From that it may seem like any choice would really be as good as any other, it’s just an ordered set of names after all, isn’t it? Well, yes, and no.

    Many commenters refer to matrix algebra’s use of 1-based indices. If you think about it, that really is just an exercise in naming and ordering, you might as well use letters as indices, any sufficiently large ordered set would do. For this kind of use, it really doesn’t matter. It’s just one step up from a dictionary, in that the keys do have a defined order.

    But there is another use case for array indexing: Algorithms where you do arithmetic with indices – and algebraically, 0 and 1 are very different beasts indeed, and they are both very special, too, being the neutral elements of addition and multiplication and the absorbing element of multiplication. Whether indexing starting at 0 or starting at 1 is easier depends a bit on the data structure you are implementing on top of the array, though, depending on whether common relations between indices are multiplicative (trees) or additive (non-trees ;-).

    However, if you consider building an array on top of another array (which is a very common thing to do, even as part of tree implementations), which happens by specifying an offset of the abstract array relative to the underlying array, you will notice that the underlying computation is an addition, and so the neutral element of that abstraction is the 0. And that is what I would assume Mr. Richards really means – it just so happens that building a PL array on top of RAM is also essentially “building an array on top of an array”, but it’s not natural to use 0-based indexing because the hardware works this way, but rather the hardware works this way because it is natural for this very common abstraction, common both in hardware and in software, and if it’s good for “hardware arrays” to work this way, why shouldn’t the same reason be a good reason for “software arrays” to work the same way? (Well, there might be reasons, of course, but my point is that the hardware is the way it is for a reason, not purely by accident.)

    And yes, this abstraction is essentially “pointer arithmetic”, however you don’t need any kind of dynamic memory allocation in your language to do “pointer arithmetic” – any array index is a pointer, just relative to an array at the language level instead of to “the process address space” or “the physical RAM” (the former also being just an array built upon the latter array, more or less). As soon as you have one array, you can build other arrays upon it by adding offets, and you can do index (pointer) arithmetic for element access.

    That the folklore around this is commonly wildly misleading and not very well understood may still be true, though ;-)

  107. clord
    Posted November 21, 2013 at 12:04 pm | Permalink

    I work on compilers (C++ and Fortran) for my day-job (yes people still (barely) paid to write them) and I have a bit of perspective on the issue. It’s not just a compilation time issue. Let’s tackle the reasons first.

    Zero-based indexing has some non-historical reasons. In my view it is a consequence of:

    1) how we assigned numbers to bit patterns,
    2) the desire to fully utilize all of our machine,
    3) simplicity of implementation at all levels

    In binary, zero is the *first* pattern. There is the origin of the off-by-one error right there in my opinion. Everything else flows out of that choice. And it’s pretty much unavoidable, since zero needs some representation in the machine. There are two domains here: the integers, and the bit patterns representing them. Decimal digits have this problem too, but no one uses decimal digits in the way we use decimal digits. e.g., 0 is the first digit, then 1 is the second, third is 2. When we index arrays, we’re indexing with bit-patterns, not with integers.

    So why not just change the patterns to match how we expect? Registers hold bit-patterns, not numbers. And the number of bit-patterns a register can hold is a precious resource — using all of the bits in that register is important to people. If we 1-index (by doing a decrement on the index before dereference), then we throw out one whole bit-pattern as meaningless during address computations (the zero). Because of the decrement, index 2^n is not representable in a n-bit register, which means not all of memory can be reached by pointer arithmetic. Some people say ‘use the zero-pattern for that last element’.

    That’s where #3 comes in to play. CPU designer says to that: “why should my CPU have to have special indexing hardware? Aren’t we claiming to be RISC? Just use math operators.” And then the language designer says “Well, CPU provides these math operators, and I don’t want to impose any performance penalty on my users just for natural indexing, so i’ll pass the buck to the user.”

    If you want to design a language that fixes this, the Fortran way is correct IMO. Let the user pick their bounds in the domain of integers, and provide sensible defaults. The language can flatten out to the domain of bit-patterns in the machine. This way the compiler can constant-fold the transformation to the under-ground zero-based-indexing. Also this approach generalizes out to multi-dimensional matrices quite nicely (perhaps with annotations about how to map the indexing domain to the bit-pattern domain for cache-line reasons — an optimization our compiler does routinely for Fortran code, but which is impossible in C/C++ code, another reason Fortran is still the fastest language we ship).

  108. bmm6o
    Posted November 21, 2013 at 1:02 pm | Permalink

    Fascinating article. Count me among the crowd that would like to see further clarification of this:

    “But here’s the thing: in that context none of the offset-calculations we’re supposedly economizing are calculated at execution time. All that work is done ahead of time by the compiler”

    Are you saying that the compiler converts x!y to something other than the machine equivalent of “load x+y”? Surely it can’t always pre-compute the sum. Unless your point is that inserting the “subtract 1″ instruction would slow down the compile time more than run time? I feel like I’m missing some point that you think is obvious.