If you have started to learn any of the mainstream web programming languages (JavaScript, Ruby, PHP, or Python) you have no doubt been introduced to something called an array.

An array is a collection of values, organized in a specific order. They can be text or numbers or boolean (true or false), and each value in the array has a number (it’s subscript) that correlates to its position in the array.

Let’s look at this sample array of my favorite colors:

`[‘blue’, ‘yellow’, ‘red’, ‘purple’, ‘white’]`

So if I were to ask you what numerical position in the array ‘blue’ has, what would you say?

“1 silly! It’s the first number!”

In life, you would be correct. In JavaScript, PHP, Python or Ruby, you would not. The right answer would be zero, because arrays are counted not from 1 but from zero.

In other words, another way to explain how the array works is this:

`[‘blue’0, ‘yellow’1, ‘red’2, ‘purple’3, ‘white’4`

I have 5 favorite colors in my array, but their respective subscripts are 0-4.

Think about it like this: let’s say instead of asking you what number ‘blue’ is, I asked you how many positions you would need to move in order to get to ‘blue’?

Since ‘blue’ is the first value in the array, and since you are starting at the beginning, you have to move zero times to get to ‘blue’, right?

Ah ha!

**Why do programmers count from zero, when everyone else starts at 1?**

Computer programming is all about efficiency, and even small improvements in efficiency can make big differences at scale.

And yes, counting from zero is slightly more efficient than starting at 1.

Let’s explore a simple mathematical equation to understand why:

If we count from zero, every value in the array of length **N**, can be represented by the following equation where **i** represents the numerical position of each value:

`0 ≤ i < N`

Our color array from before has 5 total values. If we were to take each value’s subscript (it’s numerical position in the array) and slot them into this equation:

`‘blue’`

0 ≤ 0 < 5 // true!
‘yellow’
0 ≤ 1 < 5 // also true!
‘white’
0 ≤ 4 < 5 // yep, yep, true too!

Can we all agree that each of these equations is true?

Now, if we were to count from 1, every value in the array of length **N** could be represented by the following equation where **i** represents the numerical position of each value:

`1 ≤ i < N + 1`

So for a moment, let’s consider this alternative array of colors I don’t like, indexed by 1:

`[‘beige’1, ‘orange’2, ‘green’3]`

The equations now looks like so:

`‘beige’`

1 ≤ 1 < 4 // true!
‘orange’
1 ≤ 2 < 4 // huh, also true!
‘green’
1 ≤ 3 < 4 // too true!

Those are also true, right? So what’s the problem?

The problem is found in the **N + 1** part of the equation.

You see what that means is that in order for the computer to process the equation it has to find the length of the array and then add 1. Sure, it’s not a hard task (to add 1) but it is extra work that the computer doesn’t have to do when processing the former equation, and therefore, starting the count at zero wins!

*This article is heavily indebted and inspired by Edsger W. Dijkstra, Avi Flombaum , and Emily Davis.*

“Hi there, I discovered your site by way of Google at the same time as searching for a related subject, your web site got here up, it seems great. I have bookmarked it in my google bookmarks.”

you could use <= in place of < to begin from 1.

Then you would test two condition instead of one

If you write 0 ≤ i < N for subscripts starting from 0, then for the ones starting with 1 an equivalent equation would be 0 < i ≤ N and not 1 ≤ i < N + 1

just like how you are putting less than or equal to sign on left side why couldn’t you put that on right too, then you won’t need N+1… invalid point.

this make no sense….

why would length be represented in N+1…

lets say i’ve an array with 1 value then its length should be one not two ((1)+1).

I guess you didn’t explain why N+1…

You just said since we count from zero therefore we count from zero.

I guess this is nitpicking, but this code snippet:

[‘blue’0, ‘yellow’1, ‘red’2, ‘purple’3, ‘white’4]

Perhaps that would be better represented as:

[0‘blue’, 1‘yellow’, 2‘red’, 3‘purple’, 4‘white’]

Because of the same reason you go on to explain next. The array index isn’t really referring to the nth element of the array, but what position it’s at.

I’m not sure the explanation of this being for efficiency concerns is necessarily a valid one, as presented.

In lower level languages and systems programming languages such as C and C++ the convention of starting indexes at 0 arose precisely for the reason presented — thinking in terms of how many positions a given element was offset from the starting position.

This can be seen by the fact that in C the following expressions are equivalent:

array[i];

array_memory_location + (array_element_size * i);

When this kind of code is used in places where it will be executed extremely often there certainly is a potential performance benefit to not doing i – 1 instead, and it allows array subscript notation to be consistent with other ways of writing the same code.

However, in the languages listed in this article (PHP, Python, Ruby, Javascript, and many other higher-level languages) those expressions are not equivalent, and there is likely no means of accessing an element in an array by calculating its offset from the memory location of the first element.

Indeed, in both PHP and Javascript all arrays are actually associative arrays, which have an implementation closer to Python’s dictionaries than C’s arrays. While you may access elements in them using a numeric key, their order in memory actually has no relation to the order of the numbers.

In all of the listed languages however, the implementation of array element access is vastly more complex than lower-level languages, to the point where a single subtraction operation per element accessed would not have any impact on the performance of that code.

Instead, these languages retain the 0-based indexes entirely as a tradition/convention that a great many established programmers are familiar with an expect. A few languages, such as Lua and MATLAB, have chosen to use 1-based indexes, but overwhelmingly language designers stick with the established convention and prevent cognitive disruption when people switch between languages.

So, in the end it might just be easier to simplify this article and just say “Programming languages use 0 to indicate the first thing in an array/list because it was a convention established decades ago by older languages where that distinction had real importance.”

Thanks for the thoughtful response :) Yes, I totally agree. Also, there is the whole hardware piece of this. I just wanted to explain the original reasoning, but I think the distinction you are making is on point!