10

Possible Duplicate:
Arrays, What’s the point?

I tried to ask this question before in What is the difference between an array and a list? but my question was closed before reaching a conclusive answer (more about that).

I'm trying to understand what is really meant by the word "array" in computer science. I am trying to reach an answer not have a discussion as per the spirit of this website. What I'm asking is language agnostic but you may draw on your knowledge of what arrays are/do in various languages that you've used.

Ways of thinking about this question:

  • Imagine you're designing a new programming language and you decide to implement arrays in it; what does that mean they do? What will the properties and capabilities of those things be. If it depends on the type of language, how so?
  • What makes an array an array?
  • When is an array not an array? When it is, for example, a list, vector, table, map, or collection?

It's possible there isn't one precise definition of what an array is, if that is the case then are there any standard or near-standard assumptions or what an array is? Are there any common areas at least? Maybe there are several definitions, if that is the case I'm looking for the most precision in each of them.

Language examples:

(Correct me if I'm wrong on any of these).

  • C arrays are contiguous blocks of memory of a single type that can be traversed using pointer arithmetic or accessed at a specific offset point. They have a fixed size.
  • Arrays in JavaScript, Ruby, and PHP, have a variable size and can store an object/scalar of any type they can also grow or have elements removed from them.
  • PHP arrays come in two types: numeric and associative. Associative arrays have elements that are stored and retrieved with string keys. Numeric arrays have elements that are stored and retrieved with integers. Interestingly if you have: $eg = array('a', 'b', 'c') and you unset($eg[1]) you still retrieve 'c' with $eg[2], only now $eg[1] is undefined. (You can call array_values() to re-index the array). You can also mix string and integer keys.

At this stage of sort of suspecting that C arrays are the only true array here and that strictly-speaking for an array to be an array it has to have all the characteristics I mention in that first bullet point. If that's the case then — again these are suspicions that I'm looking to have confirmed or rejected — arrays in JS and Ruby are actually vectors, and PHP arrays are probably tables of some kind.

Final note: I've made this community wiki so if answers need to be edited a few times in lieu of comments, go ahead and do that. Consensus is in order here.

Community
  • 1
  • 1
Ollie Saunders
  • 6,841
  • 3
  • 23
  • 37
  • PHP arrays is actually combination of Lists and Arrays in other languages. So it's best not to mention it here. - might cause confusion. – mauris Oct 15 '09 at 04:10
  • Oh god, that makes me laugh. Mostly because, yes, it is confusing! I think I might have to ask separate questions for lists and all the other words as well. – Ollie Saunders Oct 15 '09 at 04:15
  • 6
    Human language is almost never a formal language. You are barking up the wrong tree when you try to find the 'actual' or 'real' definition. There is none. All we have is tradition and consensus, and the consensus frequently breaks down on the margins. In so far as C, Javascript, Ruby, and PHP are formal languages, they can define what an array means in the context of the language, but there is no obligation for those definitions to converge on a common model, and they are each 'true' and 'real' within the context of the language. – Charles E. Grant Oct 15 '09 at 04:32
  • Great, Charles. So I can create a new programming language that refers to whole numbers as "strings" instead of "integers", and no one can tell me my definition is wrong! – Nicholas Knight Oct 15 '09 at 04:54
  • 1
    @ Nicholas Knight - as long as you can get enough people to agree with you... consensus is the key here. – Reuben Oct 15 '09 at 05:12
  • @Nicholas - You can't dispute this. Perl, PHP, Ruby, and many other languages use the term "Array" for their ordered collection types, but classic "array" (in C) is quite different (pop/push/shift functions are impossible for arrays on the stack, and exceedingly difficult and inefficient for heap-allocated arrays). And PHP "arrays" are really hashtables that just use numbers. – Chris Lutz Oct 15 '09 at 05:13
  • 1
    @Nicholas: correct. A definition is only "right" or "wrong" in the context of a particular framework of concepts. If you choose to define your terms in a bizarre way that is "ok" so long as you are logically consistent about it. Just don't expect other people to understand you ... or even bother to trying :-) – Stephen C Oct 15 '09 at 05:17
  • @Nicholas: there is quite a strong and widespread consensus about what a strings are, and what whole numbers are. You are indeed free to design a language that flouts those convention, but it is unlikely to be popular. Furthermore, even though there is a general consensus about what a string is, and how it should behave, it still gets murky around the edges. Compare the output of 'print "2" + "3"' in Perl and in Python. – Charles E. Grant Oct 15 '09 at 07:15
  • 1
    What exactly is the point of closing these things? Do you just like to ruin people's days. That "duplicate" is a completely different question! – Ollie Saunders Oct 15 '09 at 15:48
  • An array is simply a collection of data where you can easily access each element by referring to an index. For example, 5 people standing in a row would be equivalent to an array since you can refer to each individual in the line as #1, #2 etc. An array is pretty similar to that in computer terms. – Gajendra K Chauhan Jun 24 '13 at 07:32

8 Answers8

5

array |əˈrā|

noun

1 an impressive display or range of a particular type of thing : there is a vast array of literature on the topic | a bewildering array of choices.

2 an ordered arrangement, in particular

  • an arrangement of troops.
    1. Mathematics: an arrangement of quantities or symbols in rows and columns; a matrix.
    2. Computing: an ordered set of related elements.
    3. Law: a list of jurors empaneled.

3 poetic/literary elaborate or beautiful clothing : he was clothed in fine array. verb

  1. [ trans. ] (usu. be arrayed) display or arrange (things) in a particular way : arrayed across the table was a buffet | the forces arrayed against him.
  2. [ trans. ] (usu. be arrayed in) dress someone in (the clothes specified) : they were arrayed in Hungarian national dress.
  3. [ trans. ] Law empanel (a jury). ORIGIN Middle English (in the senses [preparedness] and [place in readiness] ): from Old French arei (noun), areer (verb), based on Latin ad- ‘toward’ + a Germanic base meaning ‘prepare.’
Ollie Saunders
  • 6,841
  • 3
  • 23
  • 37
Palo Verde
  • 387
  • 2
  • 8
5

It is, or should be, all about abstraction

There is actually a good question hidden in there, a really good one, and it brings up a language pet peeve I have had for a long time.

And it's getting worse, not better.

OK: there is something lowly and widely disrespected Fortran got right that my favorite languages like Ruby still get wrong: they use different syntax for function calls, arrays, and attributes. Exactly how abstract is that? In fortran function(1) has the same syntax as array(1), so you can change one to the other without altering the program. (I know, not for assignments, and in the case of Fortran it was probably an accident of goofy punch card character sets and not anything deliberate.)

The point is, I'm really not sure that x.y, x[y], and x(y) should have different syntax. What is the benefit of attaching a particular abstraction to a specific syntax? To make more jobs for IDE programmers working on refactoring transformations?

Having said all that, it's easy to define array. In its first normal form, it's a contiguous sequence of elements in memory accessed via a numeric offset and using a language-specific syntax. In higher normal forms it is an attribute of an object that responds to a typically-numeric message.

DigitalRoss
  • 135,013
  • 23
  • 230
  • 316
  • I think your point is being lost on me, here. Which is a shame because it sounds interesting. What do you think is the good question that is hidden? Also I don't completely see how what you're saying about syntax relates (I'm sure it does! But I'm not seeing it right now). Your definition of array has been noted, thanks for answering. – Ollie Saunders Oct 15 '09 at 04:19
  • 1
    Well, I'm arguing for a unified syntax for sending messages to objects, in order to decouple use from implementation, and to make the objects polymorphic in a different sense. This relates to the definition of array, it started out as a simple mapping of vector and matrix abstractions to lvalues, as C calls them, but now the `[]` operator can sometimes be redefined (perhaps to implement a sparse matrix, or whatever) and so now I just don't see a difference between calling a function, accessing an attribute, or indexing an array. They are all about sending messages to objects. – DigitalRoss Oct 15 '09 at 04:36
  • 1
    +1 because you've got me rethinking myself with my "ideal language" design. – Chris Lutz Oct 15 '09 at 05:39
  • I want to point out that in Clojure, Vectors are actually functions of their indices and Maps are functions of their keys. – Jörg W Mittag Oct 15 '09 at 11:37
3

If you ignore how programming languages model arrays and lists, and ignore the implementation details (and consequent performance characteristics) of the abstractions, then the concepts of array and list are indistinguishable.

If you introduce implementation details (still independent of programming language) you can compare data structures like linked lists, array lists, regular arrays, sparse arrays and so on. But then you are not longer comparing arrays and lists per se.

The way I see it, you can only talk about a distinction between arrays and lists in the context of a programming language. And of course you are then talking about arrays and lists as supported by that language. You cannot generalize to any other language.

In short, I think this question is based on a false premise, and has no useful answer.

EDIT: in response to Ollie's comments:

I'm not saying that it is not useful to use the words "array" and "list". What I'm saying is the words do not and cannot have precise and distinct definitions ... except in the context of a specific programming language. While you would like the two words to have distinct meaning, it is a fact that they don't. Just take a look at the way the words are actually used. Furthermore, trying to impose a new set of definitions on the world is doomed to fail.

My point about implementation is that when we compare and contrast the different implementations of arrays and lists, we are doing just that. I'm not saying that it is not a useful thing to do. What I am saying is that when we compare and contrast the various implementations we should not get all hung up about whether we call them arrays or lists or whatever. Rather we should use terms that we can agree on ... or not use terms at all.

To me, "array" means "ordered collection of things that is probably efficiently indexable" and "list" means "ordered collection of things that may be efficiently indexable". But there are examples of both arrays and lists that go against the trend; e.g. PHP arrays on the one hand, and Java ArrayLists on the other hand. So if I want to be precise ... in a language-agnostic context, I have to talk about "C-like arrays" or "linked lists" or some other terminology that makes it clear what data structure I really mean. The terms "array" and "list" are of no use if I want to be clear.

Stephen C
  • 632,615
  • 86
  • 730
  • 1,096
  • There is no answer is an legitimate and useful answer to me. But I don't agree with a lot of what you say before that so I'm down-voting you, sorry. – Ollie Saunders Oct 15 '09 at 04:52
  • @Ollie: specifically what do you disagree with ... and justify. – Stephen C Oct 15 '09 at 05:04
  • These words "array" and "list" exist — or should exist — to denote the characteristics and limitations of different data structures. You're correct to say that arrays and list are the same when you remove the implementation details but I think that's missing the point. It's the implementation details that I care about. After all, the motivation for having different data structures is almost solely performance, otherwise we'd just use a hash tables for everything. – Ollie Saunders Oct 15 '09 at 05:24
  • @Ollie - The implementation details are _not_ language-agnostic. You're hung up on semantics. – Chris Lutz Oct 15 '09 at 05:41
  • I beg to differ. The implementation details kind of are language-agnostic. If something takes constant time in one structure and linear time in another it's not going to matter what the syntax looks like. That's a simplification but not oversimplification. – Ollie Saunders Oct 15 '09 at 05:45
3

From FOLDOC:

array

1. <programming> A collection of identically typed data items distinguished by their indices (or "subscripts"). The number of dimensions an array can have depends on the language but is usually unlimited.

An array is a kind of aggregate data type. A single ordinary variable (a "scalar") could be considered as a zero-dimensional array. A one-dimensional array is also known as a "vector".

A reference to an array element is written something like A[i,j,k] where A is the array name and i, j and k are the indices. The C language is peculiar in that each index is written in separate brackets, e.g. A[i][j][k]. This expresses the fact that, in C, an N-dimensional array is actually a vector, each of whose elements is an N-1 dimensional array.

Elements of an array are usually stored contiguously. Languages differ as to whether the leftmost or rightmost index varies most rapidly, i.e. whether each row is stored contiguously or each column (for a 2D array).

Arrays are appropriate for storing data which must be accessed in an unpredictable order, in contrast to lists which are best when accessed sequentially. Array indices are integers, usually natural numbers, whereas the elements of an associative array are identified by strings.

2. <architecture> A processor array, not to be confused with an array processor.

Also note that in some languages, when they say "array" they actually mean "associative array":

associative array

<programming> (Or "hash", "map", "dictionary") An array where the indices are not just integers but may be arbitrary strings.

awk and its descendants (e.g. Perl) have associative arrays which are implemented using hash coding for faster look-up.

Laurence Gonsalves
  • 125,464
  • 31
  • 220
  • 273
  • I can show you counter-examples to both examples of arrays and lists used in the cited definition. For example, sparse arrays are not stored contiguously, and Java ArrayLists (or equivalent) are designed for random access. Besides these are properties of array/list IMPLEMENTATIONS, not the underlying concepts. – Stephen C Oct 15 '09 at 05:08
  • 1
    Sparse arrays aren't very common though. – Ollie Saunders Oct 15 '09 at 05:31
  • Stephen: It does say "elements of an array are *usually* stored contiguously". Java's List interface says "[positional] operations may execute in time proportional to the index value for some implementations ... [thus] iterating over the elements in a list is typically preferable to indexing through it" which is pretty much what the above says. The fact that a particular implementation of List (which has "array" in its name!) has O(1) random-access doesn't invalidate the claim that, in general, "list" means something that's meant to be accessed sequentially. – Laurence Gonsalves Oct 16 '09 at 00:01
3

An array is an ordered collection of data items indexed by integer. It is not possible to be certain of anything more. Vote for this answer you believe this is the only reasonable outcome of this question.

Ollie Saunders
  • 6,841
  • 3
  • 23
  • 37
  • 2
    This is technically correct, but it doesn't get my vote because most people would have a different mental model. Unfortunately, there are many (conflicting) mental models. No definition would get my vote because I don't think there can be a consensus. At the root of it, definitions are about consensus/common understanding, not popularity. – Stephen C Oct 15 '09 at 06:17
  • 2
    We can be certain that an array is an ordered collection, particularly in the mathematical sense that it's indexed by integers. – outis Oct 15 '09 at 06:34
  • hmm, but in PHP, an array can have a string index. So I guess the consensus stops before this attribute. – Claudiu Creanga Jan 10 '19 at 11:36
2

An array:

  1. is a finite collection of elements
  2. the elements are ordered, and this is their only structure
  3. elements of the same type
  4. supported efficient random access
  5. has no expectation of efficient insertions
  6. may or may not support append

(1) differentiates arrays from things like iterators or generators. (2) differentiates arrays from sets. (3) differentiates arrays from things like tuples where you get an int and a string. (4) differentiates arrays from other types of lists. Maybe it's not always true, but a programmer's expectation is that random access is constant time. (5) and (6) are just there to deny additional requirements.

Kevin Peterson
  • 6,933
  • 5
  • 33
  • 43
1

I would argue that a real array stores values in contiguous memory. Anything else is only called an array because it can be used like array, but they aren't really ("arrays" in PHP are definately not actual arrays (non-associative)). Vectors and such are extensions of arrays, adding additional functionality.

mpen
  • 237,624
  • 230
  • 766
  • 1,119
0

an array is a container, and the objects it holds have no any relationships except the order; the objects are stored in a continuous space abstractly (high level, of course low level may continuous too), so you could access them by slot[x,y,z...]. for example, per array[2,3,5,7,1], you could get 5 using slot[2] (slot[3] in some languages).

for a list, a container too, each object (well, each object-holder exactly such as slot or node) it holds has indicators which "point" to other object(s) and this is the main relationship; in general both high or low level the space is not continuous, but may be continuous; so accessing by slot[x,y,z...] is not recommended. for example, per |-2-3-5-7-1-|, you need to do a travel from first object to 3rd one to get 5.

Test
  • 1,643
  • 1
  • 11
  • 10