14.2. Choosing between Lists, Maps, and Arrays

The different kinds of collections objects are useful for different purposes. The List class can be used to create easy-to-use containers that make it relatively simple to manage iterative chores. Use a Swarm List when you intend to have all objects processed in order, for example. (Lists can also be processed in a randomized order). The Map and Array classes are intended for more structured maintenance of collections.

Because they serve these specialized objectives, there are some commands uniquely available to each of the Swarm container classes. For example, as we saw in an earlier chapter, the List class can respond to methods like addFirst:, addLast:, removeFirst, and removeLast. A List object can be used in a flexible way, objects can be thrown onto the end or the beginning of the list with these methods. These are not available in Map or Array, because Map and Array objects have more intricate internal structure.

A Swarm Array object is used when it is necessary to store objects in a specific order. The Swarm Array is somewhat similar to a C array, in the sense that objects can be inserted at a particular position and their values can be retrieved from that position. A Swarm Array can be processed iteratively, as a List can.

A Swarm Map is used when objects are not stored according to their numerical position in a list, but rather according to the value of some object. For example, a Map can store objects that have rankings of favorite foods for each of several people. If each person is an object, then the person's identity works as a "key" that controls the insertion and removal of the object from the Map.

Enhancement and streamlining of the Swarm Collections library is an ongoing chore, but at the current time the user's choice of List, Map, or Array is partly driven by the way these classes are implented in Swarm. The List and Map classes are comparatively slow. If one needs to make repeated accesses to a List or Map from randomly selected positions, the program will run comparatively slowly. Here's why:

  1. Suppose you have a Map and you have entered objects that represent food tastes for each of 500 people.

  2. If you then tell your Map object to retrieve the food preference of the person "Bart", for example, then the Map will be processed from the beginning (the first inserted object) and each will be checked to see if its key (its "owner," as it were) is "Bart."

  3. If Bart's object happens to be at the end of the Map, then a lot of objects will be checked.

  4. Then, when you ask for the object of person "Fred", it begins at the start of the Map and checks, one-by-one, until it finds the object whose key is Fred.

At the time of writing, the Map object, has no way to go straight to the one you want[1], so it goes through this repetitive checking process. The same is true of the List class. As a result, if there are many objects, programs will run slowly when they try to insert and retrieve data for specific objects when using Swarm Lists or Maps.

In contrast, the processing of a Swarm Array can be quite fast because the elements are entered with integer keys. A Swarm Array can quickly retrieve item number ten. Unlike a Map, it does not start at the beginning and go through a sequence of checks until it comes to the tenth item. Because of the internal structuring of the Swarm Array, the tenth item is retrieved without checking the first nine.

As a result of the aforementioned issue about the speed of the program, there is going to be a judgment call. A Map will work fine and quickly if there are just a few items stored, but the time wasted looking for a specific item increases with the length of the list. An Array might be a good choice, except allocating space for an array may waste memory. For example, suppose we are preparing to survey 20 people out of a population of 100,000. If each person is assigned a number, and then numbers are chosen at random, we might end up with people in our sample that are numbered {44, 63, 555, 4432, 6689, 21001, 44934, 78343, 99921}. If we use a Map, we could just add the 10 objects. On the other hand, if we wanted to use an Array with the person's number serving as the Array index, we would have to allocate an Array with 100,000 elements in order to store these ten items. This wastes memory, but objects can indeed be retrieved quickly. Most people would prefer a Map for this purpose. If there were 10,000 people being sampled, however, the Array might work best.

Notes

[1]

a potential enhancement to the collections library is an option that would allow the user to select a hash-table implementation of the Map protocol, which would effectively allow this kind of random-access