Previous ... Next

Collection: list, map, set, optional and their operations

(1/5)lists manipulation

As we have seen before, lists can be defined using «», as in the example below.

As you can see above, many of the most common «» classes have a nested class called «» as a convenience feature, to avoid having to define your own in most programs.
Lists can be created with square brackets; they are born mutable but can become immutable if they are directly assigned to an immutable local binding or parameter, or by other forms of promotion; for example, a method without «» parameters returning a «» reference can be used to initialize an immutable binding. You need to specify the type of the local binding to force the promotion.
For example:
Immutable lists can be combined with operators. The general idea is that operators «» work on one sequences and one element, while the corresponding doubled-up operators «» work on two sequences. You can see the details of this below.
In addition of operators, immutable lists also support a plethora of methods:
As you notice, there are different kind of actions: replace an element («»), insert an element («») and skipping/filtering elements out («»). Then, elements can be specified by index, by being the leftmost or the rightmost. To filter elements out, you can also just provide the element.

Immutable collections (and also mutable ones, as we will see later) can be accessed with the following methods:

Mutate sequences

Mutable sequences can be more efficient that immutable ones, and are more general, since they can store mutable objects.
Now we show some methods over a mutable list «»; consider each following line independently:

(2/5) Iterations on lists and views

We now show various pattens to iterate on lists. First some usual foreach:

In 42 foreach allows to iterate on multiple collections at once, and also to update the collections:
In the example above, a dynamic error would be raised if «», «» and «» have different length. We believe this is the right default behaviour. To allow, for example, «» to be longer then «» and «»,the programmer can use some variants of «»; a method producing an iterator on a subsequence of the original sequence. The following variants are available: «», «», «», «», «» and «». In the example below we use them to control iteration:
The class «» provides flexible iterators and sub-sequences. Consider the following code examples:
ys.size()
  for x in NView(xs), y in NView(ys) ( .. )
    //will iterate for as long as both xs and ys have elements.
    //similar to a functional zip
  }
NViewM = Collection.View(Num.List).more()
MainMore = {
  xs= Num.List[..]
  ys= Num.List[..]
  for x in xs, y in NViewM(ys, more=30Num) ( .. )
    //will iterate as long as xs, even if ys is shorter.
    //y = 30Num when the iteration get over ys.size()
  for x in NViewM(xs, more=10Num), y in NViewM(ys, more=30Num) ( .. )
    //will iterate for as long as either of xs and ys have elements, and values
    //x = 10Num, y = 30Num are used when the collections exhausted their elements.
  for x in NView(xs), y in NViewM(ys, more=30Num) ( .. )
    //behaves as in the "x in xs" case: a 'cut' view will consume 'more' elements
    //if they are available
  }
]]>

The power of the \

There are various methods taking advantage of the «» syntactic sugar. They provide an expressive power similar to what list-comprehensions provide in python and streams in Java, but by just using simple control flow like for/if:

2Num \add(a) )
X[ bs1==Num.List[3\;4\] ]

//flatmapping
bs2 = Num.List()( for a in as for b in bs0 \add(a+b) )
X[ bs0==Num.List[11\;21\;31\;41\;12\;22\;32\;42\;13\;23\;33\;43\;14\;24\;34\;44\;] ]

//reduce to string
str0 = S"the content is: ".builder()( for a in as \add(a) )
X[ str0 == S"the content is: 1234" ]
str1 = S"[%as.left()".builder()( for n in as.vals(1I) \add(S", %n") )++S"]"
X[ str1 == S"[1, 2, 3, 4]" ]

//reduce/fold
acc  = ( var x = 0Num, for a in as ( x+=a ), x )
acc1  = 0Num.acc()( for a in as \add(a) ) //also \addIf, \times, \val ..
X[ acc==acc1; acc1 == 10Num ]

//checks a property; great in an if or an X[]
ok0 = Match.Some()( for a in as \add(a>3Num) )
X[ ok0 ]

X[ !Match.All()( for a in as \add(a>3Num) ) ]

X[ !Match.None()( for a in as \add(a>3Num) ) ]

X[ 0I.acc()( for a in as \addIf(a>3Num) )==2I ]

asContainsAllBs = Match.All()( for b in bs \add(b in as) )

asIntersectBs = Match.Some()( for b in bs \add(b in as) )

asDisjointBs = Match.None()( for b in bs \add(b in as) )
//Note: b in as == as.contains(b)
]]>
The language 42 is expression based. Expressions that look like statements are just expressions with the «» return type. Those methods that take advantage of the «» are simply methods with a single parameter «». The «» method in the «» examples short circuit when appropriate, so that the for can terminate as soon as the result is known.

(3/5) Lists with mutable and immutable elements

Mutable sequences can contain mutable objects. While this can be useful in special circumstances, it can create aliasing issues similar to the ones of the animals example of before. To warn against such issues, methods «», «» and «» return «» references to mutable objects. In order to obtain a «» reference, the user needs to use the methods «», «» and «».

Up to now we focused on lists of «», but all instances of «» are immutable; we now discuss what happens where mutable lists contains a mixture of mutable and immutable elements. Consider the following code:

As you can see, to insert a mutable point we need to use «» and to take the point out we have to add the «» to the method. When iterating on a list, if we expect a mixture of «» and «» values we must add «» to avoid a runtime error. If we expect all values to be «», we can write «» instead. When a «» collection is promoted to «», it still remembers what values were originally inserted as «». To make it so that all values can be read as «», we can use the method «». In addition of normalizing the collection, it also marks all values accessible as «», as shown in the code below:

(4/5) Map, set, opt..

«» also support maps, sets, optional and enumerations. We will add more kinds of collections in the future.

Optional

In 42 there is no concept of null, and all the values are always intentionally initialized before they can be read. There are two main reasons programmers rely on nulls: optional values and circular/delayed initialization. Circular initialization can be solved with a «» types, an advanced typing feature that we do not discuss here. Optional values are a staple of functional programming and are logically equivalent to a collection of zero or one element, or, if you prefer, a box that may or may not contain an element of a certain type. Optionals values can be obtained with «» as shown below. Optionals are also optimized so that they do not require the creation of any new objects at run time.


  //we can just check if a box is not empty as if it was a boolean
  p01Box.#val().x(50\)//updates the x value of the point
  Debug(S"printing %p01")//printing Point(x=50, y=1)
  p00Box:= OPoint()//updates the local variables with empty boxes
  p01Box:= OPoint()
  if !p00Box ( Debug(S"printing %p00Box") )//printing <>
  X[
    !(p00 in p00Box);
    !p00Box;//using isPresent() or not is just a matter of style
    !p00Box.isPresent();
    ]  
  )
]]>
At this point in the tutorial, some readers will be confused that we can update the local variable binding «» even if it is immutable. Other readers instead will remember that immutability is a property of the reference and not of the binding/field: a local binding and fields declared «» can be updated. The updated value needs to respect the modifier of the field/binding type: if it is «» it needs to be updated with another «»; if it is «» then it can be updated with either «», «» or «». Oh, yes, another reader will realize ... and a «» reference can be assigned to any of «», «», «» or «». Note how both local bindings are updated using the same exact expression: «» In 42 «» can be either «» or «» (or «» indeed) On the other side, consider «» and «»: the first one is immutable since «» is «», while the second one is mutable since «» is «».

Optionals can be combined with the short circuting operators «» and «». Many interesting patterns emerge from this:

Map

Thanks to normalization 42 can have very fast and most reliable hash sets and hash maps. The values of sets and the keys of maps must be immutable, and are normalized just before being inserted in the collection. Then, the value of the normalized pointer is used to check for equality and hashcode. This has various positive effects:

  • The user does not need to write equality and hashing behaviour
  • There is no risk of mistake in the equality and hashing behaviour
  • The intrinsic invariants of the hashmap/hashset are never violated/corrupted.
  • The equality is a perfect structural equality, but is as fast as pointer equality; for maps with large keys this can make a massive performance difference.
Maps and sets have less methods than lists, but they can still be iterated upon, as shown in the following code:
%val") )
  //we can use (..) to extract the key/val fields from PointToString.Entry
  //this iteration is straightforward since all values are imm
  roads = Roads[
    key=S"Kelburn Parade", val=\(x=0\ y=0\); //immutable to immutable
    key=S"The Terrace", mutVal=\(x=0\ y=0\);//immutable to mutable
    key=S"Cuba Street", mutVal=\(x=0\ y=0\);//immutable to mutable
    ]
  for read (key, val) in roads ( Debug(S"%key->%val") )
  //we add 'read' in front to be able to read mixed imm/mut values
  //if all the values were mutable, we could just add 'mut' in front
  )
  mut Roads.OVal optPoint = roads.#val(key=S"Cuba Street"))
  optPoint.#val().x(50\)//update the field of the object inside the map
  ]]>
As you can see, when objects are retried from the map, we obtain an optional value; this is because statically we can not know if a key is mapped to a value or not. We can use «» or the «» pattern to provide a default value if the key is not contained in the map.
In addition to conventional «» and «», maps offers the following methods:
  • To extract a value using the key: «», «» and «»; to extract an optional «», «» or a «» reference, respectively. As for lists, it is always safe to extract a «» reference. An empty optional will be produced when attempting to extract as «» a value that was inserted as «» instead, so to reliably ask if a key is contained in the map we should write «».
  • Mutable maps can be modified by inserting immutable values with «» and mutable values with «». Finally, an association can be removed using «».
  • «» creates a class remembering the insertion order. This is needed to make the iteration deterministic. The keys can be retrieved with their order using «» passing the desired «», from zero to «» The corresponding value can be retrieved by methods «», «» and «» to extract a «», «» or a «» (not optional) reference to the value, respectively.

Set

Sets behave a lot like maps where the values are irrelevant, and have differently named methods. In particular, in addition to conventional «» and «», sets offer methods «» and «» to add and remove an element, and elements can be extracted in the insertion order by using method «» We are considering adding operators «» to sets, as supported by lists. On the other side, boolean methods like «» «» and «» can already be easily emulated with «» as we shown for lists.

(5/5) Collection summary

  • There are tons of methods and operators to know, but since most code works around collections, it is worth the effort to memorize them.
  • Immutable collections are easy to play with, using operators and «» methods.
  • Mutable collections can be more efficient and flexible, but they come with additional difficulties.
  • Most methods have a general version that works with an index, and specialized «» and «» variants.
  • «» can help remove a lot of boilerplate, but is a concept unique to 42, and require some effort to get used to.
  • «» is very useful and flexible. It is common to find methods composed from just a large «» statement plus a little pre and post processing around it.

Digressions / Expansions

Collections support iteration with the «» syntax. Iteration in 42 is way more flexible than in most other languages, and it is delegated on method calls. Iteration in 42 is designed to support two main iteration strategy: explicit indexes and iterator objects. For example, the code

would expand to
Since «» is declared «», «» is used instead of «». Since «» is declared «» «» is used instead of «».

Iteration methods in detail

  • «» and «»
    They return an object able to provide the elements of the list. The second variant returns a «» object, this is needed to provide the mutable version of those elements, and to update those elements in the list. The second variant is used if the local binding is explicitly declared either «», «»,«» or «». For complex bindings, like «», the second variant is used if any binding component would require it.
  • «» and «»
    The initial iteration hint is produced by «» and moved forward by «».
  • «», «» and «»
    The iterator checks if it has more elements to provide by calling «». Since iterations in 42 can work on multiple collections at the same time, «» results can be combined with «». This design offers a lot of flexibility; special kinds of collections may return special data types to coordinate termination in some way. The AdamsTowel collections opt for an efficient solution where there are four logical results for «»: «», «», «» and «». The iteration will stop if all the «» return «» or «», and an error is raised if some «» returns «» and some other returns «». With this design, the result of the single «» method coordinates the various iterators to check if more elements are possibly available, and to decide how strongly to insist for those elements to be visited.

Possible implementations of Iteration methods

The iterator methods allows for a range of possible options. The simplest one, and possibly the most efficient, is to delegate everything to the collection object: in this case, there is no need to create a new object to serve as an iterator, since the collection itself is able to simply access/update its elements by index, thus «» and «» may simply return «». «» returns the «» and «» returns «» if «» and «» otherwise. Finally, «» will throw error if «» would return «».

Another option would be the one of a linked list, where it would be inefficient to rely on the index to access the elements. In this case, the «» and «» may simply return a singleton object delegating the behaviour to the index object. «» can simply expose the first internal node, and «» can produce the next node. The singleton object may be able to see private methods of those internal nodes, thus even if we expose the internal nodes, they will be just unusable black boxes to the library user, and all the interactions can be mediated, and checked by the singleton iterator object.

Iterators in Java and other languages will throw an error if the collection is somehow modified during the iteration. This could be supported by providing a specialized sublist object that can remember its version; but it is unclear if this is a good idea in 42, where most collections would be iterated while they are immutable. The few cases of iteration on mutable collections may be the ones where there are good reasons to perform mutation during iteration. While adding elements at the start of a collection under iteration is most likely a bug, other operations have very reasonable use cases. For example, appending elements at the end of a list while computing a fixpoint, removing tasks from the end of a list if they are now unneeded, or replacing already visited elements with new ones.

      Previous ... Next