Data structures in Scala
Table of Contents
fold,foldLeft and foldRight in list
Understanding fold in Parallel Scala Collections
In Scala, the fold method aggregates the elements of a collection using an initial seed value and a binary operation. When applied to a parallel collection, the folding process can execute in parallel, which introduces an important distinction in how the function behaves compared to a sequential fold.
Specifically, the fold operation in a parallel collection splits the input into chunks, applies the folding operation on each chunk independently (starting with the seed value), and then combines the intermediate results. Because of this, the combining function must be both associative and commutative to ensure correctness regardless of the order in which results are combined.
1val parallelNumSeq = List(1, 2, 3, 4, 5).par
2val foldResult = parallelNumSeq.fold(0) { (acc1, acc2) =>
3 val sum = acc1 + acc2
4 println(s"Fold: acc1($acc1) + acc2($acc2) = $sum")
5 sum
6}
7println(foldResult)
8
9/* Example Output (order may vary due to parallelism):
10Fold: acc1(0) + acc2(1) = 1
11Fold: acc1(0) + acc2(4) = 4
12Fold: acc1(0) + acc2(2) = 2
13Fold: acc1(0) + acc2(5) = 5
14Fold: acc1(0) + acc2(3) = 3
15Fold: acc1(4) + acc2(5) = 9
16Fold: acc1(1) + acc2(2) = 3
17Fold: acc1(3) + acc2(9) = 12
18Fold: acc1(3) + acc2(12) = 15
1915 */In the output shown above, you can observe how acc1 and acc2 can both be intermediate accumulators, rather than raw elements from the original list. This is a result of the internal structure of parallel folds, where sub-results are merged after parallel computation. As a result, the print statements may appear out of the expected sequential order.
If order of operations is important, or if your operation is not associative, it is safer to use foldLeft on sequential collections instead.
Understanding foldLeft in list
In Scala, foldLeft is a powerful method available on collections (like List) that allows you to reduce or accumulate a result by iterating from the beginning (left) to the end (right) of the collection.
1list.foldLeft(initialValue)(operation)- initialValue: The starting value for the accumulation.
- operation: A function that takes two arguments — the current accumulator and the current element — and returns an updated accumulator.
Here's a example where we use foldLeft to compute both the sum and product of a list of integers in a single pass:
1val numbers = List(1, 2, 3, 4, 5)
2
3// Compute both sum and product using foldLeft
4val (sum, product) = numbers.foldLeft((0, 1)) {
5 case ((accSum, accProd), num) =>
6 (accSum + num, accProd * num)
7}
8
9println(s"Sum: $sum, Product: $product") // Output: Sum: 15, Product: 120This demonstrates how foldLeft can be used for more advanced use cases, such as computing multiple results in one pass by using a tuple as the accumulator.
Understanding foldRight in List
In Scala, foldRight is similar to foldLeft, but it processes the elements from the end (right) to the beginning (left) of the collection.
1list.foldRight(initialValue)(operation)• initialValue: The starting value for the accumulation.• operation: A function that takes two arguments — the current element and the current accumulator — and returns the new accumulator. (Note: the argument order is reversed compared to foldLeft.)
Here's an example where we use foldRight to reverse a list:
1val numbers = List(1, 2, 3, 4, 5)
2
3// Reversing the list using foldRight
4val reversed = numbers.foldRight(List.empty[Int]) { (num, acc) =>
5 acc :+ num
6}
7
8println(reversed) // Output: List(5, 4, 3, 2, 1)In this example, each element is appended to the end of the accumulator, effectively reversing the list. While foldRight can be more intuitive in certain recursive structures, it is generally less efficient than foldLeft on large collections due to stack usage.
The foreach Method in Scala
In Scala, foreach is a method available on most collections (such as List, Set, Map,Array) as well as on iterators. It is used to apply a given function to each element in the collection, one at a time. Unlike methods such as map or filter, which return a new collection, foreach is specifically designed for side-effecting operations. This makes it useful when you want to print values, update external state, write to a file, or interact with an external system, rather than transform data.
Using foreach with a List
When applied to a List, foreach allows you to run an operation on every element. For example, you might want to print a greeting for each name in a list:
1val names = List("Alice", "Bob", "Charlie")
2names.foreach(name => println(s"Hello, $name!"))Using foreach with a Map
Maps store key–value pairs, and foreach lets you operate on both at the same time. In this example, we iterate over each entry in the map and print both the name and the associated age:
1val ages = Map("Alice" -> 30, "Bob" -> 25)
2ages.foreach { case (name, age) =>
3 println(s"$name is $age years old.")
4}Notice that the syntax uses pattern matching inside the foreach call to destructure the key–value pairs from the map, making the code concise and expressive.
Overall, foreach is a powerful way to iterate over collections when your goal is to perform actions with side effects rather than transform data into new collections. When you do want to transform data, methods like map or flatMap are usually more appropriate.
Reduce Method
In Scala, the reduce method is used to combine the elements of a collection into a single result by repeatedly applying a binary operation. The method takes a function that accepts two elements of the collection's type and produces a single value of the same type. This function is then applied across the sequence, element by element, until only one value remains.
The process starts by combining the first two elements of the sequence. The result of that operation is then combined with the next element, and so on, until all elements are processed. The final outcome is returned as a single value. Because reduce does not take an initial value, it requires the collection to be non-empty; otherwise, it will throw an exception. When you need to handle empty collections safely, methods like reduceOption or fold are better choices.
A key property of the binary operation passed to reduce is associativity. The function should yield the same result regardless of how elements are grouped, since collections can sometimes be processed in parallel. For example, addition and multiplication are associative, while subtraction is not.
1object SequenceReduction {
2 def main(args: Array[String]): Unit = {
3 val numbers: Seq[Int] = Seq(1, 2, 3, 4, 5)
4
5 // Sum of elements using reduce
6 val sum = numbers.reduce((a, b) => a + b)
7 println(s"Sum of elements: $sum") // Output: 15
8
9 // Product of elements using reduce
10 val product = numbers.reduce((a, b) => a * b)
11 println(s"Product of elements: $product") // Output: 120
12
13 // Concatenating strings using reduce
14 val words: Seq[String] = Seq("Scala", "is", "fun")
15 val sentence = words.reduce((s1, s2) => s"$s1 $s2")
16 println(s"Concatenated sentence: $sentence") // Output: Scala is fun
17 }
18}Appending and Prepending Elements to a List
Prepending
In Scala, the most efficient way to add an element to a list is to prepend it using the :: operator. This operation simply creates a new node pointing to the existing list, which runs in constant time O(1).
1val originalList = List(2, 3)
2val newList = 1 :: originalList // List(1, 2, 3)Appending
Adding an element to the end of a list uses the :+ operator. Unlike prepending, appending requires traversing the entire list to reach its end, making the complexity linear in the list size O(n). This is less efficient for large immutable lists.
1val originalList = List(1, 2)
2val newList = originalList :+ 3 // List(1, 2, 3)Concatenating
To merge two lists, use the ::: operator. The new list is built by prepending all elements of the first list onto the second. The time complexity is proportional to the size of the first list O(n).
1val list1 = List(1, 2)
2val list2 = List(3, 4)
3val combinedList = list1 ::: list2 // List(1, 2, 3, 4)In summary, prepending is generally preferred when building lists because it is constant-time. Appending and concatenating are more costly, but still commonly used when constructing lists in a readable way.
scanLeft and scanRight in Scala
The scanLeft and scanRight methods are similar to foldLeft and foldRight, but instead of returning a single accumulated result, they return a new collection that contains all intermediate results of the accumulation — including the initial seed value.
Example: scanLeft
scanLeft processes elements from left to right, applying a binary operation and collecting partial results.
1val list = List(1, 2, 3, 4, 5)
2val result = list.scanLeft(0)(_ + _)
3// result: List(0, 1, 3, 6, 10, 15)Here, the computation starts with an initial seed value 0, and at each step, the function adds the next element to the running total:
Example: scanRight
Conversely, scanRightworks from right to left, applying the operation in reverse order.
1val list = List(1, 2, 3, 4, 5)
2val result = list.scanRight(0)(_ + _)
3// result: List(15, 14, 12, 9, 5, 0)The final list shows how the running total decreases as we move from right to left.
Parameters
Both scanLeft andscanRight take two parameters:
These methods are useful when you want not just the final aggregated result, but also the intermediate steps — for example, when tracking running totals, cumulative sums, or incremental transformations.
collect in Scala
The collect method is a powerful way to filter and transform elements of a collection in a single step. It takes a PartialFunction[A, B] as its argument — this defines both:
During execution, collect iterates over each element in the collection. For every element:
1val numbers = List(1, 2, "three", 4, "five", 6)
2
3// Define a partial function to extract and convert Strings to their length
4val extractStringLength: PartialFunction[Any, Int] = {
5 case s: String => s.length
6}
7
8// Use collect to get the lengths of the String elements
9val stringLengths = numbers.collect(extractStringLength)
10
11println(stringLengths) // Output: List(5, 4)In this example, collect goes through each element in numbers. Only elements that match the pattern case s: String — the string values "three" and "five" — are processed.
For each string, the partial function returns its length, producing a new list List(5, 4). Non-string elements (like integers) are skipped automatically.
collect combines filter and map into one concise operation.groupBy in Scala
The groupBy method in Scala is a powerful higher-order function that allows you to categorize elements in a collection based on a specified criterion. It takes a function as an argument that determines the key used for grouping.
The function you provide takes an element from the collection and returns a "key" — and all elements that share the same key are collected together into a group. The method then returns an immutable.Map, where:
Listor Seq) containing all elements that belong to each group.1val words = List("apple", "banana", "cat", "dog", "elephant")
2val groupedByLength = words.groupBy(_.length) // or words.groupBy(word => word.length)
3
4// Result: Map(5 -> List(apple), 6 -> List(banana), 3 -> List(cat, dog), 8 -> List(elephant))In this example, each word is grouped according to its length. Words with the same length are placed in the same group, producing a map where the key is the length and the value is the list of words of that length.
groupBy(identity). This is equivalent to writing groupBy(x => x).1val numbers = List(1, 2, 2, 3, 4, 5, 5, 5, 6)
2val groupedByIdentity = numbers.groupBy(identity)
3
4// Result: Map(5 -> List(5, 5, 5), 1 -> List(1), 6 -> List(6), 2 -> List(2, 2), 3 -> List(3), 4 -> List(4))This example groups the numbers by their value, so all identical numbers are collected together in the resulting map.
map Method
Applying .map to an Immutable Map
In Scala, calling .map on an immutable.Map transforms each key-value pair into a new pair, producing a new immutable map. The original map remains unchanged because it is immutable — meaning no modifications can be made directly to it.
The .map method on Map[K, V] accepts a function of type((K, V)) ⇒ (K2, W). This function is applied to each entry (key, value) in the original map.
1val originalMap = Map("a" -> 1, "b" -> 2, "c" -> 3)
2// originalMap: Map(a -> 1, b -> 2, c -> 3)
3
4val transformedMap = originalMap.map { case (key, value) =>
5 (key.toUpperCase, value * 2)
6}
7// transformedMap: Map(A -> 2, B -> 4, C -> 6)
8
9// The original map remains unchanged
10println(originalMap) // Output: Map(a -> 1, b -> 2, c -> 3)In this example, each key is converted to uppercase, and each value is doubled. A new map is returned with the transformed pairs, while originalMap stays exactly the same — demonstrating the immutability of Scala's Map.