Friday, July 1, 2016

Part 1 : Writing custom collector with initial capacity of Collection in Java 8

Java 8 provided a large set of classes to support functional-style operations on stream of elements. We can now write more expressive, more concise and more readable code. And what does that mean?

Pre Java 8 code to count even numbers:
      int[] a = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
     long count = 0;
     for (int i = 0; i < a.length; i++) {
           if (i % 2 == 0) {
                count++;
           }
     }
     return count;

Java 8 code to count even numbers:

int[] a = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 };
     long totalEvens = Arrays.stream(a)
                             .filter(val -> val % 2 == 0)
                             .count();
     return totalEvens;

This is simple example to know that Java 8 code is more readable and more expressive which helps us to about writing boiler plate code and concentrate on use-case.

Stream API provides two set of operations called intermediate operations (returns Stream result) and terminal operations (non-stream result). We are going to look at terminal operation of category Mutable Reduction.

Mutable Reduction is an operation that accumulates input element into mutable result container like Collection using method Stream.collect().

Common mutable reduction or Collectors may include following (this is not a complete set of Collectors):
1.       Stream.collect(Collectors.toCollection())
2.       Stream.collect(Collectors.toList())
3.       Stream.collect(Collectors.toSet())

This Collectors do not provide an initialCapacity constructor that is provided in several Collection classes for example:

List<String> list = new ArrayList<String>(50);
           
NOTE: For ArrayList, initial capacity constructor is used when you know in advance what the size of ArrayList is going to be. This can avoid internal array repeatedly getting copied over to newly allocated array.

Let us now look at Collector interface and leverage its concepts to write a Collector that can accept collection with initial capacity.

Below is Collector interface with 3 different type arguments:
public interface Collector<T, A, R>


1.       T – type of input elements to reduction operation.
2.       A – the mutable accumulation type of reduction operation
3.       R – result type of reduction operation

Below are abstract methods in Collector interface:

1.  Supplier<A> supplier() – Creates and returns new mutable container.
2.  BiConsumer<A, T> accumulator() – Adds an single value to container.
3.  BinaryOperator<A> combiner() – It is used to join two accumulators into one.
4.  Function<A, R> finisher() – Used to do final transformation from A i.e. mutable accumulation type of reduction operation to type R i.e. result type of reduction operation.
5.  Set<Characteristics> characteristics() – Set that indicates characteristics of this Collector.

Here is link of JavaCodeGeeks where Collectors are explained beautifully.

Collector interface also has a static method called Collector.of() that accepts parameters as Supplier, BiConsumer, BinaryOperator and Characteristics.

We will use this method to create an interesting collector of Collection with initial capacity.

We are now one step closer of creating our Collector. Before that we need to know one functional interface called java.util.function.IntFunction.

IntFunction is a functional interface that represents an int-valued arguments and produces a result. And why do we care about it? Because it accepts the int-valued argument in our case it will be initial capacity and returns a result in our case it will be for example new ArrayList<>(initialCapacity).

ToListCollectors class’s method toArayList accepts parameters as initialCapacity and Characteristics. initialCapacity parameter is used to denote the int valued constructor of ArrayList class.

/**
 * {@link ToListCollectors} is used to create a Collection
 * {@link Collector} that accepts initial capacity.
 *
 * */
public final class ToListCollectors {

     /**
      * @param <T> The type of input elements for the new 
      * collector
      * */
     public static <T> Collector<T, ?, List<T>> toArrayList(
                final int initialCapacity,
                final Characteristics... characteristics) {
          
           return
                     CollectionCollector.toCollection(
                                initialCapacity,
                                ArrayList::new,
                                characteristics);
     }
}

CollectionCollector class is used to create a Collector. Below is entire code for same.

public final class CollectionCollector {

     /**
      * Creates a {@link Collector} of specified {@link Supplier}
      * and {@link Characteristics}.
      *
      * @param T The type of input elements for the new collector
      * @param C The type of {@link Collection}
      * */
     public static <T, C extends Collection<T>> 
     Collector<T, ?, C> toCollection(
                final Supplier<C> supplier,
                final Characteristics... characteristics) {
          
           return
                     Collector.of(
                           supplierFactory,
                           C::add,
                            (t, u) -> {
                                t.addAll(u);
                                return t;
                           },
                           characteristics);
      }

     /**
      * This method accepts initial capacity as int-valued 
      * argument.
      * We use this argument to create new {@link ArrayList} using
      * {@link ArrayList#ArrayList(int)} constructor.
      *
      * @param T The type of input elements for the new collector
      * @param C The type of {@link Collection}
      * */
     public static <T, C extends Collection<T>> 
     Collector<T, ?, C>  toCollection(
                final int initialCapacity,
                final IntFunction<C> initialSizedCollection,
                final Characteristics... characteristics) {
          
return toCollection(() -> initialSizedCollection.apply(initialCapacity), characteristics);
     }
}

Ok now let us understand the code of class CollectionCollector. Let us first see method toCollection with 3 parameters as initialCapacity, initialSizedCollection, & characteristics.

public static <T, C extends Collection<T>>
Collector<T, ?, C> toCollection(
           final int initialCapacity,
           final IntFunction<C> initialSizedCollection,
           final Characteristics... characteristics) {
          
return toCollection(() -> initialSizedCollection.apply(initialCapacity), characteristics);
}

This method creates a new Supplier of class ArrayList with initial capacity but the creation is hidden under the lambda operator. Let us demystify it with below code.
Below line
() -> initialSizedCollection.apply(initialCapacity)

is equivalent to
new Supplier<C>() {
           @Override
           public C get() {
                return initialSizedCollection.apply(initialCapacity);
           }
     }


Now we have just created a Supplier who will supply value ArrayList class with initial capacity constructor.

Now let us understand toCollection method with two arguments i.e. Supplier and Characteristics.
public static <T, C extends Collection<T>>
Collector<T, ?, C> toCollection(
           final Supplier<C> supplier,
           final Characteristics... characteristics) {
          
           return
                Collector.of(
                     supplierFactory,
                     C::add,
                      (t, u) -> {
                           t.addAll(u);
                           return t;
                     },
                     characteristics);
     }

This method calls the static method Collector.of (the one we discussed before writing the code).
Collector.of(
          supplierFactory,  //Supplier – creates and returns new ArrayList
          C::add,         //BiConsumer – adds one element to ArrayList
          (t, u) -> {     //BinaryOperator – merges two partial results.
                t.addAll(u);
                return t;
          },
          characteristics); //Collector characteristics



Now you can go ahead and create Collectors with initial capacity of several different Collections.

In next post we will see custom implementations of Set interface of JDK implementations like HashSet(with initial capacity), LinkedHashSet(with initial capacity), TreeSet(with Comparator).

Part 2 Writing custom collectors for Set implementations is published here.

If there are any errors or mistakes or better way to do it please mention them in Comments.

No comments:

Post a Comment

Ads Inside Post