Friday, August 27, 2010

Old blog: Map in Java

(Found in my years old defunct blog. Less relevant now that Java is getting lambdas.)

Scheme
(map fn lst ...)

;;  ==> list
      

map is a higher order function that applies the function fn of n arguments to each element in the lists lst ...

Java has no map. Enhanced for isn't adequate for the following reasons:

  • No return value: Since for is a control structure, not a function, it doesn't return anything useful. This means a data structure has to be created beforehand, and added to at each iteration of the for loop.
  • No function arguments: This is not for's problem. Functions (methods) are not first class in Java.
  • One iterable only: The enhanced for loop can only iterate over one Iterable data structure. map can operate on any number of lists.

Examples

Simple one list mapping

Scheme
(define squares (list 1 4 9 16))

(define (sqrts nums)
  (map sqrt nums))

(sqrts squares)

;;  ==> (1 2 3 4)
      

Java
/* No generics for clarity.  Assume also the existance of a static
   method `toList' that takes an Array and returns an ArrayList. */

List squares = toList(new Integer[] {1, 4, 9, 16});

List sqrts(List nums) {
    List out = new ArrayList();
    for (Integer i : nums)
        out.add(Math.sqrt(i));
    return out;
}

sqrts(squares);

//  ==> [1, 2, 3, 4]
      
Multiple list mapping

Scheme
(define even (list 2 4 6 8))
(define odd (list 1 3 5 7))

(define (sums . lists)
  (apply map + lists))

(sums even odd)

;;  ==> (3 7 11 15)
      

Java
/* Assume the existance of a static method `isSameLength' that tests
   whether all Lists in an array are the same length. */

List even = toList(new Integer[] {2, 4, 6, 8});
List odd = toList(new Integer[] {1, 3, 5, 7});

List sums(List... lists) {
    List out = new ArrayList();
    if (!isSameLength(lists))
        throw new RuntimeException("Lists must have same length.");
    int numargs = lists.length;
    int length = lists[0].size();
    for (int i = 0; i < length; i++) {
        int val = 0;
        for (int k = 0; k < numargs; k++)
            val += lists[k].get(i);
        out.add(val);
    }
    return out;
}

sums(even, odd);

//  ==> [3, 7, 11, 15]
      

The problem with this is that it isn't generalised.

Map in Java

First we'll need a something that acts as a first-class function.

Java
public interface Proc {

    public Object apply(Object... args);

}
      

Next, a static map method.

Java
public static List map(Proc proc, List... lists) {
    List out = new ArrayList();
    if (!isSameLength(lists))
        throw new RuntimeException("`map' requires lists of identical length");
    int numargs = lists.length;
    int length = lists[0].size();
    for (int i = 0; i < length; i++) {
        Obj[] args = new Obj[numargs];
        for (int k = 0; k < numargs; k++)
            args[k] = lists[k].get(i);
        out.add(proc.apply(args));
    }
    return out;
}
      

Now, the multiple lists example again.

Java
List even = toList(new Integer[] {2, 4, 6, 8});
List odd = toList(new Integer[] {1, 3, 5, 7});

Proc sum = new Proc() {
    public Object apply(Object... args) {
        return (Integer)args[0] + (Integer)args[1];
    }
};

List sums(List... lists) {
    return map(sum, lists);
}

sums(even, odd);

//  ==> [3, 7, 11, 15]
      

Much shorter, but with some limitations:

  • It doesn't use generics, and therefore
  • requires casting in Proc implementations; and
  • it is hardcoded to return an ArrayList.
Parametrized map

First, Proc is changed.

Java
public interface Proc1<R, A> {

    public R apply(A arg);

}

public interface Proc2<R, A, B> {

    public R apply(A argA, B argB);

}

public interface ProcN<R, A, B,..., N> {

    public R apply(A argA, B argB,..., N argN);

}
      

And a parameterized map.

Java
public static <R, A, X extends List<R>> X map(Proc1<R, A> proc, X out, List<A> list) {
    for (A a : list)
        out.add(proc.apply(a));
    return out;
}

public static <R, A, B, X extends List<R>> X map(Proc2<R, A, B> proc,
        X out, List<A> listA, List<B> listB) {
    if (!isSameLength(listA, listB))
        throw new RuntimeException("`map' requires lists of identical length.");
    int length = listA.size();
    for (int i = 0; i < length; i ++)
        out.add(proc.apply(listA.get(i), listB.get(i));
    return out;

// And so on up to `map<R, A, B,..., N, X>.
      

The example would now be:

Java
List even = toList(new Integer[] {2, 4, 6, 8});
List odd = toList(new Integer[] {1, 3, 5, 7});

Proc2<Integer, Integer, Integer> sum = 
    new Proc<Integer, Integer, Integer>() {
        public Integer apply(Integer argA, Integer argB) {
            return argA + argB;
        }
    };

List sums(List listA, List listB) {
    return map(sum, new ArrayList(), listA, listB);
}

sums(even, odd);

//  ==> [3, 7, 11, 15]
      
The cost of parameterizing

The obvious limitation of the generic implementation is that arbitrary numbers of arguments are no longer supported in ProcN or map.

Reduce

reduce is easily written on top of Proc2.

Java
public static <R, A> R reduce(Proc2<R, R, A> proc, List<A> list, R initial) {
    int size = list.size();
    R out = initial;
    for (A a : list)
        out = fun.apply(out, a);
    return out;
}

List nums = toList(new Integer[] {1, 2, 3, 4});

Integer squareSum(List list) {
    return reduce(new Proc2<Integer, Integer, Integer>() {
        public Integer apply(Integer a, Integer b) {
            return a + (b * b);
        }
    }, list, 0);
}

squareSum(nums);

//  ==> 30
      

No comments:

Post a Comment