Handling maps in Groovy vs Java

Discover the differences in map handling between Groovy and Java with this hands-on demo.
1 reader likes this.
women programming

WOCinTech Chat. Modified by Opensource.com. CC BY-SA 4.0

Java is a great programming language, but sometimes I want a Java-like language that's just a bit more flexible and compact. That's when I opt for Groovy.

In a recent article, I reviewed some of the differences between creating and initializing maps in Groovy and doing the same thing in Java. In brief, Groovy has a concise syntax for setting up maps and accessing map entries compared to the effort necessary in Java.

This article will delve into more differences in map handling between Groovy and Java. For that purpose, I will use the sample table of employees used for demoing the JavaScript DataTables library. To follow along, start by making sure you have recent versions of Groovy and Java installed on your computer.

Install Java and Groovy

Groovy is based on Java and requires a Java installation as well. A recent and/or decent version of Java and Groovy might already be in your Linux distribution's repositories, or you can download and install Groovy from the Apache Groovy website. A good option for Linux users is SDKMan, which can be used to get multiple versions of Java, Groovy, and many other related tools. For this article, I'm using SDK's releases of:

  • Java: version 11.0.12-open of OpenJDK 11
  • Groovy: version 3.0.8.

Back to the problem: maps

First, in my experience, maps and lists (or at least arrays) often end up in the same program. For example, processing an input file is very similar to passing over a list; often, I do that when I want to categorize data encountered in the input file (or list), storing some kind of value in lookup tables, which are just maps.

Second, Java 8 introduced the whole Streams functionality and lambdas (or anonymous functions). In my experience, converting input data (or lists) into maps often involves using Java Streams. Moreover, Java Streams are at their most flexible when dealing with streams of typed objects, providing grouping and accumulation facilities out of the box.

Employee list processing in Java

Here's a concrete example based on those fictitious employee records. Below is a Java program that defines an Employee class to hold the employee information, builds a list of Employee instances, and processes that list in a few different ways:

     1	import java.lang.*;
     2	import java.util.Arrays;
       
     3	import java.util.Locale;
     4	import java.time.format.DateTimeFormatter;
     5	import java.time.LocalDate;
     6	import java.time.format.DateTimeParseException;
     7	import java.text.NumberFormat;
     8	import java.text.ParseException;
       
     9	import java.util.stream.Collectors;
       
    10	public class Test31 {
       
    11	    static public void main(String args[]) {
       
    12	        var employeeList = Arrays.asList(
    13	            new Employee("Tiger Nixon", "System Architect",
    14	                "Edinburgh", "5421", "2011/04/25", "$320,800"),
    15	            new Employee("Garrett Winters", "Accountant",
    16	                "Tokyo", "8422", "2011/07/25", "$170,750"),
                                                        ...
    
    81	            new Employee("Martena Mccray", "Post-Sales support",
    82	                "Edinburgh", "8240", "2011/03/09", "$324,050"),
    83	            new Employee("Unity Butler", "Marketing Designer",
    84	                "San Francisco", "5384", "2009/12/09", "$85,675")
    85	        );
       
    86	        // calculate the average salary across the entire company
       
    87	        var companyAvgSal = employeeList.
    88	            stream().
    89	            collect(Collectors.averagingDouble(Employee::getSalary));
    90	        System.out.println("company avg salary = " + companyAvgSal);
       
    91	        // calculate the average salary for each location,
    92	        //     compare to the company average
       
    93	        var locationAvgSal = employeeList.
    94	            stream().
    95	            collect(Collectors.groupingBy((Employee e) ->
    96	                e.getLocation(),
    97	                    Collectors.averagingDouble(Employee::getSalary)));
    98	        locationAvgSal.forEach((k,v) ->
    99	            System.out.println(k + " avg salary = " + v +
   100	                "; diff from avg company salary = " +
   101	                (v - companyAvgSal)));
       
   102	        // show the employees in Edinburgh approach #1
       
   103	        System.out.print("employee(s) in Edinburgh (approach #1):");
   104	        var employeesInEdinburgh = employeeList.
   105	            stream().
   106	            filter(e -> e.getLocation().equals("Edinburgh")).
   107	            collect(Collectors.toList());
   108	        employeesInEdinburgh.
   109	            forEach(e ->
   110	                System.out.print(" " + e.getSurname() + "," +
   111	                    e.getGivenName()));
   112	        System.out.println();
       
       
   113	        // group employees by location
       
   114	        var employeesByLocation = employeeList.
   115	            stream().
   116	            collect(Collectors.groupingBy(Employee::getLocation));
       
   117	        // show the employees in Edinburgh approach #2
       
   118	        System.out.print("employee(s) in Edinburgh (approach #2):");
   119	        employeesByLocation.get("Edinburgh").
   120	            forEach(e ->
   121	                System.out.print(" " + e.getSurname() + "," +
   122	                    e.getGivenName()));
   123	        System.out.println();
       
   124	    }
   125	}
       
   126	class Employee {
   127	    private String surname;
   128	    private String givenName;
   129	    private String role;
   130	    private String location;
   131	    private int extension;
   132	    private LocalDate hired;
   133	    private double salary;
       
   134	    public Employee(String fullName, String role, String location,
   135	        String extension, String hired, String salary) {
   136	        var nn = fullName.split(" ");
   137	        if (nn.length > 1) {
   138	            this.surname = nn[1];
   139	            this.givenName = nn[0];
   140	        } else {
   141	            this.surname = nn[0];
   142	            this.givenName = "";
   143	        }
   144	        this.role = role;
   145	        this.location = location;
   146	        try {
   147	            this.extension = Integer.parseInt(extension);
   148	        } catch (NumberFormatException nfe) {
   149	            this.extension = 0;
   150	        }
   151	        try {
   152	            this.hired = LocalDate.parse(hired,
   153	                DateTimeFormatter.ofPattern("yyyy/MM/dd"));
   154	        } catch (DateTimeParseException dtpe) {
   155	            this.hired = LocalDate.EPOCH;
   156	        }
   157	        try {
   158	            this.salary = NumberFormat.getCurrencyInstance(Locale.US).
   159	                parse(salary).doubleValue();
   160	        } catch (ParseException pe) {
   161	            this.salary = 0d;
   162	        }
   163	    }
       
   164	    public String getSurname() { return this.surname; }
   165	    public String getGivenName() { return this.givenName; }
   166	    public String getLocation() { return this.location; }
   167	    public int getExtension() { return this.extension; }
   168	    public LocalDate getHired() { return this.hired; }
   169	    public double getSalary() { return this.salary; }
   170	}

Wow, that's a lot of code for a simple demo program! I'll go through it in chunks first.

Starting at the end, lines 126 through 170 define the Employee class used to store employee data. The most important thing to mention here is that the fields of the employee record are of different types, and in Java that generally leads to defining this type of class. You could make this code a bit more compact by using Project Lombok's @Data annotation to automatically generate the getters (and setters) for the Employee class. In more recent versions of Java, I can declare these sorts of things as a record rather than a class, since the whole point is to store data. Storing the data as a list of Employee instances facilitates the use of Java streams.

Lines 12 through 85 create the list of Employee instances, so now you've already dealt with 119 of 170 lines.

There are nine lines of import statements up front. Interestingly, there are no map-related imports! This is partly because I'm using stream methods that yield maps as their results, and partly because I'm using the var keyword to declare variables, so the type is inferred by the compiler.

The interesting parts of the above code happen in lines 86 through 123.

In lines 87-90, I convert employeeList into a stream (line 88) and then use collect() to apply the Collectors.averagingDouble() method to the Employee::getSalary (line 89) method to calculate the average salary across the whole company. This is pure functional list processing; no maps are involved.

In lines 93-101, I convert employeeList into a stream again. I then use the Collectors.groupingBy() method to create a map whose keys are employee locations, returned by e.getLocation(), and whose values are the average salary for each location, returned by Collectors.averagingDouble() again applied to the Employee::getSalary method applied to each employee in the location subset, rather than the entire company. That is, the groupingBy() method creates subsets by location, which are then averaged. Lines 98-101 use forEach() to step through the map entries printing location, average salary, and the difference between the location averages and company average.

Now, suppose you wanted to look at just those employees located in Edinburgh. One way to accomplish this is shown in lines 103-112, where I use the stream filter() method to create a list of only those employees based in Edinburgh and the forEach() method to print their names. No maps here, either.

Another way to solve this problem is shown in lines 113-123. In this method, I create a map where each entry holds a list of employees by location. First, in lines 113-116, I use the groupingBy() method to produce the map I want with keys of employee locations whose values are sublists of employees at that location. Then, in lines 117-123, I use the forEach() method to print out the sublist of names of employees at the Edinburgh location.

When we compile and run the above, the output is:

company avg salary = 292082.5
San Francisco avg salary = 284703.125; diff from avg company salary = -7379.375
New York avg salary = 410158.3333333333; diff from avg company salary = 118075.83333333331
Singapore avg salary = 357650.0; diff from avg company salary = 65567.5
Tokyo avg salary = 206087.5; diff from avg company salary = -85995.0
London avg salary = 322476.25; diff from avg company salary = 30393.75
Edinburgh avg salary = 261940.7142857143; diff from avg company salary = -30141.78571428571
Sydney avg salary = 90500.0; diff from avg company salary = -201582.5
employee(s) in Edinburgh (approach #1): Nixon,Tiger Kelly,Cedric Frost,Sonya Flynn,Quinn Rios,Dai Joyce,Gavin Mccray,Martena
employee(s) in Edinburgh (approach #2): Nixon,Tiger Kelly,Cedric Frost,Sonya Flynn,Quinn Rios,Dai Joyce,Gavin Mccray,Martena

Employee list processing in Groovy

Groovy has always provided enhanced facilities for processing lists and maps, partly by extending the Java Collections library and partly by providing closures, which are somewhat like lambdas.

One outcome of this is that maps in Groovy can easily be used with different types of values. As a result, you can't be pushed into making the auxiliary Employee class; instead, you can just use a map. Let's examine a Groovy version of the same functionality:

     1	import java.util.Locale
     2	import java.time.format.DateTimeFormatter
     3	import java.time.LocalDate
     4	import java.time.format.DateTimeParseException
     5	import java.text.NumberFormat
     6	import java.text.ParseException
       
     7	def employeeList = [
     8	    ["Tiger Nixon", "System Architect", "Edinburgh",
     9	        "5421", "2011/04/25", "\$320,800"],
    10	    ["Garrett Winters", "Accountant", "Tokyo",
    11	        "8422", "2011/07/25", "\$170,750"],

                           ...

    76	    ["Martena Mccray", "Post-Sales support", "Edinburgh",
    77	        "8240", "2011/03/09", "\$324,050"],
    78	    ["Unity Butler", "Marketing Designer", "San Francisco",
    79	        "5384", "2009/12/09", "\$85,675"]
    80	].collect { ef ->
    81	    def surname, givenName, role, location, extension, hired, salary
    82	    def nn = ef[0].split(" ")
    83	    if (nn.length > 1) {
    84	        surname = nn[1]
    85	        givenName = nn[0]
    86	    } else {
    87	        surname = nn[0]
    88	        givenName = ""
    89	    }
    90	    role = ef[1]
    91	    location = ef[2]
    92	    try {
    93	        extension = Integer.parseInt(ef[3]);
    94	    } catch (NumberFormatException nfe) {
    95	        extension = 0;
    96	    }
    97	    try {
    98	        hired = LocalDate.parse(ef[4],
    99	            DateTimeFormatter.ofPattern("yyyy/MM/dd"));
   100	    } catch (DateTimeParseException dtpe) {
   101	        hired = LocalDate.EPOCH;
   102	    }
   103	    try {
   104	        salary = NumberFormat.getCurrencyInstance(Locale.US).
   105	            parse(ef[5]).doubleValue();
   106	    } catch (ParseException pe) {
   107	        salary = 0d;
   108	    }
   109	    [surname: surname, givenName: givenName, role: role,
   110	        location: location, extension: extension, hired: hired, salary: salary]
   111	}
       
   112	// calculate the average salary across the entire company
       
   113	def companyAvgSal = employeeList.average { e -> e.salary }
   114	println "company avg salary = " + companyAvgSal
       
   115	// calculate the average salary for each location,
   116	//     compare to the company average
       
   117	def locationAvgSal = employeeList.groupBy { e ->
   118	    e.location
   119	}.collectEntries { l, el ->
   120	    [l, el.average { e -> e.salary }]
   121	}
   122	locationAvgSal.each { l, a ->
   123	    println l + " avg salary = " + a +
   124	        "; diff from avg company salary = " + (a - companyAvgSal)
   125	}
       
   126	// show the employees in Edinburgh approach #1
       
   127	print "employee(s) in Edinburgh (approach #1):"
   128	def employeesInEdinburgh = employeeList.findAll { e ->
   129	    e.location == "Edinburgh"
   130	}
   131	employeesInEdinburgh.each { e ->
   132	    print " " + e.surname + "," + e.givenName
   133	}
   134	println()
       
   135	// group employees by location
       
   136	def employeesByLocation = employeeList.groupBy { e ->
   137	    e.location
   138	}
       
   139	// show the employees in Edinburgh approach #2
       
   140	print "employee(s) in Edinburgh (approach #1):"
   141	employeesByLocation["Edinburgh"].each { e ->
   142	    print " " + e.surname + "," + e.givenName
   143	}
   144	println()

Because I am just writing a script here, I don't need to put the program body inside a method inside a class; Groovy handles that for us.

In lines 1-6, I still need to import the classes needed for the data parsing. Groovy imports quite a bit of useful stuff by default, including java.lang.* and java.util.*.

In lines 7-90, I use Groovy's syntactic support for lists as comma-separated values bracketed by [ and ]. In this case, there is a list of lists; each sublist is the employee data. Notice that you need the \ in front of the $ in the salary field. This is because a $ occurring inside a string surrounded by double quotes indicates the presence of a field whose value is to be interpolated into the string. An alternative would be to use single quotes.

But I don't want to work with a list of lists; I would rather have a list of maps analogous to the list of Employee class instances in the Java version. I use the Groovy Collection .collect() method in lines 90-111 to take apart each sublist of employee data and convert it into a map. The collect method takes a Groovy Closure argument, and the syntax for creating a closure surrounds the code with { and } and lists the parameters as a, b, c -> in a manner similar to Java's lambdas. Most of the code looks quite similar to the constructor method in the Java Employee class, except that there are items in the sublist rather than arguments to the constructor. However, the last two lines—

[surname: surname, givenName: givenName, role: role,

    location: location, extension: extension, hired: hired, salary: salary]

—create a map with keys surname, givenName, role, location, extension, hired, and salary. And, since this is the last line of the closure, the value returned to the caller is this map. No need for a return statement. No need to quote these key values; Groovy assumes they are strings. In fact, if they were variables, you would need to put them in parentheses to indicate the need to evaluate them. The value assigned to each key appears on its right side. Note that this is a map whose values are of different types: The first four are String, then int, LocalDate, and double. It would have been possible to define the sublists with elements of those different types, but I chose to take this approach because the data would often be read in as string values from a text file.

The interesting bits appear in lines 112-144. I've kept the same kind of processing steps as in the Java version.

In lines 112-114, I use the Groovy Collection average() method, which like collect() takes a Closure argument, here iterating over the list of employee maps and picking out the salary value. Note that using these methods on the Collection class means you don't have to learn how to transform lists, maps, or some other element to streams and then learn the stream methods to handle your calculations, as in Java. For those who like Java Streams, they are available in newer Groovy versions.

In lines 115-125, I calculate the average salary by location. First, in lines 117-119, I transform employeeList, which is a list of maps, into a map, using the Collection groupBy() method, whose keys are the location values and whose values are linked sublists of the employee maps pertaining to that location. Then I process those map entries with the collectEntries() method, using the average() method to compute the average salary for each location.

Note that collectEntries() passes each key (location) and value (employee sublist at that location) into the closure (the l, el -> string) and expects a two-element list of key (location) and value (average salary at that location) to be returned, converting those into map entries. Once I have the map of average salaries by location, locationAvgSal, I can print it out using the Collection each() method, which also takes a closure. When each() is applied to a map, it passes in the key (location) and value (average salary) in the same way as collectEntries().

In lines 126-134, I filter the employeeList to get a sublist of employeesInEdinburgh, using the findAll() method, which is analogous to the Java Streams filter() method. And again, I use the each() method to print out the sublist of employees in Edinburgh.

In lines 135-144, I take the alternative approach of grouping the employeeList into a map of employee sublists at each location, employeesByLocation. Then in lines 139-144, I select the employee sublist at Edinburgh, using the expression employeesByLocation[“Edinburgh”] and the each() method to print out the sublist of employee names at that location.

Why I often prefer Groovy

Maybe it's just my familiarity with Groovy, built up over the last 12 years or so, but I feel more comfortable with the Groovy approach to enhancing Collection with all these methods that take a closure as an argument, rather than the Java approach of converting the list, map, or whatever is at hand to a stream and then using streams, lambdas, and data classes to handle the processing steps. I seem to spend a lot more time with the Java equivalents before I get something working.

I'm also a huge fan of strong static typing and parameterized types, such as Map as found in Java. However, on a day-to-day basis, I find that the more relaxed approach of lists and maps accommodating different types does a better job of supporting me in the real world of data without requiring a lot of extra code. Dynamic typing can definitely come back to bite the programmer. Still, even knowing that I can turn static type checking on in Groovy, I bet I haven't done so more than a handful of times. Maybe my appreciation for Groovy comes from my work, which usually involves bashing a bunch of data into shape and then analyzing it; I'm certainly not your average developer. So is Groovy really a more Pythonic Java? Food for thought.

I would love to see in both Java and Groovy a few more facilities like average()and averagingDouble(). Two-argument versions to produce weighted averages and statistical methods beyond averaging—like median, standard deviation, and so forth—would also be helpful. Tabnine offers interesting suggestions on implementing some of these.

Groovy resources

The Apache Groovy site has a lot of great documentation. Other good sources include the reference page for Groovy enhancements to the Java Collection class, the more tutorial-like introduction to working with collections, and Mr. Haki. The Baeldung site provides a lot of helpful how-tos in Java and Groovy. And a really great reason to learn Groovy is to learn Grails, a wonderfully productive full-stack web framework built on top of excellent components like Hibernate, Spring Boot, and Micronaut.

What to read next
Chris Hermansen portrait Temuco Chile
Seldom without a computer of some sort since graduating from the University of British Columbia in 1978, I have been a full-time Linux user since 2005, a full-time Solaris and SunOS user from 1986 through 2005, and UNIX System V user before that.

Comments are closed.

Creative Commons LicenseThis work is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.