Programming

Java Records — Etched in Finality

Spread the love


This article uses a story format to show the concept of records in Java. It shows the different concepts and parts that make up the records, including a restricted identifier, java.lang.Record, Components, Canonical, Compact, Normal constructors, and more.

The Minimalist — A Resolution

Working from home has taught some of us to think of being minimalistic. It was no different for our protagonist, “Dev” — Dev decided to be minimalistic with respect to some of the dresses — So he decided to keep track of what he was wearing to donate the new dresses. To keep it simple, he decided to keep track of two items, shirts and shoes, and wrote the following class in Java:

final class Attire {  
  private final String shirt;
  private final String shoe;  
  
  public Attire(String shirt, String shoe) {
    this.shirt = shirt;
    this.shoe = shoe;
  }
}

Impatient he was, Dev now wanted to populate and print out the values for basic sanity testing:

Attire at1 = new Attire("formal", "black");
System.out.println(at1);

“Woah,” cried he when he saw: “Attire@251a69d7”. Undeterred, he wrote the following code — please spare the opinions on the code — he just wrote a short code to print the values as follows:

@Override
public String toString() {
  StringBuilder sb = new StringBuilder(this.getClass().getName());
  sb.append("[");
  sb.append("shirt=");
  sb.append(this.shirt);
  sb.append(",");
  sb.append("shoe=");
  sb.append(this.shoe);
  sb.append("]");
  return sb.toString();
}

And he was satisfied with the output : “Attire[shirt=formal,shoe=black]”. And then he realized that he was filling the values once and forgetting them — noticed that they are private final fields since Dev doesn’t like anyone to tamper with his shirts and shoes, and it’s a record to keep; he needs to have getters/accessors — and he wanted the accessors to have the same names as the field names, so here’s what the code looked like:

public String shirt() {
  return this.shirt;
}

public String shoe() {
  return this.shoe;
}

Before closing up the class, he wanted to make sure that he got a true when he compared records of identical shirt-shoe values and tried out:

Attire at1 = new Attire("formal", "black");
Attire at2 = new Attire("formal", "black");
System.out.println(at1);

But we know the answer — “false”. So being a determined person, he decided to write the method  equals()himself:

@Override
public boolean equals(Object obj) {
  if (!(obj instanceof Attire))
    return false;  
  Attire other = (Attire) obj;
  return this.shirt.equals(other.shirt()) && this.shoe.equals(other.shoe());
}

Luckily, Dev had read Effective Java by Joshua Bloch and knew he had to override hashCode(). Anyway, with that also done, now Dev can comfortably compare and hash the records and decide what items are used and what are not. Let us look at the complete class definition code which Dev wrote:

final class Attire {
  private final String shirt;
  private final String shoe;

  // Constructor
  public Attire(String shirt, String shoe) {
    this.shirt = shirt;
    this.shoe = shoe;
  }  

  // Accessors
  public String shirt() {
    return this.shirt;
  }
  public String shoe() {
    return this.shoe;
  }  

  // Equals and HashCode Definitions
  @Override
  public boolean equals(Object obj) {
    if (!(obj instanceof Attire))
      return false;
    Attire other = (Attire) obj;
    return this.shirt.equals(other.shirt()) &&
      this.shoe.equals(other.shoe());
  }
  @Override
  public int hashCode() { // can do better here
    return this.shirt.hashCode() + this.shoe.hashCode();
  }

  // Print out 
  @Override
  public String toString() { // oh please cleanup this code
    StringBuilder sb = new StringBuilder(this.getClass().getName());
    sb.append("[");
    sb.append("shirt=");
    sb.append(this.shirt);
    sb.append(",");
    sb.append("shoe=");
    sb.append(this.shoe);
    sb.append("]");
    return sb.toString();
  }
}

Along with us, Dev also looked at his class definition amusingly. He wondered if his idea to start with was to be minimalistic, but he would probably end up being minimalistic in his clothes. Still, from the code point of view, this is far from being minimalistic, and he decided to call the Java folks, explain the problem and ask if they have a solution…And he called.

The Response

“What did you smoke last night?” — No, That was not the response he received. And to disappoint you, neither am I telling that answer if you had that question me :). The response he received was, “use Java 16, and the entire code you wrote is just one line now”:

record MyAttire(String shirt, String shoe) {}

Not believing this, he used the Disassembler of Eclipse IDE and saw that there indeed were the fields shirt and shoe, private final as he deemed, so were the accessors shirt()and shoe() along with the rest of the methods. We will see the disassembled code a little later. For now, Dev understood that the compiler is doing this magic under the hood to provide this feature to the users.

The Feature — Records

Records provide the Cartesian Product part of the story of Algebraic Data Types — It gives the notion of logical AND. Okay, hold on before you yawn and sign off; let me explain. Every day we do cartesian product operations — Take the example of Dev; he selects one shirt out of the set of shirts and chooses a shoe from a set of shoes — whether the set of shoes is finite or not is a very sensitive topic for some, so let us not go there at all :). Suffice it to say that he does this operation and creates a pair of shirts and shoes — this indeed is equivalent to the cartesian product of two sets and choosing one of the outcomes — and each of these outcomes is an instance of a record.

Records also provide the notion of constant-ness — private final fields are just one part, hashCode() and equals() add to the story where the values of the fields are the same for two records, they will be considered equal as in:

MyAttire m = new MyAttire("formal", "black");
MyAttire m2 = new MyAttire("formal", "black");
System.out.println(m.equals(m2));

This would give out a “true” since the values are the same. And as the name implies, once it is recorded, we cannot tamper with its values.

Why do we want such a construct? The answer is it’s a convenient way to provide the notion of data classes — some call it tuples [Python world]. Records can be thought of as providing itemized tuples with named accessors. And data classes blend themselves nicely into the Pattern Matching [see here for an informal primer] constructs Java is providing now and shortly. Since this article is about records in detail, let us delve deeper into records.

“Record” — Restricted Identifier

First things first — how do we define a record in source code? Using the record “restricted identifier.” What is this “restricted identifier” — if you have used var or yield, you would immediately recognize that the record also falls into the same bucket — the record can be used just like an identifier in almost all places — but with restrictions — you cannot have a type named record. And it acts like a keyword in record declaration — quite a context-sensitive phenomenon — this complexity is taken care of by the compiler. As a user, we need to know that a record is the “text” used to define a record.

The Parent — Java.lang.Record

All of us know that when defining class X{} internally, the compiler interprets this as — 

class X extends java.lang.Object {}, in the case of a record definition, a record R implicitly extends java.lang.Record, a new class provided by JDK. Does this Record provide the internal methods? Actually, they are provided by a combination of compiler-generated methods working in sync with the bootstrap mechanism akin to that of lambda — we will see more of this when we discuss the byte code or disassembly. Also, please note that a record can, of course, implement interfaces.

Components

Those “things” that look like the formal parameters of records — String shirt and String shoe in our example — are called Components in the record parlance. We now know that internally they get converted into private final fields. These components also result in the definition of accessors of these fields. These components can be individually annotated, and there indeed is a new annotation element type, ElementType.RECORD_COMPONENT. The good thing is that the annotations percolate to the fields and the accessors as applicable.

Canonical Constructor

Records provide a convenient way to minimize the source code by having a perfect blend of class definition with constructor syntax to provide concise definition albeit with an internal definition of a “canonical” constructor as opposed to the “default” constructor of classes which we are familiar with. While we know that the default constructor doesn’t take any arguments, the “canonical” constructor indeed takes the same number of arguments identical to record components in that order:

MyAttire(String shirt, String shoe){ /* init fields */}

Compact Constructor

A compact construct is just another simpler way to define a canonical constructor, where the parameters can be omitted:

MyAttire { /* Hey, I am the compact constructor */}

Since the record is all about minimizing, this is just a convenient syntax for omitting the parameters since the canonical constructor parameters should match with the record definition; hence its easier and safer to omit to use the compact constructor syntax if you want to define a canonical constructor yourself — in fact there indeed is a specific characteristic of compact constructor which we will see while discussing errors.

What About “Normal” Constructors?

This brings us to the question of whether the “normal” constructors are allowed. Let us agree that anything other than a canonical constructor is a normal constructor. I.e., any constructor which doesn’t have parameters in the same number and the same order is a normal constructor. Of course, they are allowed, but with a catch — Each such constructor should call a canonical constructor the first step.

Record — Expanding the Right Way

We just glanced through the different concepts and parts that make up the records — “record” restricted identifier, java.lang.Record, Components, Canonical, Compact, and Normal constructors. The idea is that it is very easy to create a record using the default definitions. If it is a minimalistic definition, you are ready to go as soon as you write that small definition. However, as you expand and write your own specifics — say the constructors, for example, or maybe provide your own methods, add more fields, etc.- then more rules come into play to ensure the “constant-ness” of a record is not tampered with. And errors are part of program analysis. When we implemented records in the Java Compiler in Eclipse (ecj), we ended up chcheckedhan thirty new errors (thirty-five?) for records alone. Let us briefly sample this in the following section.

Errors — A Part of Programs

Let us pick up a few scenarios where the errors will be thrown and understand the reasoning behind these:

  • User-declared non-static fields are not permitted in a record — Let us say, for now, we permit instance fields — and we can immediately see that this goes against the definition of records — All components are defined and filled up at the creation of an instance of records; also, think of the equals and hashCode — they are based on the component and a user-defined non-static field torpedoes their effort — removing all constant-guarantees. We are, of course, at liberty to define static fields.
  • Multiple canonical constructors are not allowed — First of all, he define multiple canonical constructors. — we can create a compact constructor and a canonical constructor with the same component order. In the source code. The source is coherent and would result in a clash ending up as duplicate constructors since, internally, they will have the same parameters.
  • Illegal, explicit assignment of a final field shirt in compact constructor — This one is interesting; consider:
record MyAttire(String shirt, String shoe) {
  MyAttire { // Compact Constructor
    this.shirt = "";
  }
}

The compact constructor is supposed to be really compact and is expected to be added just for making sanity checks — it’s not allowed to assign fields explicitly in the source code. The compiler internally adds the assignments while generating the code. And hence the error when we have such an assignment in the source code.

We have barely touched less than ten percent of the errors, but this just gives an idea of the care taken to make sure the definition of a record is kept sacrosanct and the majority of error checking and reporting done to ensure that all possible errors are caught in the source. We will now take a look at the byte code — and see how our little code gets expanded.

A Disassembled View

The Source code we have is:

record MyAttire(String shirt, String shoe) {}

In the disassembled version, this expands to:

final class MyAttire extends java.lang.Record
...
  private final java.lang.String shirt;  // fields
  private final java.lang.String shoe;   // fields
...
MyAttire(java.lang.String, java.lang.String); // Constructor
...
  // Accessors
  public java.lang.String shirt();
  public java.lang.String shoe();
...
  // 
  public final java.lang.String toString();
  public final int hashCode();
  public final boolean equals(java.lang.Object);
...
 

Our old fellow, Dev, can immediately relate to this and see that this expanded code looks very similar to what he wrote in the first place. Examining further, he sees that the record is, in fact, generated as a class only in the byte code. Being a learner, he was inquisitive and decided to delve more and took a look at one of the methods — the shirt accessor:

public java.lang.String shirt();
0: aload_0
1: getfield      #14   // Field shirt:Ljava/lang/String;
4: areturn

This self-explanatory code returns just the field of shirt as expected, and nothing unexpected here, but then he looks at hashCode, and he sees something strange:

public final int hashCode();
0: aload_0
1: invokedynamic #30,  0 // InvokeDynamic #0:hashCode:(LMyAttire;)I
6: ireturn

He sees an invokeDynamic instruction — he remembers something similar when he used lambda — and he doesn’t see any calculation inside the hash function . So what happens here? Similar to that of lambda, there is a bootstrapping mechanism that comes into the play here. On observing at the end of disassembly, there is this code:

Bootstrap methods:0 : # 47 invokestatic java/lang/runtime/ObjectMethods.bootstrap:
...
Method arguments:
#1 MyAttire
#48 shirt;shoe
#50 REF_getField shirt:Ljava/lang/String;
#51 REF_getField shoe:Ljava/lang/String;The End of the Beginning

The exact mechanics is beyond the scope of this article, but for now, we kind of assume that there is this ObjectMethods class in JDK which has a method bootStrap which takes the record, the String names of the fields and getters as parameters along with a method handle and magically returns a method handle that calculates the hash code, which is then in-turn called to get the hash code. A similar mechanism for equals as well. In a nutshell, the code for the calculation comes from the JDK itself with inputs from the record. If sufficient interest is there, a separate article can explore this in detail. Let’s conclude the analysis of the disassembled code with just one more detail.

Record: #RecordComponents:
// Component descriptor #6 Ljava/lang/String;
java.lang.String shirt;
// Component descriptor #6 Ljava/lang/String;
java.lang.String shoe;

A new structure to capture the record components is defined in the virtual machine specification — and in a human-readable format, the disassembler provides the view above. Adding this here for the sake of completeness.

The Beginning…

With this, we come to the end of the discussion on records. The changes indeed require detailed attention and bigger article(s) if we go deeper into specifics, but for now, we have done a broad coverage of the concept of records. Please read the Primer Article for an overview of Pattern matching concepts — the top-level feature of which record is just one building block. Records work together with other features like sealed types, pattern switches, etc., to complete the story. To make the story short, this is not the end but just the beginning…



Source link

Related Articles

Leave a Reply

Your email address will not be published.

Back to top button