Previous in the Series
Current Tutorial
Garbage Collection in Java
That's the end of the series!

Previous in the Series: Introduction to Garbage Collection

Garbage Collection in Java

 

Garbage Collection in Java

In the previous section we learned that Java uses a garbage collector for memory managment. But how does a garbage collector actually work? We will take a closer look at that in this section.

Types of Generational Garbage Collectors

Within the HotSpot JVM, the Garbage Collector isn't a single unified concept, but has multiple implementations. Which garbage collector implementation to use will depend upon the hardware resources available and the performance requirements of your application.

  • Serial Garbage Collector - Performs all garbage collection on a single thread. Has higher pause times, but lower resource usage. Best used on systems with only a single processor.
  • Parallel Garbage Collector - Similar to the serial garbage collector, but reduces pause times by performing work using multiple threads.
  • Concurrent Mark Sweep (CMS) Garbage Collector (Deprecated in JDK 9, Removed in JDK 14) - Reduces garbage collection pause times by having background process that traces objects usage while application runs.
  • Garbage First (G1) Garbage Collector (Default since JDK 9) - Improves upon and replaces the CMS GC. G1 is ideally suited for multi-processor machines with access to large amounts of memory.
  • Z GC (Experimental in JDK 11, Production in JDK 15) - Ultra-low latency GC that can also be scaled for application with multi-terabyte heaps. The internal implementation and behavior of GC is distinctly different from the other garbage collectors listed, and a description of it's behaviopr will be handled in a separate article.

 

Heap Memory

Heap memory is an allocation of memory that is controlled by the JVM. The size of heap available to the JVM is primarily controlled with the -Xms<value> and -Xmx<value> JVM args, setting initial heap size and max heap size respectively.

When any thread in the JVM creates an object, they are stored in the heap. For this reason objects stored in heap are not thread safe. This is in contrast to local variables which are allocated in stack memory, which is thread safe, and automatically cleared when the stack leaves scope.

If heap memory becomes full it will cause the JVM to throw java.lang.OutOfMemoryError exceptions, when the JVM attempts to allocate space for new objects. For most implementations of garbage collectors in Java, heap memory is divided into multiple regions based on the "age" of an object. The number and types of regions will vary depending on the specific implementation of the garbage collector.

Generational Garbage Collection

Most garbage collectors in Java are implemented as generational garbage collectors (every garbage collector except Z GC). The idea behind a generational garbage collector is that most objects are short lived, and need to be removed soon after creation. Alternatively as an object increases in age, it becomes less and less likely to become a candidate for removal. Generational garbage collectors divide the heap into multiple regions, with new objects in a more frequently checked young region and long-lived objects in a less frequently checked old region.

By dividing the heap into multiple regions this reduces system pause time associated with garbage collections, improving throughput and responsiveness of applications running on the JVM. Garbage collectors can be further tuned to favor specific characteristics; throughput, responsiveness, resource usage, and so on, depending upon the needs of the application.

Heap Regions

As mentioned earlier, the memory heap in generational garbage collectors is divided into multiple regions. Let's look at these regions in more detail.

  • Young Region - The Young Region, as the name suggests, is the heap region that contains recently created objects. The Young Region is itself subdivided into more regions.

    • Eden Space - On initial creation, and object is stored in the Eden region of the heap until its first garbage collection.
    • Survivor Spaces - Objects that have surived being garbage collected are promoted to a survivor region. Generational collectors have multiple survivor regions, the purpose to improve garbage collector efficency. During a garbage collection, still referenced object in an Eden space or an occupied survivor space or copied or moved to an empty survivor space.
  • Old Region - If an object gains enough "age", by surviving garbage collections, it will be promoted to the old region.

  • Permanent/Metaspace Region - The final region is the permanent or metaspace region. Objects stored in here are typically JVM metadata, core system classes, and other data that typically exist for near the entire duration of the JVM life. Objects stored in this region are checked by the garbage collector, often only when the heap has reached a critical consumed memory threshold.

 

Garbage Collection Process

At a high level, a garbage collection has three phases; mark, sweep, and compaction. Each of these steps have distinct responsibilities. Though note that dependening on the garbage collector implementation, there might be additional sub-phases within each phase that are not covered here.

Mark

On object creation, every object is given, by the VM, a 1 bit marking value, initially set to false (0). This value is used by the garbage collector to mark if an object is reachable. At the start of a garbage collection, the garbage collector traverses the object graph and marks any object it can reach as true (1).

The garbage collector doesn't scan each object individually, but insteads starts from "root" objects. Examples of root objects are; local variabes, static class fields, active Java threads, and JNI references. The below animation visualizes what the object mark phase looks like:

Sweep

During the sweep phase all objects that are unreachable, those whose marking bit currently false (0), are removed.

Compaction

The final phase of a garbage collection is the compaction phase. Live objects in the eden region or an occupied survivor region are moved and/or copied to an empty survivor region. If an object in a survivor region has gained enough tenureship, it is moved or copied to an old region.

Garbage Collection Pause

During a garbage collection there might be periods where some, or even all, processing within the JVM is paused, these are called Stop-the-World Events. As mentioned in the introduction of the Heap Memory section, objects stored in heap memory are not thread safe. This in turn means that during a garbage collection, part, or all, of the JVM must be paused for a period while the garbage collector works to prevent errors from occuring as objects are checked for usage, deleted, and moved or copied.

Tools like JDK Flight Recorder (JFR) and Visual VM can be used to monitor the frequency and duration of pauses occuring from garbage collection. How to tune a garbage collector is outside the scope of this tutorial, but monitoring garbage collector behavior, and subseqently tuning it through JVM arguments, can be key way to improve the performance of an application.

Types of Garbage Collections

Just like there are different regions of heap memory, there are also different types of garbage collections.

  • Minor - Minor garbage collections only scan the Young Regions of heap memory. Minor garbage collections occur very frequently and often have very low pause times associated with them.
  • Major - Major garbage collections scan both the Young and Old regions of heap memory. Major garbage collections occur much less frequently than minor garbage collections, often being triggered by specific conditions within the VM, for example when a threshold of heap memory has been used. Significantly longer pause times are assoicated with major garbage collections as a much larger portion of the heap is being scanned.
  • Full - A full garbage collection is when the entire heap is scanned, Young, Old, and Permanent/Metaspace regions. Like major garbage collections, full garbage collections are often conditions based, for example when a very high percentage of heap memory is being consumed, or being performed manually be a system administrator. Also like major garbage collections, very long pause times are associated with full garbage collections.

The below animation visualizes what a garbage collection looks like:


Last update: September 14, 2021


Previous in the Series
Current Tutorial
Garbage Collection in Java
That's the end of the series!

Previous in the Series: Introduction to Garbage Collection