Contents | Prev | Next |
J2ME CDC 1.0
Porting Guide |
Table of Contents
This document outlines the memory system architecture of CVM. It also provides a reference for its APIs and acts as a guide to a GC (garbage collector) author to assist in writing a new memory allocator and GC for CVM.
Note: This document assumes basic knowledge of tracing GC algorithms such as mark-and-sweep and copying collection, and related GC concepts such as read and write barriers.
CVM greatly improves on its predecessor PersonalJava Java Virtual Machines (JVM's) in the memory system area. The CVM memory system posesses the following new features:
CVM is built with the goal of exactness in mind. An exact VM knows about all pointers to the Java heap, both from the VM code and native methods. Exactness has numerous advantages:
CVM implements exactness by using GC-safe points (or GC points, for short). The idea is that GC cannot occur at arbitrary points in the execution of a program, but only when all threads can tolerate GC. In other words, threads only make their GC state explicit at well-known intervals, but not all the time.
Each thread in CVM can be in a GC-unsafe state or a GC-safe state. Threads in a GC-unsafe state are free to point to heap objects directly, and can do any GC-unsafe pointer manipulations they like. However, they cannot tolerate GC, as the collector cannot obtain their precise set of pointers. Threads in a GC-safe state must make all their pointers known to GC. In order to prevent pointers from becoming invisible to GC through C compiler optimizations on VM code, they are not allowed to point to objects directly, but only through an extra level of indirection. Also, they register with GC any pointers that GC needs to scan. Therefore they can tolerate GC.
GC can only proceed if all threads are GC-safe. CVM makes precise pointer information available to GC when all threads are GC-safe.
For a guide to writing GC-safe code in CVM, and a detail of CVM internal APIs with regards to GC-safe and GC-unsafe modes, please refer to the document How to be GC-safe in CVM.
A typical exact GC cycle in CVM is initiated by a thread requesting a GC. At this point, CVM must bring all other threads to GC points before GC can proceed:
The byte-code interpreter in CVM runs GC-unsafe most of the time for maximum efficiency, and offers GC points on certain byte-codes to ensure that each thread can suspend itself in bounded time on a global GC request.
GC points in the CVM interpreter are:
The choice of method invocation points as GC points is due to the state of the interpreter stack when a GC occurs. Since each frame on the stack refers to a method that has stopped, naturally, at a method invocation point, it makes sense to make invocation sites GC points. So when GC walks the interpreter stack frames looking for roots, it can readily find frame pointers into the heap.
The choice of backwards branches as additional GC points in the interpreter ensures that the currently executing method of each thread is guaranteed to become GC-safe within a bounded amount of time: each method will either loop by a backwards branch, or hit a method invocation within a bounded amount of time.
CVM code outside of the interpreter such as the system class loader and verifier runs mostly GC-safe, as opposed to the byte-code interpreter. So GC can occur alongside classloading, for example.
And finally, CVM's implementation of the Java Native Interface (JNI) allows
all native methods to run GC-safe, except when they access the Java heap. So
native methods can tolerate GC until they call JNI functions that access the
heap. At such heap access points the CVM JNI implementation makes the caller
thread temporarily GC-unsafe while it accesses the heap.
CVM is designed to allow a GC author to write a new GC without changing a
line of the VM code itself. This is achieved by careful engineering of the VM
code to separate memory system related functions from the rest of the VM, and to
identify clear interfaces between the memory system and the VM.
Entry points to the memory system from the VM are clearly defined. Also,
GC-algorithm-independent code is separated from the GC-algorithm-dependent code
very carefully. The former abstracts away the details of many common GC tasks
from the GC author, and is available as a set of routines for the GC author to
use. The latter is designed to be an interface that needs to be implemented by a
GC author to provide GC functionality.
The separation of the VM from the memory interface is achieved by extensive
use of internal interfaces that are built hiearchically.
The VM needs to access the heap:
Direct heap access is achieved by using the direct
memory interface. Indirect heap access is achieved using the indirect memory interface, built on top of the
direct memory interface. And native method heap access is achieved through a JNI
implementation built on top of the indirect memory interface. Therefore all heap
access in the system is guaranteed to eventually go through the direct memory
interface.
In order to achieve a memory interface that can accommodate as many GC
algorithms as possible, it is necessary to allow the implementation of read-
and write-barriers. A read or write barrier of a certain data type
<T> is a GC-supplied callback to be invoked on every read or
write of a heap location of type <T>. Barrier use varies widely
between GC algorithms; the main idea is that barriers are used to ensure
consistency between a running program and the garbage collector, especially when
the collector does not handle the whole heap on every GC call. Examples of such
GC's are generational, incremental and concurrent collectors.
CVM implements support for read and write barriers below the direct memory
interface implementation, so that they are not visible to the VM author.
They are called implicitly and automatically by the implementation of the direct
memory interface, and are therefore incurred on all heap access in the system.
The VM also needs to:
Heap initialization/teardown and object allocation are the main entry points
to GC code from the VM. All allocation and GC activity in the system is
triggered by a call from the VM or a native method into the object allocator.
The object allocator encapsulates GC policy, and is responsible for initiating
GC when it is required.
There are other entry points that the VM uses to cross over to the GC side.
However, these are usually triggered by a matching native call to request GC
action:
There are certain activities that all GC's will have to perform, regardless
of algorithm. CVM separates those routines into a shared GC interface called
gc_common. Such common GC activities include:
There are certain activities that are GC-algorithm specific. CVM separates
those routines into a GC-implementation specific GC interface called
gcimpl. The routines and macros in this interface need to be implemented
by the GC author. Such GC calls are responsible for:
Object allocation, and subsequent possible GC action is initiated by the VM
by calling into the gcimpl object allocation routine, and is performed by
going back and forth between shared and GC-implementation specific code.
So a GC cycle is initiated by a call into CVMgcimplAllocObject(),
and is completed through calls back and forth between the shared and
algorithm-specific interfaces.
Writing a new GC for CVM is accomplished by implementing the gcimpl
interface. This section outlines GC and relevant CVM source organization, talks
about the CVM data types that the GC author needs to know about, and outlines in
detail the gcimpl routines that need to be implemented. It also talks
about shared GC routines available to the GC author to make a CVM GC possible.
Pluggable GC
Separating Memory System from VM
Shared Memory System Code
The details of such activities are
abstracted in the implementation of the gc_common interface, and are
available as GC services for the GC author to use. These routines and
macros will be outlined in detail below, in section Writing
a New GC.
GC-specific Memory System Code
The gcimpl routines
will be called by the VM at appropriate points to ensure the correct GC
execution flow. These routines and macros will be outlined in great detail
below, in section Writing a New GC.
GC Execution Flow
Writing a new GC
Source Organization
Array type | Element type |
CVMArrayOfByte | CVMJavaByte |
CVMArrayOfShort | CVMJavaShort |
CVMArrayOfChar | CVMJavaChar |
CVMArrayOfBoolean | CVMJavaBoolean |
CVMArrayOfInt | CVMJavaInt |
CVMArrayOfRef | CVMObjectICell |
CVMArrayOfFloat | CVMJavaFloat |
CVMArrayOfLong | CVMTwoJavaWords |
CVMArrayOfDouble | CVMTwoJavaWords |
The element type of CVMTwoJavaWords for the long and double cases is defined as:
typedef CVMJavaVal32 CVMTwoJavaWords[2];
Any array can be cast to CVMArrayOfAnyType if the aim is to access array header elements only.
Since GC can assume single-threaded execution, it is free to override the second word of an object header, assuming that it reconstructs it before threads are resumed. The second header word frequently has a trivial, well known default value -- it can be tested for triviality to determine if an overriding GC routine needs to save away the original contents of the word:
/* The default trivial contents of the various32 word */ constant CVMUint32 CVM_OBJECT_DEFAULT_VARIOUS_WORD /* Is a various32 word trivial? * (i.e. can just be set to CVM_OBJECT_DEFAULT_VARIOUS_WORD after * GC) CVMBool CVMobjectTrivialClassWord(CVMUint32 word)
The complete set of operations on an object the GC author can call are given below in section 3.4.4.
A CVM garbage collector is written by implementing a set of gcimpl
functions. This section outlines in detail the functions and macros that
must be implemented, and also the set of barriers, any subset of
which may be implemented.
For basic GC execution, and for interfacing with the VM, the GC
implementation must implement the following data types and functions.
To start out with, define in
src/share/javavm/include/gc/<gcname>/gc_config.h: This should include any global state the GC would like to maintain which
non-GC code might wish to access. In the current state of CVM, there are no such
details that may be communicated through CVMGCGlobalState. This may
change in the future.
Now the GC author should implement the following functions.
CVM allows a GC author to define read and write barriers as required by a
given GC algorithm. This is done by including a series of #define's in
src/share/javavm/include/gc/<gcname>/gc_config.h. The
implementation of the barrier for a data type <T> is called
implicitly by the appropriate direct memory layer macro corresponding to
<T>; the barriers are not visible to VM authors.
Note that in all the barriers listed below, the type Ref refers to
any reference type, including objects of all classes and arrays.
Appropriate type checking of assigments is done by the rest of the VM; all
reference types are equal by the time they trickle down to the barrier layer.
The default implementation of a barrier is empty. Therefore, the GC
author should only #define the barriers that he/she needs.
The read and write barriers are separated according to data type. The names
are self explanatory. The code for a read or write barrier is executed right
before the actual read or write takes place.
All barriers take as argument a pointer to the head of the object being
written, as well as the address of the slot being written to. Write barriers
take an additional argument that is the value that is being written.
The read barrier for reference typed array or object slots:
The list of read barriers for non-reference types, size 32-bits or less: The read barrier for 64-bit slots, for Java long and
double. The write barrier for reference typed array or object slots: The list of write barriers for non-reference types, size 32-bits or less: The read barrier for 64-bit slots, for Java long and
double. Important notes:
The barriers above are defined on a slot by slot basis. On some GC's, this
may prove to be inefficient when large blocks of data need to be moved at once.
For block moves, optional block readers and block writers may be
defined. These would have to perform the read or write, and batch the barriers.
If a GC chooses not to override these, the memory system invokes the
element-wise barriers for each element of the array move.
Here's a list of various block operations, any subset of which may be
overridden.
where jType is the appropriate Java type (e.g. CVMJavaInt
or CVMJavaLong). where jType is the appropriate Java type (e.g. CVMJavaInt
or CVMJavaLong). where jType is the appropriate Java type (e.g. CVMJavaInt
or CVMJavaLong). In the section 2.2 Shared Memory System Code we have
mentioned the gc_common interface. This section outlines the various
components of that interface available to the GC author. You can find the
interface in src/share/javavm/include/gc_common.h.
When an object allocator decides to GC (most probably due to an allocation
failure), it has to make sure that the system is stopped in a GC-safe way. In
CVM, this is accomplished by using CVMgcStopTheWorldAndGC(): When all threads are stopped at GC-safe points, the GC will need to scan all
GC roots. This is accomplished by using CVMgcScanRoots(), and a
GC-specific callback function: So the GC author defines a callback function that takes the address of a
reference containing slot as argument, along with an opaque data
argument. CVMgcScanRoots(ee, gcOpts, refCallback, refCallbackData)
calls (*refCallback)(refPtr, refCallbackData) on every discovered root
address refPtr.
The memory system guarantees that both the following conditions hold
when the callback routine is called: The roots scanned are JNI local and global references, Java stack locations,
Java local variables, Java static variables, and CVM-internal data structures.
The details are abstracted from the GC author.
Weak references are only discovered and queued up if
gcOpts.discoverWeakReferences is CVM_TRUE before a call to
CVMgcScanRoots(). So the GC author typically calls the first root scan
with weak references discovery enabled, and then disable weak references
discovery by setting gcOpts.discoverWeakReferences to
CVM_FALSE.
The root scanning operation may be performed multiple times on each GC cycle.
However, note that each root scan cycle including the first one should be
preceded by a call to CVMgcClearClassMarks(): This function is responsible for clearing the mark bits on dynamically loaded
classes. The mark bits are used to prevent infinite recursion and redundant work
on class scanning. If they are not cleared between successive root scans, the GC
might end up skipping important class roots like Java statics.
Note that CVMgcClearClassMarks() is a separate function since a root
scan cycle may include more than just calling CVMgcScanRoots(). For
example, a generational GC may call CVMgcScanRoots() to discover system
roots, CVMgcScanSpecial() to discover special object roots, and then
scan pointers recorded by a write barrier. So CVMgcClearClassMarks()
should be called before each such list of root scans, during which class
scanning state should be kept.
Java 2 language and libraries have features that require special scanning
support from CVM. In particular, the following requires special handling:
The CVM memory system hides the GC-scanning details of these special features
from the GC author by the use of a narrow interface consisting of two functions.
If the GC author calls these two functions at the right points in GC code,
all special scanning is performed automatically.
The idea with special scanning is to garbage collect entries from out-of-heap
tables pointing into the heap. If we declared the tables as GC roots, they would
automatically be kept alive and their entries would never be collected. Instead,
we do the special objects scanning in conjunction with information from the
garbage collector to figure out which entries of the special objects may be
discarded, and which entries need to be kept.
So from the CVM point of view, GC involves two conceptual points where
special scan APIs should be inserted:
For this point, the appropriate special scan routine is
CVMgcProcessSpecialWithLivenessInfo().
For this point, the appropriate special scan routine is
CVMgcScanSpecial(). The full API for point #1: The liveness test is done using a predicate isLive of type: The full API for point #2: Basic Execution
struct CVMGCGlobalState {
....
};
/* Initialize GC global state if required */
CVMUint32 CVMgcimplInitGlobalState(CVMGCGlobalState* globalState)
/* Initialize the heap, with a given minimum and maximum heap size
in bytes. Return CVM_TRUE on success, CVM_FALSE otherwise. */
CVMBool CVMgcimplInitHeap(CVMGCGlobalState* globalState,
CVMUint32 minBytes,
CVMUint32 maxBytes)
/*
* Allocate uninitialized heap object of size numBytes
* This is called by the VM code on every allocation.
*/
CVMObject* CVMgcimplAllocObject(CVMExecEnv* ee, CVMUint32 numBytes)
/*
* Perform GC.
*
* This routine is called by the common GC code after all locks are
* obtained, and threads are stopped at GC-safe points. It's the GC
* routine that needs a snapshot of the world while all threads are
* stopped (typically at least a root scan).
*
* The goal to free is 'numBytes' bytes.
*/
void CVMgcimplDoGC(CVMExecEnv* ee, CVMUint32 numBytes)
/* Teardown routines */
/*
* Destroy GC global state
*/
void CVMgcimplDestroyGlobalState(CVMGCGlobalState* globalState);
/*
* Destroy heap. CVM_TRUE on success, CVM_FALSE otherwise.
*/
CVMBool CVMgcimplDestroyHeap(CVMGCGlobalState* globalState);
/*
* Return the number of bytes free in the heap.
*/
CVMUint32 CVMgcimplFreeMemory(CVMExecEnv* ee)
/*
* Return the amount of total memory in the heap, in bytes.
*/
CVMUint32 CVMgcimplTotalMemory(CVMExecEnv* ee)
/* The time-stamp of the last full GC of the heap, in order to
* support the implementation of sun.misc.GC.maxObjectInspectionAge().
* This should return the value of CVMtimeMillis() obtained on the last
* GC performed.
*/
CVMInt64 CVMgcimplTimeOfLastMajorGC();
/* Heap iteration support */
/*
* Per-object callback to call during iteration
*/
typedef void (*CVMObjectCallbackFunc)(CVMObject* obj, CVMClassBlock* cb,
CVMUint32 objSize, void* data);
/*
* CVMgcimplIterateHeap should traverse all objects on the heap and
* call 'cback' on each object, with its class, size and generic
* 'data'.
*
* If the heap consists of contiguous range(s), use CVMgcScanObjectRange()
*/
void CVMgcimplIterateHeap(CVMExecEnv* ee,
CVMObjectCallbackFunc cback, void* data)
/*
* A per-object callback function, to be called during heap dumps
*/
typedef void (*CVMObjectCallbackFunc)(CVMObject* obj, CVMClassBlock* cb,
CVMUint32 objSize, void* data);
/*
* Heap dump support: Iterate over a contiguous-allocated
* range of objects.
*/
void CVMgcScanObjectRange(CVMExecEnv* ee,
CVMUint32* base, CVMUint32* top,
CVMObjectCallbackFunc callback,
void* callbackData);
Barriers
void CVMgcimplReadBarrierRef(
CVMObject* objRef, CVMJavaObject** fieldLoc)
void CVMgcimplReadBarrierByte(
CVMObject* objRef, CVMJavaByte* fieldLoc)
void CVMgcimplReadBarrierBoolean(
CVMObject* objRef, CVMJavaBoolean* fieldLoc)
void CVMgcimplReadBarrierShort(
CVMObject* objRef, CVMJavaShort* fieldLoc)
void CVMgcimplReadBarrierChar(
CVMObject* objRef, CVMJavaChar* fieldLoc)
void CVMgcimplReadBarrierInt(
CVMObject* objRef, CVMJavaInt* fieldLoc)
void CVMgcimplReadBarrierFloat(
CVMObject* objRef, CVMJavaFloat* fieldLoc)
void CVMgcimplReadBarrier64(
CVMObject* objRef, CVMJava32* fieldLoc)
void CVMgcimplWriteBarrierRef(
CVMObject* objRef, CVMObject** fieldLoc, CVMObject* rhs)
void CVMgcimplWriteBarrierByte(
CVMObject* objRef, CVMJavaByte* fieldLoc, CVMJavaByte rhs)
void CVMgcimplWriteBarrierBoolean(
CVMObject* objRef, CVMJavaBoolean* fieldLoc, CVMJavaBoolean rhs)
void CVMgcimplWriteBarrierShort(
CVMObject* objRef, CVMJavaShort* fieldLoc, CVMJavaShort rhs)
void CVMgcimplWriteBarrierChar(
CVMObject* objRef, CVMJavaChar* fieldLoc, CVMJavaChar rhs)
void CVMgcimplWriteBarrierInt(
CVMObject* objRef, CVMJavaInt* fieldLoc, CVMJavaInt rhs)
void CVMgcimplWriteBarrierFloat(
CVMObject* objRef, CVMJavaFloat* fieldLoc, CVMJavaFloat rhs)
void CVMgcimplWriteBarrier64(
CVMObject* objRef, CVMJava64* fieldLoc, CVMJava64 rhsPtr)
Moving Array Data
memmove(&buf[0], arr->elems[start],
len * sizeof(<jType>))
void CVMgcimplArrayReadBodyByte(
CVMJavaByte* buf, CVMArrayOfByte* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayReadBodyBoolean(
CVMJavaBoolean* buf, CVMArrayOfBoolean* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayReadBodyShort(
CVMJavaShort* buf, CVMArrayOfShort* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayReadBodyChar(
CVMJavaChar* buf, CVMArrayOfChar* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayReadBodyInt(
CVMJavaInt* buf, CVMArrayOfInt* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayReadBodyFloat(
CVMJavaFloat* buf, CVMArrayOfFloat* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayReadBodyRef(
CVMJavaObject** buf, CVMArrayOfRef* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayReadBodyLong(
CVMJavaVal32* buf, CVMArrayOfLong* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayReadBodyDouble(
CVMJavaVal32* buf, CVMArrayOfDouble* arr,
CVMUint32 start, CVMUint32 len)
memmove(arr->elems[start], &buf[0], len * sizeof(<jType>))
void CVMgcimplArrayWriteBodyByte(
CVMJavaByte* buf, CVMArrayOfByte* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayWriteBodyBoolean(
CVMJavaBoolean* buf, CVMArrayOfBoolean* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayWriteBodyShort(
CVMJavaShort* buf, CVMArrayOfShort* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayWriteBodyChar(
CVMJavaChar* buf, CVMArrayOfChar* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayWriteBodyInt(
CVMJavaInt* buf, CVMArrayOfInt* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayWriteBodyFloat(
CVMJavaFloat* buf, CVMArrayOfFloat* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayWriteBodyRef(
CVMJavaObject** buf, CVMArrayOfRef* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayWriteBodyLong(
CVMJavaVal32* buf, CVMArrayOfLong* arr,
CVMUint32 start, CVMUint32 len)
void CVMgcimplArrayWriteBodyDouble(
CVMJavaVal32* buf, CVMArrayOfDouble* arr,
CVMUint32 start, CVMUint32 len)
memmove(dstArr->elems[dstIdx], srcArr->elems[srcIdx],
len * sizeof(<jType>))
void CVMgcimplArrayCopyByte(
CVMArrayOfByte* srcArr, CVMUint32 srcIdx,
CVMArrayOfByte* dstArr, CVMUint32 dstIdx, CVMUint32 len)
void CVMgcimplArrayCopyBoolean(
CVMArrayOfBoolean* srcArr, CVMUint32 srcIdx,
CVMArrayOfBoolean* dstArr, CVMUint32 dstIdx, CVMUint32 len)
void CVMgcimplArrayCopyShort(
CVMArrayOfShort* srcArr, CVMUint32 srcIdx,
CVMArrayOfShort* dstArr, CVMUint32 dstIdx, CVMUint32 len)
void CVMgcimplArrayCopyChar(
CVMArrayOfChar* srcArr, CVMUint32 srcIdx,
CVMArrayOfChar* dstArr, CVMUint32 dstIdx, CVMUint32 len)
void CVMgcimplArrayCopyInt(
CVMArrayOfInt* srcArr, CVMUint32 srcIdx,
CVMArrayOfInt* dstArr, CVMUint32 dstIdx, CVMUint32 len)
void CVMgcimplArrayCopyFloat(
CVMArrayOfFloat* srcArr, CVMUint32 srcIdx,
CVMArrayOfFloat* dstArr, CVMUint32 dstIdx, CVMUint32 len)
void CVMgcimplArrayCopyRef(
CVMArrayOfRef* srcArr, CVMUint32 srcIdx,
CVMArrayOfRef* dstArr, CVMUint32 dstIdx, CVMUint32 len)
void CVMgcimplArrayCopyLong(
CVMArrayOfLong* srcArr, CVMUint32 srcIdx,
CVMArrayOfLong* dstArr, CVMUint32 dstIdx, CVMUint32 len)
void CVMgcimplArrayCopyDouble(
CVMArrayOfDouble* srcArr, CVMUint32 srcIdx,
CVMArrayOfDouble* dstArr, CVMUint32 dstIdx, CVMUint32 len)
What to Call?
Initiating a GC
/*
* Initiate a GC. Acquire all GC locks, stop all threads, and then
* call back to the particular GC to do the work. When the particular
* GC is done, resume.
*
* Returns CVM_TRUE on success, CVM_FALSE if GC could not be run.
*/
CVMBool CVMgcStopTheWorldAndGC(CVMExecEnv* ee, CVMUint32 numBytes)
This function stops the system in a GC-consistent way by acquiring all
system locks, and bringing all threads to GC-safe points. Then it calls the
entry point to the GC implementation, CVMgcimplDoGC() to do the actual
work. If GC work could not be performed, for example due to an out of memory
situation, CVMgcStopTheWorldAndGC() returns CVM_FALSE.
Root Scans
/*
* Scan the root set of collection
*/
void CVMgcScanRoots(CVMExecEnv* ee, CVMGCOptions* gcOpts,
CVMRefCallbackFunc callback, void* data)
where the callback function type is defined as: /*
* A 'ref callback' called on each *non-NULL* discovered root
*/
typedef void (*CVMRefCallbackFunc)(CVMObject** refAddr, void* data)
refPtr != NULL
*refPtr != NULL
/*
* Clear the class marks for dynamically loaded classes.
*/
void CVMgcClearClassMarks(CVMExecEnv* ee, CVMGCOptions* gcOpts);
Special Root Scans
/*
* Process special objects with liveness info from a particular GC
* implementation. This covers special scans like string intern table,
* weak references and monitor structures.
*
* isLive - a predicate that returns true if an object is strongly
* referenced
*
* transitiveScanner - a callback that marks an object
* and all its children
*/
void CVMgcProcessSpecialWithLivenessInfo(CVMExecEnv* ee,
CVMGCOptions* gcOpts,
CVMRefLivenessQueryFunc isLive,
void* isLiveData,
CVMRefCallbackFunc transitiveScanner,
void* transitiveScannerData);
The transitiveScanner callback function type
CVMRefCallbackFunc is defined in 3.4.2 Root
Scans. The transitiveScanner should mark its parameter object
reference and all its children. The non-NULL CVMRefCallbackFunc
argument semantics hold for transitiveScanner (see guarantee)
/*
* A predicate to test liveness of a given reference
*/
typedef CVMBool (*CVMRefLivenessQueryFunc)(CVMObject** refAddr, void* data)
Here also, the non-NULL argument semantics hold for isLive (see
guarantee)
/*
* Scan and optionally update special objects. This covers special
* scans like string intern table, weak references and monitor
* structures.
*/
void CVMgcScanSpecial(CVMExecEnv* ee, CVMGCOptions* gcOpts,
CVMRefCallbackFunc updateRefCallback, void* data);
The updateRefCallback function type CVMRefCallbackFunc
is defined in 3.4.2 Root Scans. It's called for each
location that needs to be updated to a new address. The non-NULL
CVMRefCallbackFunc argument semantics hold for
updateRefCallback (see guarantee)
In summary, when GC calls CVMgcProcessSpecialWithLivenessInfo(),
each special object slot will be checked by (*isLive)() to see if GC
determined it to be live in the preceding reachability scan. Dead entries will
automatically be removed from the special object tables. Later on, GC will call
CVMgcScanSpecial(), which will cause all remaining live entries in the
special object tables to be updated with new pointers.
Given an object reference, GC must be able to find all pointers embedded in
the object, and perform an action on each pointer. This operation is very common
in all tracing GC's.
In CVM, object walking is performed using a macro, for maximum efficiency.
The object walker uniformly and automatically handles arrays of references and
objects. When it encounters an object of class C, it scans class C
as well. It also discovers weak references, and acts accordingly. CVMobjectWalkRefsWithSpecialHandling() also calls
(*callback)(refPtr, data) on the address refPtr of every
reference-typed slot in the class data. The non-NULL CVMRefCallbackFunc
argument semantics hold for callback (see guarantee)
Note that refAction is a macro, whereas callback is a
function. This asymmetry is intentional for efficiency reasons:
refAction is going to be called for each slot of each object, whereas
callback is going to be called for each slot of each class. The former
is typically orders of magnitude more frequent than the latter.
This set of macros is responsible for accessing the information on an object
given a direct object reference. The set of operations are separated into a few
distinct categories. Arrays and objects automatically receive separate
treatment.
This section outlines a very simple allocator and collector written for CVM.
Let's call it markcompact, for a mark-sweep-compact collector. The
outline here only pertains to implementation for the GC interface, so there is
no detail about the actual compaction process.
First off, there should be a gc_config.h file describing the GC:
src/share/javavm/include/gc/markcompact/gc_config.h: And then, there should be a gc_impl.c file implementing the GC. Note
that setting the build option CVM_GCCHOICE=markcompact would build this
GC automatically.
src/share/javavm/runtime/gc/markcompact/gc_impl.c: Object Walking
void CVMobjectWalkRefsWithSpecialHandling(
CVMExecEnv* ee,
CVMGCOptions* gcOpts,
CVMObject* obj,
CVMUint32 firstHeaderWord,
C-statement refAction,
CVMRefCallbackFunc callback,
void* data)
So given an ee and a gcOpts,
CVMobjectWalkRefsWithSpecialHandling() scans object obj with
the first header word firstHeaderWord. It executes (more like
"substitutes", given that this is a macro) the statement refAction on
every embedded object reference. refPtr is a special variable within
the body of refAction, pointing to the object slot being scanned. The
object slot may contain NULL, so refAction must be prepared to
deal with that.
Per-object data
/* Get the size of an object or array instance */
CVMUint32 CVMobjectSize(CVMObject* obj)
/* Get the size of an object or array instance given its class*/
CVMUint32 CVMobjectSizeGivenClass(CVMObject* obj, CVMClassBlock* cb)
/* Get the first word of the object header as integer */
CVMUint32 CVMobjectGetClassWord(CVMObject* obj)
/* Set the first word of the object header as integer */
void CVMobjectSetClassWord(CVMObject* obj, CVMUint32 word)
/* Get the class of a given object */
void CVMobjectGetClass(CVMObject* obj)
/* CVM_TRUE iff an object is read-only and not on the heap */
CVMBool CVMobjectIsInROM(CVMObject* obj)
/* CVM_TRUE iff an object is "marked". */
CVMBool CVMobjectMarked(CVMObject* obj)
/* Clear mark on an object */
void CVMobjectClearMarked(CVMObject* obj)
/* Set mark on an object */
void CVMobjectSetMarked(CVMObject* obj)
Example Collector
#ifndef _INCLUDED_MARKCOMPACT_GC_CONFIG_H
#define _INCLUDED_MARKCOMPACT_GC_CONFIG_H
#include "javavm/include/gc/gc_impl.h"
/*
* The following header could include any mark-compact specific
* declarations.
*/
#include "javavm/include/gc/markcompact/markcompact.h"
/*
* Barriers in this implementation
*/
#define CVMgcimplWriteBarrierRef(directObj, slotAddr, rhsValue) \
/* \
* Do nothing. Just an example. \
*/ \
/*
* Global state specific to GC
*/
struct CVMGCGlobalState {
/* Nothing here */
};
#endif /* _INCLUDED_MARKCOMPACT_GC_CONFIG_H */
#include "javavm/include/defs.h"
#include "javavm/include/objects.h"
#include "javavm/include/classes.h"
#include "javavm/include/directmem.h"
/*
* This file is generated from the GC choice given at build time.
*/
#include "generated/javavm/include/gc_config.h"
/* The shared GC interface */
#include "javavm/include/gc_common.h"
/* And the specific GC interface */
#include "javavm/include/gc/gc_impl.h"
/*
* The main allocation entry point
*/
/*
* Allocate uninitialized heap object of size numBytes.
* GC "policy" encapsulated here.
*/
CVMObject*
CVMgcimplAllocObject(CVMExecEnv* ee, CVMUint32 numBytes)
{
CVMObject* allocatedObj;
/* Actual allocation detail hidden here */
allocatedObj = tryAlloc(numBytes);
if (allocatedObj == NULL) {
/* GC and re-try allocation */
if (CVMgcStopTheWorldAndGC(ee, numBytes)) {
/* re-try if GC occurred */
allocatedObj = tryAlloc(numBytes);
}
}
return allocatedObj;
}
/*
* The main GC point, which CVM calls after ensuring GC-safety of all
* threads.
*
* This is a mark-sweep-compact GC, with most details of the sweep and
* compaction hidden.
*
* The GC uses three callback functions. These are detailed below,
* after CVMgcimplDoGC().
*/
void
CVMgcimplDoGC(CVMExecEnv* ee, CVMUint32 numBytes)
{
CVMGCOptions gcOpts;
/* Set default GC options */
gcOpts.fullGC = CVM_TRUE;
gcOpts.allClassesAreRoots = CVM_FALSE;
/* The mark phase includes discovering weak references */
gcOpts.discoverWeakReferences = CVM_TRUE;
/*
* Scan all roots. markTransitively will mark a root and all its
* "children". Its 'data' argument is the GC options. A more
* complicated callback could pass a pointer to a struct into
* the callback function.
*/
CVMgcScanRoots(ee, &gcOpts, markTransitively, &gcOpts);
/*
* Don't discover any more weak references.
*/
gcOpts.discoverWeakReferences = CVM_FALSE;
/*
* At this point, we know which objects are live and which are not.
* Do the special objects processing.
*/
CVMgcProcessSpecialWithLivenessInfo(ee, &gcOpts, isLive, NULL,
markTransitively, &gcOpts);
/*
* Now that all the unreferenced special objects have been deleted,
* add the rest to our root set.
*/
CVMgcScanSpecial(ee, &gcOpts, markTransitively, &gcOpts);
/* The sweep phase. This phase computes the new addresses of each
object and writes them on the side. Details hidden. */
sweep();
/* Update the roots again, by writing out looking up old -> new
address translations. */
CVMgcScanRoots(ee, &gcOpts, updateRoot, NULL);
CVMgcScanSpecial(ee, gcOpts, updateRoot, NULL);
/* And update all interior pointers. Details hidden */
scanObjectsInHeap(ee, gcOpts, updateRoot, NULL);
/* Finally we can move objects, and reset object marks. Compaction
details hidden. */
compact();
}
/*
* The liveness predicate, for use in special objects scanning.
*/
static CVMBool
isLive(CVMObject** refPtr, void* data)
{
CVMObject* ref;
CVMassert(refPtr != NULL);
ref = *refPtr;
CVMassert(ref != NULL);
/*
* ROM objects are always live
*/
if (CVMobjectIsInROM(ref)) {
return CVM_TRUE;
}
/* Is object marked? It's live then. */
return CVMobjectMarked(ref);
}
/*
* The transitive object marker. Marks a given object and all its
* "children".
* This is recursive, for simplicity. A production GC should really
* not be recursive.
*/
static void
markTransitively(CVMObject** refPointer, void* data)
{
CVMGCOptions* gcOpts = (CVMGCOptions*)data;
CVMObject* ref = *refPointer;
CVMClassBlock* refCb = CVMobjectGetClass(ref);
/*
* ROM objects are always live
*/
if (CVMobjectIsInROM(ref)) {
return;
}
CVMobjectSetMarked(ref);
/*
* Now handle all the "children".
*/
CVMobjectWalkRefsWithSpecialHandling(CVMgetEE(), gcOpts, ref, refCb, {
CVMObject* thisRef = *refPtr;
if (thisRef != NULL) {
if (!CVMobjectMarked(thisRef)) {
markTransitively(refPtr);
}
}
}, markTransitively, data);
}
/*
* Update a root with the new address of an object
*/
static void
CVMgenMarkCompactFilteredUpdateRoot(CVMObject** refPtr, void* data)
{
CVMObject* ref = *refPtr;
/*
* ROM objects are not on the heap
*/
if (CVMobjectIsInROM(ref)) {
return;
}
*refPtr = lookupNewAddress(ref); /* Details hidden. Update root. */
}
Contents | Prev | Next
Copyright © 2000
Sun Microsystems, Inc.
All Rights Reserved.
Please send comments to: jcp-cdc-comments@eng.sun.com |
![]() |