How to be GC-safe in CVM

How to be GC-safe in CVM

Table of Contents

Introduction

In CVM, garbage collection may be running at any time during program execution, searching for program state or changing object addresses behind your back. Therefore, you should be very careful whenever you want to access and change Java objects from native code.

The indirect memory interface allows you to safely manipulate objects from native code. The calls that make up the indirect memory interface operate on pointers to ICells (indirection cells), which are non-moving locations in memory holding object references. ICells must be registered with GC, so they can be found and updated when a GC occurs. Registered ICells may be local roots, or global roots.

The implementation of the indirect memory interface makes use of a per-thread GC-safety flag. Each indirect memory call on an ICell marks the caller thread as GC-unsafe, manipulates the Java object encapsulated by the ICell, and marks the thread as GC-safe again. Threads that are marked GC-unsafe cannot tolerate GC until they are marked GC-safe again. GC is only allowed to proceed if all threads are GC-safe. Use of the indirect interface in conjunction with registered ICells makes your C code safe from garbage collection, and makes the garbage collector aware of your Java object use.

Please refer to the direct memory interface and the the indirect memory interface for reference.

Living with ICells

ICells and the indirect memory interface form the foundation of the exactness architecture of CVM. Therefore it is critical to understand the various ways these calls can be used to ensure GC-safety.

Working on an exact system is different than working on a conservative system. In a conservative system, the garbage collector scans the native stacks and native registers, searching for values that look like pointers. So in order to keep a heap object alive, it is sufficient to keep around references to it in registers or stack locations, and GC will find them.

In an exact system, all locations holding pointers to heap objects must be known to the garbage collector. There are two types of such known locations:

  1. Implicitly registered locations: The garbage collector has a default root scan, and certain well-known VM data structures are considered to be default roots. Among this group are class statics, string intern tables, JNI tables holding global Ref's, and others. All the references in these data structures are considered implicitly registered with GC.
  2. Explicitly registered locations: Any ICells that are not included in the default root scan of GC must be explicitly registered, either by the local roots mechanism or the global roots mechanism.

ICell types

ICells encapsulate direct heap object references. Heap objects may be regular Java objects or arrays. There are different ICell types to express each. ICells for all non-array object types are declared as CVMObjectICell. Arrays of different Java types have corresponding ICell types. For basic type <T>, the right ICell type is CVMArrayOf<T>ICell.

Here's the list:

    CVMObjectICell         ocell;   /* for CVMObject references         */

    CVMArrayOfByteICell    acellb;  /* for CVMArrayOfByte references    */
    CVMArrayOfShortICell   acells;  /* for CVMArrayOfShort references   */
    CVMArrayOfCharICell    acellc;  /* for CVMArrayOfChar references    */
    CVMArrayOfBooleanICell acellz;  /* for CVMArrayOfBoolean references */
    CVMArrayOfIntICell     acelli;  /* for CVMArrayOfInt references     */
    CVMArrayOfRefICell     acellr;  /* for CVMArrayOfRef references     */
    CVMArrayOfFloatICell   acellf;  /* for CVMArrayOfFloat references   */
    CVMArrayOfLongICell    acelll;  /* for CVMArrayOfLong references    */
    CVMArrayOfDoubleICell  acelld;  /* for CVMArrayOfDouble references  */

Since ICells contain references that may be manipulated by GC, their referents should be set, nulled, and assigned to one another using calls from the indirect memory interface (see ICell manipulations from the indirect memory interface reference). Their values should only be passed around as ICell* to ensure GC-safety. So given ocell1 and ocell2 of type CVMObjectICell:

    CVMObjectICell* ocell1;
    CVMObjectICell* ocell2;
    CVMExecEnv*     ee = CVMgetEE();
    CVMBool res;

    <... make sure ocell1 and ocell2 point to registered ICells. They could be
    local roots or global roots, for example. See below ...>

    CVMID_icellSetNull(ee, ocell1);
    CVMID_icellSetNull(ee, ocell2);

    CVMassignDirectReferenceTo(ee, ocell1);
    CVMID_icellIsNull(ee, ocell1, res);
    if (!res) {
        /* Assign the referent of ocell1 to the referent of ocell2 */
        CVMID_icellAssign(ee, ocell2, ocell1); 
    }
In the example above, the only values passed around are pointers to ICells. Any assignment to the encapsulated direct object reference of an ICell (as assignDirectReferenceTo() does) must happen in a GC-unsafe region, created in the body of the implementation of CVMID_icellAssign(). GC-unsafe regions are to be explained in the section on GC-safety of threads.

Explicitly Registered Roots

Heap object references that are not part of the default root scan of garbage collection need to be explicitly registered with the collector. There are two separate mechanisms for explicit registration:
  1. Local roots: These are short-lived values, like local variables. They are allocated and deallocated in a stack-like fashion. The interface for using them is geared towards fast allocation and deallocation, and does not allow out-of-order deallocation.
  2. Global roots: These are long-lived values, like global variables. The program can obtain registered global root locations through the global roots API. Global roots may be created and destroyed out-of-order.

Declaring and Using Local Roots

Local roots are an efficient way of declaring, registering and unregistering ICells of local scope. They are typically used to hold relatively short-lived values; think of them as GC-registered local variables. Also note that local roots are thread-local; they are created, used and discarded in the same thread.

The use pattern is the following:

    //
    // Start a local root block, passing in the current 'ee'
    // (execution environment), which contains per-thread information.
    //
    CVMID_localrootBegin(ee); {
        CVMID_localrootDeclare(Type1ICell, var1);
        CVMID_localrootDeclare(Type2ICell, var2);
	//
        // use var1 and var2 as Type1ICell* and Type2ICell*
        // respectively
	//
        // do NOT leave the block without executing
        // CVMID_localrootEnd()!
	//	
    } CVMID_localrootEnd();

Since local roots occur more often (dynamically) than global roots, the interface for using local roots is optimized for allowing stack-like fast allocation and deallocation. Conceptually:

  1. CVMID_localrootBegin() marks the beginning of a scope containing a list of CVMID_localrootDeclare() calls.
  2. The implementation keeps track of CVMID_localrootDeclare() calls. For each, it allocates and registers a local ICell and declares an ICell pointer to that registered ICell.
  3. When the programmer is done with the local roots in the scope, he/she calls CVMID_localrootEnd(), which discards all allocated local roots in that scope.

Note that it is important to call CVMID_localrootEnd() when leaving a local root scope; this call discards all registered local roots declared since the last CVMID_localrootBegin(). Also note that CVMID_localrootBegin() and CVMID_localrootEnd() may nest arbitrarily.

Here is a good example to illustrate local root use:

Example:

You want to call an allocating operation that is possibly a few functions deep. Therefore you want the caller to declare a local root, and pass its corresponding ICell* as a result argument to the operation. This keeps the allocated object safe from garbage collection the moment it is stored in the result argument. When the operation is complete, the caller can unregister the local root.

The following creates a Java string from a Utf8 string. It is an inlined (fast) version of the String constructor. It uses two local roots for temporary values, and discards them after a String has been successfully created and assigned to a result ICell.

void CVMmakeStringFromUtf8(CVMUtf8* chars, CVMObjectICell* result) {
    CVMID_localrootBegin(); {
	// Two local roots to be used as temporaries
	CVMID_localRootDeclare(CVMObjectICell,      string);
	CVMID_localRootDeclare(CVMArrayOfCharICell, theChars);

	CVMJavaInt length;

	// Make the string object
	CVMID_objectNewInstance   (CVMjavaLangStringClassblock, string);

	// .. . and the chars array
	// Pass the local root in to receive the resulting char[]
	CVMmkArrayOfCharFromUtf8  (chars, theChars);
	
	CVMID_arrayGetLength      (theChars, length);

	//
	// Assign the values of the string
        //
	CVMID_fieldWriteRef(string,
	                    CVM_offsetOf_java_lang_String_value,  
	                    theChars);

	CVMID_fieldWriteInt(string, 
	                    CVM_offsetOf_java_lang_String_length, 
	                    length);

	CVMID_fieldWriteInt(string, 
	                    CVM_offsetOf_java_lang_String_offset, 
	                    0);

	// We write the result back to the result ICell. 
	CVMID_icellAssign  (result, string);

	// We can now discard the local roots, assuming 'result' was
	// a pointer to a registered ICell.
    } CVMID_localrootEnd();
A possible caller of this may be the constant resolution code, resolving a constant pool entry of type CONSTANT_String. The result ICell may be the actual constant pool slot, which is updatable by GC when it scans class information. (In other words, the constant pool slot for a String constant is an implicitly registered ICell).

So the call would be something like:

    void CVMresolveStringConstant(CVMConstantPool* cp,
                                  CVMJavaShort strIdx,
			          CVMJavaShort utf8Idx)
    {
        CVM_CLASS_RESOLUTION_LOCK();
            CVMID_icellSetNull(&cp.entries[strIdx].str);
            //
            // Mark it as being resolved. This way, no thread can
            // yet use this c.p. entry; however GC can scan it
            // if necessary.
            //
            CVMcpSetBeingResolved(cp, strIdx);
            CVMmakeStringFromUtf8(cp.entries[utf8Idx],
	                          &cp.entries[strIdx].str);
            CVMcpSetResolved(cp, strIdx);
        CVM_CLASS_RESOLUTION_UNLOCK();
    }

Declaring and Using Global Roots

Global root registration allows for declaring, registering and unregistering ICells of global scope. Global roots are typically used to hold long-lived values that are to be included in the GC root scan; think of them as GC-registered global variables.

The use pattern is the following:

    //
    // Part of CVMglobals
    //
    struct CVMGlobalState {
        ....
        CVMObjectICell*      globalRoot1;
        CVMObjectICell*      globalRoot2;
	....
    }
    ...
    void CVMinitThisModule()
    {
        CVMglobals.globalRoot1 = CVMID_getGlobalRoot();
	CVMglobals.globalRoot2 = CVMID_getGlobalRoot();
	...
    }
    ...
    void CVMuseThisModule()
    {
	// globalRoot1 and globalRoot2 may safely be used as
	// ICell* arguments to CVMID_ operations.
	CVMID_objectNewInstance(CVMclassJavaLangStringClassblock,
	                        CVMglobals.globalRoot2);
	CVMID_icellAssign(CVMglobals.globalRoot1, CVMglobals.globalRoot2);
    }
    ...
    void CVMexitThisModule()
    {
        CVMID_freeGlobalRoot(CVMglobals.globalRoot1);
        CVMID_freeGlobalRoot(CVMglobals.globalRoot2);
    }
Any long-lived ICell declaration should be registered as a global root. These include C structure fields and global variables.

Here are some examples:

Example 1:

Registering an ICell referred to by a C struct. There may be long-lived C structures in the system with heap object references, like a C hash table with a Java array as the list of values. Such a table existed in JavaOS. The declaration for that hash table in CVM would be:

typedef struct CVMStrIDhash {
    < ... Other hashtable fields ...>
    CVMArrayOfRefICell* params;	/* param table, if needed */
} CVMStrIDhash;
where the params array is declared as an ICell* holding an array of references. We would allocate these StrIDhash nodes as follows:
/* Create a hash table of the specified size */
static CVMStrIDhash *
CVMcreateHash(int sizeInBytes)
{
    CVMStrIDhash *h;
    h = (StrIDhash *)CVMCcalloc(1, sizeInBytes); 
    if (h != NULL) { 
        CVMinitNode(h);
	//
	// Register and null out the value
	//
	h->params = CVMID_getGlobalRoot();
    }
    return h;
}
After registration, h->params may be used as a registered ICell* parameter to other CVMID_ operations. So to allocate the params array:
CVMBool
CVMmkParams(CVMStrIDhash* hash, int size)
{
    CVMArrayOfRefICell* params = hash->params;
    CVMID_newArrayOfRef(CVMjavaLangObjectClassblock, size, params);
    if (CVMID_icellIsNull(params)) {
        return CVM_FALSE; // Allocation failed
    } else {
        return CVM_TRUE;
    }
}

Example 2:

Here is another example of a C structure that contains a Java pointer. This declaration is from the CVMClassblock structure. Whenever a new CVMClassblock is allocated, the ICell* typed fields are initialized to point to fresh global roots:
    struct CVMClassblock {
        ...
        CVMObjectICell* classLoader;
        ...
    };
    typedef struct CVMClassblock CVMClassblock;

    ...
    CVMClassblock*  class    = (CVMClassblock*)CVMCcalloc(1, sizeof(CVMClassblock));

    //
    // Get a new, nulled global root to hold a classloader reference
    //
    class->classLoader       = CVMID_getGlobalRoot();

    //
    // Make a new ClassLoader instance, and assign it to its location
    // in class 'class'.
    //
    CVMID_objectNewInstance(CVMglobals.javaLangClassLoaderClassblock, class->classLoader);

When the CVMClassblock is freed, all its registered global roots must be freed first:

void CVMclassUnload(CVMClassblock* class)
{
    //
    // Free all class-related data structures
    //  
    ...
    //
    // Now get rid of the global roots
    //
    CVMID_freeGlobalRoot(class->classLoader);

    // And finally the Classblock itself
    CVMCfree(class);
}

Example 3:

Registering well-known values. Let's assume that we want to have global instances of the java.lang.Class versions of some Java classes. Note that CVM would not necessarily do this, since Classblocks are not allocated on the heap, but it is a good example for global roots.
    //
    // Part of CVMglobals
    //
    struct CVMGlobalState {
        ....
        CVMObjectICell* classJavaLangObject;
        CVMObjectICell* classJavaLangString;
	....
    }
    ...
    CVMinitVM()
    {
        /* Allocate and null out global roots */
        CVMglobals.classJavaLangObject = CVMID_getGlobalRoot();
        CVMglobals.classJavaLangString = CVMID_getGlobalRoot();

	// ... and they lived happily ever after
	CVMfindSystemClass("java/lang/Object", CVMglobals.classJavaLangObject);
	CVMfindSystemClass("java/lang/String", CVMglobals.classJavaLangString);
    }

GC-safety of threads

Each thread in CVM has a flag called the GC-safety flag. Whenever a thread performs an operation that manipulates heap objects directly, it is marked as GC-unsafe. If another thread initiates a GC at this time, all threads must be rolled to GC-safe points in order for GC to proceed safely.

The byte-code interpreter typically works in a GC-unsafe manner to allow for efficient direct access to heap objects. To bound the time the thread remains GC-unsafe, backwards branches, method calls, and method returns are designated as GC-safe points. At those points each thread polls for GC. If there is a GC request, the thread suspends itself to rendezvous with all the other threads rolling forward to their GC points. Execution continues after GC.

The implementation of the indirect memory interface marks the caller thread GC-unsafe while it is manipulating object references directly. These are typically very simple operations, and result in only a small window of GC-unsafety.

CVM also allows arbitrary sets of operations to proceed in GC-unsafe regions. These operations should be bounded in execution time, and are not allowed to block.

To see the full set of GC-safety operations, see GC-safety of threads from the direct memory interface reference.

GC-atomic blocks

If you want GC disallowed while executing a certain set of operations, use:

    CVMD_gcUnsafeExec(ee,
	<... gc-unsafe code ...>
    )
where ee is a pointer to the execution environment (CVMExecEnv) of the current thread.

The GC-unsafe code may not block, perform I/O, or otherwise take too long to execute, in order to keep the time GC is diabled to a minimum.

When writing GC-unsafe code, extreme care must be taken to avoid calls to arbitrary library routines. These may take too long to execute, or grab platform locks that might end up blocking. The example of malloc() comes to mind. So make sure you become GC-safe before making such a call (see Offering a GC-safe point below).

Direct pointers to objects may be used within the unsafe block; however, you should make sure that all direct values are written back into registered ICells before exiting the gcunsafe block.

Use this model rarely and with great care. In some cases, if you really feel it must be used, then maybe it is because it would be useful to add new functions to the CVMID_ interface. Then we might choose to do that, and avoid CVMD_gcUnsafeExec().

Example 1:

Let's say that you want to use two direct memory accesses consecutively without the overhead of being GC-unsafe around each. Here's how you would do that:

CVMObjectICell* cell1;
CVMObjectICell* cell2;
CVMJavaInt      val1, val2;

...
< Assume cell1 and cell2 point to registered ICells >
...
CVMD_gcUnsafeExec(ee, {
    CVMObject* o1 = CVMID_icellDirect(cell1);
    CVMObject* o2 = CVMID_icellDirect(cell2);

    // Read the third integer field of
    // each object.
    CVMD_fieldReadInt(o1, 2, val1);
    CVMD_fieldReadInt(o2, 2, val2);
} ); // End of gcunsafe region

Example 2:

Setting the referent of an ICell. The example below takes an ICell* for a char[] array, allocates a string, starts a GC-unsafe region, and calls out to initialize the string's fields. The ICell pointed to by resultString gets a reference to the allocated string before the GC-unsafe region is exited.

Note that this sort of long GC-unsafe region is intended as an example only; this style should only be used in performance critical points, where direct accesses help make the code faster.

void CVMmakeString(CVMArrayOfCharICell* theChars,
                   CVMObjectICell* resultString) {
    CVMID_localrootBegin(ee); {
        CVMID_localrootDeclare(CVMObjectICell, tempString);
        // Allocate a String
	CVMID_objectNewInstance(CVMjavaLangStringClassblock, tempString);

        CVMD_gcUnsafeExec(ee, {
	    CVMObject*      str   = CVMID_icellDirect(tempString);
	    CVMArrayOfChar* chars = CVMID_icellDirect(theChars);
	    CVMJavaInt      a_len;  

	    CVMD_arrayGetLength(chars, a_len);
	    CVMinitializeDirectStringRef(str, chars, a_len, 0);

	    // Set the referent of the ICell that resultString points to
	    CVMID_icellSetDirect(resultString, str);
	} );
    } CVMID_localrootEnd();
}

Example 3:

Performing an operation that modifies a data structure that is a default GC root. In the example below CVMicellList[] is a thread-local list of free ICells, and is a default GC root. CVMgetICell() allocates ICells from the list. All assigned ICells from the list are scanned by GC during the root scan. The operation needs to disable GC; otherwise a GC scan might find the ICell list in an inconsistent state.

CVMObjectICell CVMicellList[];
CVMUint32      CVMicellListPtr;

CVMObjectICell* CVMgetICell()
{
    CVMObjectICell* ret;
    CVMD_gcUnsafeExec(ee, {
        ret = &CVMicellList[icellListPtr++];
	*((CVMUint32*)ret) = 0;
    } );
    return ret;
}
Without CVMD_gcUnsafeExec(), garbage values might be scanned as live as soon as icellListPtr is incremented, and before the returned ICell is initialized to 0.

Offering a GC-safe point

Code that is GC-unsafe for long segments must periodically offer a GC-safe point. For example, the interpreter runs in a GC-unsafe way, manipulating direct pointers, etc., but at backwards branches, at method calls, and maybe other points, it must offer to be GC-safe to bound the time from a GC request to a GC cycle start. Also, long running operations like data structure expansions or lengthy computations must offer to be GC-safe occasionally. And finally, the VM must offer GC-safe points before doing potentially blocking OS calls like dynamic linker resolution for natives, acquiring locks, or I/O in order to make sure there are no blocked, GC-unsafe threads in the system.

The standard pattern of doing this is to use the CVMD_gcSafeExec() or the CVMD_gcSafeCheckPoint() macros. For details, refer to GC-safety of threads from the direct memory interface reference.

CVMD_gcSafeCheckPoint() is used to offer a GC-safe point for operations that will definitely not block:

    CVMD_gcSafeCheckPoint(
        ee,
        {
            <Save your state for possible GC>
        },
        {
            <Restore your state after possible GC>
        }
    );

CVMD_gcSafeExec() is used to offer a GC-safe point for operations that might block: Here, the state saving operation and the potentially blocking operation are separated:

    CVMD_gcSafeExec(
        ee,
        {
            <Save your state for possible GC>
        },
        {
            <Do potentially blocking operation>
	},
        {
            <Restore your state after possible GC>
        }
    );

Note that in the case of CVMD_gcSafeExecMayBlock(), the state saving and restoration are executed unconditionally. They are included in the definition of CVMD_gcSafeExecMayBlock() to associate the state saving and restoration operations with ensuring the GC-safety of the blocking operation.

At the end of one of these macros, the executing thread is once again GC-unsafe.

Example 1:

The interpreter becomes GC-safe on a backwards branch.

    <...>
    case opc_goto: {
        CVMJavaShort skip = CVMgetJavaShort(currentPc + 1);
        if (skip <= 0) {
	    CVMD_gcSafeCheckPoint(ee,
                {
		    CVM_DECACHE_INTERPRETER_STATE(currentFrame,
					          currentPc, 
					          currentSp);
		},
                {
                    // No reconstruction needed since Java code
		    // will not execute in this thread
		    // between DECACHE_... and GC.
                }
	    );
        }
        CVMexecuteNextInstructionAtOffset(skip);
    }
    <...>
In effect, the CVMD_gcSafeCheckPoint() operation polls a global variable to see if a GC is requested. If it is requested, then the state save operation occurs, GC is run, and the state is reconstructed. If no GC was requested, we go on. This is a very cheap way to create polling-based GC-safe points.

Example 2:

Blocking operations need to become GC-safe. Here's a tricky example: a two-part monitorenter operation. The first one gets access to the object, and checks to see if blocking is needed. If no blocking needed, there is no need to become GC-safe. If blocking is needed, we save our state, become GC-safe and block.

    ....
    case opc_monitorenter: {
        CVMObject* lockedObject;
	vmResult   result;

        lockedObject = STACK_OBJECT(-1);
	CHECK_NULL(lockedObject);
	result = CVMobjectTryLock(lockedObject); // Try to lock
	if (result == VM_LOCKED_WITHOUT_BLOCKING) {
	    //
	    // The uncontended case
	    // We have already succeeded locking
	    //
        } else {
	    //
	    // May now block.
	    // Save interpreter state, stash away locked object
	    // as a GC-root, and use monitorEnterMayblock() with
	    // the ICell version.
	    //
	    CVM_DECACHE_INTERPRETER_STATE(currentFrame,
					  currentPc, 
					  currentSpCached);
            CVMD_gcSafeExec(
	        ee,
	        {
                    CVM_DECACHE_INTERPRETER_STATE(currentFrame,
		                                  currentPc, 
					          currentSpCached);
	        },
		{
		    // Pass in the stack slot as the ICell*
		    CVMmonitorEnterMayblock(&STACK_OBJECT(-1));
		},
                {
                    // No reconstruction needed since Java code
		    // will not execute in this thread
		    // between CVM_DECACHE_... and GC.
                }
            )
        }
        if (CVMcheckForException() == VM_NO_POSTED_EXCEPTION) {
	    UPDATE_PC_AND_TOS_AND_CONTINUE(1, -1);
        } else {
	    CVMhandleException();
        }
    }

Example 3:

VM becomes GC-safe before making a call that might allocate from the C heap. This routine may end up taking a long time execute, or even worse, block on an OS lock. Therefore the VM needs to become GC-safe before making the call, in a way that allows blocking.

Here's an excerpt from the stack expansion code that becomes GC-safe before attempting to allocate from the C heap.

    CVMStackVal32*
    CVMexpandStack(CVMStack* obj, CVMUint32 capacity)
    {
        ....
	 * Must allocate new chunk. Take excess capacity into account.
	 */
	size = sizeof(CVMStackChunk) + 
	    sizeof(CVMStackVal32) * (capacity - CVM_MIN_STACKCHUNK_SIZE);

	newStackSize = s->stackSize + capacity;
	if (newStackSize > CVM_MAX_STACK_SIZE) {
	   ... throw exception ...
	}
	CVMD_gcSafeExec(ee,
			{}, /* save */
			{next = (CVMStackChunk*)CVMCmalloc(size);},
			{}); /* restore */
	....
    }