GC Performance Issue of VBA Class

Background

When I was processing some data using VBA, I noticed that if ~100k class objects were added into a Dictionary (Microsoft Scripting Runtime), the script would stuck for several seconds at the end. If the objects quantity were increased to ~160k, it will stuck for 10+ sec.

Cause

After a few tests and searches on the Internet, I found the issue was most probably caused by GC (garbage collection) of VBA. The GC mechanism of VBA is that whenever the reference count of an object drops to zero, the object is destroyed. However, due to unknown reason, such mechanism seems to exhibit a time complexity of O(n²). That is, if the object quantity increases, the GC time will increase drastically.

Relationship between Time & Object #
Relationship between Time & Object #

To verify the issue, I wrote a small piece of code to cram certain quantity of myClass objects into a Collection (result as above; code at the end of article). From the chart, it is obvious that while the creating time is linear to object # (blue line), the destroying time is quadratic to object #. When object # reaches 300k, creating only takes 0.8 sec, but destroying takes near 12 sec.

Other findings

Per further test, other findings include:

  • This issue only exists for class object, but not for data type (e.g. integer, string, UDT, etc.). This is because data type are not objects and the GC does not depend on "zero reference count".
  • When the container is an object (e.g. Dictionary, Collection), the issue can be stably reproduced. When the container is an array, the reproduction of issue is not stable.
  • The issue only exists if the class objects are initialized. It doesn’t exist if the object is only declared. This can be easily understood: if the object is not initialized, no GC is involved.

Workaround

As the issue is too foundamental, I could not find any good solutions. Workarounds include:

  • Avoid using Class. Use UDT (user-defined type) instead. However, there are too many demands not achievable with UDT.
  • Only initialize the objects where necessary (i.e. avoid Cartesian product of dimensions with high correlation).
  • Use other languages (such as Power Query M).

Appendix

Test code as follows.

Option Explicit
 
Sub TestScrrunClassSpeed()
    Dim c As New Collection     'Scripting.Collection
    Dim rgIO As Range           'Range for input/output
    Dim arrIO As Variant        'Temporary array for input/output
    Dim i&                      'Iterator
    Dim testnr%, qty&           'The # and object quantity of each test
    Dim t!(0 To 2)              'The start time, create time and destroy time
     
    'Read parameter from sheet: column A defines object quantity of each test
    Set rgIO = Sheets("Result").Range("A2:C31")
    arrIO = rgIO.Value
     
    'Do each test
    For testnr = LBound(arrIO) To UBound(arrIO)
        'Read object quantity and prepare
        qty = CLng(arrIO(testnr, 1))
        t(0) = Timer
         
        'Create class objects
        For i = 1 To qty
            c.Add New myClass
        Next i
        t(1) = Timer
         
        'Clear class objects
        Set c = Nothing
        t(2) = Timer
         
        'Calculate the time
        arrIO(testnr, 2) = (t(1) - t(0)) * 1000
        arrIO(testnr, 3) = (t(2) - t(1)) * 1000
    Next testnr
     
    'Write result to sheet
    rgIO.Value = arrIO
End Sub

Leave a Comment

Your email address will not be published. Required fields are marked *