Background
When I was processing some data using VBA, I noticed that if ~100k class objects were added into a Dictionary (Microsoft Scripting Runtime), the script would stuck for several seconds at the end. If the objects quantity were increased to ~160k, it will stuck for 10+ sec.
Cause
After a few tests and searches on the Internet, I found the issue was most probably caused by GC (garbage collection) of VBA. The GC mechanism of VBA is that whenever the reference count of an object drops to zero, the object is destroyed. However, due to unknown reason, such mechanism seems to exhibit a time complexity of O(n²). That is, if the object quantity increases, the GC time will increase drastically.
Relationship between Time & Object #
To verify the issue, I wrote a small piece of code to cram certain quantity of myClass objects into a Collection (result as above; code at the end of article). From the chart, it is obvious that while the creating time is linear to object # (blue line), the destroying time is quadratic to object #. When object # reaches 300k, creating only takes 0.8 sec, but destroying takes near 12 sec.
Other findings
Per further test, other findings include:
- This issue only exists for class object, but not for data type (e.g. integer, string, UDT, etc.). This is because data type are not objects and the GC does not depend on "zero reference count".
- When the container is an object (e.g. Dictionary, Collection), the issue can be stably reproduced. When the container is an array, the reproduction of issue is not stable.
- The issue only exists if the class objects are initialized. It doesn’t exist if the object is only declared. This can be easily understood: if the object is not initialized, no GC is involved.
Workaround
As the issue is too foundamental, I could not find any good solutions. Workarounds include:
- Avoid using Class. Use UDT (user-defined type) instead. However, there are too many demands not achievable with UDT.
- Only initialize the objects where necessary (i.e. avoid Cartesian product of dimensions with high correlation).
- Use other languages (such as Power Query M).
Appendix
Test code as follows.
Option Explicit
Sub TestScrrunClassSpeed()
Dim c As New Collection 'Scripting.Collection
Dim rgIO As Range 'Range for input/output
Dim arrIO As Variant 'Temporary array for input/output
Dim i& 'Iterator
Dim testnr%, qty& 'The # and object quantity of each test
Dim t!(0 To 2) 'The start time, create time and destroy time
'Read parameter from sheet: column A defines object quantity of each test
Set rgIO = Sheets("Result").Range("A2:C31")
arrIO = rgIO.Value
'Do each test
For testnr = LBound(arrIO) To UBound(arrIO)
'Read object quantity and prepare
qty = CLng(arrIO(testnr, 1))
t(0) = Timer
'Create class objects
For i = 1 To qty
c.Add New myClass
Next i
t(1) = Timer
'Clear class objects
Set c = Nothing
t(2) = Timer
'Calculate the time
arrIO(testnr, 2) = (t(1) - t(0)) * 1000
arrIO(testnr, 3) = (t(2) - t(1)) * 1000
Next testnr
'Write result to sheet
rgIO.Value = arrIO
End Sub