OpenFL + Starling Master Performance Issues

I’ve been running some performance testing of starling vs openfl to confirm that i can get comparable amount of objects on the screen with both engines and the demo runs fine with the windows build batching correctly with multiple draw calls, but when running the master integrating just the source into the current version of OpenFL, it’s locked to a single draw call when using a sprite.

I’m using a single texture created at runtime from bitmapdata that is referenced in all generated Images and as far as i remember that should be enough to get the performance boost and batching the draw calls automatically even if the container is a Sprite, is there anything is should be trying to pay attention to? I’ve scoured the demo to try and find differences but i’m at a loss. (Though it might have been compiled with a previous version of OpenFL i will need to look into this later)

I have read about the MeshBatch but i think it’s overkill to manage for what i’m trying to do and wasn’t used for the demo either, if I’m not mistaken.

Any thoughts, i’m at a loss? Is there something i’m missing or a depreciation of functionality?

I’m not sure I understand,

The Starling demo runs fine, but when running the master (same version of Starling, not the demo?) it’s locked to a single draw call (the other calls don’t render? it’s too slow?)

Previously compiled versions of the Starling Demo Starling 2.5.1 run fine last version i tested and was working was with openfl 8.9.1 and lime 7.5.0 where i had over 63000 objects

i just recompiled Windows / Debug with openfl 8.9.6 and lime 7.7.0 it maxes out at 5200 though this appears to be the windows debug version, with the Windows / Final it batches correctly, is this a known behaviour of the Windows Debug compilation? Are there any workarounds? Any ideas to the possible cause?

What’s interesting is that OpenFL Tilemaps don’t experience the same issues.

I think Starling -Ddebug may enable error checking on Stage3D which is known to be slow

If it was just the slowdown, i would understand but it’s also not batching, which appears to be the root cause of the slowdown. In the Windows build final as it approaches 5200, it increments the batch and doesn’t get a frame-rate drop, but in the debug mode it maxs out the CPU and just stops without incrementing the batch. Even in separate stress tests which don’t stop testing when the CPU drops it doesn’t look like it’s batching correctly.

Take a look at these search results here:

This should match how Starling works in AS3 but it should help cast light on the differences between release and debug builds

After trying to isolate the issue i’ve found been able to better understand it maybe someone will have some additional insight i will add my findings here.

I will compare the Debug/Final Builds, size of the render area (340x480 and 800x60) or the Target application FPS (so building an application to run 60 or 30fps) seems to have no effect and maintained consistent performance.

Debug Build:

  • Larger HXCPP Executeable 34.5 MB
  • Only Batches sprites to a single draw call (this is an observation)
  • Max Spawned Objects 3.7k

Final Build:

  • Smaller HXCPP Executeable 9.6 MB
  • Batches Sprites in a single Container to a maximum of 3 draw calls
  • Max spawned Objects 34.5k

What i did see in the compilation folder is a haxe/release.hxml is addition to the final.hxml and debug.hxml, is this different and how does one compile to a release build?

Also taking out the debug lines or setting them to false had not effect either:
_starling.enableErrorChecking = false;
_assets.verbose = false;