HTML5 target performance issue

openfl 3.1.2
lime 2.4.8
platform: iPad 4 / safari

I’m getting some weird behavior in HTML5 target regarding CPU usage.

First of all, if I simply load up a blank openfl project (screen is black) it continuously burns 30% CPU. If I do the same test with just some basic JS that loads and shows an image, it uses 0%.

If I run my entire game; I have the following (odd) results:

  • I have a line of code that loads a list of images (png). All images are small on their own but there are over 100 of them.
  • I have another line of code that simply calls addChild( myBitmap ), which shows a full-screen image to the screen. This bitmap is not an image in the above-mentioned list of images that gets loaded (it is loaded before this step).
  • I have a small audio library that loads, which seems to permanently add 5% on top of all that (but I am neglecting this item going forward).

So here’s the curious part: if both of those lines are enabled, I get a continuous additional 50% CPU burn on top of that original 30. So my CPU usage remains in the low 80s. If I comment out ONE of those lines (ie, don’t load my list of images OR don’t render a single bitmap image) I get the 50% back! So if one (either) or both are commented, I get 50% back. If both are present, it eats 50% CPU on what is a static splash screen consisting of a single background image and a single button image.

I methodically tested my own code in various systems to rule them out, and I’ve been trying to find the causal link here. Using the browser profiler, I see there is a lot of activity in render (which eventually triggers context.drawImage() ). It seems there is something that causes the entire canvas to constantly be invalidated and need to constantly update. I’m not seeing any early termination conditions in render to optimize performance.

To reiterate, if I show a bitmap and don’t load these files, there is a 0% CPU hit. If I load these files and don’t show a bitmap, there is also a 0% CPU hit. If I both load and show there is a 50% hit.

I noticed in the master branch on openfl’s github there have been some changes to BitmapData.draw and Bitmap, which are arguably the two classes triggering these performance issues. I also see calls from CairoTextField and CairoBitmap that are causing some performance hits.

So my question here is: Is loading somehow displaying all my loaded materials offscreen or would it effect flagging the canvas as dirty?? Why is the entire canvas redrawing with every update on a static screen?

Please let me know if I can be more helpful with testing or info. Maybe I can export the browser profile so you guys can load it up and inspect it. I’d love to help you all figure this out. I know I’m a little stumped since so much happening in the background.

Thanks so much!!!

With OpenFL 3, we have been moving to combine code across all targets.

As we approach stable and broad feature support in the new codebase, I want to look deeper into performance on HTML5 again.

When I first the new OpenFL renderer (for HTML5), I did have an optimization to render only if the stage was dirty. Perhaps this was lost. If you would like to use less CPU for a project that might remain idle, you might also want to consider using the -Ddom define when building, which does not blit to a fullscreen canvas by default

Thanks, I’ll try the -dom flag and see what happens.

It does seem to only render if the stage is dirty, however it appears the dirty flag (at least in my case) is sticking to true. If I don’t load the mentioned png files (but show my image) then it seems to work as one would expect.

Update: After poking around more in the code for DisplayObject and the canvas rendering package, it doesn’t seem that the ___renderDirty flag is actually being used in any draw checks (it is set, but never used in logic), so things are always redrawing every update. requestAnimationFrame() gets triggered nonstop.

Ideally, we want requestAnimationFrame to trigger, always, even if we will not render, this gives us a heartbeat to know that if we were to render, that would be our time to do it. Also, if you are running at full speed, it would give us updates to dispatch ENTER_FRAME and update any logic.

Have you tried different browsers? Some don’t support requestAnimationFrame, so they have to use a timer, which does burn CPU

We can have requestAnimationFrame trigger all the time (or a timer), but the issue seems to be more that the entire canvas redraws every time it does trigger. We were trying to mess with __renderDirty in some logic to see if we can get the canvas to only redraw when it does need to update, and were successful on one object, however objects in front of that object did not draw. So our thoughts are that __renderDirty may need to be implemented in some various draw checks to get them to draw when necessary and not all the time. We’ll continue testing stuff and reporting what we find.

Thanks for replying so quick, Sing! I know you guys always have a lot on your plate.

User @pheres requested any info regarding progress made here, so just adding some helpful suggestions for better performance on the user side.

I believe the later OpenFL/Lime versions showed itself to be a little better than what I had originally reported, but to my knowledge the core issue mentioned above has not been resolved.

I did notice, however, that the larger the browser window, the slower the FPS. This is all depending on how powerful your computer is and how large your monitor is. So if you are seeing lag while playing your game, try making the browser window a little smaller (or even the game size if you can). You can also hit F11 to put your browser in fullscreen mode, which will also get rid of any lag. The lag only happens when you are running in-window browser at a large resolution (but still each frame is still taking too long in render).

Try any other methods for optimizations - reduce the amount of new objects created (watch out for Bitmaps and Sounds!), recycle old ones, general performance principles will help out a bit. I’ve been able to get some HTML projects running at the set FPS on mobile devices, but the games weren’t that intensive anyway.

If anything else springs to mind, I’ll update this post.

A couple other options would be a locked canvas size (and allowing CSS to scale up), or to use -Dwebgl (which I’d like to do by default soon)

openfl 3.3.5
lime 2.6.4

I have a new question regarding HTML5 performance. It has come to my attention that there are hundreds of Canvas elements being created, which is consuming large amounts of RAM (~300 MB). What is the breakdown of the canvas process? I thought one of the big changes a while back was to not generate so many canvas elements and instead combine them effectively. Is there anything I can do in my implementation side (how things are parented, what types of objects are okay to use, etc) to cause openfl to generate less/combine them?

When you launch a very basic project (show one bitmap to the screen), I have previously seen ~200+ MB consumed in total. If I show that same bitmap via HTML image tag without openfl being involved, that drops to 0 MB.

It seems that this excess canvas count is the single largest memory hit and if we can resolve it, we can get openfl to run on par with pure JS games. Should I be investigating Lime more over OpenFL?

I also want to note that I am using sprite sheets for animations (currently testing whether these have an effect on canvas elements) and am creating lots of sprites (in which I hope each one is NOT its own canvas).

Thanks for any info here! @singmajesty

Currently, canvases are created…

  • When you use or
  • When you modify BitmapData
  • When you use a TextField

This behaves differently if you use -Ddom mode, which uses DIV for text

Are you using lots of graphics calls, or custom BitmapData objects?

Thanks for the response! Good to know.

When you say ‘use’, do you mean a new Canvas is constructed every time one is modified? So for instance, does messing with graphics draw calls or text field .text create lots of canvases, or is each object mapped to one Canvas?

I do have a large number of BitmapData sets triggered via the spritesheet (1.2.0) library and am utilizing spritesheets both for animation purposes and also as a sprite atlas (so basically for all of our games’ assets).

In Chrome dev tools, I see a ton of objects that are listed under “detached DOM Tree”, each one is reported to be a HTMLCanvasElement or a HTMLImageElement. I can map most of these to the Spritesheet library.

In my own source code I have been able to limit some of the stored Bitmaps and gain about 100 MB back (out of 500 MB used), so my team and I are looking into further modifications, possibly inside the Spritesheet library classes. But obviously we want to learn as much as we can about how and when canvases are utilized and find out how we can get the memory down under 100 MB overall. If we do resolve anything of importance, we will upstream the changes : )

I will do an audit for TextField and Graphics use as well, but those aren’t as heavy as BitmapData currently is.

Yep, when I say “use” in this case, I mean per-instance. It could be reduced by making one larger BitmapData, and perhaps we could modify the spritesheet library to draw sub-rectangles of that larger image rather than cutting into separate images (which is a fast way to do it on Flash, but maybe not ideal or needed elsewhere)

I think we can run a test on spritesheet and see what happens. Might not be that hard to try.

In the case of display objects in general, is there a better method available to combine them? Maybe have some finite number of Canvases and/or scenarios where a canvas is used and then group/compile display children to it rather than each have their own. I wonder if this would even have an impact or not. And if there’s a logical need to have every one of these elements be its own canvas.

Different BitmapData or Graphics instances (with content) will create separate canvas objects, but you can have multiple Bitmap instances with the same BitmapData and draw to only one canvas, so it’s not connected to how many Sprites or DisplayObjects you have, but just how it’s rendered :slight_smile:

Updating with some results and performance improvements here:

We saw a drastic difference in idle FPS and render performance just by flattening the render tree a little. These tests were conducted on an iPad 2 via HTML5 target using canvas renderer.

Case 1:
Bitmap (x100) -> Sprite (x100) -> Sprite (x1) -> Stage (x1)
We had approximately 100 Bitmaps drawn to the screen at once. Each Bitmap is added to its own Sprite, which is added to another (shared) Sprite, which is in turn added to the Stage. We saw a consistent 15 FPS during idle. The exact same can be said if you replace Bitmap with a Sprite (with graphics) in this case.

Case 2:
Bitmap (x100) -> Sprite (x1) ->Stage (x1)
If we remove the individual Sprite layers here, so each Bitmap has the same parent, we jumped from ~15 FPS to ~52 FPS. Again, the same can be said for a Sprite (with graphics) here instead of Bitmap.

It is a convenience to map a Bitmap to a Sprite and gain lots of Sprite effects, as well as allow the image to have an origin point other than 0, 0. However, it seems doing this is a huge hit to render performance. Hoping there are optimizations or other fixes that can be eventually applied here. Wondering if the shared parent causes each Bitmap to share a single Canvas, or if there is just an optimization case taking effect.

I can share more info if anyone is investigating further, but for now I thought this would be helpful for folks to know if they find themselves in the same boat.

Is this canvas or WebGL? I wonder if this is due to hit testing

Hi Sing, it is using the canvas renderer. I wasn’t listening for input or anything. I just simply render a bunch of graphics to the screen and do nothing. I have a simple app that allows you to toggle variables such as Bitmap vs Sprite and Nested vs Non that was used for testing if you would like to see it in action.

Sure, that would be interesting to test against :smile:

I am guessing this is a regression, performance-wise, since we do real shape flag hit testing on shapes now, rather than the old bounds check only

Isn’t that optional in the api? I recall seeing a shapeFlag on hitTestPoint (also hitTestObject?), so wonder if that’s flagged everywhere instead of just these functions.

I’m going to clean up my test and will post it on Monday. Thanks.

Here is a code sample along with compiled tests.

More testing info can be found in Main.hx. If you have any questions, let me know.

Did you get to any conclusion on this topic?