Harald Could you please share a screenshot of your Project Settings - Player settings? You could have a try whether using the IL2CPP scripting backend improves the situation.
Certainly! (Sorry the image is so long, the spoiler tag doesn't shorten the empty space the image takes up on these forums, or I'd have hidden it in a spoiler.)
After setting our scripting backend to IL2CPP (which I thought I already did, but must've done only for iOS), as well as disabling the outline shader effect, we get... an unstable 30 FPS still with all three skeletons (though it might be more stable than without IL2CPP). Most of the time here is still being spent waiting for the GPU in the script as far as I can tell (Gfx.WaitForPresentOnGfxThread).
Some other testing has shown that using URP, with the default Spine URP Example 2D Scene from Spine, I only get about 15 FPS on the Android TV (while profiling), with most of the time being spent waiting for the GPU in the script (Gfx.WaitForPresentOnGfxThread).
After disabling all the point and directional lights in that scene I get around 57-59 FPS (while profiling) but with spikes down to 30 FPS, seemingly on a cyclical thing. Again these spikes seem to also be due to time being spent waiting for the GPU (Gfx.WaitForPresentOnGfxThread).
I'll report back here once I test all three skeletons rendering with URP next.
Edit: Welp. Trying URP with the three skeletons, Player, Opponent, and Crowd, is not boding well so far.
Using URP with the default Spine Skeleton shaders only gives us about 15 FPS now, whereas the default renderer with default shaders got us around 30 FPS (albeit unstably). Using URP with the Universal Render Pipeline/2D/Spine/Sprite
shader gives us a whopping 6 FPS π³ Then finally using URP with the Universal Render Pipeline/2D/Spine/SkeletonLit
shader gives us 10 FPS. GPU Instancing doesn't seem to make a difference here either.
In all three of these tests there were no normal maps, emission maps, fragment shaders (outside of any you've used for your shaders in Spine's Unity Package), no light sources, etc. It was quite literally three skeletons, a camera, a global light 2d, and a single canvas with a single TextMeshPro on it to display the FPS averaged over the last 60 frames. About as simple of a scene as we could make for this test.