I think this is unnecessary

ZZackham · 29 nov 2024

uint8_t r = static_cast<uint8_t>(skeleton.getColor().r * slot.getColor().r * attachmentColor->r * 255); uint8_t g = static_cast<uint8_t>(skeleton.getColor().g * slot.getColor().g * attachmentColor->g * 255); uint8_t b = static_cast<uint8_t>(skeleton.getColor().b * slot.getColor().b * attachmentColor->b * 255); uint8_t a = static_cast<uint8_t>(skeleton.getColor().a * slot.getColor().a * attachmentColor->a * 255); uint32_t color = (a << 24) | (r << 16) | (g << 8) | b;
This is from spine-cpp/src/spine/SkeletonRenderer.cpp line 212. I think once the colors are passed to GPU side, shaders still expect float4 as color to perform mathematical operations like interpolation. So, passing color as packed form requires unpacking again on GPU side, practically losing more performance and also losing a lot of precision. The only benefit gain is reducing the size per vertex, but if one wish to do this they can manually do this in their own code. I think it's better just keep all color format consistent as Spine::Color.

Spinebot · 29 nov 2024

The code snippet you provided is indeed packing the color values into a single uint32_t to reduce the size per vertex, which can be beneficial for memory usage and potentially for performance due to reduced data transfer between CPU and GPU. However, as you pointed out, this approach does require unpacking on the GPU side, which can introduce additional computational overhead and precision loss due to the conversion from float to uint8_t and back to float. Keeping the color format consistent as Spine::Color (using float4) would maintain precision and simplify the shader code, as it would avoid the need for unpacking. This approach would be more straightforward and could be preferable if the precision and simplicity outweigh the benefits of reduced vertex size in your specific use case.

ZZackham · 29 nov 2024

Spinebot
Dude this bot makes quit a point, and its speaking style is pretty alike to GPT. Do you guys get this bot from GPT's API?

Mario · 2 dic 2024

Yeah, we built a custom chat bot that uses the GPT API in conjunction with our content (docs, forum, etc.) to give hopefully good answers. It's a custom retrieval augmented generation system. We could plug any LLM into it.

As for your concern, yeah, the GPU has to unpack those bytes back into floats, depending on your shader of course. However, there's specialized circuitry for exactly that on the die, as packed colors is a pretty common thing. For good reason: we have to submit new vertex buffers to the GPU each frame. Sending any kind of data to the GPU is usually the bottle neck, so we try to save as much space as possible.

You'll find that many other APIs that abstract the GPU, actually use packed colors. For similar reasons.

ZZackham · 2 dic 2024

Mario
I see, thanks for your clarification. Then I guess I would change the code on myself if I would like to try passing original floating-point type as color.

ZZackham · 2 dic 2024

Mario
Wait a second, have we ever thought about passing entire vertex struct to GPU, then pass a glUniform or clGlobalVar representing the animation transformation of this frame, letting vertex shader / compute shader / kernel function to transform each vertex to correct position of this frame?

Nate · 3 dic 2024

You've described "GPU vertex skinning" and it is widely known and used. It has the benefits of offloading the work to the GPU, which is good at parallelizing. Transferring the transformation data and making sense of it on the GPU can be difficult or impossible for many game toolkits. Free form deformation also makes it a bit harder. Most GPU skinning has a limit to the number of bones that can affect a vertex, while so far Spine does not have such a limitation.

Spine rendering usually has low vertex counts, reducing the usual benefits. Often it's applying animation keyframes that takes up the most time per app frame, not computing vertex positions.

Bone transformations could also be computed on the GPU, but similarly there's not usually enough benefit to doing it. Also doing it on the CPU provides a lot more flexibility for constraints and other manipulation.

Harald · 3 dic 2024

One more aspect to mention is that GPU skinning works best when the Mesh topology remains constant, which is the case with "normal" 3D characters. Spine skeletons and 2D characters in general rely on relatively frequent switching of visible attachments, changing the vertex count and mesh topology. This further diminishes the benefits.