weird ways of using OpenGL ES on the iPhone
This is a post explaining and describing my 3D Engine and the way I use OpenGL. I’m open for suggestion, call me an amateur for using it in this strange way, but it seems to be the right thing for my purpose 🙂
In a Thread at the iDevGames Forums I realized that my way of using OpenGL is probably a bit unconventional. So I thought I’d lay my reasoning out here in the open.
Full Blown OpenGL:
The usual way I learned to use full blown OpenGL was to issue draw-commands for each geometry and then translating, rotating and scaling between Objects with glTranslate,glRotate and glScale. But then again you also had the glBegin and end commands to tell the program which part of your code should be rendered as one scene.
Now OpenGL ES has no glBegin so each time you issue a draw command there is a lot of data send from the CPU to the GPU and each OpenGL-Library call seems to be weighing on performance.
Now in the Lecture “Optimizing OpenGL on the iPhone” given by Tim Omernick from the famous app-developers of ngmoco there were some tips on how to improve drawing speed of your OpenGL views. Obviously Texture Atlases are an improvement without doubt, so I won’t talk about them here.
The main issue seems to be to use interleaved Geometry Arrays as described by Jeff Lamarche in this post.
Both argue that using less glDraw calls, putting parts of the full scene in one big interleaved array and then issuing bigger draw-calls for this big array is faster.
Now I can definitely agree to that claim, only having knowledge about basic OpenGL, when I started my project I was drawing each hexagon with one draw command. So when I was testing my hexagon painting engine, I realized that it didn’t really scale. Even though one hexagon is built from only 16 triangles even painting a size like the one in the next picture got me down to less then 5fps.
What I did:
Putting all hexagons in one interleaved array meant rotating and scaling their geometries with the CPU, before handing them over to OpenGL. But the interleaved array in the example Jeff Lamarche gave, was built for each individual frame. That meant a lot of floating point operations for the CPU, something for which the GPU would be more suitable and something that slowed the whole app down.
Here’s the typical line where I add a vertice to my array:
_addVertex( xc+scale*(co*marineData[k].v.x+si*marineData[k].v.y), yc+scale(co*marineData[k].v.y-si*marineData[k].v.x),
scale*marineData[k].v.z, 0.99, 0.99, shouldercolor, co*marineData[k].n.x+si*marineData[k].n.y, co*marineData[k].n.y-si*marineData[k].n.x, marineData[k].n.z);
(This Code is a part of the drawing of a marine for the hexagons don’t have to be rotated)
So I just deleted the line in Jeff Lamarches Code that deletes the interleaved array after sending it to the GPU. Obviously now, for each frame the CPU had nearly no work at all (just game logic) and I was at maximal fps (i.e. 90 but I capped it at 30 now to reduce battery consumption).
Remember the actual rotating, translating and scaling the user is able to do with multitouch commands is still done by glScale,glRotate and glTranslate, ensuring a nice sensation of having a firm grip of the whole scene , consisting of very detailed models with tens of thousands of triangles. This way the user interaction is really smooth since it is operating at top fps.
But I had to restack the array using the CPU each time the game state changed, which can take up to 500ms for a big map, something the User could feel. The only thing that got recalculated for each frame is a tiny top of the array which encapsulates all the constantly moving things like the little indicator hovering over a unit telling it is currently activated. So now I keep track of where in the array each geometry is and only recalculate the parts of the array which actually change.
The video in the last post still updated the whole array each time something changed, you can clearly see the delay’s. In the current version of TACTICA the player doesn’t notice the redrawing delay anymore. It is still measurable, going up to 100ms, but always happens after the user built a unit or moved something somewhere and the 100ms vanish in the moment you lift your finger up again 😉
Let’s start with the cons of this approach:
- some CPU work when changing the game-state (not noticed by the player since it always happens after a touch is finished)
- accounting of parts of the interleaved array could become very complex
- the biggest problem are the dynamical animated parts of the scene. Things like rotating wheels, flashing lights etc. need to have very few triangles because there position has to be recalculated for each frame
Now for the advantages:
- its possible to draw a scene consisting of up to 100k triangles with extremely detailed models of units (i.e. one hooded paladin consists of 1573 triangles)
- even though the scene itself is mostly static, the user may rotate/translate/scale it freely at top FPS (>80fps if necessary)
- capping of fps to 30 increases battery life and the overall fact that I only recalculate something once it really changes (which happens only when the player touched something). Imagine the player to think about a move for one minute, the whole time the iPhone is practically slacking off
So in conclusion this approach is of no concern for 2D-Jump&Run&Games since they’ll probably want a lot of moving and flashing parts and don’t need the complexity of >50k triangles.
For Thirst Person Shooters or Real Time Strategy having units that detailed may be nice but then again the game state would change so often the fps would be very low constantly.
But for a game that is essentially a board game , in my honest opinion I consider it to be optimal. It’s like having a board of beautifully carved miniatures that stand perfectly still. But one is still able to take the whole board and move or turn it around like in real life.
Here is an example of a game with size 9, one player vs. 5 AI controlled sides and many units. The AI is currently moving its units and so one can see the “thinking iPhone”-Icon at the place where usually the credits are displayed: (click to see the iPhone-sized version)