This is probably not what you need as it seems quite simple, but I don't understand your problem, so we can try. For the isometric view, you can just draw all your walls starting from the ones which are furthest from the point of view (i.e. top of the screen) and ending on those which are closest. This can be done easily by changing the sequence of walls generated by your FOR loops or whatever you use. This way you can be sure that a wall or monster which is closer never obscures one which is more distant. I don't think this would need optimizing... [OK, I am not sure whether you are even working in the isometric view, so this might be even less useful]
Maybe your problem is that you are working in OpenGL? This is designed for rendering 3D scenes and seems very strange for a 2D game. For each floor and wall and monster and item type, create a ready-to-use bitmap which can be directly copied without any modifications ("blitted") to the screen, and then just put these bitmaps at correct locations.
The game isn't isometric. I'm using OpenGL out of necessity. Since I'm working on the Android platform, I'm limited by memory, and I have a lot of graphics to load. 16MB on older models is the limit for an application. However, with OpenGL, I can access the VRAM, which opens up nearly 256MB of memory. Also, speed is a critical issue, since it's in realtime, and OpenGL is much faster than using the 2D rendering. With OpenGL, I can load my tilesets as textures, and just directly blit the parts of the textures I need.
The Z-ordering issue I'm less concerned with, as I could find a solution for that. However, because the wall tiles won't be contained in a grid, but rather will have direct pixel coordinates, I can't simply loop through a 2D array. Instead, I'll have to loop through an array containing *all* wall tiles, and check to see whether or not they're in viewing range. On any sizable map, this will take a while, when repeated dozens of times a second.
The easy solution would be to limit the map size, and just keep using the vector array system I'm using now, but I really don't want to restrict gameplay due to implementation problems.
Since I seem to be having trouble describing the problem properly, I'll give an indepth explanation:
So, I have a 2D array of floor tiles. These can simply be for looped, no problem. Wall tiles, however, don't conform to any sort of grid, since each wall tile has different dimensions. Thus, it is necessary to specify direct pixel coordinates for each wall tile, as opposed to grid coordinates, like those used for the floor tiles. So, whereas a floor tile might have coordinate (5, 7), which would correspond to pixel coordinates (32 * 5, 32 * 7), if the floor tiles were 32x32, a wall tile would have no grid coordinate, just a pixel coordinate, like (517, 892).
There's no way to align variable sized tiles to a grid. Trying to do so would create overlap, visible seams, and other visual artifacts. So right now, I have a big array containing the wall tiles, and scan each individual one, testing it to see if it falls within the camera's coordinates. This is way too expensive. I've been researching 3D methods of solving these problems, but most methods seem unnecessarily complex.
BSP trees, for example, is the typical 3D method of doing this, but that's usually used to speed up the processing of millions of polygons, as opposed to maybe a couple hundred of my wall tiles. Due to the smaller set size, I'm unsure implementing the rather complex BSP trees would actually gain me a significant speed boost.
Quadtrees are also a consideration, but, again, seem awfully complex for a 2D application.
I'm really at a loss here, and am not up for the challenge of implementing these difficult data structures, just to possibly find out it was a waste of time. I really need a solution for this damned problem, it's really doing my head in.