Idea for efficient AO type raytracing applications
12/Aug 2011
Quick idea that I had a few weeks ago regarding raytracing things like AO and potentially GI/FG. The basic observation is that for AO ray tracing you have far more rays than you have triangles, so it really doesn’t make sense that we only focus on spatial paritioning for the triangles. I suspect there’s some bias here because we’re so used to thinking of our rays as travelling around a static scene, so we just kind spatially subdivide the scene without even thinking about it. For AO the rays are just as static as the scene so we really should be looking into doing something smarter with them.
So the first implication of this is the following: organize your rays into some sort of spatial subdivision structure (e.g. an R-tree). Efficiency of construction is probably a concern here since we have a boatload of rays. Then, send your triangles down this structure one at a time finding which rays it intersects, jotting down the distance for each (so you can keep track of the minimum). You may want something more clever like some kind of cone-based tree to gather up clusters of rays facing roughly in the same direction from roughly the same location, but AO rays in particular tend to have a short maximum distance and are therefore pretty “stubby”, so something as simple as an AABB wouldn’t be terribly inefficient, I suspect.
Then of course the next step is to put the scene geometry in a spatial sudivision structure as well. Now you have two trees that sudivides the scene and rays respectively, so you can simply join them spatially to very efficiently prune out areas of your rays that can not touch areas of the scene. You can think of this as amortizing the cost of traversing the scene structure over all rays, since most of the scene traversal will be done for large chunks of your rays at the same time rather than having to repeat it for each ray. Basic CPU-pseudo-code would be:
raySceneJoin( rayNode, sceneNode ){ > > if ( rayNode and sceneNode are both leafs){ > > > > intersectRayTriangle(rayNode,sceneNode) > > > > return > > > > } > > > > // We have at least one non-leaf. Figure out what children we have. > > > > rayChildren = rayNode.isLeaf ? [rayNode] : rayNode.children > > > > sceneChildren = sceneNode.isLeaf ? [sceneNode] : sceneNode.children > > > > foreach rayChild in rayChildren { > > > > foreach nodeChild in sceneChildren{ > > > > // Any pairs of nodes that intersect needs to be refined further > > > > if ( intersection( rayChild, nodeChild) ){ > > > > raySceneJoin( rayChild, nodeChild) > > > > } > > > > } > > > > } > > }
This would give you something like ~O( log m log n ) complexity where m and n is the number of rays and triangles respectively (as opposed to O(m log n) which is the usual “foreach ray, trace against scene structure” type of AO computation).
This can obvoiusly be done by the GPU in a variety of ways, e.g. by traversing the trees depth first (keep a global queue of ray/scene node pairs, and just keep adding intersecting pairs to it until you hit a child, read from the queue until it’s empty), although there are further things you may want to do to improve performance and dynamic space utilization (e.g. use a short local stack for parts of the traversal).