Dataflow programming in roguelikes, Field of view calculation
This is one of those, probably YAGNI yet fun, off the wall learning experience/sidetracks in roguelike programming. Not only revisiting Fov calc, but doing it with an eye towards uncovering any hidden opportunities for producing friendly, simple, easy to maintain, and parallel processing friendly, code using a seeming unrelated (to roguelikes) technology: data flow programming.
In a Fov calc, the input is a set of cells containing information about visual conditions, the output is a set of visibility values, one for each input cell. Input cells conceptually contain an additional value, their location (x, y). The bounds of a Fov calc's input is a square of 2r+1 dimensions where r is the maximum observation distance. In regards to the distance, there are three distinct sets, the outer zone [outside the bounds], the inner zone [inside the square described as r' = r * 1/sqr(2)], and the contested zone described as the intersection of bounds with not(inner zone). Regardless of actual Fov algorithm used, the conceptual Fov calculation can be simply described as a set of rays emanating from the center, viewpoint, cell and radiating outwards to intersect each and every cell along the path to each cell (ideally).
At the origin cell, viewpoint, the output is dependent entirely upon the values contained in that cell. For other cells, the output value is dependent upon not only its own data, but that of each and every cell from, and including, the origin to itself along the line between them. A naive implementation has an algorithmic complexity of O(n^3) and a data dependency of 1 + 2*d (where d <= r) input cells per output cell. If we encapsulate the entire Fov calculation in a single transform, we have a simple dataflow model of: in => transform => out. Call this model A.
The nature of the dependencies noted in examination of the naive implementation suggests splitting the input set into n dependency sets and using one transform block per output cell this yields a dataflow: in => decompose => transform(xN) => compose => out, where decompose selects between 1 and 2*d input cells per transform, there is one transform per cell, and the compose operation simply maps each result into a single cell in the output. Call this model B.
Another way of splitting up the input data is to consider the nature of the Cartesian plane with the viewpoint cell at the origin (0,0). This immediately suggests spitting into quadrants, and upon further examination, octants. The dataflow in this model is: in => decompose => transform(x8) => compose => out. In this model, compose is the inverse of decompose and there are eight transform blocks. Call this model C.
So far, we have only considered transforms whose input is directly derived from the input set. For very small values of n, model A has the advantage of simplicity. For larger values of n, model C has the advantage of exploiting the inherent parallelism of the Cartesian plane mathematics and can be easily implemented using practically any existing Fov calculation algorithm. Model B seems to be nothing more than a useless observation. Its only advantage is that its transform operation is inherently simple and presumably fast. If we introduce the idea of feedback however, the picture changes.
It is the nature of the Fov problem space that a vision blocking cell close to the origin blocks (or shadows) all the cells 'behind' it. Indeed one very popular Fov algorithm (shadowcasting which incidentally can be simply mapped to model C above) is based upon exploiting this fact. In model B, if we organized the selection of the 'next transform to run', we could potentially use the output of each transform to prune the available inputs. This new model can be described as: in => decompose => select => transform(x?) => compose => out; where transform's output also loops back to be an input to select. Call this model D.
Implementing model D.
Why? Because A is too simple, C is too easy, and B in its natural form is too naive. More seriously, the model D has the potential to not only adapt well to parallel processing, but to be also relatively easily expanded to include more sophisticated visual effects such as obscurement and at least the first stage of a lighting model. Plus its a neat way to learn more about data flow programming.
Our input is a radius value, a viewpoint coordinate, and a map structure which can be queried for characteristics at a given coordinate. Our decompose block might as well deliver its results in an ordered fashion and select only those points which fall within the given radius of the viewpoint. Since we want to take advantage of feedback, we want to deliver the cell coordinates in an order that reflects the importance of the cell in its potential to shadow off a number of cells.
Let's start off by creating three bounding box rectangles, all centered on the viewpoint. The outer bounding box contains all the points that might be seen. The inner bounding box contains all the points that would be seen if there was no sight blocking impediments. The third box is the current vision scope bounding box, initially it will contain only one point, the viewpoint. We will require a PerimeterMarch function that given a rectangle will deliver a sequence of unique coordinates that completely describe the perimeter. After each call, we will inflate the size of the vision scope bounding box by one (in each direction).
Until the vision scope bounding box is no longer contained within the inner bounding box, we can ignore any distance calculations. For the portion of the decomposition where the vision scope is larger than the inner but less than or equal to the outer, we will do a distance check for each coordinate before passing it on or discarding it. That pretty much takes care of the major points of the decompose block.
The selector block is a filter. Initially the 'thy shall not pass!' set is empty, and it just passes everything along to the next available transform block. Things get more interesting as each transform block completes. The output is copied and fed back from the transform block to the selector, then the selector does some voodoo and the 'thy shall not pass!' set is modified (or not) appropriately. Pretty simple huh?
Ok, seriously, computing the necessary modifications to the 'thy shall not pass!' set is a non-trivial calculation. We really don't want to do it at run time, at least not for each and every blocking output from a transform. The mapping can be precomputed, folded, spindled, mutated and stored as a binary asset. Indeed it can even be a 'one size fits all' mapping that works for any reasonable radius. The size required, assuming we take the easy way out storing quadrant one and a fixed sized result set, is R^2 * (sizeof(coordinate) + sizeof((R^2) bitmap)) where R is the maximum radius you'll ever use. For R = 10, with allowance for overhead it's a trivial 2KB+change, but woah, around 320KB for R = 40.
For larger sizes, you'd probably want to go to octant mapping and use variable size, offset, bitmaps. I kinda skipped a step, at some point we need to generate a partial supercover line from viewpoint to each point coming out of the selector. Exactly when we do this is dependent upon what data model we're using. Its the same line that model B delivers straight from the decomposer. The choices are whether ornot to deliver just an array of points or to combine the coordinates with the map data and delivering that to the transform block.
The exact form of the data given to the transform block may have an impact on performance depending upon processor cache size and other hardware details as well as the size and nature of both your map, the exact nature of your transform function, and, of course, your language of choice. Simplest and safest, would be to be lazy about line generation and do it in a separate 'preprocessor' block sitting between the selector and transform blocks. As for the map data, best solution is probably to create an immutable copy of the relevant map data early in the decomposition block and share that with all the transform block instances.
Hope this is of interest and looking for feedback,
Brian