Author Topic: Dataflow programming in roguelikes, exploring Field of view calculation (Read 31758 times)

Omnivore · « **on:** March 14, 2015, 03:13:39 PM »

Dataflow programming in roguelikes, Field of view calculation

This is one of those, probably YAGNI yet fun, off the wall learning experience/sidetracks in roguelike programming. Not only revisiting Fov calc, but doing it with an eye towards uncovering any hidden opportunities for producing friendly, simple, easy to maintain, and parallel processing friendly, code using a seeming unrelated (to roguelikes) technology: data flow programming.

In a Fov calc, the input is a set of cells containing information about visual conditions, the output is a set of visibility values, one for each input cell. Input cells conceptually contain an additional value, their location (x, y). The bounds of a Fov calc's input is a square of 2r+1 dimensions where r is the maximum observation distance. In regards to the distance, there are three distinct sets, the outer zone [outside the bounds], the inner zone [inside the square described as r' = r * 1/sqr(2)], and the contested zone described as the intersection of bounds with not(inner zone). Regardless of actual Fov algorithm used, the conceptual Fov calculation can be simply described as a set of rays emanating from the center, viewpoint, cell and radiating outwards to intersect each and every cell along the path to each cell (ideally).

At the origin cell, viewpoint, the output is dependent entirely upon the values contained in that cell. For other cells, the output value is dependent upon not only its own data, but that of each and every cell from, and including, the origin to itself along the line between them. A naive implementation has an algorithmic complexity of O(n^3) and a data dependency of 1 + 2*d (where d <= r) input cells per output cell. If we encapsulate the entire Fov calculation in a single transform, we have a simple dataflow model of: in => transform => out. Call this model A.

The nature of the dependencies noted in examination of the naive implementation suggests splitting the input set into n dependency sets and using one transform block per output cell this yields a dataflow: in => decompose => transform(xN) => compose => out, where decompose selects between 1 and 2*d input cells per transform, there is one transform per cell, and the compose operation simply maps each result into a single cell in the output. Call this model B.

Another way of splitting up the input data is to consider the nature of the Cartesian plane with the viewpoint cell at the origin (0,0). This immediately suggests spitting into quadrants, and upon further examination, octants. The dataflow in this model is: in => decompose => transform(x8) => compose => out. In this model, compose is the inverse of decompose and there are eight transform blocks. Call this model C.

So far, we have only considered transforms whose input is directly derived from the input set. For very small values of n, model A has the advantage of simplicity. For larger values of n, model C has the advantage of exploiting the inherent parallelism of the Cartesian plane mathematics and can be easily implemented using practically any existing Fov calculation algorithm. Model B seems to be nothing more than a useless observation. Its only advantage is that its transform operation is inherently simple and presumably fast. If we introduce the idea of feedback however, the picture changes.

It is the nature of the Fov problem space that a vision blocking cell close to the origin blocks (or shadows) all the cells 'behind' it. Indeed one very popular Fov algorithm (shadowcasting which incidentally can be simply mapped to model C above) is based upon exploiting this fact. In model B, if we organized the selection of the 'next transform to run', we could potentially use the output of each transform to prune the available inputs. This new model can be described as: in => decompose => select => transform(x?) => compose => out; where transform's output also loops back to be an input to select. Call this model D.

Implementing model D.

Why? Because A is too simple, C is too easy, and B in its natural form is too naive. More seriously, the model D has the potential to not only adapt well to parallel processing, but to be also relatively easily expanded to include more sophisticated visual effects such as obscurement and at least the first stage of a lighting model. Plus its a neat way to learn more about data flow programming.

Our input is a radius value, a viewpoint coordinate, and a map structure which can be queried for characteristics at a given coordinate. Our decompose block might as well deliver its results in an ordered fashion and select only those points which fall within the given radius of the viewpoint. Since we want to take advantage of feedback, we want to deliver the cell coordinates in an order that reflects the importance of the cell in its potential to shadow off a number of cells.

Let's start off by creating three bounding box rectangles, all centered on the viewpoint. The outer bounding box contains all the points that might be seen. The inner bounding box contains all the points that would be seen if there was no sight blocking impediments. The third box is the current vision scope bounding box, initially it will contain only one point, the viewpoint. We will require a PerimeterMarch function that given a rectangle will deliver a sequence of unique coordinates that completely describe the perimeter. After each call, we will inflate the size of the vision scope bounding box by one (in each direction).

Until the vision scope bounding box is no longer contained within the inner bounding box, we can ignore any distance calculations. For the portion of the decomposition where the vision scope is larger than the inner but less than or equal to the outer, we will do a distance check for each coordinate before passing it on or discarding it. That pretty much takes care of the major points of the decompose block.

The selector block is a filter. Initially the 'thy shall not pass!' set is empty, and it just passes everything along to the next available transform block. Things get more interesting as each transform block completes. The output is copied and fed back from the transform block to the selector, then the selector does some voodoo and the 'thy shall not pass!' set is modified (or not) appropriately. Pretty simple huh?

Ok, seriously, computing the necessary modifications to the 'thy shall not pass!' set is a non-trivial calculation. We really don't want to do it at run time, at least not for each and every blocking output from a transform. The mapping can be precomputed, folded, spindled, mutated and stored as a binary asset. Indeed it can even be a 'one size fits all' mapping that works for any reasonable radius. The size required, assuming we take the easy way out storing quadrant one and a fixed sized result set, is R^2 * (sizeof(coordinate) + sizeof((R^2) bitmap)) where R is the maximum radius you'll ever use. For R = 10, with allowance for overhead it's a trivial 2KB+change, but woah, around 320KB for R = 40.

For larger sizes, you'd probably want to go to octant mapping and use variable size, offset, bitmaps. I kinda skipped a step, at some point we need to generate a partial supercover line from viewpoint to each point coming out of the selector. Exactly when we do this is dependent upon what data model we're using. Its the same line that model B delivers straight from the decomposer. The choices are whether ornot to deliver just an array of points or to combine the coordinates with the map data and delivering that to the transform block.

The exact form of the data given to the transform block may have an impact on performance depending upon processor cache size and other hardware details as well as the size and nature of both your map, the exact nature of your transform function, and, of course, your language of choice. Simplest and safest, would be to be lazy about line generation and do it in a separate 'preprocessor' block sitting between the selector and transform blocks. As for the map data, best solution is probably to create an immutable copy of the relevant map data early in the decomposition block and share that with all the transform block instances.

Hope this is of interest and looking for feedback,
Brian

Trystan · « **Reply #1 on:** March 14, 2015, 04:20:23 PM »

What?

Is this implemented somewhere?

I've also thought about other ways to do FOV. One way that I think sounds like what your describing is to precalculate what are 'behind' each tile (where the viewer is at (0,0)) then do something like your PerimeterMarch function. If a tile is blocked, then skip it, otherwise add the tiles 'behind' it to the new perimeter tiles. Repeat until your set of perimeter tiles is empty.

So if (1,0) is blocked, then don't bother with (2,0) but if it's not blocked, add (2,0) to your perimeter set.

I don't remember why I haven't tried it yet though. I think because the naive way has always worked well enough for me.

Omnivore · « **Reply #2 on:** March 14, 2015, 04:51:13 PM »

Not implemented as a data flow system, no. Working on that at the moment, in fact just wrote the blocking pre-processor. Next up, wrapping my head around TPL DataFlow (C#) enough to begin implementing the the fun parts.

On the other hand, I've done two previous Fov calculators using similar ideas based on combining ray casting to perimeter with blocking information to skip rays, and in the last, to start the next ray at some place other than the origin if possible (furthest point of divergence from ray that blocked). The first was last year in Python, the second a month or two ago in C#.

The biggest difference between those two approaches and the one detailed above, is that they were perimeter raycasters and thus suffered from many of the same artifact problems common to all basic bresenham line based perimeter ray caster implementations. Jice (of libtcod fame), did a very thorough and interesting comparison of various Fov approaches in a paper that has good examples of the artifacts.

The approach I suggest in model D above is *not* a perimeter bresenham line ray caster. It is a 'draw partial supercover line to every cell' algorithm. The naive, non-pipe-lined, single threaded, no feedback, implementation is an order of magnitude slower than a perimeter caster even though it doesn't need the artifact cleanup phase. Adding pipelining and multiple processors along with the feedback loop should (hopefully) result in performance similar to a single threaded 'Basic' style ray caster, just without artifacts and having guaranteed full symmetry.

Wow, I'm talkative (typeative?) today

Brian

mushroom patch · « **Reply #3 on:** March 14, 2015, 11:07:56 PM »

I only skimmed the OP, but whenever I've thought very hard about FOV/LOS computations in roguelikes and how to make them fast/scalable, I always come back to two approaches: Precomputation of all lines of sight (not as bad as it sounds in many realistic situations) and non-grid based computations based on polygonal representations of maps and binary space partitioning. The second, in my view, is by far the best option. Unsurprisingly, you rarely see discussion of it in connection with roguelikes.

Krice · « **Reply #4 on:** March 15, 2015, 09:08:40 AM »

Quote from: mushroom patch on March 14, 2015, 11:07:56 PM

and non-grid based computations based on polygonal representations of maps and binary space partitioning. The second, in my view, is by far the best option. Unsurprisingly, you rarely see discussion of it in connection with roguelikes.

We are only roguelike programmers, not professors of mathematics.

Omnivore · « **Reply #5 on:** March 15, 2015, 09:46:56 AM »

Quote from: Krice on March 15, 2015, 09:08:40 AM

Quote from: mushroom patch on March 14, 2015, 11:07:56 PM
and non-grid based computations based on polygonal representations of maps and binary space partitioning. The second, in my view, is by far the best option. Unsurprisingly, you rarely see discussion of it in connection with roguelikes.

We are only roguelike programmers, not professors of mathematics.

In proof of that, I totally fubar'd the feedback portion of the proposed algorithm. The suggested feedback mechanism was based on a flawed understanding of the math, and (I must plead lack of sleep or something) would've cost more to apply the feedback than would be saved in pruned calculation costs.

The (more) correct equation for the number of blocked points in quadrant I for all possible blocking points, required in such a cache has an upper bound of (R^3)/3 where R is the maximum radius. This is a much smaller number than I was anticipating and given the overall problem and the nature of hardware caches/cache miss costs, its simpler, easier, and faster to just, for each point in the cache, store an array of the points it would block - uncompressed. The worst case array size is bounded by the size of R^2 elements (actually less but then add in overhead), which for any reasonable value of R will comfortably fit in modern processor caches. Of course, again given the nature of modern processors, you probably wouldn't lose anything by using a pair of bytes for the coordinate storage format in the arrays.

With regards to using binary space partitioning in fov calculation, I'd be interested in hearing more. I'm primarily interested in applying parallel data flow techniques to various problems in roguelikes but I'm still very interested in applying good algorithms. One point against precalculated field of view algorithms is the problems they have with mutable terrain, if bsp trees can be used to mitigate or bypass those problems, it would be very interesting.

Thanks,
Brian

[Update] Parallel processing is a wonderful thing, overdoing it is not. My model D fails on the overdoing it criterion. The use of fine grained parallelism causes the cost of applying corrective feedback to skyrocket as that process must also be thread safe. However, it may still be that a similar approach, using a quadrant variant of model C might work since the feedback process can be internalized to the transform block and thus remain single threaded and simple. In retrospect, programming on the Intertec Superbrain II wasn't all that bad,

[Update2] After switching to a quadrant variant of model C, but keeping the same corrective feedback algorithm, in the midst of writing the line scan function I realized that if I dropped the 'fuzzy' visibility measurement for a straight boolean visible or no, I could skip the line scan entirely. Its rather obvious in retrospect but, this change reduces the map cell examination calls to one per cell visible in the output (assuming lit walls). Now every coordinate in the potential field of view is still checked against the blocked set so don't expect miraculous efficiency, also map checks do result in fairly expensive blocked set updates at blocking cells. Still, it should have full symmetry and zero wall artifacts so it is worth pursuing further.

mushroom patch · « **Reply #6 on:** March 15, 2015, 07:56:46 PM »

Quote from: Krice on March 15, 2015, 09:08:40 AM

Quote from: mushroom patch on March 14, 2015, 11:07:56 PM
and non-grid based computations based on polygonal representations of maps and binary space partitioning. The second, in my view, is by far the best option. Unsurprisingly, you rarely see discussion of it in connection with roguelikes.

We are only roguelike programmers, not professors of mathematics.

I recognize that, but the idea of binary space partitioning in commercial video games goes back at least to DOOM. It's not some kind of cutting edge technology.

My impression is that whatever performance issues exist in roguelike LOS/FOV computation (to the extent there really are any) are an artifact of sticking to a needlessly naive map representation.

Krice · « **Reply #7 on:** March 16, 2015, 11:20:43 AM »

Quote from: mushroom patch on March 15, 2015, 07:56:46 PM

My impression is that whatever performance issues exist in roguelike LOS/FOV computation

My experience tells that fast routines are also simple. There is no need trying to use something that however brilliant from technical point of view is just going to be slower than simple, brute force methods readily available.

mushroom patch · « **Reply #8 on:** March 16, 2015, 06:20:12 PM »

Quote from: Krice on March 16, 2015, 11:20:43 AM

Quote from: mushroom patch on March 15, 2015, 07:56:46 PM
My impression is that whatever performance issues exist in roguelike LOS/FOV computation

My experience tells that fast routines are also simple. There is no need trying to use something that however brilliant from technical point of view is just going to be slower than simple, brute force methods readily available.

There's a show-stopper. Krice's experience tells him something.

Would you be surprised to hear that my experience tells me this is often not true?

Krice · « **Reply #9 on:** March 17, 2015, 09:26:15 AM »

Show us it's faster. That will solve it.

Omnivore · « **Reply #10 on:** March 17, 2015, 10:33:19 AM »

In the mutable terrain case, map representation has next to no impact on the fov algorithm. If it has, you have way too much coupling. All you should need for fov purposes is a vector describing viewpoint location, a vision radius, and a simple callback function that answers the question "Can I see through the cell at position p?".

The data flow approach is not just about speed gain through parallelizing. Breaking an algorithm up into a pipeline also simplifies and decouples each stage yielding simpler, cleaner, and easier to maintain code. The performance characteristics give you more freedom in choice of algorithm as well as opening up the door to ideas like composable field of views which may yield a better result for things like obscuration, lighting, and mob 'AI'.

Trying to apply algorithms used in immutable terrain, free movement, real time 3d to mutable terrain, cell movement, turn based 2d seems to be a lost cause due to too many changes in not only model but criteria. I am willing to be proven wrong.

mushroom patch · « **Reply #11 on:** March 17, 2015, 06:42:52 PM »

@Omnivore, it is my understanding that the bsp recalculation can be done in realistic (3-dimensional) game situations in realtime with mutable environments (there's a paper on the subject, if memory serves, but I don't have a reference). I don't know of any instances of this being used in actual games. Obviously roguelikes have much lower performance requirements and only two-dimensional environments. In general, problems of line of sight, lighting, etc. are more naturally framed in a polygonal context and bsp techniques can solve such problems quickly in a way that scales logarithmically with map complexity, rather than with length of LoS, as tends to be the case in grid based situations.

@Krice, lol.

I'm glad you asked. Here's a simple example of where reality differs from your hard-won, "experience-based" maxims. Compare the following videos:

https://www.youtube.com/watch?v=ywWBy6J5gz8

https://www.youtube.com/watch?v=Ns4TPTC8whw

Two guesses for which algorithm is simpler...

Omnivore · « **Reply #12 on:** March 17, 2015, 09:02:59 PM »

Quote from: mushroom patch on March 17, 2015, 06:42:52 PM

@Omnivore, it is my understanding that the bsp recalculation can be done in realistic (3-dimensional) game situations in realtime with mutable environments (there's a paper on the subject, if memory serves, but I don't have a reference). I don't know of any instances of this being used in actual games. Obviously roguelikes have much lower performance requirements and only two-dimensional environments. In general, problems of line of sight, lighting, etc. are more naturally framed in a polygonal context and bsp techniques can solve such problems quickly in a way that scales logarithmically with map complexity, rather than with length of LoS, as tends to be the case in grid based situations.

Yeah after I wrote that I started wondering what the voxel engines are using. Sometimes my memory lags the date by a decade.

The algorithm I'm currently using is, in effect, a polygonal one, the cached 'shadow' vectors describe a polygonal area, though they describe it in detail not in a concise equation. It builds a customized lookup table from a precalculated set of those shadow areas naturally during the evaluation. The cost is a hash lookup and a series of vector translations for each blocking cell, though only an array dereference for a non-blocking cell.

If modeled as polygons, each shadow would be three or four points and you would build up a series of occlusion volumes If you check for overlap and appropriately combine polygons, you could reduce the number of polygons to check against. At some value of radius, I expect that this approach would begin to be faster than the lookup table approach. I suspect you could restrict the polygons to rectangles and use a bsp tree to store them.

PS: still LOL'ing at the vids

Krice · « **Reply #13 on:** March 18, 2015, 07:47:38 AM »

Quote from: mushroom patch on March 17, 2015, 06:42:52 PM

I'm glad you asked. Here's a simple example of where reality differs from your

So you can't prove it. How surprising.

mushroom patch · « **Reply #14 on:** March 18, 2015, 06:02:17 PM »

Krice --

My standard of evidence in arguing with idiots on the internet, yourself included, is a preponderance of evidence standard. If 51% of the evidence brought to bear supports my position, I consider my job done. I don't feel the need to substantiate my positions more rigorously than that in the face of commentary of the quality you bring. In this thread, you've made two related claims: One (1), that simple algorithms are usually or even always better (which I have disproven conclusively and which no reasonably educated person would've made in the first place) and, two (2), that my position that bsp based LOS/FOV methods offer performance improvements in roguelike games (if such improvement is actually needed for some reason) requires "proof" -- something it's obvious you would be unable to appreciate were it on offer -- on the basis of your first claim.

Again, as your first claim is so ridiculous on the face of it, it compels no such proof. Moreover, I have offered some argument in favor of my claims re: bsp methods, which I doubt you are qualified to meet, which in any case you don't, and which people who are qualified to comment seem to at least believe are plausible. This, to me, closes the matter, unless someone else has something intelligent to say about it, which would be very welcome indeed.

News: Read the RULES before posting.

Author Topic: Dataflow programming in roguelikes, exploring Field of view calculation (Read 31758 times)

Omnivore

Dataflow programming in roguelikes, exploring Field of view calculation

Trystan

Re: Dataflow programming in roguelikes, exploring Field of view calculation

Omnivore

Re: Dataflow programming in roguelikes, exploring Field of view calculation

mushroom patch

Re: Dataflow programming in roguelikes, exploring Field of view calculation

Krice

Re: Dataflow programming in roguelikes, exploring Field of view calculation

Omnivore

Re: Dataflow programming in roguelikes, exploring Field of view calculation

mushroom patch

Re: Dataflow programming in roguelikes, exploring Field of view calculation

Krice

Re: Dataflow programming in roguelikes, exploring Field of view calculation

mushroom patch

Re: Dataflow programming in roguelikes, exploring Field of view calculation

Krice

Re: Dataflow programming in roguelikes, exploring Field of view calculation

Omnivore

Re: Dataflow programming in roguelikes, exploring Field of view calculation

mushroom patch

Re: Dataflow programming in roguelikes, exploring Field of view calculation

Omnivore

Re: Dataflow programming in roguelikes, exploring Field of view calculation

Krice

Re: Dataflow programming in roguelikes, exploring Field of view calculation

mushroom patch

Re: Dataflow programming in roguelikes, exploring Field of view calculation