Author Topic: Smart AI vs Expensive AI  (Read 54748 times)

Bear

  • Rogueliker
  • ***
  • Posts: 308
  • Karma: +0/-0
    • View Profile
Re: Smart AI vs Expensive AI
« Reply #45 on: July 01, 2014, 08:22:35 PM »
And, thinking some more (a dangerous pastime, I know)... 

Suppose we were to store at each tile a bitmap of which other tiles within sight radius from it are visible.  With sight radius of 15, that would be about 500 bits per tile because any pair of tiles can use mutual visibility mapping for the leftmost of them, or the uppermost if they're in the same column; that way you only need bits for half of the 31x31 tiles under consideration. 

Now this is a factor of about 8 bigger than what I'm using for FastLOS, but it's a lot more straightforward to calculate and has an additional advantage of requiring storage that scales linearly with increasing area rather than with the square of increasing area. 


mushroom patch

  • Rogueliker
  • ***
  • Posts: 554
  • Karma: +0/-0
    • View Profile
Re: Smart AI vs Expensive AI
« Reply #46 on: July 01, 2014, 11:54:42 PM »
Maybe this is a stupid question, but it seems to me that if you're willing to precompute FOV for every square (which doesn't seem crazy to me), why not just throw all mutually viewable pairs into some constant time access collection? In a reasonable dungeon layout (like, one with tunnels and walls and stuff), this shouldn't turn out to be all that much stuff. If you don't care about improbable false positives (which I think is a reasonable compromise), you can use a Bloom filter to keep the memory use under control.

No, it isn't stupid.  It's much simpler and easier to get right than what I wrote. I didn't really consider it because the storage requirements are O(n^2) rather than O(n).  I don't think it really matters though at the scale I'm considering in practice. 

Let me think...  My 'mapchunks' (one level per mapchunk, in most cases) are 100 x 100, so 10000 tiles per level.  So the storage requirement per mapchunk, using one bit per tile pair, is (10000x10001)/2 bits = 5005000 bits, or about 5 megabytes.  That's too chunky for a phone, but not outrageous on a desktop machine. The way I'm doing it, with a 64 bit mask per tile, is 640000 bits or a bit over a half-megabyte. 

So you'd get much simpler code, easier to debug, and about the same speed modulo cache issues, and I'd get a savings of about 85% in memory devoted to speeding up line of sight checks.



I think it's hard to analyze what the asymptotic storage requirement really is because it depends on local geometric properties of the map that might be hard to get a handle on. What I meant by "reasonable" maps is pretty much exemplified by the maps reaver uses: rooms not much bigger than 15x15 connected by narrower corridors. I assume your 'n' here is the number of non-wall tiles. In that case, I think for "reasonable maps" the storage requirement is closer to O(n). (Obviously, for a totally open map, it will be O(n^2).)

I mean, the silliest thing you could do is just take an array A[x_1, y_1, x_2, y_2] with four axes representing (x_1, y_1) and (x_2, y_2) for all pairs of points and record a boolean value in each element for line of sight. This would surely be faster than anything involving computing hashes (like a Bloom filter), but the memory cost is like O(d^4), where d is the diameter of the map. But then you say, wait, all the stuff we care about is in the part of this thing where (x_1, y_1) is close to (x_2, y_2). Now say s is the average number number of squares viewable from any given square -- this is a geometric quantity associated to a "reasonable" map generation algorithm. Then your storage cost with just throwing all the mutually viewable pairs into a constant time access collection depends on your storage cost for the collection -- this should be O(p) where p is the number of pairs. Obviously, p = sn, so what we have is O(sn).

I guess my core claim, which obviously I'm not going to carefully formalize or prove, but I think it should be intuitively clear, is that s does not depend on a particular map, but only on the generating algorithm if the algorithm is "reasonable." In particular, making a bigger or smaller map with the same algorithm will result in roughly the same s. In this sense, your storage cost is actually O(n) for a given algorithm.

mushroom patch

  • Rogueliker
  • ***
  • Posts: 554
  • Karma: +0/-0
    • View Profile
Re: Smart AI vs Expensive AI
« Reply #47 on: July 02, 2014, 03:28:43 AM »
Suppose we were to store at each tile a bitmap of which other tiles within sight radius from it are visible.  With sight radius of 15, that would be about 500 bits per tile because any pair of tiles can use mutual visibility mapping for the leftmost of them, or the uppermost if they're in the same column; that way you only need bits for half of the 31x31 tiles under consideration. 

Now this is a factor of about 8 bigger than what I'm using for FastLOS, but it's a lot more straightforward to calculate and has an additional advantage of requiring storage that scales linearly with increasing area rather than with the square of increasing area. 

Yeah, my feeling is that this would be less storage efficient than using a Bloom filter on mutually viewable pairs with .1% false positives (~16 bits per mutually viewable pair, does not actually store the pairs), but much faster on look-ups. With a maximum radius of 15, I think most of the pairs that will be represented in your data structure will not be mutually viewable (again, for a "reasonable" map).
« Last Edit: July 02, 2014, 03:41:34 AM by mushroom patch »

reaver

  • Rogueliker
  • ***
  • Posts: 207
  • Karma: +0/-0
    • View Profile
Re: Smart AI vs Expensive AI
« Reply #48 on: July 02, 2014, 07:59:27 AM »
Bear, thanks for the extra explanations of the FastLOS, I'll check it out when I have a bit more time, at the moment I'm trying to finish testing of the bit-fiddled brute force method that we were mentioning about.

 
Let me think...  My 'mapchunks' (one level per mapchunk, in most cases) are 100 x 100, so 10000 tiles per level.  So the storage requirement per mapchunk, using one bit per tile pair, is (10000x10001)/2 bits = 5005000 bits, or about 5 megabytes.  That's too chunky for a phone, but not outrageous on a desktop machine. The way I'm doing it, with a 64 bit mask per tile, is 640000 bits or a bit over a half-megabyte. 

So you'd get much simpler code, easier to debug, and about the same speed modulo cache issues, and I'd get a savings of about 85% in memory devoted to speeding up line of sight checks.

Suppose we were to store at each tile a bitmap of which other tiles within sight radius from it are visible.  With sight radius of 15, that would be about 500 bits per tile because any pair of tiles can use mutual visibility mapping for the leftmost of them, or the uppermost if they're in the same column; that way you only need bits for half of the 31x31 tiles under consideration. 

Now this is a factor of about 8 bigger than what I'm using for FastLOS, but it's a lot more straightforward to calculate and has an additional advantage of requiring storage that scales linearly with increasing area rather than with the square of increasing area. 

Wait, wait, there might have been some confusion, the costs with the tile pair storage would not be as you suggest. For a 100x100 map with 15 max sight radius, the costs would be 2^N bits, where:

N = mapIndexBits + 2*sightRadBits +1              (1)

mapIndexBits = the number of bits needed to represent the linear map index. For 100x100 = 10000, and 10000 < 16384, so:
mapIndexBits = 14

sightRadBits = the number of bits needed to represent the sight radius. For 15, 15 < 16, so:
sightRadBits = 4

So, if you plug the numbers above in eq. (1), you'll need 23 bits, which means 2^23 = 1 MB of storage.  With this 1MB you can store up to a 128x128 map with 15 max sight radius. By now, you might be ask where does eq. (1) come from.

Now as I said in one of the previous post, when you want to store a pair of points, you can do:
* sort the points by the X axis
* store the first point's linear index in the mapIndexBits  ( linear index is "point.x + map.width*point.y")
* store the second point's offset to the first point in the rest bits. The offset in X is ALWAYS positive or 0, so sightRadBits will suffice. The offset in Y needs one more bit for the sign. And that's why its "2*sightRadBits + 1" in eq. 1.

The costs for updates are also very nice: if you update a few tiles, you would just need to run the FoV algorithm in these tiles:
Code: [Select]
for each tile in updated_tiles:
    view_points  = points_in_radius(tile, max_sight_radius)     // get all points in the given radius, centered on tile
    for each vpoint in view_points:
        is_visible = LineOfSight(tile, vpoint)              // Calculate line of sight with your fav fov
        set_precalc_los_bit( tile, vpoint, is_visible) // update our dense precalculated-LoS map
« Last Edit: July 02, 2014, 08:14:53 AM by reaver »

reaver

  • Rogueliker
  • ***
  • Posts: 207
  • Karma: +0/-0
    • View Profile
Re: Smart AI vs Expensive AI
« Reply #49 on: July 02, 2014, 09:28:08 AM »
Ok, some results. The test is as follows: I generate a standard dungeon map with the specified dimensions, I place an actor somewhere random, I set his LoS range and let him autoexplore. The test is over when all tiles have been explored. I use two methods to calculate FoV:

Old: run recursive shadowcasting
New: for each tile in the FoV radius, test visibility using the precalculated map.

Code is in C++
Laptop is an i7 @ 2.5 Ghz , Desktop is a Xeon @ 2.6 Ghz.
Los is the los radius
There are three numbers:
    1. total number of milliseconds for the visibility calculation
    2. total number of turns
    3. milliseconds per visibiliy calculation (average)

------------------------------------------------------------
Laptop
Los: 7
80x25
New: 6.44531 - 342 : 0.0188459
Old: 13.7655 - 346 : 0.0397846

160x50
New: 27.361 - 1303 : 0.0209984
Old: 50.5674 - 1304 : 0.0387787

320x100
New: 138.124 - 5267 : 0.0262245
Old: 218.997 - 5356 : 0.0408881


--------------------------------
Desktop
Los: 7
80x25
New: 2.26241 - 342 : 0.00661523
Old: 5.72091 - 346 : 0.0165344

160x50
New: 9.13756 - 1303 : 0.00701271
Old: 22.3549 - 1304 : 0.0171433

320x100
New: 44.7344 - 5267 : 0.00849333
Old: 94.1606 - 5356 : 0.0175804

----------------------
Desktop
Los : 15
160x50
New: 26.1644 - 1188 : 0.0220239
Old: 25.9351 - 1178 : 0.0220162

Los : 4
160x50
New: 5.26167 - 1835 : 0.0028674
Old: 22.4202 - 1836 : 0.0122114

--------------------------------------------------------
Interpretation:

Well, for FoV calculation where you do LoS with each tile, the benefits reduce as the bottleneck shifts to the "for each tile" processing.
If you have LoS calculations as described earlier in the thread ( for each interesting entity in a given radius, run LoS), it can be very,very beneficial.
What annoys me is that I have a bug somewhere as the bot uses a different amount of turns to complete the autoexplore, even if I start with the same seed. Anyway, I don't have time to check that now.

mushroom patch

  • Rogueliker
  • ***
  • Posts: 554
  • Karma: +0/-0
    • View Profile
Re: Smart AI vs Expensive AI
« Reply #50 on: July 02, 2014, 03:46:19 PM »
Hm, I'm not sure I get your test. So you're checking field of view by iterating over all squares within the given radius and checking line of sight to the current location and using this data to autoexplore? It seems like a practical use case, but I'm not so sure about calculating the full field of view on each move. Do I have this right?

What I had in mind was more of a pure test of the speed of the LoS check: Just pick a non-wall square and another non-wall square within the LoS radius (randomly in both cases) and check whether they're mutually viewable in the usual way vs. in the precomputed/cached way. It sounds like you're saying caching gives ~2x improvement over more obvious line of sight checks (which is not as good as I was hoping for, actually, but not bad) in terms of raw speed. Is that right?

reaver

  • Rogueliker
  • ***
  • Posts: 207
  • Karma: +0/-0
    • View Profile
Re: Smart AI vs Expensive AI
« Reply #51 on: July 02, 2014, 04:21:47 PM »
Hm, I'm not sure I get your test. So you're checking field of view by iterating over all squares within the given radius and checking line of sight to the current location and using this data to autoexplore? It seems like a practical use case, but I'm not so sure about calculating the full field of view on each move. Do I have this right?

Yes. I have not implemented a very smart/progressive fov calculation, but my dumb fov uses LoS a lot which is the purpose of fitting that into the test.

What I had in mind was more of a pure test of the speed of the LoS check: Just pick a non-wall square and another non-wall square within the LoS radius (randomly in both cases) and check whether they're mutually viewable in the usual way vs. in the precomputed/cached way.

The test you suggest is quite easy to implement, will try it. I would expect massive difference in speed. I did not try it because it's not going to be a real test case, but I guess it'll serves its purpose.

It sounds like you're saying caching gives ~2x improvement over more obvious line of sight checks (which is not as good as I was hoping for, actually, but not bad) in terms of raw speed. Is that right?

Yes, when you calculate FoV using "for each tile within radius, calculate LoS".

mushroom patch

  • Rogueliker
  • ***
  • Posts: 554
  • Karma: +0/-0
    • View Profile
Re: Smart AI vs Expensive AI
« Reply #52 on: July 02, 2014, 04:43:52 PM »
I guess my question is about the comparison of this precomputation business versus a line of sight algorithm (something like: compute a discretized version of the line connecting two points and check whether there's an obstruction any of the squares in that approximation). It's not at all clear to me which of these two should be faster on average -- I guess we should expect to see more LoS computations between points near the maximum line of sight difference, which works against the conventional LoS algorithm, but I don't have much intuition for how fast C++ maps are.

Bear

  • Rogueliker
  • ***
  • Posts: 308
  • Karma: +0/-0
    • View Profile
Re: Smart AI vs Expensive AI
« Reply #53 on: July 02, 2014, 05:15:44 PM »
There is an important difference between line of sight and field of view, for most algorithms. 

If you're working directly with map geometry, it's always more efficient per tile to calculate field of view (ie, to derive the whole set at once of tiles to which line of sight exists) than line of sight.

This is because when you're doing field of view using one of the best-in-class algorithms like spiral-path or recursive-shadowcasting, it takes O(1) to add each tile to the set, working from the geometry of the map.  But if you're computing line-of-sight for a single square, the map-based algorithms are O(n) where n is distance, because you have to visit every square between the origin and the target to make sure your view isn't obstructed by map geometry along the way. 

So if you're doing field-of-view well (one of the best in class algorithms) FastLOS should change the time required for field-of-view by a constant factor - probably less than 1 (meaning faster) but it depends on your implementation. 

If you're doing field of view badly, by calculating line-of-sight to every tile iteratively, you should expect switching to FastLOS (or spiral-path, or recursive-shadowcasting) to give you a speedup that scales proportional to your sight radius times a constant.  But no matter what map-based algorithm you're using for line-of-sight, switching to FastLOS (or any constant-time access to precomputed LOS information) should give you that same proportional speedup on that task.

What a method such as FastLOS (or any of the variants we've been discussing here) does, is give you access to precomputed LOS information in O(1), which is the same as the best map-based algorithms can get it when they're doing field-of-view.  If you're doing Field of view, FastLOS is therefore on the same order of complexity (but likely a constant factor faster depending on implementation details) as Spiral-Path or Recursive-Shadowcasting.   But if you're doing Line of Sight, you're getting O(1) where a classic LoS algorithm would give you, at best, O(n) where n is the number of tiles separating origin and target. 

« Last Edit: July 02, 2014, 05:24:34 PM by Bear »

mushroom patch

  • Rogueliker
  • ***
  • Posts: 554
  • Karma: +0/-0
    • View Profile
Re: Smart AI vs Expensive AI
« Reply #54 on: July 02, 2014, 05:49:59 PM »
Yeah, that's not really what my question is about. If what you want to do is calculate LoS here and there for deciding if something can hit something else with a spell or missile weapon, then obviously an LoS computation will be faster than FoV. Yes, LoS is O(d), but FoV for the same purpose is O(d^2) in the worst case. Reaver is talking about using C++ map data structures (which I think are implemented as red-black trees -- so O(log n) access time) to store mutually viewable pairs for LoS testing. My question is whether, in practice, reaver's map based implementation is actually faster than LoS and what the difference is. I don't think this is immediately clear. We're not talking large values of d here (maybe large values of n though). It depends on the map data structure, which is something I don't know the specifics of performance-wise.


Bear

  • Rogueliker
  • ***
  • Posts: 308
  • Karma: +0/-0
    • View Profile
Re: Smart AI vs Expensive AI
« Reply #55 on: July 02, 2014, 05:54:39 PM »
As far as I know there is no performance guarantee for the C++ map data structure.  Given its ordering constraints, and assuming the people who wrote your stdlib are either sane or interested in efficiency, you're probably right about it being implemented as a red-black tree and thus having O(log n) access. 

But the ordering constraints are useless in this case so you should be using an unordered map.  Unordered maps are implemented as hash tables, thus O(1) access.


reaver

  • Rogueliker
  • ***
  • Posts: 207
  • Karma: +0/-0
    • View Profile
Re: Smart AI vs Expensive AI
« Reply #56 on: July 02, 2014, 09:30:41 PM »
There is an important difference between line of sight and field of view, for most algorithms. 

If you're working directly with map geometry, it's always more efficient per tile to calculate field of view (ie, to derive the whole set at once of tiles to which line of sight exists) than line of sight. ...

I know, my Fov might be suboptimal, but I didn't want to demonstrate my FoV algorithm. I was just showing that even with FoV, using the precalculated LoS for every square is faster. I expect that comparing bresenham LoS to the precalculated one presented here will be day and night.

Reaver is talking about using C++ map data structures (which I think are implemented as red-black trees -- so O(log n) access time) to store mutually viewable pairs for LoS testing. My question is whether, in practice, reaver's map based implementation is actually faster than LoS and what the difference is.

No, no, nobody reads? it's DENSE, no map data structure, just a biiiiig bit buffer where you encode/decode the pair location/offset as an index, and the visibility is the bit value. It's guaranteed to be faster than anything else you throw at it, unless it's faster than some mul/adds and bitops, which I kinda doubt. Damn it, I'll do that LoS test to show you the speed diff.

Bear

  • Rogueliker
  • ***
  • Posts: 308
  • Karma: +0/-0
    • View Profile
Re: Smart AI vs Expensive AI
« Reply #57 on: July 02, 2014, 09:46:19 PM »
No, no, nobody reads? it's DENSE, no map data structure, just a biiiiig bit buffer where you encode/decode the pair location/offset as an index, and the visibility is the bit value. It's guaranteed to be faster than anything else you throw at it, unless it's faster than some mul/adds and bitops, which I kinda doubt. Damn it, I'll do that LoS test to show you the speed diff.

Eggs Ackley.  Reaver and I have been talking about bitbuffers.  Others have been talking about more complex constructed data  such as maps, lists, bloom filters, etc. 

There are good reasons, as Reaver just pointed out, to use bitbuffers if you can make them work for your needs.  If you can get them packed appropriately, there is hardly any way to get anything faster.  The FastLOS sight mask is a specialized little bitbuffer; I work hard during map generation to get the sightlines boiled down to where I can check them using binary AND. 

Omnivore

  • Rogueliker
  • ***
  • Posts: 154
  • Karma: +0/-0
    • View Profile
Re: Smart AI vs Expensive AI
« Reply #58 on: July 02, 2014, 10:30:34 PM »
Questions from the peanut gallery:

1) What impact does opening/closing a door (removing or adding obstacle) have?
2) Can these structures be lazy initialized or is precalculation a requirement?
3) Assuming full exploration is best case use scenario, what is the worst case use scenario and
what performance gain over standard fov algos can be expected there?


reaver

  • Rogueliker
  • ***
  • Posts: 207
  • Karma: +0/-0
    • View Profile
Re: Smart AI vs Expensive AI
« Reply #59 on: July 02, 2014, 10:41:29 PM »
Questions from the peanut gallery:

1) What impact does opening/closing a door (removing or adding obstacle) have?
2) Can these structures be lazy initialized or is precalculation a requirement?
3) Assuming full exploration is best case use scenario, what is the worst case use scenario and
what performance gain over standard fov algos can be expected there?

1) In my case: you calculate the FoV from that tile with your favorite algo and set the fov bits with visible/not visible. So around 50 bit buffer accesses
2) Could be lazy initialized, but precalculation makes more sense. In my case, the precalculation amounts to running the fov algo from every single point
3) full exploration is not the best case scenario. It's just one that I chose! Again, as bear said, you won't see the massive differences in FoV calcs, but in LoS checks.

Anyway, some stats:
values are milliseconds for the total time for...
   for los values from 3 till 15 do 1000000 random pair checks, where the pairs have distance <= los

Bresenham is Los with ... bresenham.
Precalc samples from the bitbuffer
null calculates the time to read the pre-generated random pair lists, and is used to calculate the "base cost" before doing any algorithm work.

80x25 map
Bresenham: 3790.5
Precalc: 570.626
null: 179.297
----------------------------------------------------
160x50 map
Bresenham: 4229.73
Precalc: 608.415
New: 131.031
----------------------------------------------------
320x100 map
Bresenham: 4676
Precalc: 714.562
null: 129.874

---------------------------

So, the above says that using the precalculated bitbuffer is about 6.6x faster than calculating LoS using bresenham. For what is worth. I think it's now time to move on to coding something more useful.
« Last Edit: July 02, 2014, 10:43:25 PM by reaver »