Author Topic: Generating a list of unique random numbers (Read 47790 times)

Z · « **Reply #15 on:** April 13, 2014, 02:47:48 PM »

Quote from: sokol815 on April 12, 2014, 04:01:37 AM

ah, the experiences of lists of random numbers. Javascript makes something like this super easy:

Code: [Select]
var numArr = Array(20); for(var i = 0; i < numArr.length;i++){ numArr[i] = i; } numArr.sort(function(){return Math.random()-.5;}); console.log(numArr);

It is impossible to randomly shuffle three values in a bounded number of coin tosses. Thus, this method either makes some permutations more probable than others or it uses an unbonded number of comparisons, depending on how "sort" works. For example, on my browser,

Code: [Select]

ile = Array(); for(t=0; t<1000000; t++) { arr = Array(1,2,3); arr.sort(function(){return Math.random()-.5;}); s=arr[0]+","+arr[1]+","+arr[2]; if(ile[s]) ile[s]++; else ile[s] = 1; } ile

returns 1,2,3 and 2,1,3 twice as frequently as other permutations.

EDIT: With 4 elements some permutations turn out to appear 180 times more frequently than others. Even if you look just at the first element: 1 and 2 appear with probability 28%, 3 with 18% and 4 with 25%. The probabilities of choosing each element seem out to be more and more different when the number of elements increased (with 17 elements, 1..17, they range from 3.5% for 6 to 15.8% for 1). If you wrote a game using this, it could be completely different on a different browser.

Jaxian · « **Reply #16 on:** April 13, 2014, 03:59:32 PM »

Quote from: chooseusername on April 13, 2014, 05:51:46 AM

Quote from: Jaxian on April 12, 2014, 04:30:53 PM
However, the line "val in value_list" requires the code to compare "val" to every single value in the list. As the list grows in size, this could take a longer time. For this reason, it is better to use a Set instead of a List. A Set uses hashing so that the same operation can be completed much more quickly:

Code: [Select]
used_values = set() for x in range(0, 100): val = random() while val in used_values: val = random() used_values.add(val) // Do something with val, maybe append to a result list or call a function to build part of your map
Note that for every number generated, val is looked up in used_values twice. Once to check if it is in there already, and once when it gets added (because that's how sets work). If you take this into account, you can reduce this example to what I wrote earlier. Of course, you are not talking about Python's random.random() here, that would be bad.

A good point. It did not occur to me to me to use the length of the set to determine whether or not a value was added uniquely, and I do think that avoiding a second hash is worthwhile. As for which version of random() I am referring to, I don't know Python, so I can't say. If there is a certain version of random() that should be avoided, then hopefully someone can explain that here.

chooseusername · « **Reply #17 on:** April 13, 2014, 08:29:01 PM »

Quote from: Jaxian on April 13, 2014, 03:59:32 PM

If there is a certain version of random() that should be avoided, then hopefully someone can explain that here.

random() is the name of the standard library function which returns a float between 0.0-1.0. There would of course be no point in putting that as the key in a dictionary or set.

chooseusername · « **Reply #18 on:** April 13, 2014, 08:51:47 PM »

Quote from: mushroom patch on April 13, 2014, 01:46:26 PM

Edit: Perhaps a more topical example: Suppose you want to generate treasures to populate a vault, but you don't want a lot of repeated treasures or treasure types and you want to do it in a flexible, data-oriented way (e.g. you want it to depend on dungeon levels, maybe some kind of theming for different vault types, etc.). You can write a routine that puts together a large list of item types (or better item builder objects), then use random.sample to draw from it.

Do you really believe this is a realistic example? Or is it one contrived to employ random.sample()? To me, it seems like the latter. Is there anyone out there other than me that's written loot dropping code, who would write it this way or use random.sample()?

requerent · « **Reply #19 on:** April 13, 2014, 09:08:53 PM »

Quote from: chooseusername on April 13, 2014, 05:51:46 AM

random.sample() is a limited function, and it smacks to me of something someone put in to handle their own special case. The Python standard library has many of these sorts of things. There are arbitrary modules in there that almost do what I want, but are just unusable because whomever wrote them wanted them to work in a way that was unusable for me. Back to sample(), I've never had a sequence I needed to pick out a unique list of entries from. If I had to do this, it would be clearer IMO to write the code yourself than to call this function. Most of my random number usage was picking from weighted lists, and repicking was desirable. After all, you want 2x50 arrows sometimes, 1x50 arrows some other times, and so forth.

This is ludicrous. Python standard libraries are written in C- many of which have undergone pretty severe optimizations. I don't even like python for games, but if you're going to use it, you want to leverage the standard libraries as much as possible. Period. Even if a python library doesn't have exactly what you need, it's going to give you better performance to modify the results afterwards than to try and implement it entirely in python. Or you can go shopping for other modules that fit your purpose more precisely. At all costs, avoid coding in actual python as much as possible. That is the more pythonic approach.

Quote

Do you really believe this is a realistic example? Or is it one contrived to employ random.sample()? To me, it seems like the latter. Is there anyone out there other than me that's written loot dropping code, who would write it this way or use random.sample()?

If the OP is asking how to do it, that is sufficiently good enough reason to implement it in that matter or discuss such an implementation. The problem domain is unique to each individual's design goals, it doesn't matter if nobody has done it yet. There are a LOT of ways to utilize sampling for loot. Arguing against its use while ignorant of the problem domain is just silly. The OP isn't asking for advice on how to generate loot, he's asking for a specific algorithm which may be perfect for the game he's working on.

mushroom patch · « **Reply #20 on:** April 13, 2014, 09:29:46 PM »

Quote from: chooseusername on April 13, 2014, 08:51:47 PM

Quote from: mushroom patch on April 13, 2014, 01:46:26 PM
Edit: Perhaps a more topical example: Suppose you want to generate treasures to populate a vault, but you don't want a lot of repeated treasures or treasure types and you want to do it in a flexible, data-oriented way (e.g. you want it to depend on dungeon levels, maybe some kind of theming for different vault types, etc.). You can write a routine that puts together a large list of item types (or better item builder objects), then use random.sample to draw from it.
Do you really believe this is a realistic example? Or is it one contrived to employ random.sample()? To me, it seems like the latter. Is there anyone out there other than me that's written loot dropping code, who would write it this way or use random.sample()?

Absolutely. It's a lot cleaner than the ridiculous hacks you see in major roguelikes for tasks like monster and item generation, e.g. run through a list, add up a bunch of weights assigned to each possible item, make a list of threshold values, generate a random number, etc.

I mean, if your argument is that many roguelike developers don't use/know what sampling without replacement and/or random permutations are, I'll totally grant you that. I don't think that really means anything though, except perhaps that there's a lot of scope for improvement on the state of the art.

Also:

Quote from: chooseusername on April 13, 2014, 08:29:01 PM

Quote from: Jaxian on April 13, 2014, 03:59:32 PM
If there is a certain version of random() that should be avoided, then hopefully someone can explain that here.
random() is the name of the standard library function which returns a float between 0.0-1.0. There would of course be no point in putting that as the key in a dictionary or set.

Actually, a reasonable way to do what the OP asks (if you don't know about random.sample) would something like sort(l, key=lambda x: random.random()) -- you risk key value collisions (which just results in slightly less random behavior than desired), but the odds of that are low enough that they have almost no effect statistically. Python only evaluates the key function once per item and stores the result, so there's no problem doing it this way.

chooseusername · « **Reply #21 on:** April 13, 2014, 10:19:20 PM »

Quote from: requerent on April 13, 2014, 09:08:53 PM

This is ludicrous. Python standard libraries are written in C- many of which have undergone pretty severe optimizations. I don't even like python for games, but if you're going to use it, you want to leverage the standard libraries as much as possible. Period. Even if a python library doesn't have exactly what you need, it's going to give you better performance to modify the results afterwards than to try and implement it entirely in python. Or you can go shopping for other modules that fit your purpose more precisely. At all costs, avoid coding in actual python as much as possible. That is the more pythonic approach.

Have you ever looked at the standard library? Very little of it is written in C. And have you ever subscribed to the python-dev list for a reasonable period of time? The actual process of getting something into the standard library comes mainly from attrition, or it has for most of the lifetime of Python development. All you have to do is have something you want in there, and the staying power to argue away all detractors, and to offer to maintain it once it is in there. So what happens is that what gets into the standard library, is not the best or most useful, it's just the most pushed by those who can be bothered.

You seem to ascribe some magical quality to the standard library. Or to the mystical quality of being Pythonic. It is not uncommon to have arguments where people where each argues different perspectives of how what they want to push on the other is more Pythonic. It's just a worthless appeal to authority, an empty point foisted on those who do not know better.

Avoid coding in Python if possible? Wow, we really do have different approaches. How can you be confident that your code works right, and is easy to maintain, if you do not know how it works and are not afraid to rewrite it as needed? Too many people just use libraries and frameworks and treat them like a magical black box that just work the way they want.

Quote from: requerent on April 13, 2014, 09:08:53 PM

If the OP is asking how to do it, that is sufficiently good enough reason to implement it in that matter or discuss such an implementation. The problem domain is unique to each individual's design goals, it doesn't matter if nobody has done it yet. There are a LOT of ways to utilize sampling for loot. Arguing against its use while ignorant of the problem domain is just silly. The OP isn't asking for advice on how to generate loot, he's asking for a specific algorithm which may be perfect for the game he's working on.

*sigh* You're talking in circles trying to justify it's usefulness. I never said I only used random numbers in loot dropping. I just used it as an example, for how things are generally more complicated and how random.sample() is only useful in theory, rather than in practice.

I can't argue with overly emotional irrational arguments made solely to push something unjustified on someone else, so I'm going to choose to opt out of responding to your posts from now on. Good luck with your personal opinions.

chooseusername · « **Reply #22 on:** April 13, 2014, 10:26:31 PM »

Quote from: mushroom patch on April 13, 2014, 09:29:46 PM

Absolutely. It's a lot cleaner than the ridiculous hacks you see in major roguelikes for tasks like monster and item generation, e.g. run through a list, add up a bunch of weights assigned to each possible item, make a list of threshold values, generate a random number, etc.

I mean, if your argument is that many roguelike developers don't use/know what sampling without replacement and/or random permutations are, I'll totally grant you that. I don't think that really means anything though, except perhaps that there's a lot of scope for improvement on the state of the art.

We're going to have to disagree. Unless you care to link to any of these ridiculous hacks and describe how you would do it better (without simplifying things to the point where value is removed).

sokol815 · « **Reply #23 on:** April 14, 2014, 12:20:08 AM »

Quote from: Z on April 13, 2014, 02:47:48 PM

Quote from: sokol815 on April 12, 2014, 04:01:37 AM
ah, the experiences of lists of random numbers. Javascript makes something like this super easy:

Code: [Select]
var numArr = Array(20); for(var i = 0; i < numArr.length;i++){ numArr[i] = i; } numArr.sort(function(){return Math.random()-.5;}); console.log(numArr);

It is impossible to randomly shuffle three values in a bounded number of coin tosses. Thus, this method either makes some permutations more probable than others or it uses an unbonded number of comparisons, depending on how "sort" works. For example, on my browser,
Code: [Select]
ile = Array(); for(t=0; t<1000000; t++) { arr = Array(1,2,3); arr.sort(function(){return Math.random()-.5;}); s=arr[0]+","+arr[1]+","+arr[2]; if(ile[s]) ile[s]++; else ile[s] = 1; } ilereturns 1,2,3 and 2,1,3 twice as frequently as other permutations.

EDIT: With 4 elements some permutations turn out to appear 180 times more frequently than others. Even if you look just at the first element: 1 and 2 appear with probability 28%, 3 with 18% and 4 with 25%. The probabilities of choosing each element seem out to be more and more different when the number of elements increased (with 17 elements, 1..17, they range from 3.5% for 6 to 15.8% for 1). If you wrote a game using this, it could be completely different on a different browser.

Yeah, that first code block is really not useful. maybe for prototyping something... If I needed to randomize an array, I would use the second method that I mentioned. Here is a test harnessed up to that 2nd method:

Code: [Select]

var ile = {};
for(var t = 0; t < 1000000; t++){
	var arr = [1,2,3];
	var arr2 = [];
	var s;
	while(arr.length > 0){
		var index = (Math.random() * arr.length) >> 0;
		arr2.push(arr[index]);
		if(arr2.length == 1){
			s = arr2[0];
		} else {
			s = s + "," + arr2[arr2.length-1];
		}
		arr.splice(index,1);
	}
	if(!ile[s]){
		ile[s] = 1;
	} else {
		ile[s]++;
	}
}
ile;

this method returns all 6 orders in similar proportions.

Code: [Select]

1,2,3: 166352
1,3,2: 166225
2,1,3: 166572
2,3,1: 166947
3,1,2: 166894
3,2,1: 167010

I use an implementation of a RC4 psuedo-random algorithm in my javascript games. it is called "Seedrandom.js". Seed random provides a seedable RNG. V8's built-in RNG is not seedable, so I use this instead.

mushroom patch · « **Reply #24 on:** April 14, 2014, 02:17:33 AM »

Quote

Have you ever looked at the standard library? Very little of it is written in C.

True fact. The standard library method in question is implemented here (in python):

http://svn.python.org/projects/python/branches/py3k/Lib/random.py

Quote from: chooseusername on April 13, 2014, 10:26:31 PM

Quote from: mushroom patch on April 13, 2014, 09:29:46 PM
Absolutely. It's a lot cleaner than the ridiculous hacks you see in major roguelikes for tasks like monster and item generation, e.g. run through a list, add up a bunch of weights assigned to each possible item, make a list of threshold values, generate a random number, etc.

I mean, if your argument is that many roguelike developers don't use/know what sampling without replacement and/or random permutations are, I'll totally grant you that. I don't think that really means anything though, except perhaps that there's a lot of scope for improvement on the state of the art.
We're going to have to disagree. Unless you care to link to any of these ridiculous hacks and describe how you would do it better (without simplifying things to the point where value is removed).

I checked out the angband source, which is the last place I looked for this kind of thing. I decided I'm willing to settle with disagreement because it's too much of a pain in the ass locating exactly which if-else ladder/switch statement I propose to replace. Moreover, angband doesn't even have the kind of finesse to consider the issue of whether or not you really want to select different item types independently -- it just does it.

requerent · « **Reply #25 on:** April 14, 2014, 02:40:27 AM »

Whoops. Python is disappoint.

Jaxian · « **Reply #26 on:** April 14, 2014, 03:18:12 AM »

Quote from: chooseusername on April 13, 2014, 08:29:01 PM

random() is the name of the standard library function which returns a float between 0.0-1.0. There would of course be no point in putting that as the key in a dictionary or set.

Why do you say that? Do I not know something about the interaction between floats and sets?

Trystan · « **Reply #27 on:** April 14, 2014, 05:16:06 AM »

Quote from: Jaxian on April 14, 2014, 03:18:12 AM

Quote from: chooseusername on April 13, 2014, 08:29:01 PM
random() is the name of the standard library function which returns a float between 0.0-1.0. There would of course be no point in putting that as the key in a dictionary or set.

Why do you say that? Do I not know something about the interaction between floats and sets?

Because you could end up with something like 0.000000001, 0.000000002, 0.000000003, 0.000000004, 0.000000005. All unique numbers but all very close. It's pretty much impossible to call random a few times and get duplicate numbers. You have to be careful when comparing or doing math with floats.

Numeron · « **Reply #28 on:** April 14, 2014, 06:25:38 AM »

I don't know much python, but if you want to create a really large set of unique randoms ints, it might be faster to use a hashset rather than a list so you don't have to O(n) your way through the list on every number generated. Putting duplicate values into a set will automatically check if the number already exists in the set.

In java, primitive ints are auto-boxed into Integer objects for the standard library HashSets (which makes the above awful memorywise), so if this is the same with python you would probably want to go look for some library that has hashsets specifically catered to primitives (like Trove in java)

Jaxian · « **Reply #29 on:** April 14, 2014, 03:44:50 PM »

Quote from: Trystan on April 14, 2014, 05:16:06 AM

Because you could end up with something like 0.000000001, 0.000000002, 0.000000003, 0.000000004, 0.000000005. All unique numbers but all very close. It's pretty much impossible to call random a few times and get duplicate numbers. You have to be careful when comparing or doing math with floats.

Sure, the numbers may be very close, but they are still unique. Why would their closeness be a concern? Perhaps in reality the goal is not a list of unique floats but a list of unique integers within a range, say 0 - 5000. In this case we would of course need to convert the random value to an integer first. But is that all that was being said? It sounded more like "a float between 0 and 1 should never be used as a key in a set or dictionary."

As a side-note, I do think it should be possible to get duplicate values. If we think of an integer as 32 bits, a random positive integer should have its sign bit set to 0, and the remaining 31 bits should be randomized. Conversely, a random float between 0 and 1 should have its sign bit set to 0, as well as its exponent's sign bit set to 1, and the remaining 30 bits randomized. This means that a random float should be more likely to contain a duplicate than a random integer (having 2^30 possible values instead of 2^31). It's true that one must be careful comparing floats when crossing architectures or after performing math. However, two floats with the same bits will compare as equal.

News: Read the RULES before posting.

Author Topic: Generating a list of unique random numbers (Read 47790 times)

Z

Re: Generating a list of unique random numbers

Jaxian

Re: Generating a list of unique random numbers

chooseusername

Re: Generating a list of unique random numbers

chooseusername

Re: Generating a list of unique random numbers

requerent

Re: Generating a list of unique random numbers

mushroom patch

Re: Generating a list of unique random numbers

chooseusername

Re: Generating a list of unique random numbers

chooseusername

Re: Generating a list of unique random numbers

sokol815

Re: Generating a list of unique random numbers

mushroom patch

Re: Generating a list of unique random numbers

requerent

Re: Generating a list of unique random numbers

Jaxian

Re: Generating a list of unique random numbers

Trystan

Re: Generating a list of unique random numbers

Numeron

Re: Generating a list of unique random numbers

Jaxian

Re: Generating a list of unique random numbers