0

I'm wondering if this is a sufficient algorithm for finding the best value with a weighted system. Is there anything I could add to make it better?

in this example I would like the probability of $object->get() returning test4 to be 4 times greater than the probability of it returning test1.

class weightCalculator { 
    var $data = array();
    var $universe = 0; 

    function add( $data, $probability ){ 
        $this->data[ $x = sizeof( $this->data ) ] = new stdClass; 
        $this->data[ $x ]->value = $data; 
        $this->universe += $this->data[ $x ]->probability = abs( $probability ); 
    } 

    function get(){ 
        if( !$this->universe ){
            return null; 
        }
        $x = round( mt_rand( 0, $this->universe ) ); 
        $max = 0;
        $i = 0; 

        while( $x > $max ){
            $max += $this->data[ $i++ ]->probability;
        }
        $val=-1;
        if($this->universe==1){
            $val = $this->data[$i]->value;      
          } else {
            $val = $this->data[$i-1]->value;                
        }
        return $val;
    } 
}

$object = new weightCalculator; 
$object->add( 'test1', 10 );
$object->add( 'test2', 20 ); 
$object->add( 'test3', 30 ); 
$object->add( 'test4', 40 ); 
Petrogad
  • 4,329
  • 5
  • 31
  • 76
  • What do you mean by "best value"? The assigned value for the random variable $x, based on the ordered weights of all data items in this->universe? – taserian Jul 09 '09 at 17:13
  • I guess i'm wondering if the method I'm using is the best way to calculate the weighted value of a result. – Petrogad Jul 09 '09 at 17:16
  • Ask a silly question, but what are you trying to do here? – Meep3D Jul 09 '09 at 18:11
  • pass in values, with weights, and have it return me with 1 result factoring in the weights of the values i've passed in. Cheers – Petrogad Jul 09 '09 at 18:18
  • 2
    So in this example you'd like the probability of $object->get() returning test4 to be 4 times greater than the probability of it returning test1? – acrosman Jul 09 '09 at 19:50
  • In your example, you get into trouble if mt_rand returns 100, don't you? I think the max there needs to be $this->universe - 1. I think that also eliminates the need for the $this->universe == 1 special case, if I understand correctly why it's there. As for improvements -- one question I have is if you want to optimize for memory or speed? This will be slow for a large number of objects. – Richard Dunlap Jul 09 '09 at 20:38
  • Looking to optimize for speed mostly. Yea I was worried it could be costly for a larger data sets – Petrogad Jul 09 '09 at 21:01

2 Answers2

0

Seems fair enough, depends on the use. If you need better performance for the get() method, you could build your range values in the add() method and use dichotomy.

instanceof me
  • 35,024
  • 3
  • 27
  • 39
0

To elaborate on streetpc's answer; alongside your data array, use add() to maintain a same-sized keys array where you store the upper bound on your range; for your example, that array would look like {10, 30, 60, 100}. (Or, use one array containing objects or structs that contain both the data and the key.) Then, your get() method is simply searching a sorted list for the first element greater than $x; a binary search could do that in O(ln N), compared to O(N) for your method. You'll chew up a little extra memory -- but for most applications, it looks like a good tradeoff.

Richard Dunlap
  • 1,855
  • 10
  • 14