Let’s say I have an array of items with each item a value. I’d like to create a new array where the items are clustered by their relative distance to each other. When an item has a distance of one to another item, they belong to each other.
$input = [ 'item-a' => 1, 'item-b' => 2, 'item-c' => 3, 'item-d' => 5, ]; $output = [ ['item-a', 'item-b'], ['item-b', 'item-c'], ['item-d'], ];
This will create an output of overlapping arrays. What I want is that, because item-a and item-b are related, and item-b is also related to item-c, I’d like to group item-a, item-b, and item-c to each other. The distance to item-c and item-d is greater than 1 so it will for a cluster of itself.
$output = [ ['item-a', 'item-b', 'item-c'], ['item-d'], ];
How do I even start coding this?
Thanks in advance and have a nice day!
Advertisement
Answer
This can only be tested in your environment but here is what it does
- it attempts to find relative distances based on array index 0’s hash
- it resorts the input array by distances (assuming that in this stage some will be positive and some negative) – that gives us the info to put the hash array in an order
- Take this new array and put the hash back in
- build a final output array measuring distances and sorting the level of output array by a threshhold.
I put in a couple dummy functions to return distances, obviously replace with your own. This might need tweaking but at this point, it’s in your hands.
<?php // example code $input = [ 'item-a' => 'a234234d', 'item-f' => 'h234234e', 'item-h' => 'e234234f', 'item-b' => 'f234234g', 'item-m' => 'd234234j', 'item-d' => 'm234234s', 'item-e' => 'n234234d', 'item-r' => 's234234g', 'item-g' => 'f234234f', ]; function getDistanceFrom($from, $to) { return rand(-3,3); } function getDistanceFrom2($from, $to) { return rand(0,7); } // first sort by relative distance from the first one $tmp = []; $ctr = 0; foreach ($input as $item => $hash) { if ($ctr === 0) { $ctr ++; continue; } $tmp[$item]=getDistanceFrom(reset($input), $hash); } uasort($tmp, function ($a, $b) { return ($a < $b) ? -1 : 1; }); //now they're in order, ditch the relative distance and put the hash back in $sortedinput = []; foreach ($tmp as $item => $d) { $sortedinput[$item] = $input[$item]; } $output=[]; $last=0; $level=0; $thresh = 3; // if item is within 3 of the previous, group foreach($sortedinput as $v=>$i) { $distance = getDistanceFrom2($last, $i); if (abs($distance) > $thresh) $level++; $output[$level][]=array("item" => $v, "distance" => $distance, "hash" => $i); $last = $i; } print_r($output);