Skip to content
Advertisement

Remove duplicate subarrays based on identifying data respectively retaining the last occurring duplicates

I get the following data from an api query and I need to remove sets of data with duplicate employee id values and retain the last occurring dataset.

$holiday_array = [
    [
        'employee' => [
            'id' => 456062
        ],
        'reviewed_by' => [
            'id' => 260700
        ],
        'reason' => null,
        'type' => 'Holiday',
        'deducted' => '11.0',
        'id' => 11505539,
        'start_date' => '2021-03-19',
        'end_date' => '2021-04-02',
        'action' => 'request',
        'status]'=> 'approved',
        'created_at' => '2021-02-22T09:19:57+00:00',
        'updated_at' => '2021-02-23T13:28:41+00:00',
    ],
    [
        'employee' => [
            'id' => 522010
        ],
        'reviewed_by' => [
            'id' => 260760
        ],
        'reason' => null,
        'type' => 'Holiday',
        'deducted' => '2.0',
        'id' => 11730818,
        'start_date' => '2021-03-19',
        'end_date' => '2021-03-22',
        'action' => 'request',
        'status'=> 'approved',
        'created_at' => '2021-03-10T14:14:48+00:00',
        'updated_at' => '2021-03-15T08:04:36+00:00',
    ],
    [
        'employee' => [
            'id' => 638070
        ],
        'reviewed_by' => [
            'id' => 578193
        ],
        'reason' => null,
        'type' => 'Holiday',
        'deducted' => '1.0',
        'id' => 11861461,
        'start_date' => '2021-03-22',
        'action' => 'request',
        'status' => 'approved',
        'notes' => 'test',
        'created_at' => '2021-03-22T14:30:33+00:00',
        'updated_at' => '2021-03-22T14:31:39+00:00'
    ],
    [
        'employee' => [
            'id' => 638070
        ],
        'reviewed_by' => [
            'id' => 578193
        ],
        'reason' => null,
        'type' => 'Holiday',
        'deducted' => '1.0',
        'id' => 11861498,
        'start_date' => '2021-03-22',
        'action' => 'cancel',
        'status' => 'approved',
        'created_at' => '2021-03-22T14:31:55+00:00',
        'updated_at' => '2021-03-22T14:32:26+00:00'
    ],
    [
        'employee' => [
            'id' => 351779
        ],
        'reviewed_by' => [
            'id' => 578193
        ],
        'reason' => null,
        'type' => 'Holiday',
        'deducted' => '1.0',
        'id' => 11863071,
        'start_date' => '2021-03-22',
        'action' => 'request',
        'status' => 'approved',
        'notes' => 'Test',
        'created_at' => '2021-03-22T15:28:48+00:00',
        'updated_at' => '2021-03-23T14:41:13+00:00'
    ],
    [
        'employee' => [
            'id' => 638070
        ],
        'reviewed_by' => [
            'id' => 578193
        ],
        'reason' => null,
        'type' => 'Holiday',
        'deducted' => '1.0',
        'id' => 11864185,
        'start_date' => '2021-03-22',
        'action' => 'request',
        'status' => 'approved',
        'notes' => 'test',
        'created_at' => '2021-03-22T16:14:15+00:00',
        'updated_at' => '2021-03-22T16:41:18+00:00'
    ],
    [
        'employee' => [
            'id' => 638070
        ],
        'reviewed_by' => [
            'id' => 578193
        ],
        'reason' => null,
        'type' => 'Holiday',
        'deducted' => '1.0',
        'id' => 11877400,
        'start_date' => '2021-03-22',
        'action' => 'cancel',
        'status' => 'approved',
        'created_at' => '2021-03-23T14:24:54+00:00',
        'updated_at' => '2021-03-23T14:32:35+00:00'
    ],
    [
        'employee' => [
            'id' => 351779
        ],
        'reviewed_by' => [
            'id' => 578193
        ],
        'reason' => null,
        'type' => 'Holiday',
        'deducted' => '1.0',
        'id' => 11878419,
        'start_date' => '2021-03-22',
        'action' => 'cancel',
        'status' => 'approved',
        'created_at' => '2021-03-23T15:10:22+00:00'
    ],
    [
        'employee' => [
            'id' => 351779
        ],
        'reviewed_by' => [
            'id' => 578193
        ],
        'reason' => null,
        'type' => 'Holiday',
        'deducted' => '1.0',
        'id' => 11878445,
        'start_date' => '2021-03-22',
        'action' => 'cancel',
        'status' => 'approved',
        'created_at' => '2021-03-23T15:11:47+00:00'
    ],
    [
        'employee' => [
            'id' => 351779
        ],
        'reviewed_by' => [
            'id' => 578193
        ],
        'reason' => null,
        'type' => 'Holiday',
        'deducted' => '1.0',
        'id' => 11878450,
        'start_date' => '2021-03-22',
        'action' => 'cancel',
        'status' => 'approved',
        'created_at' => '2021-03-23T15:11:53+00:00'
    ]
]

Only 4 of the 10 sets of data belong to unique employee ids, so I need the following output:

Array
(
    [0] => Array
        (
            [employee] => Array
                (
                    [id] => 456062
                )

            [reviewed_by] => Array
                (
                    [id] => 260700
                )

            [reason] => 
            [type] => Holiday
            [deducted] => 11.0
            [id] => 11505539
            [start_date] => 2021-03-19
            [end_date] => 2021-04-02
            [action] => request
            [status] => approved
            [created_at] => 2021-02-22T09:19:57+00:00
            [updated_at] => 2021-02-23T13:28:41+00:00
        )
[1] => Array
    (
        [employee] => Array
            (
                [id] => 522010
            )

        [reviewed_by] => Array
            (
                [id] => 260760
            )

        [reason] => 
        [type] => Holiday
        [deducted] => 2.0
        [id] => 11730818
        [start_date] => 2021-03-19
        [end_date] => 2021-03-22
        [action] => request
        [status] => approved
        [created_at] => 2021-03-10T14:14:48+00:00
        [updated_at] => 2021-03-15T08:04:36+00:00
    )
[6] => Array
    (
        [employee] => Array
            (
                [id] => 638070
            )

        [reviewed_by] => Array
            (
                [id] => 578193
            )

        [reason] => 
        [type] => Holiday
        [deducted] => 1.0
        [id] => 11877400
        [start_date] => 2021-03-22
        [action] => cancel
        [status] => approved
        [created_at] => 2021-03-23T14:24:54+00:00
        [updated_at] => 2021-03-23T14:32:35+00:00
    )
[9] => Array
    (
        [employee] => Array
            (
                [id] => 351779
            )

        [reviewed_by] => Array
            (
                [id] => 578193
            )

        [reason] => 
        [type] => Holiday
        [deducted] => 1.0
        [id] => 11878450
        [start_date] => 2021-03-22
        [action] => cancel
        [status] => approved
        [created_at] => 2021-03-23T15:11:53+00:00
    )

)

All arrays must be sorted by key values [employee][id], If there are no duplicated arrays with the same [employee][id], then just output the solitary array, and if there are for example 4 identical (2,3,5,6), then output the last array (6).

I wrote such cycle, but to me deduces only id of workers, and I need to deduce all last arrays in which these id enter.

for($i = 0; $i < count($holiday_array); $i++) {
    $holiday_arrays[] = $holiday_array[$i]["employee];
    $array[] = array_unique($holiday_arrays[$i], SORT_REGULAR);
}
return $array;

Advertisement

Answer

Because php will not allow duplicate keys on any level of an array, you can abuse this rule and assign temporary first-level keys on the result array based on the deep employee id. When finished just re-index the result array.

I am assuming there is no value in retaining the original first level keys.

Your sample input was too verbose, so I have reduced it to its meaningful parts to demonstrate that the last occurring entries are retained.

Code: (Demo)

$holiday_array = [
    ['employee' => ['id' => 456062], 'num' => 1],
    ['employee' => ['id' => 522010], 'num' => 1],
    ['employee' => ['id' => 638070], 'num' => 1],
    ['employee' => ['id' => 638070], 'num' => 2],
    ['employee' => ['id' => 351779], 'num' => 1],
    ['employee' => ['id' => 638070], 'num' => 3],
    ['employee' => ['id' => 638070], 'num' => 4],
    ['employee' => ['id' => 351779], 'num' => 2],
    ['employee' => ['id' => 351779], 'num' => 3],
    ['employee' => ['id' => 351779], 'num' => 4],
];

$result = [];
foreach ($holiday_array as $row) {
    $result[$row['employee']['id']] = $row;
}
var_export(array_values($result));

Output:

array (
  0 => 
  array (
    'employee' => 
    array (
      'id' => 456062,
    ),
    'num' => 1,
  ),
  1 => 
  array (
    'employee' => 
    array (
      'id' => 522010,
    ),
    'num' => 1,
  ),
  2 => 
  array (
    'employee' => 
    array (
      'id' => 638070,
    ),
    'num' => 4,
  ),
  3 => 
  array (
    'employee' => 
    array (
      'id' => 351779,
    ),
    'num' => 4,
  ),
)
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement