Skip to content
Advertisement

Extracting a pack size from a column with mixed formats

I have a column pack_size in a table called product_master_test. The problem that I am facing is that the pack_size is in mixed formats, there is no uniformity to it.

For example:

  • 4 x 2kg (pack size should be 4)
  • 48-43GM (pack size should be 48)
  • 12 x 1BTL (pack size should be 12)
  • 1 x 24EA (pack size should be 24)

I’ve been thinking about different approaches, but I can’t think of anything that would work without having a lot of IF statements in the query/PHP code. Is there a solution that I am missing?

I do have the file in Excel, if there is an easier way to process it using PHP.

I am not including any code, as I’m not entirely sure where to start with this problem.

Advertisement

Answer

Using a regex to split the pack size could at least give you the various components which you can then (possibly) infer more from…

$packs = ["4 x 2kg","48-43GM","12 x 1BTL","1 x 24EA", "12 X 1 EA"];

foreach ( $packs as $size ) {
if ( preg_match("/(d*)(?:s+)?[xX-](?:s+)?(d+)(?:s+)?(w*)/", $size, $match) == 1 )  {
        print_r($match);
    }
    else {
        echo "cannot determine - ".$size.PHP_EOL;
    }
}

(regex can probably be optimised, not my area of expertise). It basically splits it to be a number, some space with either a x or a - and then another number followed by the units (some text). The above with the test cases gives…

Array
(
    [0] => 4 x 2kg
    [1] => 4
    [2] => 2
    [3] => kg
)
Array
(
    [0] => 48-43GM
    [1] => 48
    [2] => 43
    [3] => GM
)
Array
(
    [0] => 12 x 1BTL
    [1] => 12
    [2] => 1
    [3] => BTL
)
Array
(
    [0] => 1 x 24EA
    [1] => 1
    [2] => 24
    [3] => EA
)
Array
(
    [0] => 12 X 1 EA
    [1] => 12
    [2] => 1
    [3] => EA
)

With the else part it should also give you the ones it cannot determine and perhaps allow you to change it accordingly.

User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement