Skip to content
Advertisement

preg_match include string with unlimit occurence

I trying to create preg_match function with a pattern to validate the future string with unlimit occurence. This is my function like this:

if(! preg_match_all("#^([a-zA-Z0-9_-]+)$#", $arg, $matches, PREG_OFFSET_CAPTURE)){
    var_dump($matches);
    throw new Exception('The simple pattern "'.$arg.'" is not valid !');
}

One occurrence must respect the following format any charchters between two parentheses: (mystring123/). The whole of string ($arg) is a collection of these occurrences.
For example
1-This string is valid (AAA/)(BBB/)(cc).
2-this string is not valid (AAA/)xxxx(BBB/)(cc)

The function works correctly but the pattern that I trying to create not accept more than one occurrence.

My second try, I change the pattern but the issue has been triggered when preg_match function is executed.

#[^([a-zA-Z0-9_-]+)$]+#

My need is how to resolve this issue, and how I can add to pattern string the followin charchters “” and “/”.

Advertisement

Answer

I’ve toiled at this task for a period of time, trying to devise a method to combine your fullstring validation with indefinite captured groups. After trying many combinations of G and lookarounds, I am afraid it cannot be done in one pass. If php allowed variable width lookbehinds, I think I could, but alas they are not available.

What I can offer is a process with the unnecessary “stuff” removed.

Code: (Demo)

$strings = ["(AAA/)(BBB/)(cc)", "(AAA/)xxxx(BBB/)(cc)"];

foreach ($strings as $string) {
    if (!preg_match('~^(?:([w\/-]+))+$~', $string)) {
        echo "The simple pattern $string is not valid!";
        // throw new Exception("The simple pattern $string is not valid!");
    } else {
        var_export(preg_split('~)K~', $string, 0, PREG_SPLIT_NO_EMPTY));
    }
    echo "n";
}

Output:

array (
  0 => '(AAA/)',
  1 => '(BBB/)',
  2 => '(cc)',
)
The simple pattern (AAA/)xxxx(BBB/)(cc) is not valid!

Pattern #1 Breakdown:

~              #pattern delimiter
^              #start of string anchor
(?:            #start of non-capturing group
  (           #match one opening parenthesis
  [w\/-]+    #greedily match one or more of the following characters: a-z, A-Z, 0-9, underscores, backslashes, slashes, and hyphens
  )           #match one closing parenthesis
)              #end of non-capturing group
+              #allow one or more occurrences of the non-capturing group
$              #end of string anchor
~              #pattern delimiter

Pattern #2 Breakdown:

~              #pattern delimiter
)             #match one closing parenthesis
K             #restart the fullstring match (forget/release previously matched character(s))
~              #pattern delimiter

Pattern #2’s effect is to locate every closing parenthesis and “explode” the string on the zero width position that follows the closing parenthesis. K ensures that no characters become casualties in the explosions.

The if condition does not need to call preg_match_all() since there can only ever be one matching string while you are validating from ^ to $. Declaring a variable to contain the “match” is pointless ( as is PREG_OFFSET_CAPTURE) — if there is a match, it will be the entire input string so just use that value if you want it.

preg_split() is a suitable substitute for a preg_match_all() call because it outputs exactly the output that you will seek in a lean single-dimensional array AND uses a very small, readable pattern. *The 3rd and 4th parameters: 0 and PREG_SPLIT_NO_EMPTY tell the function respectively that there is “no limit” to the number of explosions, and that any empty elements should be discarded (don’t make an empty element from the ) that trails cc)

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement