Skip to content
Advertisement

Split string on dots not preceded by a digit without losing digit in split

Given the following sentence:

The is 10. way of doing this. And this is 43. street.

I want preg_split() to give this:

Array (
 [0] => "This is 10. way of doing this"
 [1] => "And this is 43. street"
)

I am using:

preg_split("/[^d+]./i", $sentence)

But this gives me:

Array (
 [0] => "This is 10. way of doing thi"
 [1] => "And this is 43. stree"
)

As you can see, the last character of each sentence is removed. I know why this happens, but I don’t know how to prevent it from happening. Any ideas? Can lookaheads and lookbehinds help here? I am not really familiar with those.

Advertisement

Answer

You want to use a negative assertion for that:

preg_split("/(?<!d)./i",$sentence)

The difference is that [^d]+ would become part of the match, and thus split would remove it. The (?! assertion is also matched, but is “zero-width”, meaning it does not become part of the delimiter match, and thus won’t be thrown away.

User contributions licensed under: CC BY-SA
2 People found this is helpful
Advertisement