I am trying to extract the last name of a attendance report for sorting it alphabetically by last name. The attendance report (should be an .cvs) looks like this:
Artur Testme Left 27.1.2021, 10:34:15
(Tab was extracted for the post, so here:
Artur Testme [Tab] Left [Tab] 27.1.2021, 10:34:15)
I open it via fgetcsv and find the word between space and a tab:
if (($handle = fopen($_FILES["file"]["test.cvs"], "r")) !== FALSE) { while (($data = fgetcsv($handle, 1000, ";")) !== FALSE) { preg_match('/(s)(.*)(t)/', $data[0], $matches); echo $matches[0]."<br>"; } fclose($handle); } ?>
Output looks like this: Testme Left
I don’t understand why I also takes the second word. My understanding is, that is should take the word between the space and the tab. I hope someone might help me with this.
Btw: if you find a nice and fast way to sort all the data alphabetically and also throw out the doubles in the attendance report.. this would help me alot and saves me from to much google time. 🙂
Thank you! Kind regards Daniel
Advertisement
Answer
About the regex question, the .*
is matching the whole line (also spaces and tabs) and will then backtrack to match the last tab. That is why the second group also contains the second word.
Using s
also matches a newline or a tab. You might use h
to match an horizontal whitespace char except for a tab using a negated character class [^Ht]
You can use w+
to match 1 or more word characters, or use S+
to match 1 or more non whitespace characters.
$pattern = '/[^Ht](w+)t/'; $s = 'Artur Testme Left 27.1.2021, 10:34:15'; if (preg_match($pattern, $s, $matches)) { var_dump($matches[1]); }
Output
string(6) "Testme"