I need to extract the housenumber with all the different constellations in austria:
| Street name | housenumber | stairs | floor | door | | --------------------------------------- | ----------- | ------ | ----- | ---- | | Lilienstr. 12a | 12a | | | | | Leibnizstraße 36/28/2 | 36 | 28 | | 2 | | Prager Straße 14/3/1/4 | 14 | 3 | 1 | 4 | | Guentherstr. 43 B | 43 B | | | | | Eberhard-Leibnitz Str. 1/7 | 1 | | | 7 | | Schießstätte 7/7 | 7 | | | 7 |
I’ve already found this question: Regex to extract (german) street number.
This works if no stair/floor/door is entered. Can you help?
^[ -0-9a-zA-ZäöüÄÖÜß.]+?s+(d+(s?[a-zA-Z])?)s*(?:$|(|[A-Z]{2})
Advertisement
Answer
Not knowing Austrian address formats it’s hard for me to say if this is correct, however, please see the regex below.
^(.*)s+(d+(?:s*[a-zA-Z])?)(?:/(d+))?(?:/(d+))?(?:/(d+))?s*(?:$|(|[A-Z]{2})
This expression will always match all 4 number groups (1/2/3/4) so you will need to do some additional processing to determin if an address has a housenumber and stairs and floor and door, compared to if an address only has a housenumber and door.
For example:
<?php $pattern = '^(.*)s+(d+(?:s*[a-zA-Z])?)(?:/(d+))?(?:/(d+))?(?:/(d+))?s*(?:$|(|[A-Z]{2})$'; $addresses = [ 'Lilienstr. 12a', 'Leibnizstraße 36/28/2', 'Prager Straße 14/3/1/4', 'Guentherstr. 43 B', 'Eberhard-Leibnitz Str. 1/7', 'Schießstätte 7/7' ]; $results = []; foreach ( $addresses as $address ) { // 0. Full match // 1. Streetname // 2. Housenumber // 3. Stairs // 4. Floor // 5. Door preg_match( '/' . $pattern . '/', $address, $matches ); // Remove full match from array_shift( $matches ); // Set up default values $streetname = array_shift( $matches ); $housenumber = null; $stairs = null; $floor = null; $door = null; // Count total values given $total = count( array_filter( array_map( 'trim', $matches ) ) ); switch ( $total ) { // Has all 4 parts case 4: $housenumber = $matches[ 0 ]; $stairs = $matches[ 1 ]; $floor = $matches[ 2 ]; $door = $matches[ 3 ]; break; // Only has 3 parts case 3: $housenumber = $matches[ 0 ]; $stairs = $matches[ 1 ]; $door = $matches[ 2 ]; break; // Only has 2 parts case 2: $housenumber = $matches[ 0 ]; $door = $matches[ 1 ]; break; // Has 1 part default: $housenumber = $matches[ 0 ]; break; } // Add to results array $results[] = [ 'address' => $address, 'streetname' => $streetname, 'housenumber' => $housenumber, 'stairs' => $stairs, 'floor' => $floor, 'door' => $door ]; } print_r( $results );
Output
Array ( [0] => Array ( [address] => Lilienstr. 12a [streetname] => Lilienstr. [housenumber] => 12a [stairs] => [floor] => [door] => ) [1] => Array ( [address] => Leibnizstraße 36/28/2 [streetname] => Leibnizstraße [housenumber] => 36 [stairs] => 28 [floor] => [door] => 2 ) [2] => Array ( [address] => Prager Straße 14/3/1/4 [streetname] => Prager Straße [housenumber] => 14 [stairs] => 3 [floor] => 1 [door] => 4 ) [3] => Array ( [address] => Guentherstr. 43 B [streetname] => Guentherstr. [housenumber] => 43 B [stairs] => [floor] => [door] => ) [4] => Array ( [address] => Eberhard-Leibnitz Str. 1/7 [streetname] => Eberhard-Leibnitz Str. [housenumber] => 1 [stairs] => [floor] => [door] => 7 ) [5] => Array ( [address] => Schießstätte 7/7 [streetname] => Schießstätte [housenumber] => 7 [stairs] => [floor] => [door] => 7 ) )
See here: http://sandbox.onlinephpfunctions.com/code/3952b2f3cab251e7137bcd9d55e42d8c8bcdd723