I need to extract the housenumber with all the different constellations in austria:
JavaScript
x
| Street name | housenumber | stairs | floor | door |
| --------------------------------------- | ----------- | ------ | ----- | ---- |
| Lilienstr. 12a | 12a | | | |
| Leibnizstraße 36/28/2 | 36 | 28 | | 2 |
| Prager Straße 14/3/1/4 | 14 | 3 | 1 | 4 |
| Guentherstr. 43 B | 43 B | | | |
| Eberhard-Leibnitz Str. 1/7 | 1 | | | 7 |
| Schießstätte 7/7 | 7 | | | 7 |
I’ve already found this question: Regex to extract (german) street number.
This works if no stair/floor/door is entered. Can you help?
JavaScript
^[ -0-9a-zA-ZäöüÄÖÜß.]+?s+(d+(s?[a-zA-Z])?)s*(?:$|(|[A-Z]{2})
Advertisement
Answer
Not knowing Austrian address formats it’s hard for me to say if this is correct, however, please see the regex below.
JavaScript
^(.*)s+(d+(?:s*[a-zA-Z])?)(?:/(d+))?(?:/(d+))?(?:/(d+))?s*(?:$|(|[A-Z]{2})
This expression will always match all 4 number groups (1/2/3/4) so you will need to do some additional processing to determin if an address has a housenumber and stairs and floor and door, compared to if an address only has a housenumber and door.
For example:
JavaScript
<?php
$pattern = '^(.*)s+(d+(?:s*[a-zA-Z])?)(?:/(d+))?(?:/(d+))?(?:/(d+))?s*(?:$|(|[A-Z]{2})$';
$addresses = [
'Lilienstr. 12a',
'Leibnizstraße 36/28/2',
'Prager Straße 14/3/1/4',
'Guentherstr. 43 B',
'Eberhard-Leibnitz Str. 1/7',
'Schießstätte 7/7'
];
$results = [];
foreach ( $addresses as $address ) {
// 0. Full match
// 1. Streetname
// 2. Housenumber
// 3. Stairs
// 4. Floor
// 5. Door
preg_match( '/' . $pattern . '/', $address, $matches );
// Remove full match from
array_shift( $matches );
// Set up default values
$streetname = array_shift( $matches );
$housenumber = null;
$stairs = null;
$floor = null;
$door = null;
// Count total values given
$total = count( array_filter( array_map( 'trim', $matches ) ) );
switch ( $total ) {
// Has all 4 parts
case 4:
$housenumber = $matches[ 0 ];
$stairs = $matches[ 1 ];
$floor = $matches[ 2 ];
$door = $matches[ 3 ];
break;
// Only has 3 parts
case 3:
$housenumber = $matches[ 0 ];
$stairs = $matches[ 1 ];
$door = $matches[ 2 ];
break;
// Only has 2 parts
case 2:
$housenumber = $matches[ 0 ];
$door = $matches[ 1 ];
break;
// Has 1 part
default:
$housenumber = $matches[ 0 ];
break;
}
// Add to results array
$results[] = [
'address' => $address,
'streetname' => $streetname,
'housenumber' => $housenumber,
'stairs' => $stairs,
'floor' => $floor,
'door' => $door
];
}
print_r( $results );
Output
JavaScript
Array
(
[0] => Array
(
[address] => Lilienstr. 12a
[streetname] => Lilienstr.
[housenumber] => 12a
[stairs] =>
[floor] =>
[door] =>
)
[1] => Array
(
[address] => Leibnizstraße 36/28/2
[streetname] => Leibnizstraße
[housenumber] => 36
[stairs] => 28
[floor] =>
[door] => 2
)
[2] => Array
(
[address] => Prager Straße 14/3/1/4
[streetname] => Prager Straße
[housenumber] => 14
[stairs] => 3
[floor] => 1
[door] => 4
)
[3] => Array
(
[address] => Guentherstr. 43 B
[streetname] => Guentherstr.
[housenumber] => 43 B
[stairs] =>
[floor] =>
[door] =>
)
[4] => Array
(
[address] => Eberhard-Leibnitz Str. 1/7
[streetname] => Eberhard-Leibnitz Str.
[housenumber] => 1
[stairs] =>
[floor] =>
[door] => 7
)
[5] => Array
(
[address] => Schießstätte 7/7
[streetname] => Schießstätte
[housenumber] => 7
[stairs] =>
[floor] =>
[door] => 7
)
)
See here: http://sandbox.onlinephpfunctions.com/code/3952b2f3cab251e7137bcd9d55e42d8c8bcdd723