Please help with the regex, the language can be any. I’ll later translate it to python.
I’m trying to build a regex to capture the tag below:
JavaScript
x
#Facilitator:"Full Name <mail@mail.domain>"
- Full name can be with accents like José, Pâmela, or any available in the ASCII table.
- Full name can have 1, 2 or n family names. Could have or not a ‘(comapny name)’ at the end of the name: like
#Facilitator:"Name1 Name2 Name3 (Company Inc) <mail@domain>"
- The tag can appear 0, 1 or n times in strings.
- The tag can appear in any place of the string.
So far trying like this (python) but no success:
JavaScript
import re
notes = 'Verbal confirmation #Facilitator:"Fernas P. Loyola (YARDA LTDA) <ope@yahoo.com>"from ATUX with Melanie. Waiting for scheduling#Facilitator:"Fernandes <v-rrlo@stttr.de>" #Facilitator:"Pablito Ferdinandes <papa@gmail.com>"'
facilitator_regex = '^.*((#Facilitator:".*"){1,}).*$'
regex_replace = '\1'
print(re.sub(facilitator_regex, regex_replace, notes))
The output i expect is a list of 0, 1 or more #tags separated by a space.
Any help on any language? I need help mostly with the regex itself. thanks so much.
Advertisement
Answer
You can find all the facilitators using re.findall
with this regex:
JavaScript
'#Facilitator:"[^"]*"'
e.g.
JavaScript
facilitator_regex = '#Facilitator:"[^"]*"'
facilitators = re.findall(facilitator_regex, notes)
For your sample data this gives
JavaScript
[
'#Facilitator:"Fernas P. Loyola (YARDA LTDA) <ope@yahoo.com>"',
'#Facilitator:"Fernandes <v-rrlo@stttr.de>"',
'#Facilitator:"Pablito Ferdinandes <papa@gmail.com>"'
]
You could then use str.join
to make a space-separated list:
JavaScript
print(' '.join(facilitators))
Output:
JavaScript
#Facilitator:"Fernas P. Loyola (YARDA LTDA) <ope@yahoo.com>" #Facilitator:"Fernandes <v-rrlo@stttr.de>" #Facilitator:"Pablito Ferdinandes <papa@gmail.com>"