Skip to main content
Solved

Rules on Names


Hello, I am trying to create a rule for validity on Names. The logic and condition is that the rule matches names that are 

  • Title Case --first letter upper case follow by all lower case 
  • Accented
  • Accepts special characters 
  • Mandatory
  • Could have Hyphen - 

I have this regex ^[a-zA-Z '-]+$ 

^[A-Z][a-z]*(?:[ '-][A-Z][a-z]*)*$

(?<=^|\s)[a-z]*[A-Z][a-z]*[A-Z]*[a-z]*

 

above regex capture and validate names like 

John Doe

Mary Anne which are okay.  but also validate name in all lower case for example john doe which is not okay. i want to know what i need to do to allow the regex not validate names in all lower case .

 

Best answer by srija piratla

Hi @olayinkadaramola 

  • Mandatory case
  • a word in title case -(Title case - One upper letter follow by lower letter )  the regex expression  \b(?:[A-Z][a-z]*)(?: [A-Z][a-z]*)*\b
  • another condition for special characters like the apostrophe, hyphens and space \b[A-Z][a-z]*(?:[-' ][A-Z][a-z]*)*\b
  • and another rule to meet the condition for accented characters \b(?:[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)(?:[- ][A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)*(?:'[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)?\b

Here are the expressions breakdown for each condition you asked and I see that you are trying to split all the conditions in one rule. This won’t work because we have multiple conditions here. For example if a name which matches Ola-Ola belongs to 3rd test rule, the test rule check fails when it is checking the 2nd test rule and throws an error (Title case Upper case followed by lower case). Hence it won’t go to 3rd test rule which results in above error that you provided. As a workaround you can create 3 separate rules for 3 different conditions and assign all the rules to respective attribute you are trying to use it.

Here is the documentation for creating DQ rules - https://docs.ataccama.com/one/latest/data-quality/create-dq-rule.html

Hope this helps.

Regards,

Srija Piratla

View original
Did this topic help you find an answer to your question?

13 replies

Forum|alt.badge.img

Hi @olayinkadaramola,

 

Please try this regular expression to get your results - \b(?:[A-Z][a-z' -]*|[A-Z][a-z'-]*(?:[ -][A-Z][a-z' -]*)*)\b. 

 

Hope this works.

Regards,

Srija Piratla


Hello Srija, Thanks for the response. Above regex did not work. is there something I'm  not doing right? see picture attached . Names like Ola, Mary, O’brien should have be valid  but they are all showing invalid . i could see from the screenshot your sent earlier that they were valid from yours . Pls guide on what i did not do right . 

 


Forum|alt.badge.img

Hi @olayinkadaramola,

I see that you are using a β€˜-’ in the starting can you just remove that and use only this 

\b(?:[A-Z][a-z' -]*|[A-Z][a-z'-]*(?:[ -][A-Z][a-z' -]*)*)\b. 

Please try this and let me know if you are still facing any issues.

 

Hope this helps!

 

Regards,

Srija Piratla


yea, the regex works , however i need a little modification, is it possible to allow both O’brien and O’Brien shows as Valid ? if not possible to accept both while maintaining other conditions . The preferred way of writing should be O’Brien. i also noticed that its not validating accented names as seen in the picture below for example 

StΓ©phane-Henri  and 

Jean-Pierre O'Brien

 

 

StΓ©phane-Henri

 


Forum|alt.badge.img

Hi @olayinkadaramola ,

Please try this expression 

\b(?:[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)(?:[- ][A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)*(?:'[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)?\b

I checked for all the use cases you mentioned let me know if still something is missing.

 

Regards,

Srija Piratla


Thank Srija, 

above regex works for all the use case using a single condition.

I will like to know if this regex( \b(?:[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)(?:[- ][A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)*(?:'[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)?\b) can be broken down for easy comprehension , for instance create one that will only match

  • a word in title case -(Title case - One upper letter follow by lower letter )
  • another condition for special characters like the apostrophe, hyphens and space 
  • and another rule to meet the condition for accented characters 

in a single rule , have multiple conditions and still provide same result as above with all the names used for test . 

 

 i was able to split the regex, however its did not work or validate the names when i split and test the rules , i may have made a mistake when i split the  regex but below is what i have , Pls clarify if below  is same with the single regex or where or what i am missing out 

  • (?:[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)
  • (?:[- ][A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)*
  • (?:'[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)?

 result of the text is - all use case should have been valid .

 


in addition to above, do you have references on rules creation that where i can learn from, Pls share any references or  resources where i can learn more on how to create rules starting from basic to complex ones 


Forum|alt.badge.img
  • Ataccamer
  • 49 replies
  • Answer
  • June 5, 2024

Hi @olayinkadaramola 

  • Mandatory case
  • a word in title case -(Title case - One upper letter follow by lower letter )  the regex expression  \b(?:[A-Z][a-z]*)(?: [A-Z][a-z]*)*\b
  • another condition for special characters like the apostrophe, hyphens and space \b[A-Z][a-z]*(?:[-' ][A-Z][a-z]*)*\b
  • and another rule to meet the condition for accented characters \b(?:[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)(?:[- ][A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)*(?:'[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)?\b

Here are the expressions breakdown for each condition you asked and I see that you are trying to split all the conditions in one rule. This won’t work because we have multiple conditions here. For example if a name which matches Ola-Ola belongs to 3rd test rule, the test rule check fails when it is checking the 2nd test rule and throws an error (Title case Upper case followed by lower case). Hence it won’t go to 3rd test rule which results in above error that you provided. As a workaround you can create 3 separate rules for 3 different conditions and assign all the rules to respective attribute you are trying to use it.

Here is the documentation for creating DQ rules - https://docs.ataccama.com/one/latest/data-quality/create-dq-rule.html

Hope this helps.

Regards,

Srija Piratla


Cansu
Community Manager
Forum|alt.badge.img+3
  • Community Manager
  • 625 replies
  • June 6, 2024

Hi @olayinkadaramola, I’m closing this thread for now, if you have any follow-up questions please feel free to share them here or create a new post πŸ™‹β€β™€οΈ


@srija piratla ,

Can this rules not validate names like Ola ola? i want the first letter of every word in upper case . i have tried \b[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ']*(?: [A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ']*)*\b , but it not passing or validation .


Forum|alt.badge.img

Hi @olayinkadaramola ,

I see you want a word in title case -(Title case - One upper letter follow by lower letter )  \b[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ']*(?: [A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ']*)*\b 

I used this and make sure if you give Ola ola - invalid and Ola Ola - valid and Ola OLA - invalid 

Let me know where it is failing for you and hope this helps to solve your issue :)

Regards,

Srija Piratla


when you use regex \b[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ']*(?: [A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ']*)*\b  and test with Ola-Ola, its shows as invalid where its should be valid. O’Brien is also showing as invalid and other i marked in the picture above.

 

If i use regex \b(?:[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)(?:[- ][A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)*(?:'[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ' -]*)?\b , all excpetion above are being validated as valid which is okay, the exception to this is that its also validate names like Ola ola as valid which should be wrong . 

 


Forum|alt.badge.img

Hi @olayinkadaramola ,

 

The one i gave only satisfies the first case upper letter. The example you are looking what you gave in the screenshot please use this expression to satisfy those conditions.

For the first condition you are looking it should satisfy all these things - condition for special characters like the apostrophe, hyphens and space \b[A-Z][a-z]*(?:[-' ][A-Z][a-z]*)*\b 

For the second regex exp you gave try this \b(?:[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ'-]*)(?:[\s-][A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ'-]*)*(?:'[A-Z][a-z'Γ -ΓΆΓΈ-ΓΏ'-]*)?\b

 

 

Hope this helps !

Regards,

Srija Piratla


Reply


Cookie policy

We use cookies to enhance and personalize your experience. If you accept you agree to our full cookie policy. Learn more about our cookies.

 
Cookie settings