Skip to main content

Hi all-- Please find below an introductory primer for the Ataccama Expressions language, for those familiar with Excel formulae (which are deliberately very similar in structure and use to the way we’ve designed the expression language)…

 

Introduction

Overview

One of Ataccama’s core values when developing Ataccama ONE is to create a user experience which simplifies as far as possible the technical complexities of data manipulation and data quality, and to put as many capabilities into the hands of business and non-technical users as possible.

With this philosophy in mind we have created a powerful expression language, which moves away from the complexity of development languages such as scripting/python/SQL. Despite containing powerful and extensible functionality for complex data challenges, the design of the expressions have been made as simple as possible, and can be easily learned by anyone with a familiarity in MS-Excel formulas and features.

This expression language is used throughout Ataccama ONE, in both the web and desktop applications. Expressions can be applied to any data regardless of the type of data source (SQL database, files, data streams, ETL, clustered environments, etc), removing the need for users to be familiar with the syntax of the various languages supported by these technologies. An expression / rule, once written, can be re-used and applied anywhere without the need to rewrite into differing programming languages.

This document provides an introduction to the expression language, its syntax and its use, alongside useful functions and a full expression reference guide within the appendix.

Audience

This document is aimed at Ataccama beginners with an expectation of some knowledge of the workings of Excel formulae and Conditional Formatting.

Document Structure

  • What are Ataccama Expressions?
  • Introduction to the syntax and style of the expressions, with side by side comparisons to Excel.
  • A list of useful and common expressions and functions.
  • Notes on advanced features which use regular expressions
  • The full expressions documentation from Ataccama’s documentation.

Introduction to Expressions and Formulae

What is an expression

Put simply an expression is a way of instructing an application to perform an action (or a series of actions) on one or more values, and to return a result. Every expression must therefore have; (i) some input values*, (ii) the list of actions to perform (and the sequence in which to perform them), (iii) a result.

( Purist note; In some cases (eg. the Excel function TODAY(), which returns today’s date, input values aren’t required.)*

Results of expressions

The result of an expression will be a specific type of data; text, whole number, decimal, etc. To take some simple examples…

Numeric expressions

The expression…

3 + 4

….would result in a number, 7. (The action is addition and the input values are 3 and 4.)

String expressions

The expression…

“ATA” & “CCAMA”

…In Excel, the action ‘&’ is an action used to concatenate text values together. Therefore the input values ‘ATA’ and ‘CCAMA’ (both text) are concatenated and the result is the text value ‘ATACCAMA’.

Conditional Expressions

Consider the following…

4 > 3

…The ‘>’ indicates a conditional action ‘is greater than?’. In natural language terms we’d read it as a question; “Is 4 greater than 3?”, the answer being ‘yes, of course it is!’.

Conditional expressions in both Excel and Ataccama result in a data type called boolean, a true/false value.

As we can see in Excel:

This formula equates to TRUE

Conditional expressions are one of the most important types of expression in Ataccama, especially when writing rules. Consider ‘Is this account number eight digits?’, ‘Does this policy type exist in my central list of policy types?’, ‘Is this value unique?’. When working with Ataccama rules all expressions you’ll use will result in a true-false (yes or no) result, known as a boolean.

Functions, Sequencing and Brackets

Function Basics

The above examples of simple arithmetic and string manipulation are only sufficient for quite basic operations. The value in the processing comes from advanced actions/functions, small pre-coded routines built into the software. Both Excel and Ataccama contain a huge set of built in functions across all data types:

The format of an expression in both Ataccama and Excel is the same;

ExpressionName ( < list of input parameters > )

For example, the expression Left ( “ATACCAMA”, 3 ) will return the text value ‘ATA’ (the leftmost three characters) in both Excel and Ataccama.

Nesting Functions and Sequencing

In both Excel and Ataccama it is possible to embed functions within functions, and this is achieved by using brackets.

This expression…

Upper (Left (”AtaccamaONE”, 😎 & “ Data Quality”)

…would return ‘ATACCAMA DATA QUALITY’, in upper case. When functions are nested with brackets they are evaluated from the innermost bracket (the function ‘left’ in this case), then any operations (in this case the concatenation ‘&’, and so on, finally with the outer function ‘Upper’ (a function which puts any text into full upper case) last.

Excel and Ataccama’s syntax do however have some subtle differences between them and we cover key differences later in the document.

Useful Ataccama Functions

Text/string functions

Expression Description
left(srcStr, count) Returns a new string that is a substring of srcStr containing count of chars taken from the start of the srcStr.
right(srcStr, count) Returns a new string that is a substring of srcStr containing count of chars taken from the end of the srcStr.
substr(srcStr, beginIndex),  
substr(srcStr, beginIndex, strLen) Returns a new string that is a substring of the string srcStr.
transliterate(srcStr, charsFrom, charsTo) Transforms characters of the string srcStr.
containsWord(srcStr, srcWord) Searches for the occurrence of the word srcWord in the string srcStr.
indexOf(srcStr, subStr), indexOf(srcStr,  
subStr, fromIndex) Returns the index within the string srcStr of the first occurrence of the specified substring subStr.
lastIndexOf(strStr, subStr) The sister function to lastIndexOf(), this super useful function will return the position of the final instance of subStr. This is often used to obtain last “/” in a file path so you can get rid of everything in front and just get the file name.
lower(srcStr) Transforms all characters of the string srcStr to lower case.
upper(srcStr) Transforms all characters of the string srcStr to upper case.
capitalize(srcStr) Transforms the first character of each word to upper case and all following characters to lower case.
replicate(srcStr, n) Returns n copies of the string srcStr concatenated together without any separator.
replace(srcStr, what, withWhat) Replaces occurrences of the string what with the string withWhat in the string srcStr.

Regular Expression Functions ( See ‘Understanding Regular Expressions’ below)*

Regular Expressions are a highly tuned, efficient way of evaluating text based formats and widely used throughout Ataccama. Although they can be daunting at first, once understood the use of regular expressions can vastly improve the speed of creating rules.

Regular Expressions are a highly tuned, efficient way of evaluating text based formats and widely used throughout Ataccama. Although they can be daunting at first, once understood the use of regular expressions can vastly improve the speed of creating rules.

Expression Description
find(srcRegex, srcStr  
e, caseInsensitive]) Verifies whether the string srcStr or its parts match the regular expression srcRegex.
matches(srcRegex, srcStr <,  
caseInsensitive]) Verifies whether the string srcStr matches exactly the pattern of the regular expression srcRegex.
substituteAll(srcPattern, srcReplacement,  
srcStr o, caseInsensitiveFlag]) Replaces all occurrences of srcPattern in string srcStr with srcReplacement.
substituteMany(srcPattern,  
srcReplacement, srcStr, srcVolume S, caseInsensitiveFlag]) Replaces all occurrences of srcPattern in the string srcStr with srcReplacement, the maximum number of replacements is defined by the parameter srcVolume.

Conversion Functions

Expression Description
toDate(expr, dateFormate, dateLocale]) Returns the date specified in expression expr converted to date type DAY.
toString(expr, strFormate, strLocale]) Converts the expression expr to a STRING value.

Datetime Functions

Expression Description
datePart(srcDate, fieldName) This function returns the value of the field fieldName (YEAR, MONTH or DAY) of srcDate.
dateDiff(startDate, endDate, fieldName) Returns the difference between endDate and startDate expressed in fieldName units.
today() Returns the current date in type DAY.
dateAdd(srcDate, srcValue, fieldName) Adds the specified srcValue of the type specified by fieldName (YEAR, MONTH or DAY) to the srcDate.

Conditional Functions

Expresion Description
iif(ifExpr, trueExpr, elseExpr) Returns trueExpr if ifExpr is TRUE.
case(expr, exprValueh, expr, exprValue]...f, defaultExpr]) Returns the value of expression exprValue immediately following the first expression expr whose value is TRUE.
decode(decodeExpr, expr, exprValues, expr  
, exprValue]...i, defaultExpr]) Returns the value of expression exprValue immediately following the first expression expr whose value is equal to decodeExpr.
nvl(exprl, expr]...) Returns the value of the first expression expr whose value is not NULL.

Excel vs. Ataccama - Key Differences

Although Ataccama’s expression language follows the syntax and structure of Excel, there are a few subtle differences between the two. (This is due to the nature of the underlying programming language of Ataccama and the manner in which its expressions are executed; Some operators such as ‘=’ or ‘&’, or functions such as ‘if’ are already reserved for other purposes within Ataccama and so alternatives had to be chosen.)

Below we present a list of key differences between the two;

Excel Ataccama Equivalent Notes
& + The & symbol in Excel will concatenate two text values together. Within Ataccama, the + symbol is used for addition when working with numeric values, or concatenation when working with text.

In Excel, 4+3 would return 7 In Ataccama, 4+3 also returns 7

In Excel, “Ata” & “ccama” returns “Ataccama”, a text. The Ataccama equivalent is “Ata” + “ccama” | | if | iif | The function IF, in Excel has the following syntax:

IF (conditional-expression, value-if-true, value-if-false)

Within Ataccama, this works in exactly the same way but the function is named iif, with two is.

IF ( A > B, “A is greater than B”, “A is not greater than B” ) becomes IIF ( A > B, “A is greater than B”, “A is not greater than B” ) | | mid | substr | In order to extract a sub-set of a piece of text, within Excel we would use the function ‘mid’. For example mid (”Ataccama”, 4, 2) would return ‘cc’. Within Ataccama the function substr (substring) is used to achieve the same result. It is important to note that within Ataccama text the first character is numbered 0 (vs. 1 in Excel), so an equivalent in Ataccama is: substr (”Ataccama”, 3, 2). |

Advanced Feature - Understanding Regular Expressions

Introduction

Regular expressions are an industry standard way of describing the pattern or structure of data, their use dating back to the early 1950s. Regex, used within Ataccama functions, are widely used across all industries and organisations to describe standard pattern or structure of a specific type of data. When creating logic in Ataccama it’s often easy to find a regex pattern which describes the data you are trying to validate.

There are numerous online sites, government portals, or open source regular expressions available to use within your Ataccama expressions!

Worked Example

Example

UK National Insurance numbers (UK’s social security number) are defined here:

https://www.regexlib.com/REDetails.aspx?regexp_id=527

A national insurance number is formed of two (specific) characters, six digits, and an optional final character. The regex for this is:

^"A-CEGHJ-PR-TW-Z]{1}iA-CEGHJ-NPR-TW-Z]{1}?0-9]{6}iA-DFM]{0,1}$

Within Ataccama we are able to use the function ‘matches’, to check this:

matches ( myTextValue, “^tA-CEGHJ-PR-TW-Z]{1}hA-CEGHJ-NPR-TW-Z]{1}H0-9]{6}-A-DFM]{0,1}$” )

Unpicking this example

At first glance, for those unfamiliar with regular expressions, a regular expression can be rather daunting. However, once a few simple rules are understood, they become fairly straight forward to read.

The structure of a regex pattern is describing the format of data for text values to match against. (Hence the function ‘matches’ being used above. Ataccama will attempt to match the input value against the regex string.)

In the above example:

  • The symbol ^ indicates the beginning of the text.
  • The symbol $ indicates the end of the text.
  • Everything within ^ and $ will be part of the matching process.
  • A set of brackets indicates an optional list of characters, and the ‘-’ found within indicates a range. ^ A-E] would therefore mean “The first character of this piece of text must be one of A, B, C, D, E, only”.
  • Therefore rA-CEGHJ-PR-TW-Z] will match one character out of ABCEGHJKLMNOPRTWXYZ.
  • The numbers in curly brackets {} indicate the number of times a character can be repeated.
  • Therefore ,0-9]{6} can be read as ‘at this point in the text, we must find one of the digits 0-9, repeated six times.

Regular Expressions Basics

The below summary has been inspired from this comprehensive guide - one of many brilliantly written articles available online covering this subject; https://medium.com/factory-mind/regex-tutorial-a-simple-cheatsheet-by-examples-649dc1c3f285

Character Matching and Repeated Characters

Logic Description Example
^The Matches any text beginning ‘The’ The house - matches correctly
the house - would not. (Note that regex is ‘case sensitive’, and the upper and lower case must match exactly).    
end$ Matches any text with the word end, at the end. In the end - matches correctly
In the end I wrote something else - does not match    
* Matches any number of characters matches a string that has ab followed by zero or more c
+ Indicates one or more of the preceeding character matches a string that has ab followed by one or more c
? Indicates zero or more of the proceeding character matches a string that has ab followed by zero or one c
{x} Indicates the text must match exactly ‘x’ number of repeated characters. matches a string that has ab followed by 2 c
{x,} The text must match x or more instances of the preceding character matches a string that has ab followed by 2 or more c
{x,y} The text must match x-y instances matches a string that has ab followed by 2 up to 5 c
a(bc) Brackets are a method to group together specific characters when using these operators. In this case the text bc must match what follows. matches a string that has a followed by zero or more copies of the sequence bc
a(bc){2,5} As in this example, a must be followed by two to five instances of the text bc. matches a string that has a followed by 2 up to 5 copies of the sequence bc

Two was to specify ‘this OR that’.

Logic Description Example
a(b c) The
a bc] As mentioned above, any series of characters within square brackets will match one of the characters within.  
The - character can be used to specify a range. /0-9] for example, would match any digit.  
asbc] achieves the same ab or ac match as in the first example using a pipe .  

Matching Characters

Logic Description Example
\d This will match any digit 0-9 \d* would match any number (without decimals)
\w This will match any ‘word’ character; that is any upper or lower case letter, or a _ underscore. \w* would match any word such as Ataccama_ONE. Ataccama ONE (containing a space) would not match.
\s This will match any whitespace / spacebar character.  
\D \W \S The capitalised versions of these tags will match the opposite of their lower case equivalent. For example \D matches any non-digit.  

Further Reading

There are numerous in-depth resources for understanding regular expressions available online, and the topic is larger than this primer will allow. The site regex101.com provides comprehensive detailed documentation and a tool for testing regular expressions.

Summary

This concludes the introductions to Ataccama’s expression language, and by this point you should have enough information to create rules and logic within both the desktop and web applications using this powerful feature. Please do let us know of any questions you may have, to be included in future versions.

Thanks for reading!

 

 

Hello,

I am doing POC, could you please help me out to get below scenario.

Scenario:

I need nth date (previous run data ) and current run date sale value and its threshold


I am also doing POC need help in writing complex rules.


@matthewtayloruk Do you have a list of functions in ataccama?


@matthewtayloruk Do you have a list of functions in ataccama?

Hi Karthika -- The full expression manual can be found at the documentation portal (available to Ataccama customers, you’d need to register for a login here):

https://support.ataccama.com/home/docs/aip/latest/user-guides/one-web-application-user-guide/one-basics/one-expressions

Enjoy!


Thank you! @matthewtayloruk 


@Adela Sotonova could you please give me the access on my user id as group user ?

Thank you,

Mukesh


@matthewtayloruk@Adela Sotonova 

How to get the access of below link ? it says my user id is not a group user id. Please help, how to get access or how to create group user id. Thanks in advance, Mukesh

 

https://support.ataccama.com/home/docs/aip/latest/user-guides/one-web-application-user-guide/one-basics/one-expressions

Screen shot

 


Hi @mukesh ,

can you confirm that you are logging in with your company (or client) e-mail? This link is available only to our customers.

If you still have trouble with this, please, let me know and we can raise a ticket to our Support team.


Hi @Adela Sotonova 

No, I am logging with my personal mail id only.  Okey I will register with my client id and try to login.

 

Thanks,

Mukesh


@Adela Sotonova back with client id

 


Reply