EXTRACT and CALL implementations

EXTRACT and CALL implementations

Address (address)

Extracts the longest possible address entity from utterance.

Currently supported languages: [‘cs’, ‘sk’, ‘hu’, ‘en’]

Description: Extracts the longest possible address entity from utterance. Returns address as a dictionary with keys: street, city and zipcode. If key has no value, it won’t be returned as part of the ouput dictionary. If no value is extracted, it returns None.

Raises : * ValueError if there is an invalid parameter of language, parameter has to be ‘cs’ or ‘sk’

  • KeyError if missing required parameter language

YAML usage:

CALL:
    - user_address: address
    # user address is for example {"street": "Masarykova 2", "city": "Brno", "zipcode": "60200"}
    - user_address_wo_street_number: [address, {'require_street_number': False}]
    # user address is for example {"street": "Masarykova", "city": "Brno", "zipcode": "60200"}

Output:

{'type': 'dict', 'desc': 'Returns Dict[str, str]: {`street`: value, `city`: value, `zipcode`: value} contained in utterance.'}

Address (advanced_address)

Extracts the longest possible address entity from utterance.

Currently supported languages: [‘cs’, ‘sk’]

Description: Extracts the address entity consisted of Street, Street Number, Zipcode, City. Extracts exactly one address, in case of multiple present it returns first found one. Multi address extraction not supported yet. Supports returning anonymized input utterance instead of found address in case relevant argument is set.

Args:

require_street_number: {'type': 'bool', 'desc': 'switch to make street number a mandatory element of a valid street entity'}

street_cities: {'type': 'bool', 'desc': 'switch to turn on allowed overlap between streets and cities, cities without streets can be considered a valid street names for themselves'}

blacklist: {'type': 'list', 'desc': 'list of address entities to not consider a real or relevant address object'}

valid_address_pattern: {'type': 'list', 'desc': 'list of combinations of elements (street, city, zipcode) that a valid address object must contain'}

street_number_separated: {'type': 'bool', 'desc': 'output street number separate from street name'}

anonymize_utterance: {'type': 'bool', 'desc': 'switch to anonymize input text by removing detected entities'}

Output:

{'type': 'dict', 'desc': 'Returns Dict[str, str]: {`street`: value, `city`: value, `zipcode`: value} contained in utterance.'}

Advanced number (advanced_number)

Return number from utterance according to the criteria.

Currently supported languages: Any

Description: Return number from utterance if it matches criteria and return it as string to support leading zeros. If no value is found, return None.

Rules: : * if user says hello 1 2 3 10 20 1, function extracts 12310201

  • if user says a group of numbers delimited by letters, e.g. num 1 2 3 and 4 it tries both 123 and 4

  • it tries to join 20/30/40/50/60/70/80/90/100 with following single digit X into 2X, 3X, e.g. 10 2 20 2 100 2 to 10 2 22 100 2 and also 10 2 20 2 102

  • if there are more group of numbers or more possibilities how to join a group of numbers, function returns just the first extracted interpretation

YAML usage:

EXTRACT: : - first_number: [advanced_number, {“min_digits”: 8, “max_digits”: 8, “pattern”: “.*1.*”}] # “number 1 2 3 13, and 1, 2 1” => extracts 12313121 (exactly 8 digits containing number 1)

  • second_number: [advanced_number, {“min_digits”: 5, “max_digits”: 5}]

    “my number is 1 2 3 13, and 1, 2 1” => extracts 12313 (exactly 5 digits)

  • third_number: [advanced_number, {“min_digits”: 3}]

    “number is 20 1 20 3 5” => extracts 212035 (minimum 3 digits, numbers 20 and 1 are joined to 21)

Args:

min_digits: {'type': 'int', 'desc': 'number must contain at least `min_digits` digits, minimal is always 1'}

max_digits: {'type': 'int', 'desc': 'number can contain at most `max_digits` digits, default maximum is 255'}

pattern: {'type': 'str', 'desc': 'number must match regex pattern if given'}

return_type: {'type': 'str', 'desc': '`int` or `str`'}

Output:

{'type': 'str', 'desc': 'If information about integer numbers is present, join numbers and return them as a string.'}

anonymize (anonymize)

Anonymizer.

Currently supported languages: [‘cs’, ‘sk’, ‘hu’]

Description: Tries to remove identifiers from utterance. So far supports full_name, phone and address

YAML usage:: : CALL: : - anon: [ anonymize, { ‘custom_input’: “Ja som Michal Jurco Ok? Kozi 12a, Nová Ves, 602 00 a 735666456”, anonymizers: [ ‘full_name’, ‘phone’, ‘address’ ] } ]

Args:

anonymizers: {'type': 'list', 'desc': 'List of entity names to anonymize from input utterance, one of `full_name|phone|address`.'}

Output:

{'type': 'dict', 'desc': 'Returns str: anonymized and tagged utterance/custom_input.'}

bad words (bad_words)

Return sanitized input with removed explicit words.

Currently supported languages: [‘cs’]

Description: Takes in a string and tries to replace swear words with * or functional_replacement*.

YAML usage:

EXTRACT: : - cleaned_utterance: bad_words # “jsi debil a je to na hovno” => returns “jsi * a je to k ničemu”

Output:

{'type': 'str', 'desc': 'Sanitized input with removed explicit words.'}

birth_number (birth_number)

Return list of birth numbers.

Currently supported languages: [‘cs’, ‘pl’, ‘sk’, ‘ro’]

Description: Return list of valid birth/national id numbers for Slovakia, Czech republic, Poland and Romania if any contained within utterance.

YAML usage:

EXTRACT: : - user_birth_number: birth_number # “number 72223145778” => extracts 72223145778 sumcheck as per pl government site

Output:

{'type': 'list', 'desc': 'If any valid birth numbers present, return as list of strings.'}

birth_number_decomposition (birth_number_decomposition)

Return dictionary of decomposed birth number into year, month and day.

Currently supported languages: [‘cs’, ‘sk’]

Description: Return dictionary of decomposed birthdate part of valid Czech or Slovak birth number into year, month and day.

YAML usage:

EXTRACT: : - birth_number_decomposed: birth_number_decomposition # “736028/5163” => {“year”: “1973”, “month”: “10”, “day”: “28”, “valid”: “True”} # “7162060000” => {“year”: “1971”, “month”: “12”, “day”: “06”, ‘valid’: ‘False’} # “number 997/267” => returns dictionary {“year”: “”, “month”: “”, “day”: “”, “valid”: “False”} if not valid birth num

Output:

{'type': 'dict', 'desc': 'If a valid birth number present, return as a dictionary.'}

BirthDate (birthdate)

Try to guess date provided by user. Supports single precise events for now.

Currently supported languages: [‘cs’, ‘de’, ‘pl’, ‘sk’, ‘ro’]

Description: Try to guess full date from user utterance.

Return datetime in ISO format (e.g. 2020-06-13).

YAML usage:

CALL:
    - user_birth_date: birthdate

Output:

{'type': 'str', 'desc': 'ISO format string (e.g. 2020-06-13) or None'}

censor (censor)

Extract vulgarisms from the last utterance.

Currently supported languages: [‘sk’, ‘cs’]

Description: Return input text with detected vulgar expressions censored out, else as is if none found.

YAML usage:

CALL:
    - censored_text: [censor, {'input': "I am a vulgar text with a vulgar word"}]
    # "" => "I am a ***gar text with a ***gar word"
    - censored_text: censor
    # "I am a vulgar text with a vulgar word" => "I am a ***gar text with a ***gar word"

Args:

language: {'type': 'str', 'desc': 'language, `sk` is supported'}

Output:

{'type': 'str', 'desc': 'Return text with detected vulgar expressions censored out.'}

check_variant_one_to_six (check_variant_one_to_six)

Detect if numeral 1 to 6 is contained within last utterance.

Currently supported languages: [‘cs’, ‘en’, ‘sk’]

Description: Return True if variant numeral/digit 1-6 present in utterance uttered (e.g. ‘je to to prvni’, ‘2’) else None.

YAML usage:

CALL:
    - choice_made: check_variant_one_to_six  # set to True if choice 1-6 made in last utterance
    # "Chci jednicku" => True
    # "vyberam si 5" => True
    # "nic z toho" => None

Output:

{'type': 'bool', 'desc': 'Return True if numeral 1 to 6 is detected within last utterance.'}

city

city_extract_replace (city_extract_replace)

Extract name of a city from the last utterance in nominative.

Currently supported languages: [‘cs’, ‘sk’, ‘hu’, ‘en’]

Description: Return city name in Nominative if contained within the last utterance (e.g. ‘Praha’, ‘Žilina’) else None.

YAML usage:

CALL:
    - city_name: city_extract_replace  # returns name of the city from the last utterance
    # "Sidli na adrese v Bratislave" => "Bratislava"
    # "išiel Žilinou cez centrum" => Žilina
    # "Je to u nas v meste" => None

Args:

language: {'type': 'str', 'desc': 'language, one of `sk|cs` is supported'}

get_indices: {'type': 'bool', 'desc': 'if set to True, extractor will return tuple pair of values: found city (str or None) and coordinates (Tuple[int, int]).if set to False, returns only city (str or None).'}

Output:

{'type': 'str', 'desc': 'Return name of a city in Nominative first case if found within last utterance else nothing (None). Capitalized, then longer cities have preference. Currently supports only limited amount of cities we declined.'}

city_street_to_postcode (city_street_to_postcode)

Extract the cities and streets from the last utterance postcodes.

Currently supported languages: [‘cs’, ‘sk’]

Description: Extract all cities and streets from the utterance and return their postcodes.

Args:

context_city: {'type': 'str', 'desc': 'Optional parameter, limits possible zipcodes to those within the context city.'}

Output:

{'type': 'List[str]', 'desc': 'Returns the list of all found zipcodes from the last utterance. Accepts city and street as a context for extraction of zipcodes from the utterance.'}

city_to_postcode (city_to_postcode)

Extract the zipcodes from the last utterance.

Currently supported languages: [‘cs’, ‘sk’, ‘hu’]

Description: Tries to map a city name to zipcodes from custom data. City name should be mentioned in the input utterance.

Returns the zipcodes in format of a list (e.g. [‘10000’, ‘25226’]).

YAML usage:

CALL:
    - zipcodes: [city_to_postcode, {'context_city': 'Long Angeles'}]  # returns the zipcodes from the utterance
    # "mesto sa vola Brnníčko" => "['78391', '78975']"

Args:

context_city: {'type': 'str', 'desc': 'Optional parameter, limits possible zipcodes to those within the context city.'}

Output:

{'type': 'List[str]', 'desc': 'Returns the list of all found zipcodes from the last utterance. Accepts city as a context for extraction of zipcodes from the utterance.'}

company (company)

Extract company names from the last utterance.

Currently supported languages: [‘sk’]

Description: Return company name if contained within the last utterance (e.g. [‘MalibuSK’, ‘EuroEye’]) else None if not found.

YAML usage:

CALL:
    - company: company  # returns list of the companies from the last utterance
    # "Nazov firmy je Braver fit s.r.o." => ["Braver fit"]
    # "je to BAUMeister" => ["BAUMeister"]
    # "je to taka mala firma v nasom meste" => None

Output:

{'type': 'list', 'desc': 'List of company names from the last utterance if any present, else None.'}

cosine_similarity (cosine_similarity)

Vectorize and compare two string.

Currently supported languages: Any

Description: Transform input strings into vectors and calculate cosine similarity. Extension does one to one or one to many cosine similarity calculation. First letters of words have bigger weight for multi-word strings. Works better with equally long strings. If multiple comparison string provided, we sort by highest score, descending.

YAML usage:

CALL:
    - similarity_score: [cosine_similarity, {'string_1': 'Slowak Telekomee', 'string_2': 'Slovak Telekom'}]
    # 0.56
    - similarity_scores: [cosine_similarity, {'string_1': 'Telekomee', 'string_2': ['Telekom', 'Google corp.']}]
    # [['Telekom', 0.798], ['Google corp.', 0.012]]

Args:

string_1: {'type': 'str', 'desc': 'first string to compare'}

string_2: {'type': 'str', 'desc': 'second string to compare'}

Output:

{'type': 'str', 'desc': 'Returns calculated cosine similarity of input string vectors.'}

counter (counter)

Extract count of visits for a specific node.

Currently supported languages: Any

Description: Return count of visit for node, where we call this extractor (e.g. node CNT_NODE countains counter, on first visit we get integer type with value of 1, every visit adds one to the returned value).

YAML usage:

CALL:
    - counter_value: counter  # returns count of how many times we visited specific node where it is used.

Args:

storage: {'type': 'str', 'desc': 'Name of optional variable where existing counters for all nodes are stored.Default value is `fallback_counters`.'}

Output:

{'type': 'int', 'desc': 'Return count of visit for a specific node.'}

current_conversation_id (current_conversation_id)

Extract conversation id.

Currently supported languages: Any

Description: Extract conversation id from goodbot.

Return conversation id (e.g. ‘text_12345’).

YAML usage:

CALL:
    - cid: current_conversation_id  # set to string containing conversation id if present

Output:

{'type': 'str', 'desc': 'Extract conversation id if contained within goodbot else nothing set.'}

current_date (current_date)

Return datetime.date object for checking current dates in conditions.

Currently supported languages: Any

Description: Return datetime.date object with current day date (e.g. datetime.date(2020, 9, 29)). Default timezone=’Europe/Prague’.

YAML usage:

CONDITIONS:
    - 'current_date.month == 12': DECEMBER_NODE  # if current month is twelfth month go to december_node
    - 'current_date.month == 11': NOVEMBER_NODE  # if current month is twelfth month go to NOVEMBER_NODE

Output:

{'type': 'datetime.date', 'desc': 'Return datetime object, however it is not json serializeable, so it is for calculation purpose only, not to be saved into variable unless converted into string during extraction in yaml. Has attributes `year`, `month` and `day`.It is also usable in MARKDOWN.'}

current_named_entities (current_named_entities)

Extract current conversation named entities.

Currently supported languages: Any

Description: Return dictionary containing named entities from history object of the conversation (e.g. {‘cid’: ‘text_123’, ‘customer_id’: ‘123456’}).

YAML usage:

CALL:
    - entities: current_named_entities  # saves dictionary of named entities from the conversation

Output:

{'type': 'Dict', 'desc': 'Extract all named entities and returns them as an dictionary object.Entities with the same name as the name of the target variable are ignored.'}

current_state (current_state)

Extract current state code.

Currently supported languages: Any

Description: Checks conversation history and returns current state code.

YAML usage:

EXTRACT: : - current_node: current_state # “INTENT_02_01_DIRECT_ME_TO_A_PERSON”

Output:

{'type': 'str', 'desc': 'Returns the name of the current state in the conversation flow.'}

current_time (current_time)

Return datetime.now() object for checking current times in conditions.

Currently supported languages: Any

Description: Return datetime.now() object with current time and date (e.g. datetime.date(2020, 9, 29)). Default timezone=’Europe/Prague’.

YAML usage:

CONDITIONS:
    - 'current_time.hour > 17': NOT_OFFICE_HOURS_NODE  # if current hours is more than 17 go to not office hours node
    - '8 < current_time.hour < 18': OFFICE_HOURS_NODE  # if current hours is between 8 and 18 go to office hours node

Output:

{'type': 'datetime.now()', 'desc': 'Return datetime object, however it is not json serializeable, so it is for calculation purpose ONLY, not to be saved into variable unless converted into string during extraction in yaml. Has attributes `year`, `month`, `day`, `hour`, `minute`, `second`. It is also usable in MARKDOWN.'}

current_utc_time_millis (current_utc_time_millis)

Return datetime.now() object for checking current times in conditions.

Currently supported languages: Any

Description: Extract datetime.datetime.utcnow() in iso format as a string with current time in milliseconds (e.g. ‘2020-09-29T12:51:02.435Z’).

YAML usage:

CALL:
    - current_time_in_millis: current_utc_time_millis  # set to string akin to '2020-09-29T13:16:22.978Z'

Output:

{'type': 'datetime.now()', 'desc': 'Return string with current time in milliseconds in iso format. It is also usable in MARKDOWN.'}

date (date)

Try to guess date intended by user. Supports single precise events for now.

Currently supported languages: [‘cs’, ‘sk’, ‘pl’, ‘de’, ‘ro’, ‘en’, ‘nl’, ‘fr’, ‘hu’]

Description: Try to guess date from user utterance. Return date in ISO format (e.g. 2020-06-13).

If multiple date information are present, they can get mixed. Currently, intervals, fuzzy dates and other advanced features are not supported.

You can use simple relative terms like “tomorrow”, “yesterday”, etc.

Currently, only Czech, Slovak, German, Polish, English and Romanian languages are supported.

YAML usage:

CALL:
    - reservation_date: date  # current date is 2022-08-17
    # "zítra" => "2022-08-18"
    # "pozítří" => "2022-08-19"
    # "včera" => "2022-08-16"

Args:

current_date: {'type': 'date', 'desc': 'All relative information are computed from this date,if not set, use date at the moment of user interaction.'}

multiple_instances: {'type': 'bool', 'desc': 'Ignored for now.'}

intervals: {'type': 'bool', 'desc': 'Ignored for now.'}

Output:

{'type': 'str', 'desc': 'format string (e.g. 2020-06-13) or None'}

dateandtime (dateandtime)

Try to guess date and time intended by user. Supports single precise events for now.

Currently supported languages: [‘cs’, ‘sk’, ‘pl’]

Description: Try to guess date and time from user utterance.

Return datetime in ISO format (e.g. 2020-06-13T18:00:00+02:00).

If multiple date/time information are present, they can get mixed. Currently, intervals, fuzzy dates and other advanced features are not supported.

You can use simple relative terms like “tomorrow”, “yesterday”, “in six hours”, “at half past five”, etc.

Currently, only Czech, Slovak and Polish languages is supported.

YAML usage:

CALL:
    - meeting_datetime: dateandtime  # current datetime is 2020-01-30 10:10
    # "zítra" => "2020-01-31T10:10:00+02:00"
    # "pozítří" => "2020-02-01T10:10:00+02:00"
    # "včera v pět" => "2020-01-30T17:10:00+02:00"

Args:

current_datetime: {'type': 'datetime', 'desc': 'All relative information are computed from this datetime, if not set, use datetime at the moment of user interaction.'}

future: {'type': 'bool', 'desc': 'Ignored for now.'}

past: {'type': 'bool', 'desc': 'Ignored for now.'}

ignore_time: {'type': 'bool', 'desc': 'Ignored for now.'}

ignore_date: {'type': 'bool', 'desc': 'Ignored for now.'}

multiple_instances: {'type': 'bool', 'desc': 'Ignored for now.'}

intervals: {'type': 'bool', 'desc': 'Ignored for now.'}

Output:

{'type': 'str', 'desc': 'ISO format string (e.g. 2020-06-13T18:00:00+02:00) or None'}

declined_streets_to_cities (declined_streets_to_cities)

Extract the street name and street number from the last utterance.

Currently supported languages: [‘sk’, ‘cs’]

Description: Extract the city name based on street utterance.

Args:

city: {'type': 'str', 'desc': 'Optional parameter, limits possible streets to streets located in the city.'}

Output:

{'type': 'str', 'desc': 'Returns the street name and street number from the last utterance. Accepts city as a context for extraction of street from the utterance.'}

decode_byte64 (decode_byte64)

Returns dictionary of generated filename, and its respective bytes.

Currently supported languages: Any

Description: Returns dictionary of generated filename, and its respective bytes. Expects list of strings, which are b64 encoded bytes. Correct input format of strings in list is output of current image widget implementation:

  • data:{data_type}/{file_suffix};base64,{base64encoded_bytes}

We should only use as callable to generate input for other functions and extensions. Bytes are not json serializable and will fail to save via SET or CALL.

YAML usage:

CALL:
    - call_send_email:
        - send_email
        - smtp_host: smtp.gmail.com
          subject: 'TEST-VA'
          smtp_password: borndigitaltesting
          smtp_port: 587
          address_from: testovaci.ucet.borndigital@gmail.com
          address_to: testovaci.ucet.borndigital@gmail.com
          html: "I am a content of an email.<br>I am another line of an email."
          attachments: '{decode_byte64(list_of_b64_encoded_strings)}'

Output:

{'type': 'dict', 'desc': 'dict of file names as a keys and byte strings of image data as a values.'}

detect_month (detect_month)

If month mentioned by word, return the numeric order of it as a string.

Currently supported languages: [‘cs’, ‘sk’, ‘de’]

Description: Transform name of month mentioned into a number.

Return month as numeric string (e.g. 1 for january).

YAML usage:

CALL:
    - month_number: detect_month

Output:

{'type': 'str', 'desc': 'numeric string (e.g. `8` for august, as it is eight month of the year) or None'}

digit_to_word (digit_to_word)

Transcribe digits into readable numerals divided into groups up to 99.

Currently supported languages: [‘cs’, ‘sk’]

Description: Return digit transcribed into words if digit is valid pure digit string or integer.

YAML usage:

CALL:
    - readable_id: [digit_to_word, {'digit': id}]

Args:

language: {'type': 'str', 'desc': 'language, one of `sk|cs` is supported'}

digit: {'type': 'str|int', 'desc': 'digit to be transcribed'}

Output:

{'type': 'str', 'desc': 'Return joined string with transcribed digits up to 99.'}

distance_city_to_city (distance_city_to_city)

Calculate distance between two cities.

Currently supported languages: [‘cs’, ‘sk’]

Description: Return distance in kilometres if correct cities provided else None.

YAML usage:

CALL:
    - distance: [distance_city_to_city, {'origin': Nitra, 'dest': 'Bratislava'}]
    # 74.801987

Args:

language: {'type': 'str', 'desc': 'language, for now only `sk|cs` is supported'}

Output:

{'type': 'float', 'desc': 'Distance in kilometers as float.'}

encode_byte64 (encode_byte64)

Returns b64 string.

Currently supported languages: Any

Description: Returns base64 string from input bytes. Expects bytes of a file.

YAML usage:

SET:
    - b64string_of_bytes: 'encode_byte64(get_conversation_recording(conversation_id=cid))'

Output:

{'type': 'dict', 'desc': 'b64 encoded bytes as string.'}

environ (environ)

Extract whitelisted environment variable.

Currently supported languages: Any

Description: Extract whitelisted system environment variable from goodbot instance or get all whitelisted.

Return value of system environment variable (e.g. environ(“GOODBOT_PORT”) -> “8001”).

YAML usage:

SET:
    - choose_one_env_entity: 'environ("GOODBOT_PORT")'
    - get_another_one_env_entity: 'environ()["GOODBOT_NAME"]'

Output:

{'type': 'str', 'desc': 'Extract value of the whitelisted system environment variable from goodbot instance.'}

Fetch Url (fetch_url)

Call external link using given method and data and return the response.

Currently supported languages: Any

Description: Call REST request to given url using method and returns JSON converted to Python objects or text if JSON response is not available. We support only the standard HTTP methods listed in ARGS.

YAML usage: : CALL: !!omap : - RESULT: [ : fetch_url, {‘url’: “www.api_waiting_for_request.com/get_data”, json_body: { id: “{id}” }} ] - ASYNC_RESULT: [ : fetch_url, {‘url’: “www.api_waiting_for_request.com/get_data”, json_body: { id: “{id}” }, non_blocking: True} ] # use non_blocking, if you do not need to wait for response, result will be saved in history at the beginning of next interaction after results received - “facts”: [ > fetch_url, > {‘url’: “https://catfacts/facts”, ‘method’: ‘GET’, ‘delay’: 5} ] # use delay, if you want to block the conversation and delay sending request by X seconds # if non_blocking is set to True, delay is ignored CALL: !!omap : - cid: current_conversation_id - history: [ : fetch_url, {‘url’: “http://goodbot:8121/conversations/{cid}/messages”, ‘method’: ‘GET’, verify: False} ]

Args:

url: {'type': 'str', 'desc': 'url must be accessible from GoodBot server, include protocol'}

method: {'type': 'str', 'desc': 'one of OPTIONS/HEAD/GET/PUT/POST/PATCH/DELETE'}

json_body: {'type': 'Dict', 'desc': 'data to be JSON encoded'}

data: {'type': 'Dict', 'desc': 'data to be url encoded'}

timeout: {'type': 'float', 'desc': 'give up after x seconds'}

headers: {'type': 'Dict', 'desc': 'headers of request'}

cookies: {'type': 'Dict', 'desc': 'cookies of request'}

auth_method: {'type': 'str', 'desc': 'one of HTTPBasicAuth/HTTPDigestAuth/HTTPProxyAuth'}

username: {'type': 'str', 'desc': 'username for authenticated links'}

password: {'type': 'str', 'desc': 'password for authenticated links'}

cert: {'type': 'Union[str, List]', 'desc': 'path to certificate on GoodBot server'}

proxies: {'type': 'Dict', 'desc': 'proxies of request'}

verify: {'type': 'bool', 'desc': 'verify SSL certificate'}

non_blocking: {'type': 'bool', 'desc': 'do not wait for response'}

log: {'type': 'bool', 'desc': 'log request and response'}

delay: {'type': 'int', 'desc': 'seconds to wait before sending request'}

Output:

{'type': 'Any', 'desc': 'JSON decoded object or text.'}

first_email (first_email)

Extract first email address from the last utterance.

Currently supported languages: Any

Description: Extract email address if contained within last utterance uttered.

Return email address extracted from utterance (e.g. ‘meno@gmail.com’, ‘slayer@doom.au’).

YAML usage:

CALL:
    - email: first_email  # returns the first email address from the last utterance
    # "poslite mi to na guy123@hotmail.com" => "guy123@hotmail.com"

Output:

{'type': 'str', 'desc': 'Returns first email address from the last utterance.'}

full_name (full_name)

Extract full name from the last utterance.

Currently supported languages: [‘cs’, ‘sk’, ‘pl’, ‘de’, ‘en’, ‘ro’, ‘hu’]

Description: Extracts the longest person entity (Dict[str, str]: {name: value, surname: value}) from raw uncorrected utterance, if that fails and argument retry_on_fail was passed as True we retry on corrected utterance.

Returns None if no entity detected as per input parameters.

YAML usage:

CALL:
    - name: full_name  # returns the full name from the last utterance
    # "rezervujte ma na meno Adam Nový" => {"name": "Adam", "surname": "Nový"}
    - name: [full_name, {"complete": False}]  # returns the full name from the last utterance
    # "je to tam ako Adam alebo Erik Nový" => {"name": ["Adam", "Erik"], "surname": ["Nový"]}

Output:

{'type': 'str', 'desc': 'Returns longest full name from the last utterance.'}

full_name_and_gender (full_name_and_gender)

Extract full name and its gender from the last utterance.

Currently supported languages: [‘cs’, ‘sk’, ‘pl’, ‘de’, ‘en’, ‘hu’]

Description: Extract full name extracted from utterance and gender if the name is in dictionary with (fe)males’ names. (e.g. {“name”: “Andrej”, “surname”: “Novák”, “gender”: “pán”}, {“name”: “Jana”, “surname”: “Kolárová”, “gender”: “pani”}).

YAML usage:

CALL:
    - name: full_name_and_gender
    # "rezervujte ma na meno Adam Nový" => {"name": "Adam", "surname": "Nový", "gender": "M"}

Output:

{'type': 'str', 'desc': 'Returns longest full name and its gender from the last utterance.'}

full_name_extract_replace (full_name_extract_replace)

Extract full name in Nominative from the last utterance.

Currently supported languages: [‘pl’]

Description: Extract full name if contained within last utterance uttered and return it in Nominative. (e.g. {“name”: “Robert”, “surname”: “Lewandowski”}, {“name”: “Jerzy”, “surname”: “Dudek”}). Supported languages: ‘pl’

YAML usage:: : CALL: : - name: full_name_extract_replace # returns the full name in Nominative from the last utterance # “chcesz rozmawiać z Robertem Lewandowskim” => {“name”: “Robert”, “surname”: “Lewandowski”} - name: [full_name, {“complete”: False}] # returns all names in Nominative from the last utterance # “z Robertem albo Jerzym Lewandowskim?” => {“name”: [“Robert”, “Jerzy”], “surname”: [“Lewandowski”]}

Output:

{'type': 'str', 'desc': 'Returns longest full name from the last utterance.'}

full_name_from_input_data (full_name_from_input_data)

Extract full name and optionally its gender from utterance using User Data Table.

Currently supported languages: [‘cs’, ‘sk’, ‘pl’, ‘de’, ‘en’, ‘nl’, ‘fr’, ‘ro’]

Description: Extract the longest full name extracted from utterance using Input Data Table. (e.g. {“name”: “Andrej”, “surname”: “Novák”, “gender”: “pán”}, {“name”: “Jana”, “surname”: “Kolárová”, “gender”: “pani”}).

YAML usage:

CALL:
    - output:
        - full_name_from_input_data
        - input_data_name: Names_in_DM
          lookup_column_name: names_column
          lookup_column_surname: surnames_column
          lookup_column_gender: gender_column
# "rezervujte ma na meno Adam Nový" => if name & surname is in data table, the output is:
{"name": "Adam", "surname": "Nový", "gender": "M"}

Output:

{'type': 'str', 'desc': 'Returns longest full name and its gender from the last utterance.'}

get_closest_work_day (get_closest_work_day)

Return iso-format date of the closest workday (Holidays extension marks it).

Currently supported languages: Any

Description: Return iso-format date equal to the closest work day (Holidays extension marks it as such).

YAML usage:

CALL:
    - 'closest_work_day': get_closest_work_day
    # if current date 2023-12-22 -->> '2023-12-22'
    - 'closest_work_day': get_closest_work_day
    # if current date 2023-12-23 (saturday before Christmas) -->> '2023-12-27'
    - 'fifth_work_day_from_now': [get_closest_work_day, {'custom_input': '2023-12-22', 'offset': 5}]
    # if current date 2023-12-22 -->> '2024-01-03'

Output:

{'type': 'str', 'desc': 'Return iso-format date, closest workday, or closest workday + offset specified in offset argument. '}

get_conversation_as_html (get_conversation_as_html)

Returns structured summary of conversation as html string.

Currently supported languages: Any

Description: YAML usage:

CALL:
    - conversation_history: get_conversation_as_html
    # [<time datetime="2022-01-24T12:17:29.304750+01:00">22-01-24, 12:17:29</time>] <b>BOT</b>: Dobrý den. Jak vám mohu pomoci?<br>
    # [<time datetime="2022-01-24T12:17:33.965884+01:00">22-01-24, 12:17:33</time>] <b>USER</b>: neco vymyslime<br>

Output:

{'type': 'Optional[str]', 'desc': 'Interactions as html formatted string separated by <br> tag.'}

get_conversation_recording (get_conversation_recording)

Returns recording data of current conversation in WAV format.

Currently supported languages: Any

Description: Returns recording data of current conversation in WAV format.

YAML usage:: : content_bytes_encoded: ‘encode_byte64(get_conversation_recording(conversation_id=cid_variable, sip_host=sip_host_variable))’

Output:

{'type': 'Optional[bytes]', 'desc': 'Recording data as bytes in WAV format.'}

get_edit_distance (get_edit_distance)

Calculate linguistically informed edit distance of two string.

Currently supported languages: Any

Description: Take two strings and calculate edit distance from first to second. Extension does one to one or one to many distance calculation. Our implementation is based on the Levenshtein distance algorithm,

however we use a custom heuristic of linguistic similarity as well as alternative distance metrics.

It is possible to change edit distance mode to CER which is better at handling multicharacter substitutions.

Works better with similarly long strings. If multiple comparison string provided, we sort by highest score, descending.

YAML usage:

CALL: : - edit_distance: : [ : get_edit_distance, { ‘string_1’: ‘love’, ‘string_2’: ‘lowe’ } ]

0.75

  • edit_distances:

[ : get_edit_distance, {

‘string_1’: ‘love’, ‘string_2’: [ ‘loe’, ‘Love’, ‘lowe’ ]

}

]

[[‘loe’, 2.0], [‘lowe’, 0.75], [‘Love’, 0.25]]

Args:

string_1: {'type': 'str', 'desc': 'first string to compare'}

string_2: {'type': 'str', 'desc': 'second string to compare'}

language: {'type': 'str', 'desc': 'determines on which common substitutions and typos are considered'}

edit_distance_mode: {'type': 'str', 'desc': 'either `edit_distance` or `cer`, `cer` supports multi-character substitution detection'}

Output:

{'type': 'str', 'desc': 'Returns calculated edit distance of input strings.'}

get_out_calls (get_out_calls)

Return list of current out calls made by input phone number.

Currently supported languages: [‘cs’, ‘sk’, ‘pl’, ‘de’, ‘hr’, ‘en’, ‘ro’, ‘ru’, ‘pt’]

Description: Return list of current out-calls made by input phone number.

YAML usage:

EXTRACT: : - first_number: [get_out_calls, {“phone_number”: +420774072329}] # [ # { # “status”: “declined”, # “trials”: 1, # “_id”: “5fd365228c0bdf00131f1888”, # “phoneNumber”: “+420725939689”, # “metaData”: { # “greeting”: “Dobrý den, u telefonu Jakub.”, # “offerid”: 312376, # }, # “calledAt”: “2020-12-11T12:25:06.341Z”, # “finishedAt”: “2020-12-11T12:25:17.895Z”, # “callListId”: “5fd364e88c0bdf00131f1884” # }, # ]

Args:

phone_number: {'type': 'str', 'desc': 'number must be a valid phone number (can be succesfully extracted by out phone number extractor)'}

Output:

{'type': 'List', 'desc': 'List of dictionaries representing out calls.'}

get_string_hash (get_string_hash)

Return hash for string input.

Currently supported languages: Any

Description: Return hash string for string input

YAML usage:

CALL:
    - 'hash': [get_string_hash, {'word': 'file.ntm.45733', 'key': 'my_super_secret_key'}]
    #  'w6HMDAwwQS/N0be2e0pLAnASpU3O2kn0nKo4agB7cnuGFWssQ9YXYmhHe5bRYecv3mM5OB7ZUR49NKEyMQtghA=='

Output:

{'type': 'str', 'desc': 'Return hash for string input.'}

get_utterance_score (get_utterance_score)

Returns potential candidates for intents with their score.

Currently supported languages: Any

Description: Recover winning intent and score of last recognized utterance from conversation history.

Return pair of winning INTENTS (in language model there can be multiple identical intents) and their winning SCORE (e.g. [[‘I_CHANGE_PARAMS_FIND_DETAIL_DAY_LIMIT’, ‘I_CHANGE_PARAMS_DAY_LIMIT’], 0.9999368190765381]).

YAML usage:

CALL:
    - winning_pair: get_utterance_score  # can be used if we want to know the score of winning intent
    - 'all_candidates': [get_utterance_score, {"return_all": True}] # can be used if we want all last candidates

Output:

{'type': 'Optional[tuple[list[str], float]]', 'desc': 'winning candidate as pair of [INTENT, SCORE]'}

GPT Prompt (gpt_prompt)

Call GPT-engine with a prompt and user’s text and return GPT’s reply.

Currently supported languages: Any

Description: Makes an HTTP request to .env-configured GPT_ENGINE_URL and returns GPT’s response.

YAML usage: : CALL: !!omap : - cid: current_conversation_id - history: [ : gpt_prompt, { system_message: ‘’, first_message: ‘’ } ]

Args:

system_message: {'type': 'str', 'desc': 'System message to prime GBT about how to respond.'}

first_message: {'type': 'str', 'desc': "Message sent to the GPT before the user's text. Similar usage to system_message"}

temperature: {'type': 'float', 'desc': "Value between 0 and 2 which determines GPT's 'creativity' (simply put)"}

use_user_utterance: {'type': 'bool', 'desc': 'Whether to use last utterance of user as another GPT message'}

Output:

{'type': 'Dict', 'desc': "JSON decoded object with GPT's response."}

GPT Stream (gpt_stream)

Call Open AI API with a prompt and user’s text and return GPT’s reply. Context from indexer can also be provided and the reply can be streamed in sentences to the provided URL.

Currently supported languages: Any

Description: Makes a request to Open AI API and streams GPT’s response to the provided url.

YAML usage: : CALL: : - cid: current_conversation_id - history: [ : gpt_stream, { system_message: ‘’, first_message: ‘’ } ]

Args:

system_message: {'type': 'str', 'desc': 'System message to prime GBT about how to respond.'}

first_message: {'type': 'str', 'desc': "Message sent to the GPT before the user's text. Similar usage to system_message"}

temperature: {'type': 'float', 'desc': "Value between 0 and 2 which determines GPT's 'creativity' (simply put)"}

use_user_utterance: {'type': 'bool', 'desc': 'Whether to use last utterance of user as another GPT message'}

index_name: {'type': 'str', 'desc': 'Where (if at all) to search for related text'}

node_name: {'type': 'str', 'desc': 'Name of the current ai node'}

use_gpt_functions: {'type': 'bool', 'desc': 'If we have to use Gpt functions to retrieve answer'}

knowledge_base_description: {'type': 'str', 'desc': 'Description of the knowledge base Gpt function'}

user_assistant_messages: {'type': 'list', 'desc': 'List of user/assistant messages'}

function_name: {'type': 'str', 'desc': 'Name of the Gpt function'}

index_query: {'type': 'str', 'desc': "This string will be searched for in documents under the provided index_name. If 'None', user's utterance is used"}

index_timeout: {'type': 'float', 'desc': 'Indexer endpoint timeout value. In [s].'}

ai_model: {'type': 'str', 'desc': 'Which (GPT) model should be used'}

ai_timeout: {'type': 'float', 'desc': 'IF GPT endpoint does not respond in this time, error is risen. In [s].'}

ai_intent_timeout: {'type': 'float', 'desc': 'IF GPT endpoint for intent recognition does not respond in this time, error is risen. In [s].'}

stream_sentences: {'type': 'bool', 'desc': 'Whether individual sentences should be streamed as soon as they are ready'}

stream_first_response_messages: {'type': 'list', 'desc': 'Set of messages, that get played if response delay is exceeded'}

stream_first_response_delay: {'type': 'float', 'desc': 'Delay to trigger playing "silence killer" message. In [s] or [ms] if > 100.'}

stream_first_chunk_timeout: {'type': 'float', 'desc': 'Max time receiving client should wait for next message. In  [s] or [ms] if > 100.'}

previous_messages_count: {'type': 'str', 'desc': 'How many previous interactions should be sent to GPT for context'}

include_in_history: {'type': 'str', 'desc': 'If the interactions of this extractor call should be included in history/context'}

index_top_results_count: {'type': 'int', 'desc': 'Allow customization for number of snippets returned from index'}

amount_adjacent_snippets: {'type': 'int', 'desc': 'integer from 0 - 5 that asks for X chunks within the same document that are adjacent to the specific one'}

Output:

{'type': 'Dict', 'desc': "JSON decoded object with GPT's response."}

holidays (holidays)

Check if current day a holiday.

Currently supported languages: [‘sk’, ‘cs’, ‘pl’, ‘de’]

Description: Return boolean value True if the date is a holiday/weekend, else False for a working day.

YAML usage:

CALL:
    - is_holiday: holidays
    # '2022-12-25' => True
    - is_holiday: [holidays, {'datestring': '2022-01-01'}]
    # 'pracujete dnes?' => True
    - is_holiday: [holidays, {'countries': ['at']}]
    # 'Is today an Austrian national holidays?' => True
    - is_holiday: [holidays, {'countries': ['de'], custom_holidays: ['2023-11-01']}]
    # 'Is today a Westfalen holidays?' => True

Output:

{'type': 'bool', 'desc': 'True if current date not a work day else False'}

ico (ico)

Extract company identification number from the last utterance.

Currently supported languages: [‘cs’, ‘pl’, ‘sk’]

Description: Extract czech and slovak and polish company identification number if contained within last utterance.

Return company identification number extracted from utterance (e.g. ‘15547400’, ‘47114983’).

YAML usage:

CALL:
    - ico_cislo: ico  # returns the company identification number from the last utterance
    # "moje ico je 47114983, mate ho v systeme?" => "47114983"

Output:

{'type': 'str', 'desc': 'Returns CS/SVK/POL company identification number from the last utterance. Works only with digits (e.g. 1 2 3 4 5 6 7 9 or 12345679). ICO must be valid (special check is made using checksum verification.'}

is_open_input_data (is_open_input_data)

Checks if current time within time range of set working hours.

Currently supported languages: Any

Description: Extension to enable checking working hours in client data. User needs to know table name and project name.

YAML usage:: : CALL: : - working_hours_output: [ : is_open_input_data, {

“table_name”: “Opening_hours”, “lookup_row_name”: “Apple”, “lookup_column_name”: “Project” “open_hours_column”: “Open hours”

}

]

‘Ste teraz otvoreny?’ [current time: Mo 15:15 opening hours in customer data: Tu-Fr 23:00-23:30] => “[False, []]”

‘Ste teraz otvoreny?’ [current time: Mo 15:15 opening hours in customer data: Mo-Th 13:00-23:30] => “[True, [‘13:00-23:30’]]”

‘Ste teraz otvoreny?’ [current time: Mo 15:15 opening hours in customer data: Mo-Fr 17:00-22:00] => “[False, [‘17:00-22:00’]]”

Output:

{'type': 'list', 'desc': '[True, [working hours intervals]] if working hours, [False, [working hours intervals]] if not working hours but there are working hours during that day, [False, []] if not working hours in given day or data is not available.'}

keywords (keywords)

Returns keyword from a preset list or stem dictionary if mentioned in utterance.

Currently supported languages: Any

Description: Extract a keyword as a string from utterance.

YAML usage:

CALL:
    - found_conditions: [keywords, {"keywords": {"tehot": "těhotenství"}}]  # returns list of found words
    # "jsem tehotna" => "těhotenství"
    - found_people: [keywords, {"keywords": ["erik", "pepik", "adam"]}]
    # "Ja som erik a ty si ten adam, ci?" => "erik"

Args:

keywords: {'type': 'Dict|List', 'desc': 'input containing stems or words to be detected'}

ignore_spaces: {'type': 'bool', 'desc': 'switch to add strip space during normalization for keywords only'}

Output:

{'type': 'str', 'desc': 'Returns keyword from the utterance as a string. Only first found instance is returned. If none found returns empty list.'}

No digits (no_digits)

If there is no number in utterance, returns True.

Currently supported languages: Any

Description: Returns True if there is no number (as digits or as text) in utterance (and preprocessed utterance). Otherwise, returns None.

Language dependency is hidden in preprocessed_utterance_tokens (some text numbers can be transformed into digital form, e.g. “one” => “1”).

YAML usage:

EXTRACT:
    - _: no_digits
    # "ahoj číslo je 1 2 3 13, lala 1, 2" => None
    # "ahoj číslo je jedna" => None
    # "number is" => True

Output:

{'type': 'bool', 'desc': 'returns True if no number, otherwise returns None'}

official_street_and_number (official_street_and_number)

Extract the street name and street number from the last utterance.

Currently supported languages: [‘cs’, ‘sk’]

Description: Extract the street name and street number from utterance.

Return the street name and street number extracted from utterance (e.g. ‘Sokolova 113/3b’, ‘Husova 5’).

YAML usage:

CALL:
    - street: official_street_and_number  # returns the street name and street number from the last utterance
    # "Moje adresa je Hradní náměstí 15" => "Hradní náměstí 15"

Args:

city: {'type': 'str', 'desc': 'Optional parameter, limits possible streets to streets located in the city.'}

require_number: {'type': 'bool', 'desc': 'Optional parameter, switch on/off whether street number is required.'}

Output:

{'type': 'str', 'desc': 'Returns the street name and street number from the last utterance. Accepts city as a context for extraction of street from the utterance.'}

parcel_number (parcel_number)

Extract all parcel numbers from the last utterance.

Currently supported languages: [‘cs’, ‘sk’]

Description: Extract parcel number from utterance.

Return all parcel numbers from utterance (e.g. ‘1518/3’, ‘702/49’).

YAML usage:

CALL:
    - parcel: parcel_number  # returns all parcel numbers from the last utterance
    # "no to číslo parcely 2161/20" => "2161/20"

Output:

{'type': 'str', 'desc': 'Returns all parcel numbers from the last utterance.'}

phone (phone)

Extract phone number from the last utterance.

Currently supported languages: [‘cs’, ‘sk’, ‘pl’, ‘de’, ‘hr’, ‘en’, ‘ro’, ‘ru’, ‘pt’]

Description: Extract phone number from utterance (e.g. ‘+421904134666’, ‘+420954260200’).

YAML usage:

CALL:
    - phone_number: phone  # returns the phone number from the last utterance
    # "Na postu sa vola na 954 260 200" => "+420954260200"

Args:

language: {'type': 'str', 'desc': 'language, one of `sk|cs|pl|de|en|pt|ru|hr|ro` is supported'}

region: {'type': 'str', 'desc': 'not mandatory, only needed if it is not obvious based on language'}

Output:

{'type': 'str', 'desc': 'Returns the phone number from the last utterance. Uses country code based on provided language.'}

postcode_to_city (postcode_to_city)

Extract the cities from the last utterance postcodes.

Currently supported languages: [‘sk’, ‘cs’]

Description: Extract all cities from the utterance and return their postcodes.

Args:

context_city: {'type': 'str', 'desc': 'Optional parameter, limits possible zipcodes to those within the context city.'}

Output:

{'type': 'List[str]', 'desc': 'Returns the list of all found zipcodes from the last utterance. Accepts city as a context for extraction of zipcodes from the utterance.'}

preprocessed_utterance (preprocessed_utterance)

Extract last user utterance from utterance tokens.

Currently supported languages: Any

Description: Extract last utterance uttered.

Return last utterance uttered (e.g. ‘ano’, ‘ne’, ‘veta od uzivatele’).

YAML usage:

CALL:
    - last_utterance: preprocessed_utterance  # extracts last utterance
    # "Vstup od uzivatele." => "vstup od uzivatele"

Output:

{'type': 'str', 'desc': 'Return utterance of user, even if is empty. Returns preprocessed tokenized version of utterance. Tokenized version has no type cases and was corrected by corrector.'}

previous_conversations (previous_conversations)

Returns data of previous conversations of the user.

Currently supported languages: Any

Description: Look for previous conversations containing specific user.

By default, returning count of previous conversation (integer number).

If verbose argument set to True, returns list of previous conversations:

[
    {
        'timestamp': datetime.datetime(2022, 6, 27, 12, 55, 8, 148765),
        'conversationId': '83366cfd1fe34613a81e027b021d43c2',
        'customer_id': 123,
        'variables': {'customer_id': 123}
    },
    {
        'timestamp': datetime.datetime(2022, 6, 27, 12, 55, 8, 15165),
        'conversationId': 'bdf41c4ba1f84fd4b1903b3620c38688',
        'customer_id': 123,
        'variables': {'customer_id': 123}
    }
]

Empty list or zero if none found.

YAML usage:

CALL:
    - past_conversations1: [previous_conversations, {filter_key='customer_id', filter_value='123'}]
    - past_conversations2: [previous_conversations, {'days': 2, 'hours': 12}] #  past_conversations2 == [] > none found
    - past_conversations_plain_example: previous_conversations

Args:

weeks: {'type': 'int', 'desc': 'Optional parameter, limits how far in the past to look for previous conversations.'}

filter_key: {'type': 'str', 'desc': 'Parameter, if not provided defaults to `PhoneNumber`.'}

filter_value: {'type': 'str', 'desc': 'Parameter, if set is used as previous conversation filter for values of `filter_key` parameter.'}

days: {'type': 'int', 'desc': 'Optional parameter, limits how far in the past to look for previous conversations.'}

hours: {'type': 'int', 'desc': 'Optional parameter, limits how far in the past to look for previous conversations.'}

minutes: {'type': 'int', 'desc': 'Optional parameter, limits how far in the past to look for previous conversations.'}

seconds: {'type': 'int', 'desc': 'Optional parameter, limits how far in the past to look for previous conversations.'}

verbose: {'type': 'bool', 'desc': 'Optional parameter, switch to return list of dictionaries with previous conversation data.'}

variables: {'type': 'map', 'desc': 'Optional parameter, array of variable names to be returned.'}

Output:

{'type': 'list[str] | int', 'desc': 'Returns integer count of previous conversations, or list of dictionaries containing data of the previous conversations if set to verbose.'}

pronounce_job (pronounce_job)

Return pronunciation of a word passed in ‘job_name’ argument.

Currently supported languages: [‘cs’]

Description: Return phonetic pronunciation of a word passed as an argument ‘job_name’ (e.g. ‘majkrosoft’, ‘dejtabejs’).

YAML usage:

CALL:
    - pronunciation: [pronounce_job, {'job_name'='Office'}]  # set to czech pronunciation transcript 'ofis'

Args:

job_name: {'type': 'str', 'desc': 'Word whose phonetic transcription will be returned.'}

Output:

{'type': 'str', 'desc': 'Extract pronunciation of word which was passed as an argument. '}

pronounce_number (pronounce_number)

Return slower pronunciation of a number using commas.

Currently supported languages: Any

Description: Transform digits to digits separated by commas for slower reading by text2speech system.

YAML usage:

CALL:
    - pronunciation: [pronounce_number, {'number': '1234'}]
    # `1, 2, 3, 4`
1:
    SPEECH: "Number is {pronounce_number(number='01234')}}."
    # `Number is 1, 2, 3, 4.`

Args:

number: {'type': 'Union[int, str]', 'desc': 'Input number.'}

Output:

{'type': 'str', 'desc': 'Number converted to comma separated digits.'}

quick_string_matching (quick_string_matching)

Match string to a list of predefined entities using fuzzy matching.

Currently supported languages: Any

Description: Efficient fuzzy matching tool based on tf-idf ngram vectors.

Args:

entity_type: {'type': 'str', 'desc': 'Entity to fuzzy match.'}

to_remove: {'type': 'str', 'desc': 'Switch to remove prebuilt matching objects if they were built.'}

language: {'type': 'str', 'desc': 'Language of entity to fuzzy match. Exclusively cs|sk.'}

Output:

{'type': 'str', 'desc': 'Returns best three matches.'}

random_int (random_int)

Generate random integer number.

Currently supported languages: Any

Description: Generates a random integer N such that start <= N <= stop and returns it (e.g. 15, 42).

YAML usage:

CALL:
    - randomness: random_int  # returns random integer number (from the Latin integer meaning "whole")

Args:

start: {'type': 'int', 'desc': 'Start of the interval for the random integer. Default value is `0`.'}

stop: {'type': 'int', 'desc': 'End of the interval for the random integer. Required argument.'}

step: {'type': 'int', 'desc': 'Only numbers with this step from start will be generated. Default value is `1`.'}

inclusive: {'type': 'bool', 'desc': 'Whether `stop` number of the interval will be a possible output.'}

Output:

{'type': 'str', 'desc': 'Returns random whole number. Takes start, step, inclusive and stop parameters.'}

Regex matching (regex_match)

Define a regular expression, flags and whether to succeed on presence/absence of regex.

Currently supported languages: Any

Description: Apply regular expression defined in field regex_str on user utterance and:

  • return True if regular expression matches user utterance and inverse is False

  • return True if regular expression doesn’t match user utterance and inverse is True

  • otherwise return None

Regular expression flags can be defined as list of values in flags field, e.g. ["IGNORECASE"] or ["IGNORECASE", "NEWLINES"] or []

Recommended webs:

YAML usage:

EXTRACT:
    - _entry_check: [regex_match, {regex_str: " (?:z|v|ze|ve) [A-ZČŠŘŽ]", inverse: True}]
    # "volám do Irska" => True, # "volám z Irska" => None

CALL:
    - _entry_check: [regex_match, {regex_str: "roaming", flags: ['IGNORECASE']}]
    # "I want ROAMING" => True, "I want rOaMiNg" => True, "I want rooming" => None

Args:

regex_str: {'type': 'str', 'desc': 'Regular expressions as string, do not forget to escape backslashes, e.g. \\\\w.'}

flags: {'type': 'List[str]', 'desc': 'List of regex flags, e.g. [IGNORECASE, NEWLINES] or None.'}

inverse: {'type': 'bool', 'desc': 'True: fail if regex is found, otherwise True; False: fail if regex not found, otherwise True.'}

Output:

{'type': 'bool', 'desc': 'Return True (for matching with inverse=False or for non-matching with inverse=True) or None.'}

registration_plate (registration_plate)

Extract vehicle registration number from the last utterance.

Currently supported languages: [‘cs’, ‘sk’, ‘pl’, ‘de’, ‘at’, ‘ch’, ‘ro’, ‘hu’, ‘en’]

Description: Extract cs|sk|pl|de|at|ch|ro|hu standard vehicle registration number if contained within last utterance uttered.

Return company vehicle registration number extracted from utterance (e.g. ‘4A2-3000’, ‘BB-221BS’).

YAML usage:

CALL:
    - spz: registration_plate
    # "moje auto je 4A2 3000 mate ho v systeme?" => "4A2-3000"
    - spz: [registration_plate, {"remove_punctuation": True}]
    # "moje auto je 4.A .2 3?000 mate ho v systeme?" => "4A2-3000"

Output:

{'type': 'str', 'desc': 'Returns `cs`|`sk`|`pl`|`de`|`at`|`ch`|`ro`|`hu` vehicle registration number. Works only with standard VRN (e.g. 4A2 3000 not COOL-CAR-666). '}

remove_diacritics (remove_diacritics)

Remove diacritics signs from input

Currently supported languages: Any

Description: Remove diacritics signs from input and transform accented characters into characters without diacritics

YAML usage:

NODE_01:
    CALL:
        - clean_text_call: [remove_diacritics, {'text': 'Kováč'}]
        # `Kovac`
    SET:
        - clean_text_set: 'remove_diacritics(text="Kováč")'
        # `Kovac`
        - name_of_var: '"Ján.Kováč"'
    1:
        SPEECH: "Jeho prihlasovacie meno je {remove_diacritics(text='Ján.Kováč')}"
        # SPEECH: `Jeho prihlasovacie meno je Jan.Kovac`
    2:
        SPEECH: "Jeho prihlasovacie meno je {remove_diacritics(text=name_of_var)}"
        # SPEECH: `Jeho prihlasovacie meno je Jan.Kovac`

Args:

text: {'type': 'str', 'desc': 'Input text to be normalized'}

Output:

{'type': 'str', 'desc': 'Text converted to one without diacritics.'}

reset_gpt_history (reset_gpt_history)

Resets Gpt history

Currently supported languages: Any

Description: YAML usage:

CALL:
    - reset_history: reset_gpt_history

Output:

{'type': 'bool', 'desc': 'If operation was successful or not'}

rivers (rivers)

Extract river names from the last utterance as list of strings.

Currently supported languages: [‘cs’, ‘sk’]

Description: Extract river names from utterance if present.

YAML usage:

CALL:
    - rivers: rivers  # returns list of rivers contained within the utterance
    # "Vylila se tu bečva" => ["Bečva"]

Output:

{'type': 'list', 'desc': 'List of river names from the last utterance if any present, else None.'}

Sanitize text (sanitize_text)

If there is html or javascript code in text, remove it and return safe text.

Currently supported languages: Any

Description: YAML usage:

EXTRACT:
    - safe_text: [sanitize_text, {'custom_input': html_string}]
    # "<span>some text</span>" => "some text"

Output:

{'type': 'str', 'desc': 'Returns text with code removed from body.'}

selected_intent (selected_intent)

Returns dictionary containing information about last picked ACTIONS intent.

Currently supported languages: Any

Description: Returns parsed source|intent|info dictionary based on last successful ACTIONS interaction.

YAML usage:

CALL:
    - last_intent: selected_intent
    #  {'Source': 'START', 'SelectedIntent': 'INTENT_TO_LOOP', 'Confidence': '1.000'}

Output:

{'type': 'Optional[dict]', 'desc': "If successful {'Source': 'str', 'SelectedIntent': 'str', 'Confidence': 'float'}"}

send_email (send_email)

Sends email to given email addresses over SMTP protocol.

Currently supported languages: Any

Description: Sends email to given email addresses over SMTP protocol.

YAML usage:

CALL:
    - call_send_email:
        - send_email
        - smtp_host: smtp.gmail.com
          subject: 'TEST-VA'
          smtp_password: borndigitaltesting
          smtp_port: 587
          address_from: testovaci.ucet.borndigital@gmail.com
          address_to: testovaci.ucet.borndigital@gmail.com
          html: "I am a content of an email.<br>I am another line of an email."
          attachments: {'invoice_scan.pdf': <bytes>}

Args:

smtp_host: {'type': 'str', 'desc': 'SMTP host.'}

smtp_port: {'type': 'str', 'desc': 'Port if different from default 25.'}

smtp_password: {'type': 'str', 'desc': 'Password if any needed.'}

subject: {'type': 'str', 'desc': 'Email subject.'}

address_from: {'type': 'str', 'desc': 'Adress from whom the email will be sent.'}

address_to: {'type': 'str', 'desc': 'Email address or a list of email addresses.'}

html: {'type': 'str', 'desc': 'HTML content. Body of the email.'}

text: {'type': 'str', 'desc': 'Text to be sent as an email.'}

attachments: {'type': 'Mapping[str, bytes]', 'desc': 'Attachments of the email. Key is a file name and value is content of the attachment.'}

Output:

{'type': 'bool', 'desc': 'Return True if succesfully send otherwise false.'}

Simple number (simple_number)

Merge all number information from user utterance into one number.

Currently supported languages: Any

Description: Extract one number from user utterance and return it as string to support leading zeros. If no value is found, return None:

  • user says “Hello 3 3 1 13, 23” and smart function extracts “3311323”

  • user says “Hello” and smart function extracts None

  • user says “I want 5 bananas” and smart function extracts “5”

  • user says “I want 5 apples and 3 bananas” and smart function extracts “53”

YAML usage:

CALL:
    - variable: simple_number
    # Hello 3 3 1 13, 23 => "3311323"
    # Hello => None

Output:

{'type': 'str', 'desc': 'If information about integer numbers is present, join numbers and return them as one string.'}

Simple number list (simple_number_list)

Extract all numbers from user utterance as a list.

Currently supported languages: Any

Description: Extract all numbers from user utterance and return them as a list containing strings to support leading zeros. If numbers are separated only by space, merge them. If no value is found, return None:

  • user says “Hello 3 3 1 13, 23” and smart function extracts [“33113”, “23”]

  • user says “Hello 3, 3, 1, 13, 23” and smart function extracts [“3”, “3”, “1”, “13”, “23”]

  • user says “Hello” and smart function extracts None

  • user says “I want 5 bananas” and smart function extracts [“5”]

  • user says “I want 5 apples and 3 bananas” and smart function extracts [“5”, “3”]

YAML usage:

CALL:
    - variable: simple_number_list  # Hello 3 3 1 13, 23 => ["33113", "23"]
    # Hello => None

Output:

{'type': 'List[str]', 'desc': 'Return None for no numbers, list cannot be empty'}

soap_extractor (soap_extractor)

Return response dict, response and request string.

Currently supported languages: Any

Description: Return result (Dict), message_sent (str), message_received (str).

YAML usage:: : It is not meant to be used in YAML, but can be called in backend calls and extension on request to send and process SOAP requests

Output:

{'type': 'Tuple', 'desc': '(Return result (Dict), request - message_sent (str), response - message_received (str))'}

spell_out (spell_out)

Return slower pronunciation of a string using comma with white space as delimiter.

Currently supported languages: [‘cs’, ‘sk’, ‘pl’, ‘fr’, ‘de’, ‘nl’]

Description: Transform string to characters separated by commas for slower reading by text2speech system. Output allows spelling of alphanumeric characters only, everything else will be left out [ -./

*

].

YAML usage:

CALL:
    - pronunciation: [spell_out, {'string': 'C1234O'}]
    # `C, 1, 2, 3, 4, O`
1:
    SPEECH: "Number is {spell_out(string='C1234O')}."
    #`Number is C, 1, 2, 3, 4, O.`

Args:

string: {'type': 'Union[int, str]', 'desc': 'Input string to be spelled out.'}

Output:

{'type': 'str', 'desc': 'Number converted to comma with white space separated digits.'}

street (street)

Extract the street name and street number from the last utterance as string.

Currently supported languages: [‘cs’, ‘sk’]

Description: Extract the street name and street number as a single string from utterance.

Return the longest possible street entity extracted from utterance (e.g. ‘Sokolova 113/3b’, ‘Husova 5’).

YAML usage:

CALL:
    - street: street  # returns the street name and street number from the last utterance as a single string
    # "Moje adresa je Hradní náměstí 15" => "Hradní náměstí 15"

Output:

{'type': 'str', 'desc': 'Returns the street name and street number from the last utterance as a string. Based on a regex pattern, does not consider city or context.Demand for a street number is a condition for a successful extraction.'}

street_trie (street_trie)

Extract the street name and street number from the last utterance as string.

Currently supported languages: [‘cs’, ‘sk’, ‘pl’, ‘hu’, ‘en’]

Description: Extract the street name and street number as a single string from utterance.

Returns the longest possible street entity extracted from utterance (e.g. ‘Sokolova 113/3b’, ‘Husova 5’). Returns the street name and street number from the last utterance as a single string

YAML usage:

CALL:
    - street: street_trie
    # "Moje adresa je Hradní náměstí 15" => "Hradní náměstí 15"

Output:

{'type': 'str', 'desc': 'Returns the street name and street number from the last utterance as a string. Based on a list of existing streets in `cs`|`sk`|`pl`|`hu`, if none found, it tries regex, does not consider city or context.Demand for a street number is a condition for a successful extraction.'}

time (time)

Try to guess time intended by user. Supports single precise events for now.

Currently supported languages: [‘cs’, ‘sk’, ‘pl’, ‘de’, ‘en’, ‘ro’, ‘nl’, ‘fr’, ‘hu’]

Description: Try to guess time from user utterance.

Return only time in format ‘Hours:Minutes’ if presented or None if time not presented.

Currently, intervals and other advanced features are not supported.

You can use simple relative terms like “in six hours”, “at half past five”, etc.

YAML usage:

CALL:
    - phonecall_time: time  # current time is 10:10
    # "at half past five" => "17:30"
    # "yesterday at five" => "17:00"
    # "at five" => "17:00"
    # "at five in the morning" => "05:00"

Args:

current_time: {'type': 'datetime', 'desc': 'All relative information are computed from this time, if not set, use time at the moment of user interaction.'}

multiple_instances: {'type': 'bool', 'desc': 'Ignored for now.'}

intervals: {'type': 'bool', 'desc': 'Ignored for now.'}

Output:

{'type': 'str', 'desc': 'part of ISO format string (e.g. 18:00) or None'}

to_date (to_date)

Return datetime.date() object for checking current dates in conditions, or saving them as numbers.

Currently supported languages: Any

Description: Return datetime.date() object with current date (e.g. datetime.date(2020, 9, 29)), based on input arguments.

YAML usage:

EXTRACT:
    - 'current_month': to_date("time"=current_time()).month  # set current_month entity to int of current month

Output:

{'type': 'datetime.date()', 'desc': "Return datetime object, however it is not json serializeable, so it is for calculation purpose ONLY, not to be saved into variable unless converted into string e.g. by calling objects atributes,to_date('time'=current_time()).month returns integer marking the month of current month. The attributes are `year`, `month`, `day`. It is also usable in MARKDOWN."}

to_readable_date (to_readable_date)

Return readable date if possible.

Currently supported languages: [‘cs’, ‘sk’]

Description: Return readable word form of date input. Input needs to contain substring in format dd.mm.yyyy.

YAML usage:

CALL:
    - 'readable_date': [to_readable_date, {"date": "01.08.2000", "language": "cs"}]  # prvního srpna dva tisíce
    - 'readable_date': [to_readable_date, {"date": "narodil sa 29.02.1975", "language": "sk"}]  # dvadsiateho deviateho februára tisíc deväťsto sedemdesiat päť

Args:

language: {'type': 'str', 'desc': 'language, so far only `cs|sk` is supported'}

Output:

{'type': 'str', 'desc': 'Return date, transformed from digit into numeral words. If date format check fails, raises custom FormatNotSupportedError. We also accept string containing substrings in said format. In case of multiple matches, we take first from left.'}

translate (translate)

Translates the input text to the specified language using Azure Translate REST API.

Currently supported languages: [‘cs’, ‘fr’, ‘hr’, ‘hu’, ‘nl’, ‘ro’, ‘ru’, ‘sk’, ‘en’, ‘pl’, ‘de’, ‘pt’]

Description: Extractor to translate the input text using Azure Translate REST API.

Args:

language: {'type': 'str', 'desc': 'The target language for translation.'}

from: {'type': 'str', 'desc': 'The source language of the text.'}

input: {'type': 'str', 'desc': 'The text to be translated.'}

Output:

{'type': 'str', 'desc': 'Returns the translated text.'}

update_json (update_json)

Update input json variable with new dictionary input

Currently supported languages: Any

Description: Update variable equal to json structure by merging with new dictionary value and return it on output.

YAML usage:: : # in history there exists variable named some_json_variable with value {key2: value2} CALL:

  • updated_dictionary_variation: [ : update_json, {‘json_variable_name’: ‘some_json_variable’, ‘values’:[{“data”: “35”}, {“data_2”: {“data_3”: 40}}]}

]

for some_json_variable equal to {“data”: “30”} returns {‘data’: ‘35’, ‘data_2’: {‘data_3’: 40}}

Args:

json_variable_name: {'type': 'str', 'desc': 'Name of variable that exists in conversations history object.'}

values: {'type': 'list', 'desc': 'List of dictionaries to be merged with json variable.'}

Output:

{'type': 'str', 'desc': 'Updated json variable.'}

user_data_management (user_data_management)

SELECT data from clients database.

Currently supported languages: Any

Description: Extension to enable checking custom client data. User needs to know table structure and column types. If table name is variable it needs to be defined in SET section of start node of flow. Schema is organization id.

YAML usage::

``
`

CALL: : - output: [ : user_data_management, {

“table_name”: “org_table”, “input_column”: {“age”: “30”, “shoe_size”: “11”}, “output_columns”: [“name”, “surname”, “age”]

}]

  • output2: [ : user_data_management, {

“table_name”: “org_table”, “ignore_case”: true, “input_column”: {“name”: “{name_variable}”, “surname”: “{surname_variable}”}, “output_columns”: [“name”, “surname”, “age”] }

]

use placeholder signifier {variable_that_contains_valid_value}, in case of input stored in variable

  • async_output: [

user_data_management, {

“table_name”: “org_table”, “ignore_case”: true, “input_column”: {“name”: “timo”, “surname”: “JURCO”}, “output_columns”: [“name”, “surname”, “age”], “non_blocking”: True }

]

use non_blocking, if you do not need to wait for response, result will be saved in history at the beginning of next interaction after results received

: - multi_table_output: - user_data_management - table_name:[ ‘people’, ‘people2’]

input_column: : [ : [ : { “number”: ‘{input_number_1}’, “name”: ‘{input_name_1}’ }, { “number”: ‘{input_number_2}’, “name”: ‘{input_name_2}’ } ], { “number”: ‘{input_number_2}’, “name”: ‘{input_name_2}’ } ]

output_columns: [[‘number’, ‘name’, ‘surname’, ‘email’], [‘number’, ‘name’, ‘surname’]]

``
`

Output:

{'type': 'list', 'desc': 'List of deserialized rows according to input_column parameter, [] if none found'}

utterance (utterance)

Extract last user utterance from conversation.

Currently supported languages: Any

Description: Extract last utterance uttered.

Return last utterance uttered (e.g. ‘ano’, ‘ne’, ‘veta od uzivatele’).

YAML usage:

CALL:
    - last_utterance: utterance  # extracts last utterance
    # "vstup od uzivatele" => "vstup od uzivatele"

Output:

{'type': 'str', 'desc': 'Return utterance of user, even if is empty.'}

utterance_language (utterance_language)

Return string marking language of last utterance.

Currently supported languages: Any

Description: Return guess of language for utterance.

YAML usage:

CALL:
    - guessed_utt_language: utterance_language # set to string marking utterance (e.g. 'ru', 'cs', 'en')

Args:

utterance: {'type': 'str', 'desc': 'Word whose phonetic transcription will be returned.'}

require_language: {'type': 'str', 'desc': 'Candidate expected language.'}

fail_on_match: {'type': 'bool', 'desc': 'If set to True and the expected language and detected language match, the extractor fails.'}

possible_languages: {'type': 'Tuple', 'desc': 'Tuple of suspected candidate languages.'}

Output:

{'type': 'str', 'desc': 'Return string marking language of last utterance.'}

winning_intent (winning_intent)

Returns if target state is intent with greatest score and above threshold.

Currently supported languages: Any

Description: Send request to the intent resolver to check winning intent against target conversation state.

Return True or None based on winning INTENTS, True if target state in winning intents (in language model there can be multiple identical intents) and their winning SCORE greater than input. (e.g. True for target I_CHANGE_PARAMS_FIND_DETAIL_DAY_LIMIT, threshold 0.8 and [[‘I_CHANGE_PARAMS_FIND_DETAIL_DAY_LIMIT’, ‘I_CHANGE_PARAMS_DAY_LIMIT’], 0.9999368190765381]).

YAML usage:

CALL:
    - is_winning: [winning_intent, {"threshold": 0.85, "target": TARGET_NODE}]
    # can be used if we want to know the score of winning intent or ban entry for non target states

Args:

threshold: {'type': 'float|int', 'desc': 'threshold for score of winning intent'}

Output:

{'type': 'Optional[bool]', 'desc': 'True if TARGET_STATE == WINNING_INTENT and score > given threshold'}

zipcode (zipcode)

Extract zipcode from the last utterance.

Currently supported languages: [‘cs’, ‘sk’, ‘pl’]

Description: Extract zipcode number from utterance.

Return zipcode number from utterance (e.g. ‘81106’, ‘60200’).

YAML usage:

CALL:
    - psc: zipcode  # returns the zip code number from the last utterance
    # "Pre stare mesto psc je nieco ako 811 06" => "81106"

Output:

{'type': 'str', 'desc': 'Returns the zip code number from the last utterance.'}

zipcode (zipcode_greedy)

Extract zipcode from the last utterance.

Currently supported languages: [‘cs’, ‘sk’, ‘pl’, ‘hu’, ‘en’]

Description: Extract zipcode number from utterance, not dependent on the format.

Return zipcode number from utterance (e.g. ‘81106’, ‘60200’).

YAML usage:

CALL:
    - psc: zipcode  # returns the zip code number from the last utterance
    # "Pre stare mesto psc je nieco ako 8 1 1 0 6" => "81106"

Output:

{'type': 'str', 'desc': 'Returns the zip code number from the last utterance. This is the greedy version, meaning less strict on the format compared to the basic zipcode extractor in /extensions.'}

widget_announcement (widget_announcement)

Returns data for the announcement widget

Currently supported languages: Any

Description: Fetches, evaluates and returns data for the announcement widget. See https://dev.azure.com/borndigitalai/Born%20Digital/_workitems/edit/1789

Args:

widget_id: {'type': 'str', 'desc': 'ID of the announcement to return.'}

Output:

{'type': 'dict[str, str]', 'desc': 'Widget specific data.'}

Last updated