Skip to content

You are viewing documentation for Immuta version 2023.2.

For the latest version, view our documentation for Immuta SaaS or the latest self-hosted version.

Sensitive Data Discovery (SDD) API Reference Guide

Note

In previous documentation, identifier is referred to as classifier. The language is being updated to identifier to be more accurate and not conflate meaning with the Immuta data classification and frameworks feature.

Workflow

  1. Create a custom identifier.
  2. Create a template containing one or more identifiers.
  3. Search for identifiers or templates.
  4. Apply templates to one or more data sources.
  5. Run SDD on one or more data sources; tags are applied to columns where identifiers were detected.
  6. Update identifiers or templates.
  7. Delete identifiers or templates.

Create a custom identifier

Endpoint

Method Path Purpose
POST sdd/classifier Create an identifier.

Query Parameters

None.

Payload Parameters

Attribute Description Required
name string Unique, request-friendly identifier name. Yes
displayName string Unique, human-readable identifier name. Yes
description string The identifier description. Yes
type string The type of identifier: regex, dictionary, columnNameRegex, or builtIn. Yes
config object May include config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags. *See descriptions below. Yes
minConfidence* number When the detection confidence is at least this percentage, tags are applied. Yes
tags* array[string] The name of the tags to apply to the data source. Yes
regex* string A case-insensitive regular expression to match against column values. No
columnNameRegex* string A case-insensitive regular expression to match against column names. No
values* array[string] The list of words to include in the dictionary. No
caseSensitive* boolean Indicates whether or not values are case sensitive. Defaults to false. No

Response Parameters

Attribute Description
createdBy array Includes details about the user who created the identifier, such as their profile id, name, and email.
name string Unique, request-friendly identifier name.
displayName string Unique, human-readable identifier name.
description string The identifier description.
type string The type of identifier: regex, dictionary, columnNameRegex, or builtIn.
config object May include config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags. *See descriptions below.
minConfidence number When the detection confidence is at least this percentage, tags are applied.
tags* array[string] The name of the tags to apply to the data source.
columnNameRegex* string A case-insensitive regular expression to optionally match against column names.
regex* string A case-insensitive regular expression to match against column values.
values* array[string] The list of words included in the dictionary.
caseSensitive* boolean Indicates whether or not values are case sensitive. Defaults to false.
createdAt date When the identifier was created.
updatedAt date When the identifier was last updated.

Request example

The following request creates a custom identifier, which is saved in example-payload.json.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/classifier

Payload examples

Regex identifier payload
{
  "name": "MY_REGEX_IDENTIFIER",
  "displayName": "My Regex Identifier",
  "description": "An identifier using regex",
  "type": "regex",
  "config": {
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5,
    "tags": ["Discovered.regex-example"]
  }
}
Dictionary identifier payload
{
  "name": "MY_DICTIONARY_IDENTIFIER",
  "displayName": "My Dictionary Identifier",
  "description": "An identifier using dictionary",
  "type": "dictionary",
  "config": {
    "values": ["Bob", "Eve"],
    "caseSensitive": true,
    "minConfidence": 0.6,
    "tags": ["Discovered.dictionary-example", "Discovered.dictionary-identifier-example"]
  }
}
Column name regex identifier payload
{
  "name": "MY_COLUMN_NAME_REGEX_IDENTIFIER",
  "displayName": "My Column Name Regex Identifier",
  "description": "An identifier using column name regex",
  "type": "columnNameRegex",
  "config": {
    "columnNameRegex": "ssn|social ?security",
    "tags": ["Discovered.column-name-regex"]
  }
}

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "MY_REGEX_IDENTIFIER",
  "displayName": "My Regex Identifier",
  "description": "An identifier using regex",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5
  },
  "id": 67,
  "createdAt": "2021-10-14T18:48:56.289Z",
  "updatedAt": "2021-10-14T18:48:56.289Z"
}

Create a template

Endpoint

Method Path Purpose
POST sdd/template Create a template.

Query Parameters

None.

Payload Parameters

Attribute Description Required
name string Unique, request-friendly template name. Yes
displayName string Unique, human-readable template name. Yes
description string The template description. Yes
classifiers array Includes each identifier's name and overrides for minConfidence and tags. Yes
sampleSize integer Override for how many records to sample from the data source. No

Response Parameters

Attribute Description
id integer The unique ID of the template.
createdBy array Includes details about the user who created the template, such as their profile id, name, and email.
name string Unique, request-friendly template name.
displayName string Unique, human-readable template name.
description string The template description.
classifiers array Includes details about the identifiers within the template, such as the name and overrides.
sampleSize integer Optional override of how many records to sample from the data source.
createdAt date When the template was created.
updatedAt date When the template was last updated.

Request example

The following request creates a template that contains 2 identifiers, saved in example-payload.json.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template

Payload example

{
  "name": "MY_FIRST_TEMPLATE",
  "displayName": "My First Template",
  "description": "This is the first template I've created.",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_IDENTIFIER"
    },
    {
      "name": "MY_REGEX_IDENTIFIER"
    }
  ],
  "sampleSize": 100
}

Response example

{
  "name": "MY_FIRST_TEMPLATE",
  "displayName": "My First Template",
  "description": "This is the first template I've created.",
  "sampleSize": 100,
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "id": 1,
  "createdAt": "2021-10-14T19:12:22.092Z",
  "updatedAt": "2021-10-14T19:12:22.092Z",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_IDENTIFIER",
      "overrides": {}
    },
    {
      "name": "MY_REGEX_IDENTIFIER",
      "overrides": {}
    }
  ]
}

Search for identifiers or templates

Method Path Purpose
GET sdd/classifier List or search identifiers.
GET sdd/template List or search templates.
GET sdd/classifier/{classifierName} View a specific identifier by name.
GET sdd/template/{templateName} View a specific template by name.
GET sdd/template/global View the current global SDD template.

List or search for identifiers

Endpoint

Method Path Purpose
GET sdd/classifier List or search identifiers.

Query Parameters

Attribute Description Required
sortField string The field by which to sort the search results: id, name, displayName, type, createdAt, or updatedAt. No
sortOrder string Denotes whether to sort the results in ascending (asc) or descending (desc) order. Default is asc. No
offSet integer Use in combination with limit to fetch pages. No
limit integer Limits the number of results displayed per page. No
type array[string] Searches for identifiers based on identifier type: regex, dictionary, builtIn, or columnNameRegex. No
searchText string A partial, case-insensitive search on name. No

Response Parameters

Attribute Description
count integer The number of identifiers found matching the search criteria.
createdBy array Includes details about the user who created the identifier, such as their profile id, name, and email.
name string Unique, request-friendly identifier name.
displayName string Unique, human-readable identifier name.
description string The identifier description.
type string The type of identifier: regex, dictionary, columnNameRegex, or builtIn.
config object May include config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags. *See descriptions below.
minConfidence number When the detection confidence is at least this percentage, tags are applied.
tags* array[string] The name of the tags to apply to the data source.
columnNameRegex* string A case-insensitive regular expression to optionally match against column names.
regex* string A case-insensitive regular expression to match against column values.
values* array[string] The list of words included in the dictionary.
caseSensitive* boolean Indicates whether or not values are case sensitive. Defaults to false.
createdAt date When the identifier was created.
updatedAt date When the identifier was last updated.

Request example

The following request lists 5 identifiers.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/classifier?sortField=name&sortOrder=asc&limit=5

Response example

{
  "count": 67,
  "hits": [
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AGE",
      "displayName": "Age",
      "description": "Detects numeric strings between 10 and 199, provided the column header contains text such as `age`, `year`, `years`, `yr`, or `yrs`.",
      "type": "builtIn",
      "config": {
        "minConfidence": 0.7,
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Indirect",
          "Discovered.PHI",
          "Discovered.Entity.Age"
        ],
        "conditionalTags": {}
      },
      "id": 3,
      "createdAt": "2021-10-28T07:34:58.761Z",
      "updatedAt": "2021-10-28T07:34:58.761Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "ARGENTINA_DNI_NUMBER",
      "displayName": "Argentina DNI Number",
      "description": "Detects strings consistent with Argentina National Identity (DNI) Number.  Requires an eight digit number with optional periods between the second and third and fifth and sixth digit.",
      "type": "builtIn",
      "config": {
        "minConfidence": 0.7,
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Argentina",
          "Discovered.PHI",
          "Discovered.Entity.DNI Number"
        ],
        "conditionalTags": {}
      },
      "id": 4,
      "createdAt": "2021-10-28T07:34:58.769Z",
      "updatedAt": "2021-10-28T07:34:58.769Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AUSTRALIA_MEDICARE_NUMBER",
      "displayName": "Australia Medicare Number",
      "description": "Detects numeric strings consistent with Australian Medicare Number.  Requires a ten or eleven digit number.  The starting digit must be between 2 and 6, inclusive.  Optional spaces can be placed between the fourth and fifth and ninth and tenth digit.  Optional 11th separated by a `/` can be present.  A checksum is required.",
      "type": "builtIn",
      "config": {
        "minConfidence": 0.7,
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Australia",
          "Discovered.PHI",
          "Discovered.Entity.Medicare Number"
        ],
        "conditionalTags": {}
      },
      "id": 5,
      "createdAt": "2021-10-28T07:34:58.779Z",
      "updatedAt": "2021-10-28T07:34:58.779Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AUSTRALIA_PASSPORT",
      "displayName": "Australia Passport",
      "description": "Detects strings consistent with Australian Passport number.  A 8 or 9 character string is required, with a starting upper case character (N, E, D, F, A, C, U, X) or a two character starting character (P followed by A, B, C, D, E, F, U, W, X, or Z) followed by seven digits",
      "type": "builtIn",
      "config": {
        "minConfidence": 0.7,
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Australia",
          "Discovered.PHI",
          "Discovered.Entity.Passport"
        ],
        "conditionalTags": {}
      },
      "id": 26,
      "createdAt": "2021-10-28T07:34:59.010Z",
      "updatedAt": "2021-10-28T07:34:59.010Z"
    },
    {
      "createdBy": {
        "id": 21,
        "name": "Immuta System Account",
        "email": "immuta_system@immuta.com"
      },
      "name": "AUSTRALIA_TAX_FILE_NUMBER",
      "displayName": "Australia Tax File Number",
      "description": "Detects strings consistent with Australia Tax File Number.  Requires a nine digit number with optional spaces between the third and fourth and sixth and seventh digits.  A checksum is also required",
      "type": "builtIn",
      "config": {
        "minConfidence": 0.7,
        "tags": [
          "Discovered.PII",
          "Discovered.Identifier Direct",
          "Discovered.Country.Australia",
          "Discovered.PHI",
          "Discovered.Entity.Tax File Number"
        ],
        "conditionalTags": {}
      },
      "id": 6,
      "createdAt": "2021-10-28T07:34:58.789Z",
      "updatedAt": "2021-10-28T07:34:58.789Z"
    }
  ]
}

List or search for templates

Endpoint

Method Path Purpose
GET sdd/template List or search templates.

Query Parameters

Attribute Description Required
sortField string The field by which to sort the search results: id, name, displayName, type, createdAt, or updatedAt. No
sortOrder string Denotes whether to sort the results in ascending (asc) or descending (desc) order. Default is asc. No
offSet integer Use in combination with limit to fetch pages. No
limit integer Limits the number of results displayed per page. No
classifiers array[string] Filters template results to those containing the specified identifiers. No
searchText string A partial, case-insensitive search on the template name. No

Response Parameters

Attribute Description
count integer The number of templates found matching the search criteria.
id integer The unique ID of the template.
createdBy array Includes details about the user who created the template, such as their profile id, name, and email.
name string Unique, request-friendly template name.
displayName string Unique, human-readable template name.
description string The template description.
classifiers array Includes details about the identifiers within the template, such as the name and overrides.
sampleSize integer Optional override of how many records to sample from the data source.
createdAt date When the template was created.
updatedAt date When the template was last updated.

Request example

The following request lists all custom templates.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/template

Response example

{
  "count": 1,
  "hits": [
    {
      "name": "MY_FIRST_TEMPLATE",
      "displayName": "My First Template",
      "description": "This is the first template I've created.",
      "sampleSize": 100,
      "createdBy": {
        "id": 1,
        "name": "John",
        "email": "john@example.com"
      },
      "id": 1,
      "createdAt": "2021-10-14T19:12:22.092Z",
      "updatedAt": "2021-10-14T19:12:22.092Z",
      "classifiers": [
        {
          "name": "MY_COLUMN_NAME_REGEX_IDENTIFIER",
          "overrides": {}
        },
        {
          "name": "MY_REGEX_IDENTIFIER",
          "overrides": {}
        }
      ]
    }
  ]
}

View an identifier by name

Endpoint

Method Path Purpose
GET sdd/classifier/{classifierName} Get an identifier by name.

Query Parameters

Attribute Description Required
classifierName string The name of the identifier. Yes

Response Parameters

Attribute Description
id integer The unique ID of the identifier.
createdBy array Includes details about the user who created the identifier, such as their profile id, name, and email.
name string Unique, request-friendly template name.
displayName string Unique, human-readable identifier name.
description string The identifier description.
type string The type of identifier: regex, dictionary, columnNameRegex, or builtIn.
config object May include config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags. *See descriptions below.
minConfidence number When the detection confidence is at least this percentage, tags are applied.
tags* array[string] The name of the tags to apply to the data source.
columnNameRegex* string A case-insensitive regular expression to optionally match against column names.
regex* string A case-insensitive regular expression to match against column values.
values* array[string] The list of words included in the dictionary.
caseSensitive* boolean Indicates whether or not values are case sensitive. Defaults to false.
createdAt date When the identifier was created.
updatedAt date When the identifier was last updated.

Request example

This request gets the identifier named MY_REGEX_IDENTIFIER.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/classifier/MY_REGEX_IDENTIFIER

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "MY_REGEX_IDENTIFIER",
  "displayName": "My Regex Identifier",
  "description": "An identifier using regex",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5
  },
  "id": 67,
  "createdAt": "2021-10-18T16:48:18.819Z",
  "updatedAt": "2021-10-18T16:48:18.819Z"
}

View a template by name

Endpoint

Method Path Purpose
GET sdd/template/{templateName} Get a template by name.

Query Parameters

Attribute Description Required
templateName string The name of the template. Yes

Response Parameters

Attribute Description
id integer The unique ID of the template.
createdBy array Includes details about the user who created the template, such as their profile id, name, and email.
name string Unique, request-friendly template name.
displayName string Unique, human-readable template name.
description string The template description.
classifiers array Includes details about the identifiers within the template, such as the name and overrides.
sampleSize integer Optional override of how many records to sample from the data source.
createdAt date When the template was created.
updatedAt date When the template was last updated.

Request example

This request gets the template named MY_FIRST_TEMPLATE.

curl \
    --request GET \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/template/MY_FIRST_TEMPLATE

Response example

{
  "name": "MY_FIRST_TEMPLATE",
  "displayName": "My First Template",
  "description": "This is the first template I've created.",
  "sampleSize": 100,
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@immuta.com"
  },
  "id": 1,
  "createdAt": "2021-10-18T16:54:24.920Z",
  "updatedAt": "2021-10-18T16:54:24.920Z",
  "classifiers": [
    {
      "name": "MY_DICTIONARY_IDENTIFIER",
      "overrides": {}
    },
    {
      "name": "MY_REGEX_IDENTIFIER",
      "overrides": {}
    }
  ]
}

View the current template

Endpoint

Method Path Purpose
GET sdd/template/global View the current global SDD template.

Query Parameters

None.

Response Parameters

Attribute Description
id integer The unique ID of the template.
name string Unique, request-friendly template name.
displayName string Unique, human-readable template name.
description string The template description.
classifiers array Includes details about the identifiers within the template, such as the name and overrides.
sampleSize integer Optional override of how many records to sample from the data source.
createdBy array Includes details about the user who created the template, such as their profile id, name, and email.
createdAt date When the template was created.
updatedAt date When the template was last updated.

Request example

This request gets the current global SDD template information.

curl -X 'GET' \
  'https://demo.immuta.com/sdd/template/global' \
  -H 'accept: application/json' \
  -H 'Authorization: Bearer 9ba76f3c64c345ad817fa467d7110556'

Response example

{
  "name": "MY_FIRST_TEMPLATE",
  "displayName": "My First Template",
  "description": "This is the first template I've created.",
  "sampleSize": 100,
  "createdBy": {
    "id": 2,
    "name": "Jane Doe",
    "email": "jane.doe@immuta.com"
  },
  "id": 1,
  "createdAt": "2022-08-10T20:35:43.252Z",
  "updatedAt": "2022-08-10T20:35:43.252Z",
  "classifiers": [
    {
      "name": "AGE",
      "overrides": {}
    },
    {
      "name": "ETHNIC_GROUP",
      "overrides": {}
    }
  ]
}

Apply templates to data sources

Endpoint

Method Path Purpose
PUT sdd/template/apply Apply a template to a set of data sources.

Query Parameters

None.

Payload Parameters

Attribute Description Required
template string The name of the template to apply to the data sources; null to clear current template. Yes
sources string The name of the data sources to apply the template to. Yes

Response Parameters

Attribute Description
success boolean When true, the request was successful.

Request example

This request applies the MY_FIRST_TEMPLATE template to the Public Case data source.

curl \
    --request PUT \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template/apply

Payload example

{
  "template": "MY_FIRST_TEMPLATE",
  "sources": [
    "Public Case"
  ]
}

Response example

{
  "success": true
}

Run SDD on data sources

Endpoint

Method Path Purpose
POST sdd/run Run SDD on specified data sources.

Query Parameters

None.

Payload Parameters

Attribute Description Required
sources string The name of the data sources to apply the template to. Yes
all boolean If true, SDD will run on all Immuta data sources. No
wait integer The number of seconds to wait for the SDD jobs to finish. The value -1 will wait until the jobs complete. Default is -1. No
dryRun boolean When true, SDD will not update the tags on the data source(s) and will just return what tags would have been applied or removed. Default is false. No
template string If passed, Immuta will run SDD with this template instead of the applied template on the data source(s). Passing template when dryRun is false will cause an error. No

Response Parameters

Attribute Description
id string A job universally unique identifier.
state string The job state: created, retry, active, completed, expired, cancelled, or failed.
output array[string] Information about the tags applied on the data source, including diff (added and removed tags) and the current state of allTags on all columns in the data sources.

Request example: Run SDD on a single data source

This request runs SDD on the data source Public Case.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/run

Payload example

{
  "sources": [
    "Insurance Data"
  ]
}

Response example

{
  "Insurance Data": {
    "id": "d2edc1d0-328c-11ec-9d5a-6793988ccf95",
    "state": "completed",
    "output": {
      "diff": {
        "addedTags": {
          "ssn": [
            "Discovered.PII"
          ],
          "email": [
            "Discovered.PII"
          ]
        },
        "removedTags": {
          "ssn": [
            "Discovered.Country.US"
          ]
        }
      },
      "sddTagResult": {
        "ssn": [
          "Discovered.Entity.Social Security Number",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ],
        "email": [
          "Discovered.Entity.Electronic Mail Address",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ]
      }
    }
  }
}

Request example: Run SDD on all data sources

This request runs SDD on all your data sources.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/run

Payload example

{
  "all": true
}

Response example

{
  "Insurance Data": {
    "id": "d2edc1d0-328c-11ec-9d5a-6793988ccf95",
    "state": "completed",
    "output": {
      "diff": {
        "addedTags": {
          "ssn": [
            "Discovered.PII"
          ],
          "email": [
            "Discovered.PII"
          ]
        },
        "removedTags": {
          "ssn": [
            "Discovered.Country.US"
          ]
        }
      },
      "sddTagResult": {
        "ssn": [
          "Discovered.Entity.Social Security Number",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ],
        "email": [
          "Discovered.Entity.Electronic Mail Address",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ]
      }
    }
  }
  "Finance Data": {
    "id": "d2edc1d0-328c-11ec-9d5a-695e896d59s",
    "state": "completed",
    "output": {
      "diff": {
        "addedTags": {
          "ssn": [
            "Discovered.PII"
          ],
          "email": [
            "Discovered.PII"
          ]
        },
        "removedTags": {
          "ssn": [
            "Discovered.Country.US"
          ]
        }
      },
      "sddTagResult": {
        "ssn": [
          "Discovered.Entity.Social Security Number",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ],
        "email": [
          "Discovered.Entity.Electronic Mail Address",
          "Discovered.Identifier Direct",
          "Discovered.PHI",
          "Discovered.PII"
        ]
      }
    }
  }
}

Update identifiers or templates

Method Path Purpose
PUT /sdd/classifier/{classifierName} Update an identifier. Partial updates are not supported.
POST sdd/classifier/template/{templateName}/clone Clone a template.
PUT /sdd/template/{templateName} Update a template.

Update an identifier

Endpoint

Method Path Purpose
PUT sdd/classifier/{classifierName} Update an identifier. Partial updates are not supported.

Query Parameters

Attribute Description Required
classifierName string The name of the identifier to update. Yes

Payload Parameters

Attribute Description Required
name string Unique, request-friendly identifier name. Yes
displayName string Unique, human-readable identifier name. Yes
description string The identifier description. Yes
type string The type of identifier: regex, dictionary, columnNameRegex, or builtIn. Yes
config object May include config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags. *See descriptions below. Yes
minConfidence* number When the detection confidence is at least this percentage, tags are applied. Yes
tags* array[string] The name of the tags to apply to the data source. Yes
regex* string A case-insensitive regular expression to match against column values. No
columnNameRegex* string A case-insensitive regular expression to match against column names. No
values* array[string] The list of words to include in the dictionary. No
caseSensitive* boolean Indicates whether or not values are case sensitive. Defaults to false. No

Response Parameters

Attribute Description
createdBy array Includes details about the user who created the identifier, such as their profile id, name, and email.
name string Unique, request-friendly identifier name.
displayName string Unique, human-readable identifier name.
description string The identifier description.
type string The type of identifier: regex, dictionary, columnNameRegex, or builtIn.
config object May include config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags. *See descriptions below.
minConfidence number When the detection confidence is at least this percentage, tags are applied.
tags* array[string] The name of the tags to apply to the data source.
columnNameRegex* string A case-insensitive regular expression to optionally match against column names.
regex* string A case-insensitive regular expression to match against column values.
values* array[string] The list of words included in the dictionary.
caseSensitive* boolean Indicates whether or not values are case sensitive. Defaults to false.
createdAt date When the identifier was created.
updatedAt date When the identifier was last updated.

Request example

The following request updates the name and description of the MY_REGEX_IDENTIFIER identifier.

curl \
    --request PUT \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/classifier/MY_REGEX_IDENTIFIER
Payload example
{
  "name": "REGULAR_EXPRESSIONS",
  "displayName": "Regular Expressions",
  "description": "This identifier uses regular expressions",
  "type": "regex",
  "config": {
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5,
    "tags": ["Discovered.regex-example"]
  }
}

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "REGULAR_EXPRESSIONS",
  "displayName": "Regular Expressions",
  "description": "This identifier uses regular expressions",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5
  },
  "id": 67,
  "createdAt": "2021-10-14T18:48:56.289Z",
  "updatedAt": "2021-10-19T12:48:56.289Z"
}

Clone a template

Endpoint

Method Path Purpose
POST sdd/template/{templateName}/clone Clone a template.

Query Parameters

Attribute Description Required
templateName string The name of the template to clone. Yes

Payload Parameters

Attribute Description Required
name string Unique, request-friendly template name for the cloned template. Yes
displayName string Unique, human-readable template name for the cloned template. Yes
description string The cloned template description. No

Response Parameters

Attribute Description
id integer The unique ID of the template.
createdBy array Includes details about the user who created the template, such as their profile id, name, and email.
name string Unique, request-friendly template name.
displayName string Unique, human-readable template name.
description string The template description.
classifiers array Includes details about the identifiers within the template, such as the name and overrides.
sampleSize integer Optional override of how many records to sample from the data source.
createdAt date When the template was created.
updatedAt date When the template was last updated.

Request example

This request clones the MY_FIRST_TEMPLATE template.

curl \
    --request POST \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template/MY_FIRST_TEMPLATE/clone
Payload example
{
  "name": "CLONE_OF_FIRST_TEMPLATE",
  "displayName": "Clone of My First Template",
  "description": "This is a clone of my first template."
}

Response example

{
  "name": "CLONE_OF_FIRST_TEMPLATE",
  "displayName": "Clone of My First Template",
  "description": "This is a clone of my first template.",
  "sampleSize": 100,
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "id": 4,
  "createdAt": "2021-10-19T16:21:17.660Z",
  "updatedAt": "2021-10-19T16:21:17.660Z",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_IDENTIFIER",
      "overrides": {}
    },
    {
      "name": "MY_REGEX_IDENTIFIER",
      "overrides": {}
    }
  ]
}

Update a template

Endpoint

Method Path Purpose
PUT sdd/template/{templateName} Update a template.

Query Parameters

Attribute Description Required
templateName string The name of the template to update. Yes

Payload Parameters

Attribute Description Required
name string Unique, request-friendly template name. Yes
displayName string Unique, human-readable template name. Yes
description string The template description. Yes
classifiers array Includes each identifier's name and overrides for minConfidence and tags. Yes
sampleSize integer Override for how many records to sample from the data source. No

Response Parameters

Attribute Description
id integer The unique ID of the template.
createdBy array Includes details about the user who created the template, such as their profile id, name, and email.
name string Unique, request-friendly template name.
displayName string Unique, human-readable template name.
description string The template description.
classifiers array Includes details about the identifiers within the template, such as the name and overrides.
sampleSize integer Optional override of how many records to sample from the data source.
createdAt date When the template was created.
updatedAt date When the template was last updated.

Request example

The following request updates the name of, description of, and identifier in the MY_FIRST_TEMPLATE template.

curl \
    --request PUT \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    --data @example-payload.json \
    https://your-immuta-url.immuta.com/sdd/template/MY_FIRST_TEMPLATE
Payload example
{
  "name": "HEALTH_DATA",
  "displayName": "Health Data",
  "description": "This template uses the column regex and regex identifiers.",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_IDENTIFIER"
    },
    {
      "name": "REGULAR_EXPRESSION"
    }
  ],
  "sampleSize": 100
}

Response example

{
  "name": "HEALTH_DATA",
  "displayName": "Health Data",
  "description": "This template uses the column regex and regex identifiers.",
  "sampleSize": 100,
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "id": 1,
  "createdAt": "2021-10-14T19:12:22.092Z",
  "updatedAt": "2021-10-20T19:12:22.092Z",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_IDENTIFIER",
      "overrides": {}
    },
    {
      "name": "REGULAR_EXPRESSION",
      "overrides": {}
    }
  ]
}

Delete identifiers or templates

Method Path Purpose
DELETE /sdd/classifier/{classifierName} Delete an identifier.
DELETE /sdd/template/{templateName} Delete a template.

Delete an identifier

Endpoint

Method Path Purpose
DELETE sdd/classifier/{classifierName} Delete an identifier.

Query Parameters

Attribute Description Required
classifierName string The name of the identifier to delete. Yes

Response Parameters

Attribute Description
createdBy array Includes details about the user who created the identifier, such as their profile id, name, and email.
name string Unique, request-friendly identifier name.
displayName string Unique, human-readable identifier name.
description string The identifier description.
type string The type of identifier: regex, dictionary, columnNameRegex, or builtIn.
config object May include config.minConfidence, config.values, config.caseSensitive, config.regex, config.columnNameRegex, and config.tags. *See descriptions below.
minConfidence number When the detection confidence is at least this percentage, tags are applied.
tags* array[string] The name of the tags to apply to the data source.
columnNameRegex* string A case-insensitive regular expression to optionally match against column names.
regex* string A case-insensitive regular expression to match against column values.
values* array[string] The list of words included in the dictionary.
caseSensitive* boolean Indicates whether or not values are case sensitive. Defaults to false.
createdAt date When the identifier was created.
updatedAt date When the identifier was last updated.

Request example

The following request deletes the REGULAR_EXPRESSION identifier.

curl \
    --request DELETE \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/classifier/REGULAR_EXPRESSION

Response example

{
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@example.com"
  },
  "name": "REGULAR_EXPRESSION",
  "displayName": "Regular Expression",
  "description": "This identifier uses regular expression",
  "type": "regex",
  "config": {
    "tags": [
      "Discovered.regex-example"
    ],
    "regex": "^[A-Z][a-z]+",
    "minConfidence": 0.5
  },
  "id": 67,
  "createdAt": "2021-10-19T15:54:28.695Z",
  "updatedAt": "2021-10-19T16:00:02.329Z"
}

Delete a template

Endpoint

Method Path Purpose
DELETE sdd/template/{templateName} Delete a template.

Query Parameters

Attribute Description Required
templateName string The name of the template to delete. Yes

Response Parameters

Attribute Description
id integer The unique ID of the template.
createdBy array Includes details about the user who created the template, such as their profile id, name, and email.
name string Unique, request-friendly template name.
displayName string Unique, human-readable template name.
description string The template description.
classifiers array Includes details about the identifiers within the template, such as the name and overrides.
sampleSize integer Optional override of how many records to sample from the data source.
createdAt date When the template was created.
updatedAt date When the template was last updated.

Request example

The following request deletes the HEALTH_DATA template.

curl \
    --request DELETE \
    --header "Content-Type: application/json" \
    --header "Authorization: Bearer dea464c07bd07300095caa8" \
    https://your-immuta-url.immuta.com/sdd/template/HEALTH_DATA

Response example

{
  "name": "HEALTH_DATA",
  "displayName": "Health Data",
  "description": "This is a template for health data.",
  "sampleSize": 100,
  "createdBy": {
    "id": 1,
    "name": "John",
    "email": "john@immuta.com"
  },
  "id": 1,
  "createdAt": "2021-10-19T16:07:39.356Z",
  "updatedAt": "2021-10-19T16:07:39.356Z",
  "classifiers": [
    {
      "name": "MY_COLUMN_NAME_REGEX_IDENTIFIER",
      "overrides": {}
    }
  ]
}