Configuration Best Practices
Use an IAM and SCIM
Most users will not log in to Immuta directly; they will log in to a system in which you want to enforce data governance. However, Immuta needs to know who the users are in order to enforce access controls. To do so,
- Use an IAM (such as Okta) as an SSO or SAML solution.
- Use SCIM to push a user's account (including their groups or attributes) to Immuta. SCIM automatically syncs user accounts in Immuta. However, users will still need to log into Immuta to use the web interface.
To learn more about Immuta’s IAM and SCIM integration click here.
Use an external IAM for authentication and Immuta's internal IAM to manage attributes
You probably already have groups configured in your IAM, but attributes can make your policies more dynamic. Groups and roles in your existing IAM likely grant access to data, but they typically don’t describe that data or the access.
Use tags and attributes in Immuta that describe your data and your users, and then write a single policy that grants access to data automatically. For example, you could create tags that describe the domain of the table and the security classification. You could then write a policy that states users with attributes that match tags on the tables or columns would automatically get access. Writing a single policy like this one simplifies your overall policy creation and execution.
-
Analyze your policies and determine which user metadata is critical to understanding how policies should act, and then decorate your users with those descriptive attributes.
-
Leverage the systems you use for approval workflows for group assignments that are already a part of your organization to have approvals for attributes.
-
Manage attributes through Immuta or your IAM.
Use Schema Monitoring to assess changes to data sources
Schema monitoring checks for new tables that get added to your schema. This is a powerful tool to ensure that tables are all being governed by Immuta. Immuta will run a daily job to pick up and add any new tables.
- Consider using Schema Monitoring later in your onboarding process, not during your initial setup and configuration when tables are not in a stable state.
- Consider using Immuta’s API to either run the schema monitoring job when your ETL process adds new tables or to add new tables.
- Activate the New Column Added templated Global Policy to protect potentially sensitive data. This policy will NULL the new columns until a Data Owner reviews new columns that have been added. This protects your data and avoids data leaks on new columns getting added without being reviewed first.
Develop a plan for capturing metadata
You may already have an external data catalog tool like Collibra or Alation. Immuta can easily integrate with those catalogs and use your existing metadata to scale policy creation quickly.
If you do not already have a sensitive data tagging solution, allow Immuta to discover and tag sensitive data with our Sensitive Data Discovery capabilities. Should you have data types Immuta doesn’t discover out of the box, you can customize Immuta’s sensitive data discovery to do so.
Start small, and then automate
Add your first data sources and policies through the Immuta UI. Don't automate a process you haven’t executed manually successfully once.
- When going to production with a large number of data sources, write a script to use Immuta’s API or CLI to automate onboarding of data sources.
- The CLI is the best way to manage a single file for onboarding data sources and applying tags. Storing those file versions in a repository allows you to track versions and automate your pipeline.
Register data sources with a system account
Always register data sources with a system account, not a user account. If a user changes roles or leaves the organization, your data sources will continue to function. If the original owner of the data source is a user account, then the data source could get locked out when that user leaves the company or gets a new role, and an account lock can leave data sources unhealthy.
Add a secondary data owner to your data sources for easier management. The data owner(s) can be added to a data source after using a system account to register the data source.
Consider your ETL when establishing governance
Since ETL processes are typically completed through system accounts, Immuta should not be in your ETL process. Immuta governs user accounts, not system accounts that have full access to data.
- Remove Immuta from your ETL process. For example, if you're using Databricks, your ETL process should use non-Immuta clusters.
- Use Immuta’s native write capabilities to share derived data within a team setting, not as proxies for an ETL strategy.
- Consult your Immuta representative on the best practice for your technology stack.