Skip to content

You are viewing documentation for Immuta version 2023.4.

For the latest version, view our documentation for Immuta SaaS or the latest self-hosted version.

Getting Started with Immuta Databricks Spark Integration with Unity Catalog Support

Databricks Unity Catalog is a shared metastore at the Databricks account level that streamlines management of multiple Databricks workspaces for users.

Immuta’s Databricks Spark integration with Unity Catalog support uses a custom Databricks plugin to enforce Immuta policies on a Databricks cluster with Unity Catalog enabled. This integration provides a pathway for you to add your tables to the Unity Catalog metastore so that you can use the metastore from any workspace while protecting your data with Immuta policies.

Prerequisites

  • Databricks Runtime 11.3.
  • Unity Catalog enabled on your Databricks cluster.
  • Unity Catalog metastore created and attached to a Databricks workspace.
  • The metastore owner you are using to manage permissions has been granted access to all catalogs, schemas, and tables that will be protected by Immuta. Data protected by Immuta should only be granted to privileged users in Unity Catalog so that the only view of that data is through an Immuta-enabled cluster.
  • You have generated a personal access token for the metastore owner that Immuta can use to read data in Unity Catalog.
  • You do not plan to use non-Unity Catalog enabled clusters with Immuta data sources. Once enabled, all access to data source tables must be on Databricks clusters with Unity Catalog enabled on runtime 11.3.

Configure the Integration

  1. Create a catalog and grant access to the metastore admin.
  2. Enable Databricks Unity Catalog support in Immuta and configure token synchronization.

Configure Your Cluster

Configure your cluster to register data in Immuta.

Register Your Data

Register Unity Catalog tables as Immuta data sources.

Protect Your Data

Build policies in Immuta to restrict access to data.