Challenge
A large global payment provider desired to be able to enable data collaboration and discovery between multiple internal teams utilizing the Databricks Platform.
Rearc partnered with them to understand the vision and develop a data governance strategy which enable internal teams to manage their own data while meeting organization security goals.
Solution
Rearc utilized Unity Catalog and Databricks Workspaces to develop a hub-and-spoke model, the DataHub.
Each internal team represents a spoke which is provisioned with a set of standardized user groups to provide workspace admin, end user and read-only roles.
These spoke teams are able to manage their own resources within catalogs in a self-service manner. To facilitate cross-team collaboration or data sharing, each spoke is able to add data sets to a catalog which is able to be accessed by other spokes. Access may be granted to an entire team or specific individuals.
Spokes can discover metadata about shared data and request access to listed data sets to promote discovery.
Outcome
The client was able to have a consistent governance model for internal teams and the processes to quickly create a new spoke within the DataHub model.
Teams were able to collaborate and generate data sets to be broadly consumed within the Databricks platform by the organization.
Moreover, the foundational governance and guard-rails from DataHub supported and enabled development of additional strategic projects within the data platform.