Challenge
A top global payment provider sought to enable collaboration on their data with potential customers and partners. The use cases focused on two primary goals:
- Improving the experience of customers purchasing their data products by allowing them to evaluate the value of the data and prove out it's worth.
- Collaborate with partners to provide analytics over combined data sets while preserving privacy and security.
Rearc engaged to implement Databricks Clean Rooms to meet the needs of these use cases and prove out the data collaboration strategy of the client.
Our Approach
The client was on-boarded as an early adopter of Clean Rooms from our Databricks partners. Rearc worked with client stakeholders to identify immediate scenarios to be implemented on Clean Rooms for the two use cases. Our engineers implemented the scenarios utilizing Rearc as placeholder collaborator to rapidly iterate and gather feedback.
Solution
Our team implemented Databricks Workflows to manage the lifecycle of Clean Room asset configuration, including sharing data and notebooks within the Clean Room. Notebooks which provide the data analytic instructions executed in the Clean Room were created for each use case.
A notebook joined data from two parties to perform aggregation reporting and unique insights not obtainable with just one data set. Shown below, another notebook was used as a data product pre-sales validator, allowing a potential customer to verify the data being purchased was not already contained within their own data sets.
It is important to consider security and privacy of data utilized for data collaboration. Rearc developed a privacy check framework to validate data being shared into a Clean Room, as well as data outputs within the Clean Room execution. The flexible framework allows the client to utilize open-source privacy tools and proprietary technology. This privacy framework supports the data collaboration use cases by ensuring the data privacy of all partners.
To address custom auditing needs, a logging framework was implemented allowing output from the Clean Room to be available. This solution allowed critical information from the notebook executing within the clean room compute environment to be recorded. This information includes privacy metrics and other information related to technologies utilized within the Clean Room.
Outcome
The client was able to quickly enable new data collaboration capabilities through Databricks Clean Rooms. The client was able to demonstrate the value of this capability through implemented scenarios with Rearc. These efforts allowed the client to confidently move forward with bringing additional data collaboration use cases to market.