Given the critical role data plays in helping companies achieve growth and competitive advantage, it’s no wonder that companies have been migrating their analytical workloads to the cloud. Cloud not only relieves companies of the capital cost required to operate their own data centers, but they offer the flexibility to dynamically adjust compute resources based on demand too. The cloud also offers a wide variety of cutting edge analytics services available from public cloud vendors to address a number use cases.
This is essentially the recipe that enterprises have been following to transform themselves digitally. However, when companies migrate their storage and compute workloads to the cloud, they quickly realize that their processes for authorizing access to data requires transformation as well. This is because earlier, manual processes were unable to keep pace with business in the cloud.
This lack of automation impacts companies in two key areas:
- Places a heavy operational burden on the IT organization: Traditionally, these over-tasked resources have always been responsible for managing access requests from business users. However, the task is made even more complex due to the growing number of cloud services, each of which have a unique mechanism of defining, administering, and enforcing access control policies. In the absence of a solution that unifies and automates the authorization process, IT gets inundated with access requests for which it lacks the context and the criteria needed to act on them.
- Negatively impacts decision making capabilities: This is ironic because one of the major reasons businesses decide to move to the cloud is to increase agility through improved decision making which in turn, depends on rapid access to data. There are a number of utilities available that can help a company migrate its data to cloud services; however, making that data accessible to data consumers such as data scientists, business analysts and line of business personnel is a different story. Until strong data access governance policies for each of the cloud storage and compute services are in place, a company cannot onboard users to those services. This frustrates data scientists and analysts because they are forced to spend a significant portion of their time looking for and collecting data sets, which require getting permission from the data owner. Needless to say, the longer it takes to get access to data for exploratory analysis and hypothesis testing, the more decisions are delayed and the productivity of data scientists and analysts suffers.
Achieving Governed Access Is Harder Than it Looks
To be fair to the platform teams that are building the cloud infrastructure in companies, providing users with access to data at the right granularity is no trivial matter. In the absence of automated data access governance solutions that can administer user access at database, bucket, table, row or column levels, the best that a data platform team can hope for is to control access exclusively by business role. Yet, the organizational structure of most modern enterprises is too complex for coarse access control to be sufficient. For example, you might want to give access to a table in a database to only those users with manager designation rather than to the entire finance organization…or you might want to give access to data to the majority of the users in a department but exclude specific users until they complete their probationary period.
These are just a few of the myriad of other scenarios that leave data platform teams without a scalable way to manage access to each data set for custom groups or individual users. This may be why those teams have gone to the extremes of providing each user with their own instance of a data science platform, a process that is neither practical nor scalable, not to mention extremely expensive in the long run.
The winners and losers in the data economy will be determined by their ability to share data securely; internally as well as externally with business partners. According to Gartner, by 2023, organizations that promote data sharing will outperform their peers on most business value metrics. However CIOs and CDOs struggle to implement data sharing in their organizations due to inflexible data governance frameworks and policies which often work as barriers to sharing data with business partners. Customers repeatedly tell us that the reason they consider implementing an access governance platform is because their data platform teams didn’t realize how difficult it is to share the data that a partner needs.
Another reason enterprises seek a unified data access governance solution is to gain comprehensive visibility into their data ecosystem through centralized reporting. Traditionally, companies have struggled to trace user activity and are forced to depend on workarounds like comparing timestamps to determine which user accessed what cloud instance. Companies need airtight regulatory reporting and auditing for compliance or forensics purposes, yet IT has difficulty accessing, aggregating, querying and summarizing logs for disparate cloud data platforms to get a complete picture of access activity.
This presents a major risk factor for regulatory compliance, as companies should be able to prove to auditors and regulators – at any time – which user accessed what data, when and whether access was granted or denied based on which policy. Automating data access through a single user interface brings together users’ activity from different cloud services in a common format and is a critical component to enabling both internal and external audits.
Why Modern Data Management Requires Unified Data Access Governance
Enterprises need unified data access governance to provide consistent data access control, governed data sharing, and strong compliance capability to support today’s modern data management landscape that is spread across multiple cloud services. With fine-grained access control so that each data resource can be protected based on user roles, attributes, groups, or classifications, these platforms enable data with different access requirements to coexist in the same storage or analytics platform. These solutions empower data administrators to create role-, attribute-, and tag-based policies to control data access at the file, row, and column level–as well as implement dynamic data masking and filtering.
This allows data scientists and analysts to query data rapidly while still protecting against authorized access. It also enables data consumers to quickly acquire the datasets they have permissions to, alleviating the time-consuming process of manually requesting access from each data owner. Empowering data consumers to deliver high-quality insights and analytics faster with the ability to access authorized data, no matter where it resides, is to capitalize on all the advantages of today’s modern landscape and empower the entire organization to generate faster time to insights.
About the Author: Balaji Ganesan is CEO and co-founder of both Privacera, the cloud data governance and security leader, and XA Secure, which was acquired by Hortonworks. He is an Apache Ranger committer and member of its project management committee (PMC).