Becoming a Cloud Data Platform Driven Organization - Part 1

By:   |   Updated: 2022-06-02   |   Comments   |   Related: 1 | 2 | > Cloud Strategy


Problem

The Cloud Lakehouse paradigm is gaining traction in the Data and Advanced Analytics domains as a platform that enables development teams to deliver high value propositions and business outcomes. Ingesting and cleansing a variety of structured, semi-structured, and unstructured data across various cloud and on-premises systems leads to the aggregation and collocation of this data within a Data Lake storage system. With storage being decoupled from compute, vast volumes of data, including big data, can be ingested into these platforms with low storage and maintenance costs that can be securely managed and maintained from an accessibility perspective. With data being the lifeblood of decisions and insights for business stakeholders, the Cloud Data Lakehouse Platform supports a variety of ingestion patterns including batch and near-real-time Extract-Load-Transform (ELT) jobs along with real-time streaming capabilities. As organizations embark on this journey of applying innovation and digitally disrupting their traditional infrastructure and business models to be more data driven, they continue to express an interest in learning more about the benefits of a Cloud Platform.

Solution

There are countless benefits to adopting a Cloud Lakehouse footprint on a major cloud provider platform. According to Gartner, the market is dominated by five vendors who typically accounted for nearly 80% of worldwide cloud market share year over year. These vendors are Amazon (47.8%), Microsoft (15.5%), Alibaba (7.7%), Google (4.0%) and IBM (1.8%). Gartner also predicts that by 2025, $1.8 trillion of enterprise IT spending will be in the public cloud. This potential leads to opportunities for organizations to explore the motivations for moving to the cloud and the benefits becoming a cloud data driven organization. The figure below, by Flexera's 2022 State of the Cloud Report, shows the year over year cloud provider adoption rates for all 753 organizations that were interviewed in both 2021 and 2022. In this article, we will begin to explore these cloud adoption motivations through initially understanding the challenges a cloud data platform can solve along with how to create a cloud adoption plan.

With careful planning and adoption, a Cloud Data Platform can solve numerous challenges and limitations related to on-premises environments. As organizations embark on their journey into the cloud, they will move through various phases in their analytics maturity as they provide market competitive advantage and business stakeholder value. This modern Cloud Platform will enable advanced predictive and prescriptive analytics by providing developers and engineers with advanced platform capabilities. In this section, we will explore the challenges a Cloud Data Platform can solve and how to go about adopting a cloud platform and managing the cost of cloud ownership on the journey to becoming a Cloud Data Platform organization.

Challenges a Cloud Data Platform Can Solve

The modern Cloud Data Platform is intended to solve several challenges that organizations experience with their current on-premises ecosystems. As customers begin to consider cloud solutions to solve these challenges, a strategic opportunity might be justifiable to leverage either one cloud provider (Single Cloud), a variety of cloud solution providers (Multi-Cloud), or both an on-premises and cloud platform (Hybrid Cloud). Both the strategic and tactical solution options are plentiful.

For organizations that are just beginning their foundational Cloud journey, identifying the current on-premises pain points and challenges is a sound first step which can then be strategically mapped to either Single, Multi, or Hybrid Cloud solution roadmap. As an example, lets look at the table below which shows an on-premises challenge in one column and the corresponding cloud solution capability in the next column. As this exercise continues, additional columns can be added to capture the specific cloud provider and technology to determine the optimal strategic direction related to single vs. multi vs. hybrid cloud.

Organizations frequently out-source their advanced analytics needs to consultancies. The cost of hiring an expensive team of Data Engineers and Data Scientists to build these ML Models, coupled with the risk of providing these external vendors with potentially sensitive data which will need to be encrypted and securely shared by an internal team of Data Engineers can often lead to organizations evaluating the benefits of building these advanced analytics capabilities in-house. This in-house Cloud Platform would bring with it the capabilities of AI, ML, and real-time analytics services and infrastructure. Additionally, a Cloud Platform brings robust options for sharing data via marketplaces, tools, and exchanges. Its decoupled Apache Spark compute and low storage cost capabilities support various volumes, velocities, and varieties of data. With its out of the box cloud connectors and real-time services, quicker time to insights can be realized to enhance business value and significantly improve staff productivity.

Many on-premises solutions require expensive infrastructure and licenses, while may cloud solutions offer deep pre-purchase discounts and pay-as-you go options which are even available at the ‘query level'. Cloud platforms often support auto-scaling compute which further supports cost management and controls. Robust security controls, compliance services, and auto-managed maintenance are available for several cloud services to help with securely managing and governing your cloud platform. Finally, for organizations that are invested in DevOps as part of their continuous integration and delivery (CI/CD) methodology, Infrastructure as Code (IaC) can be used heavily within a Cloud Platform to manage both manage infrastructure and incrementally promote changes from one environment to another. The list of challenges and solutions presented in this section are by no means exhaustive. Organizations may have a growing list of additional platform questions which could make its way into a product backlog once the foundational Cloud Data Platform is built out as a Minimum Viable Product (MVP) whose capabilities can be further expanded.

On-Premises Challenge Cloud Solution
Missed business opportunities from lack of AI and ML Tools AI, ML, and Real-time Analytics Services & Platforms
Limitations for sharing data efficiently Out of Box Cloud Data Sharing tools and marketplaces.
Managing huge volumes, velocities, and varieties of data requiring more storage and computing power Supports low cost, secure, centralized storage in the Lake and offers unlimited big data computing power with Apache Spark
Lack of cloud source connectors and real-time solutions 100+ pre-built cloud and real-time streaming connectors enables quicker time to insights.
Limited Pre-Pay and Pay as you Go options Pre-purchase discounts, pay-per-query models; segregation of storage from compute.
Limited Advanced Security Controls Workspace level security for compute and storage; out of box compliance advisory services
Elasticity of Services & Serverless Scale compute and storage based on needs; auto scaling and auto-managed maintenance
Lack of Infrastructure as Code Re-usable code driven templates in a variety of languages including Terraform, ARM, Bicep.
Lack of automated IT infrastructure & software administration services With Cloud Infrastructure, Software, and Platform as service options, coupled with automated management and maintenance options, staff productivity increases significantly.

Cloud Adoption Plan

As organizations agree on their cloud strategy, they will need a cloud adoption plan which will guide them on their path to achieving maturity with a Cloud Data Platform. Many of the major Cloud service providers supply a comprehensive plan for Cloud adoption which convert the aspirational strategic goals into actionable plans which guide technical efforts in alignment with the business outcomes.

Most Cloud providers recommend defining your organization's strategy as the first step in the Cloud Adoption Framework. Understand what business outcomes are being sought, what the current limitations are, and how a cloud platform can solve these problems to provide business value. Once these and other motivations are clearly understood, then begin planning for an MVP project which will carefully design and deliver the expected business outcomes. Many of the major Cloud Adoption Frameworks (CAFs) provide detailed plan generators, trackers, templates, checklists and more as you evaluate your strategy and plan for your cloud adoption journey.

Once these strategic and tactical cloud adoption plans are clearly outlined, the CAF framework provides readiness checklists for environment preparation including Infrastructure as Code (IaC) templates, naming convention tools, and much more. Many CAF frameworks, including Microsoft Azure's robust framework, provide templates, blueprints, assessments, and checklists for the governance, migration, innovation, management, organization, and security of the Cloud Platform that will be built.

Next Steps









About the author
Ron L'Esteve is a seasoned Data Architect who holds an MBA and MSF. Ron has over 15 years of consulting experience with Microsoft Business Intelligence, data engineering, emerging cloud and big data technologies.

View all my tips


Article Last Updated: 2022-06-02

Comments For This Article