Rubrik Alta
Today, Rubrik officially announced version 4.2 of it's Cloud Data Mangagement platform, code named Alta. As a new member of the Rangers team and being focused on public cloud architectutre, I can honestly say this one of the most exciting releases I have seen to date. Alta is going to accelerate the journey to the cloud for Rubrik customers, while continuing to provide enterprise class reliability and unmatched ease of use.
Perspective from the field
As a Cloud Solutions Architect I talk to a lot of customers about public cloud platform capabilities, consumption models, migration patterns, etc. Historically, I've dealt predominantly with infrastructure professionals. Lately, I've seen more and more conversations with developers and application owners at the table as well, this is great! While it's completely subjective on my part, below are some prevelent themes i've noticed. By no means an exhaustive list, but it certainly sets the tone for our operating environment today.
Everybody is moving towards public cloud in some way shape or form
It doesnt matter if you are a two person startup all in on public cloud or a legacy IT shop moving towards something like archive or test/dev in the cloud. The industry seems to be recognizing that these platforms offer a new financial consumption model, as well as unparalleled scalability and agility. Bottom line, regardless of who you are or what you do, public cloud should a tool in your kitbag.
Infra pros and devs want to minimize complexity, but they want the power to create and customize
Developers and infrastructure pros are sick of complexity hindering innovation. They want products and services that are quick and easy to implement and that drive signficant value out of the box. At the same time, they want extensibility via open and easy to use APIs.
Keep it native! But... don't lock me in.
There are a few conflicting objectives in the statement above that I think really drive a great cloud product/services strategy. Let's unpack it quickly:
- If you put something into or interact with the public cloud, do so in a way native to the platform whenever possible. Don't make me consume cloud soley through your product or service.
- At the same time, make your product platform agnostic so that if I can change cloud providers or even go back on premises if I want.
- Bring something meaningful to the platform beyond it's native capabilities whenever possible.
- Do all this without making the solution so expensive or complicated that it's prohibitive.
These simple design principles are really the core underpinnings of a great cloud integration in my opinion. Its really hard to strike an appropriate balance between the asks, especially at scale and over time.
Automation, infrastructure as code, ITSM, logging, and other ecosystem integrations are expected
Especially when talking to developers or infrastructure pros that have a true devops focus. It's gone from "oh! you have [Chef Cookbooks|Ansible Modules|vRA Plugins|Splunk Integration|PowerShell Modules|Service Now Plugins|...] - neat!" to "You don't have that integration? No Thanks"
Enter Rubrik Alta
Rubrik Alta (4.2) allows us to offer several value propositions to our customers that, in my opinion, align well with the market trends identifed above.
It's now easier to build your secure private cloud.
Building upon our logical multitenancy releases and security tools, we’ve delivered features like vCloud Director integration, Envoy (support for advanced network topologies), and enhanced notifications that make it easier to share resources, offer backup-as-a-service, and run your data center like a public cloud provider – whether you’re a managed service provider or a cloud-aware enterprise.
It's now easier to protect public cloud applications
As large enterprises start to build cloud-native applications, Rubrik is ready to offer cloud-native backup of AWS EC2 instances. This gives enterprises the choice to protect AWS applications with cloud-native tools. While still leveraging the same SLA driven methodology customers have grown to know and love.
Extending deeper into the enterprise data center
Public cloud integrations are in high demand, but that doesnt mean that traditional tier 1 workloads are going away anytime soon. We are committed to providing the same powerful UX and simple interface for traditional enterprise workloads. We now protect AIX and Oracle Solaris.
Everything under a unified, storage-agnostic interface
All of this is built into the core product, with simplicity, automation, and a scale-out architecture in mind. There is no separate cloud-only software, there is no parallel media server architecture, and there are no heavyweight agents. Whether you’re building on AWS or building on Solaris, you get the same Rubrik experience.
AWS Native Protection
Native functionality and current gaps
Without going into exhaustive detail, lets do a quick overview of what EBS volumes are and what sort of protection AWS offers natively today.
A virtual machine in AWS is known as an EC2 (Elastic Compute Cloud) instance. These instances are backed by either an Instance Store root volume, or an EBS (Elastic Block Store) root volume and can have additional EBS volumes attached to mount points as needed.
EBS volumes are what I see most commonly utilized in the field because, amongst other benefits, they can persist after an EC2 instance fails or is terminated. Instance store volumes are ephemeral in the sense that instance failure or termination will cause them to be deleted. EBS volumes are hardware redundant within an availability zone, which offers some protection against hardware failure, but no protection against malicious or accidental data modifications or deletion. To learn more about EBS, go to https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/AmazonEBS.html.
To protect against the latter, many folks use periodic, point in time EBS snapshots. After the first full snapshot, all subsequent snapshots are incremental only. All snapshots are stored in a hidden S3 bucket that is available for use in that Region. To learn more about EBS snapshots, go to https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/EBSSnapshots.html.
Sounds great, right? Not so fast. While users can initiate and delete snapshots using the AWS console or the EC2 APIs, lifecycle management is not a part of the native service:
- EBS snapshots cannot be scheduled from the AWS console. Typical workarounds include cron jobs that kick off a script to create a snapshot or Lambda functions that create snapshots based on CloudWatch metric.
- There is no native policy engine for automatically expiring and deleting snapshots based on a schedule or a retention policy. Typical workarounds include cron jobs that kick off a script to delete snapshots or using a Lambda function to delete snapshots based on CloudWatch metric.
Use Cases for Rubrik native EC2 protection
- Automate Manual Processes - Leverage the Rubrik SLA Domain model to automate snapshot lifecycle management instead of having users maintain their own scripts and tools.
- Rapid Recovery from Failures - Eliminate error-prone and time-consuming manual procedures by using the Rubrik dashboard to recover an instance in a few simple clicks.
- Replicate EC2 Instances in other Regions - With a few clicks in Rubrik, use a snapshot to instantiate a replica of an instance in the same AZ or export a snapshot to any Region.
- Simplify Single File Browsing, Search and Download - Eliminate time-consuming manual procedures by using the Rubrik dashboard to browse, search or download a file from any snapshot in a few simple clicks.
- Consolidate Data Management - Use one solution to manage data protection across on-premises and cloud environments.
Rubrik Integration with EC2 Instance Protection
Rubrik CDM 4.2 integrates its internal snapshot service with the relevant Amazon EC2 APIs. Specifically, Rubrik leverages the create-image API to snapshot EBS volumes and to create Amazon Machine Images (AMI) from those snapshots. An AMI is a saved template of an EC2 instance that can be created from a snapshot. Users can launch multiple instances using the same AMI. An AMI includes the following:
- Template for the root volume for the instance
- Launch permissions that control which AWS accounts can use the AMI to launch instances
- A block device mapping that specifies the volumes to attach to the instance when it's launched
- The EC2billingProduct code, which verifies the subscription/licensing status of the guest operating system
Rubrik does not need to ingest any data, allowing management to be done from either a Cloud Cluster instance or from an on-premises Rubrik instance. Discovery of EC2 instances are conducted from a Rubrik cluster using the EC2 API, which populates the new “Cloud Workloads” section. Users can specify which AWS accounts and Regions to manage.
Discovered EC2 instances can be associated with an SLA Domain where the retention period and backup window can be specified. Note that replication and archival policies are not applicable and therefore not available for configuration.
The Nitty Gritty
Backups
To provide a user experience that is uniform with the experience that customers have when they back up their on-premises virtual infrastructure, Rubrik CDM 4.2 uses Amazon’s create-image API to initiate EC2 instance protection using EBS snapshots.
Using the create-image API with the appropriate switches enables the following:
- Rubrik always snapshots the root EBS volume of an instance and any other attached EBS volumes that have not been explicitly excluded
- A new AMI is created with every snapshot
- Rubrik uses the “&NoReboot=true” option so an instance is not shutdown before a snapshot is taken
Instance Restores
Rubrik always restores the full EC2 instance, including the root volume and any other attached volumes that were included in the snapshot. Rubrik automates the following steps that are required for restoring an EC2 instance:
- Create new EBS volumes from a specified snapshot
- Stop the associated instance
- Detach the old volumes
- Attached the newly created volumes
- Restart associated instance
Launch New Instance from AMI
Users can launch a new instance from an AMI in any Availability Zone within the same Region. For example, an AMI created in us-west-1a can be used to launch a new instance in us-west-1c. Users can also leverage Rubrik to export a snapshot/AMI to a different Region and launch a new instance in that Region. For example, Rubrik can copy an AMI created in us-west-2 to any Availability Zone in us-east-1 and launch a new instance using the newly copied AMI.
File Level Recovery
Natively, AWS users can only restore a full EBS volume and not individual files in a volume. To recover a specific file, users would have to manually restore the full volume, attach it to an instance, find the file, and copy it to the original volume. Rubrik enhances the user experience for file-level recovery by enabling users to browse or to search for any protected version of a file. Users can then download that file for recovery purposes, without having to manually perform a full volume restore.
Under the hood, Rubrik automatically launches a Bolt instance after a new snapshot is taken. The Bolt instance is a single node Rubrik compute instance that indexes the files in the snapshot. This is done by having the Bolt instance mount the new volume that is created from the snapshot. The Bolt instance then indexes all the files in the volume and records the metadata in the Rubrik cluster. The Bolt instance and attached volume is deleted after the indexing is complete.
Final Thoughts
Alta continues to demonstrate tangible progress towards Rubrik's commitment of delivering consistent, simple, hybrid cloud data management at scale. All in a manner comencerate with the market's demand to leverage native platform functionality while avoiding platform and vendor lock-in. There's still tons of room for us to continue innovating and revolutionizing the market, but 4.2 is a damn strong step in the right direction.
Thanks for reading, and have a great day. As always, I welcome your feedback @vDingus.