What are some things the Tintri file system needs to intrinsically understand? Virtual Machines and the files that comprise them.
Rob Girard explains:
They taught the file system that a Virtual Machine (VM) is described by a configuration file (a .vmx file in the case of VMware’s ESX/ESXi/vSphere). This file gives more information more about the VM, such as how many virtual disks it has, and where they live (VMNAME.vmdk and VMNAME-flat.vmdk). These vDisks may have snapshots (VMNAME-0000000001.vmdk & VMNAME-0000000001-delta.vmdk) as well as vSwap files that are used when host memory pressure needs to spill contents over to storage.
They also taught the file system about certain “objects”. Later, they taught it about more objects to support additional hypervisor integrations, such as the equivalent files for Hyper-V, Red Hat Enterprise Virtualization, Xen Server, and OpenStack.
This was further enhanced by teaching the VMstore yet another new trick i.e., storing, serving, protecting, and cloning objects beyond VMs to include SQL Server database files: .mdf, .ldf, .ndf (and a few more along the way that we caught in our Q/A processes).
Since this was done in an extensible way, it provides the ability to programmatically do the same thing again and to be able to repeat the process in the future, with minimal work relative to the initial effort.
The result is a file system that understands what a VM is, instead of serving a set of seemingly random read and write calls for various offsets, whose contents and relevance are privy only to the owner of the file system, which are the vSphere hosts in the case of VMFS; not the underlying storage system.
What else? All the higher-level objects need their components (files) to be captured at any point in time, and these need to be completely consistent with one another, preserving write order. These, of course are what we refer to as “snapshots” – Per-VM snapshots. These snapshots can be used to roll back to a point in time, for data recovery. They can be replicated for off site protection, and the replication itself is another intelligent design that not only de-duplicates on the wire to save precious WAN bandwidth but replicates at a per-VM granularity. This ensures more useful 100% complete replica VMs vs. a big LUN that might have only completed 90% by the time it’s needed (which = 0% from a functional perspective).
Also: you can create new objects through intelligent cloning, which means you can instantly create more VMs (or databases) without any performance overhead. Not only can VMstore create them, but it knows what to do with them, automatically adding them to the inventory of specified hosts! These are all space efficient since they don’t contain the churn of any other data except the exact data set you want: the data that belongs to specific VMs.
They also taught the file system about the concept of hosts and clusters, and how to poll them to automatically discover which of the files is being hosted belongs to particular VMs., or in the case of cloning, how to speak to these hypervisors in their native language (APIs) to add the new VMs which have just been created into inventory.
Then there is information the file system knows about ground rules of performance. All VMs (and databases) need to play nicely with one another. This is where things got really interesting. They introduced machine learning (ML) to discover the performance characteristics of VMs, and then assumed that the VM would continue to behave this way for the near future. So, performance capacity was reserved to prevent a VM from being impacted by new workloads, or an existing workload whose characteristics have strayed from the demonstrated norm.
Under the covers, the file system is constantly tweaking and tuning how to prioritise I/O for each VM based upon a lot of rules: some are constants, others are variable based on how other VMs are behaving. These rules for recognising that I/O for each VM is independent from all other I/O are the foundation of Tintri’s patented auto-QoS engine.
Rob Girard – Putting the “Intelligent” in “Intelligent Infrastructure” Tintri Blog 1st April 2020