Qubes: security by virtualization

May 5, 2010

This article was contributed by Koen Vervloesem

The Polish security researcher Joanna Rutkowska is specialized in low-level security, including hardware-based attacks, kernel exploits, rootkits, and virtualization malware. Among other things, she has discovered leaks in the Windows Vista kernel, the Xen hypervisor, and Intel's Trusted Execution Technology (TXT). In 2007 Joanna founded Invisible Things Lab and subsequently her team has changed strategies: they decided to use the knowledge they have gained in breaking systems to create a new operating system that improves security for users.

Last month, Invisible Things Lab presented the first result of this: it launched an alpha version of a new secure open source operating system, Qubes. The project aims at building a secure operating system for desktop users. The main idea is that different applications are isolated from each other, but without any big impediments to usability. To implement this idea, Qubes uses the isolation capabilities of the Xen hypervisor, together with modern hardware technologies such as Intel VT-d (Virtualization Technology for Directed I/O) and TXT.

Virtualization is the cornerstone of the Qubes security architecture because it allows creating containers that are much better isolated than the standard processes in typical operating systems. If the user's web browser gets compromised in a typical operating system, it's difficult to prevent other processes or the user's data being compromised as well. If the compromised process is a core system component such as a WiFi driver or network stack, the security of the whole system is at stake.

Of course this architecture means that the choice of the hypervisor is critical for the security of the whole system. The Qubes developers have chosen Xen for a clear reason: the hypervisor itself is very simple, and it doesn't provide services like a network stack or filesystems that could be an attack vector. A security audit of the Xen hypervisor is therefore much easier to perform than for other solutions like KVM. A more thorough explanation of why the Xen hypervisor architecture better suits the needs of Qubes can be found in the Qubes OS Architecture [PDF] document.

Isolating domains

Users can divide their tasks and resources into several virtual machines, called AppVMs (the "cubes"). Which AppVMs they choose depends on the user's work environment, but there are some typical examples. A "bank" VM could be set up exclusively for access to the user's bank web site, only allowing HTTPS access to the web site and nothing else. Work and personal stuff can be isolated in their own virtual machines. And a "random" VM could be used for watching YouTube movies and playing games.

Qubes provides some virtual machines for system-wide services by default, called SystemVMs. For example, all networking code (network stack and drivers) is sandboxed in an unprivileged "network" VM. The unprivileged code gets safe direct access to specific PCI devices (the network cards) using VT-d technology. The privileged Dom0 (the "host" operating system of Xen which runs the management stack) doesn't contain any networking code. As only the network VM is granted direct access to the networking hardware, each AppVM uses a virtual network interface created by the Xen network frontend. The other side of this virtual interface, in the network VM, is connected to the physical interface via the Linux packet filter, which also blocks any direct inter-VM traffic. This setup prevents the scenario where a lesser-privileged VM can compromise more-privileged VMs by exploiting a bug in privileged driver code.

Another possible attack vector is Dom0, which is almost as privileged as the hypervisor: although it cannot modify the hypervisor's memory, it has access to the memory of all the other virtual machines. So if a certain AppVM can attack Dom0, it can also modify other AppVMs. However, by placing the network code in an unprivileged domain, the likelihood of such an attack is minimal. The only really security-sensitive code in Dom0 that is accessible by the AppVMs is the XenStore daemon (which contains information about where various storage devices are located) and the GUI. If a malicious program can mimic starting and operating AppVMs, they can trick the user into thinking they are running their application securely — much like a phishing scam on a web site.

Secure storage

If all user applications are hosted in AppVMs, it could require a lot of memory and storage: each virtual machine requires an operating system (e.g. a Linux distribution) and one or more applications. However, Qubes makes a special effort to save disk space. Instead of replicating the full OS image for each VM, all AppVMs based on the same distribution share the same read-only root filesystem (/boot, /bin, /etc, /lib, /usr, and so on). The AppVM distribution in Qubes is a lightweight Linux distribution (with a roughly 400 MB footprint) without a desktop environment (as the user's desktop environment is run in the Dom0 operating system), and it only uses a minimal X server.

Because read-only access is not enough, Qubes uses the device mapper to create a copy-on-write device on top of this. This device is discarded when the AppVM shuts down, so (possibly malicious) changes to the root filesystem will not be preserved: even if a virtual machine is compromised, it will boot the next time with a clean state.

For VM-specific data, a separate writable block device is used, containing directories such as /home, /usr/local, and /var. Executable files on this disk, such as browser plugins in the user's home directory or manually installed programs in /usr/local/bin are a risk, because this device is not discarded after use. However, a security audit becomes much easier because exploitable files are limited to this device.

The VM-specific devices (both the copy-on-write image and the private data image) are encrypted with an AppVM-specific key, known only to the AppVM and Dom0. This encryption is done by LUKS (Linux Unified Key Setup). The read-only device used for the root filesystems is signed, and each AppVM verifies this signature when using the device. To prevent an attacker that compromised the storage domain from providing a modified kernel or initrd, the kernel and initrd files are explicitly specified in Dom0 to ensure that the initrd verifies the signature of the root filesystem before mounting it.

Centralized updates of all AppVMs are possible because they share the same root filesystem: the only thing that's needed is a special UpdateVM virtual machine with read-write access to the root filesystem and the signing key to re-sign the device. This obviously makes UpdateVM a weak spot, so it should be secured with much care.

Marrying isolation with usability

This all sounds nice in theory, but if the system is too cumbersome, users will not use it and render their system insecure. Fortunately, Qubes integrates the AppVMs seamlessly on the desktop: the various applications are just shown on the same desktop, although they are hosted in different virtual machines. Copying and pasting text between virtual machines also works, but Qubes has taken care that AppVMs have no direct access to the clipboard: the user has to initiate the copy/paste operation. Of course this could still lead to some data leaks, but it is up to the user to enforce a policy on inter-VM data flows.

Transferring files between virtual machines is a bit more cumbersome. The user has to open the Dolphin file manager in one VM, open the context menu for the file, choose "Send to VM", enter the name of the destination VM and then authorize the file transfer in the destination VM. The files are never automatically copied into the destination's filesystem, but made available in a virtual "pen drive" that is mounted in the destination. The last step is copying the files from the virtual pen drive to the right location in the VM's filesystem. As cumbersome as this procedure is, this prevents an AppVM from forcing another AppVM to automatically accept some files, which could lead to attacks.

The Qubes project is currently in alpha, and is not suitable for production use, although Joanna is using Qubes now as her main operating system. A stable version is expected to appear towards the end of this year. In the meantime, intrepid users can follow the installation guide, which covers the installation of Qubes on top of a Fedora 12 system with KDE.

After installing a template image that will be used for all the AppVMs, as well as the image for the network service VM, the user creates AppVMs with the qvm-create command. Icons for the AppVMs are then created in the KDE start menu of Dom0. When the user starts an application from an AppVM for the first time, Qubes automatically starts the AppVM before starting the application, which introduces a delay, but this delay disappears when the user starts a second application in the same AppVM. Obviously, Qubes needs a lot of RAM: 4 GB is recommended.

Each application gets a label, which is the name of the virtual machine, such as "work" or "shopping". Moreover, the window manager shows a colored frame around the application's window to show which AppVM it is part of. Applications are not allowed to maximize to full screen to prevent a malicious application from spoofing the decorations of other AppVMs.

Most of the documentation about the Qubes project can be found in the wiki. The architecture document linked above has a thorough explanation of the inner workings of Qubes (including an analysis of potential attack vectors), and there's also some practical information in a presentation by Joanna [PDF]. The source code is available in a Git repository and the project welcomes contributions.

The future

Qubes is still under development, and a lot of additions are planned. For example, there will be an unprivileged storage domain — similar to the network domain — that holds all storage drivers and filesystem code, and will get safe direct access to the disk controller. So even if a low-level storage driver or protocol stack gets compromised, it won't result in a full system compromise.

Another feature that is planned is support for Intel's Trusted Execution Technology. This will prevent modification of the system's boot code. So if the storage domain is compromised and a backdoor or rootkit is installed in the boot code, the Qubes system will become unbootable to protect itself.

Currently, the Qubes prototype is using Linux as the operating system running in the AppVMs, but there is nothing that would prevent support for other guest operating systems, such as Windows, as long as they support running as a Xen DomU. Of course Qubes must be adapted then, for example to support the shared root filesystem, but this should be possible. According to the FAQ, support for Windows-based AppVMs might become a commercial extension. In the same way, the general architecture could be used with any hypervisor, as long as it supports the features that the Qubes architecture requires, such as unprivileged driver domains. The developers are also thinking about a slimmed-down version of Xen for more security.

It's interesting to see that one of the best security breakers in the world has now become a builder. The architecture of Qubes is well-thought-out and based on years of system-level security research. The concept of virtualization to isolate potentially unsafe processes is certainly not new (look at FreeBSD jails, OpenSolaris zones, or Linux containers), but it's refreshing to see it implemented in a (relatively) user-friendly way. When the project reaches version 1 later this year, security-conscious Linux users should definitely give it a try.

Index entries for this article
Security	Distributions
Security	Virtualization
GuestArticles	Vervloesem, Koen

Qubes: security by virtualization

Posted May 6, 2010 3:23 UTC (Thu) by spotter (guest, #12199) [Link] (8 responses)

seems very similar to work I published as a tech report almost 2 years ago and had made the conference submission scene and will be published in USENIX ATC this summer

compare to https://mice.cs.columbia.edu/getTechreport.php?techreport...;

Qubes: security by virtualization

Posted May 6, 2010 13:08 UTC (Thu) by sorpigal (guest, #36106) [Link] (1 responses)

This is interesting. I like that people are looking in to things like this. As just an average user I would be interested in seeing, in the not too distant future, a system where isolation brings me network transparency for free (or cheap). Ideally I'd like to be able to suspend an application or set of applications (ie, a workspace) on one system, transfer it to another host, and restore it with its state remaining the same. This is possible today at the VM level but requires a complete, heavy OS instance for each workspace/application. The idea is to achieve app-bundle-like ability to transfer programs on a network without requiring one directory per app and to also permit me to care even less what computer I'm actually sitting at.

Qubes: security by virtualization

Posted May 6, 2010 13:22 UTC (Thu) by spotter (guest, #12199) [Link]

we can already do that, see my other research, work I've done closely w/ Oren Laadan (who was the lead on the checkpoint/restart portion and is now trying to get it into the kernel itself)

http://www.ncl.cs.columbia.edu/research/migrate/

particularly http://www.ncl.cs.columbia.edu/publications/compsac2006_f... and http://www.ncl.cs.columbia.edu/publications/sosp2007_deja...

Qubes: security by virtualization

Posted May 6, 2010 14:43 UTC (Thu) by PaXTeam (guest, #24616) [Link] (5 responses)

how does your system deal with kernel exploits?

Qubes: security by virtualization

Posted May 6, 2010 15:46 UTC (Thu) by spotter (guest, #12199) [Link] (4 responses)

we note that kernel exploits are a way to exploit the system (kernel becomes part of the TCB). advantage of containers is they are very lightweight . if your threat model has to deal with kernel exploits our stuff can be used with any hardware type VMs, but there is significantly higher overhead.

In a KVM or xen type case, the kernel really is still part of the TCB, just that with a containers model, exploiting kernel flaws is more straightforward.

Qubes: security by virtualization

Posted May 6, 2010 21:53 UTC (Thu) by PaXTeam (guest, #24616) [Link] (3 responses)

> we note that kernel exploits are a way to exploit the system (kernel becomes part of the TCB).

care to quote me that part from your paper? i was specifically looking for anything kernel bug/exploit related and found nothing, ditto for discussing what constitutes the TCB. whenever you mention exploit it's always in the context of application (userland) exploits, never the kernel.

> if your threat model has to deal with kernel exploits[...]

yours does, that's what i was trying to imply. there's nothing to prevent a userland exploit from going after a kernel bug next. in other words, your system wouldn't survive for long in the real world, quite the contrary to your claims ;).

Qubes: security by virtualization

Posted May 6, 2010 23:28 UTC (Thu) by spotter (guest, #12199) [Link] (1 responses)

I don't disagree, if I were to attack my system, I'd go after kernel bugs. but that's an attack against many systems, including much more complicated systems like SELinux. in my system there are ways to mitigate that problem (leverage a combination of OS containers and VMs, perhaps use VMs for persistent containers, and have a set of VMs to store a larger set of ephemeral containers), but won't perfectly solve it and will also increase overhead (and as we note at least in the final draft, ease of use and good security are always in tension)

It's actually not in that old tech report, nor in the final version being submitted to USENIX due to space constraints, but was in intermediate versions and has always been in the talks I've given on it, where I basically stated up front that we were concerned about exploits like the run of PDF exploits, but if you are concerned about the kernel being exploited as well that would need a different container approach being container's don't provide isolated kernels.

so I'll agree with

Qubes: security by virtualization

Posted May 6, 2010 23:29 UTC (Thu) by spotter (guest, #12199) [Link]

"so I'll agree with" you on that point

Qubes: security by virtualization

Posted May 7, 2010 4:49 UTC (Fri) by spotter (guest, #12199) [Link]

oh and to answer your question, don't see this in the tech report version (and the last sentence is probably going to be excised from version being published in USENIX due to space), but in the version that was in my dissertation, I included this text

"While VMs provide superior isolation, they suffer higher overhead due to running independent operating systems. This impacts performance and makes them less suited for ephemeral usage on account of their long startup times. However, Apiary can leverage them if one does not want to trust a single operating system kernel."

Nice

Posted May 6, 2010 7:50 UTC (Thu) by smurf (subscriber, #17840) [Link] (2 responses)

now if only all of this didn't require quite so much memory

Nice

Posted May 6, 2010 9:47 UTC (Thu) by TRS-80 (guest, #1804) [Link] (1 responses)

Sharing pages between VMs would help with that.

Nice

Posted May 8, 2010 9:05 UTC (Sat) by nix (subscriber, #2304) [Link]

Of course that makes the TCB more complex and opens a potential avenue for VMs to interfere with each other if it is buggy.

Swings and roundabouts.

Qubes: security by virtualization

Posted May 6, 2010 13:02 UTC (Thu) by pcampe (guest, #28223) [Link] (4 responses)

The article fails to explain why (if) Qubes is better than KVM+SELinux, i.e. SVirt (http://selinuxproject.org/page/SVirt). Anyone has a clearer picture?

Qubes: security by virtualization

Posted May 6, 2010 15:04 UTC (Thu) by davecb (subscriber, #1574) [Link] (1 responses)

It's an independent reinvention of MAC, implemented by virtualization. Which is amusing, as the Solaris "zones" virtualization is derived from Trusted Solaris MAC (;-))

I expect two things
- additional similar reinventions both v->m and m->v
- a later realization that they're the same problem

and just perhaps
- a push from Linus to make MAC and KVM converge (;-))

--dave

Qubes: security by virtualization

Posted May 6, 2010 15:24 UTC (Thu) by davecb (subscriber, #1574) [Link]

Whoop! My error, you're already *doing* the combination.

--dave

Qubes: security by virtualization

Posted May 7, 2010 1:49 UTC (Fri) by jamesmrh (guest, #31622) [Link] (1 responses)

sVirt can't protect against a kernel bug in the host -- if a guest breaks out and exploit a host kernel bug, then it's game over.

We are looking at ways to help mitigate this.

Qubes: security by virtualization

Posted May 7, 2010 7:50 UTC (Fri) by pcampe (guest, #28223) [Link]

Partially correct, because a MAC could protect against such attack if the MAC function in the kernel is working properly and the policy has no black holes (of course, you could have some kernel bugs that prevent MAC from enforcing the defined security policy when complex interactions between host and guests happen).

Otherwise, you'd better have an hypervisor with a minimal footprint, which at least reduces the attack surface; but Qubes it's using Xen, so it could exposes the same target with the same (known or latent) vulnerabilities.

User interface for security

Posted May 6, 2010 15:12 UTC (Thu) by davecb (subscriber, #1574) [Link] (2 responses)

The Trusted Systems world (the real one, not the fakers doing
DVD players) deals with the copy-paste problem by making it
user-visible, and management-allowable.

In X, the windows have an optional decoration by security classification.
When you try to copy or paste from a high-security compartment to a low-security one, you get a pop-up saying this is a bad thing.
If needed, there is a "downgrader" that can copy to low-security on a case-by-case basis. Compilers can have an automatic downgrader for their object output, for example.

For safety, you could also have a popup when you try to copy from a low-safety compartment to a secure one, with a dwngrade to convert, for example, macro-virused word processor docs to macro-free filtered odf (;-))

--dave

User interface for security

Posted May 8, 2010 0:57 UTC (Sat) by quotemstr (subscriber, #45331) [Link] (1 responses)

Compilers can have an automatic downgrader for their object output, for example.

Why would a compiler be in the high-security compartment to begin with?

User interface for security

Posted May 8, 2010 14:38 UTC (Sat) by davecb (subscriber, #1574) [Link]

It's input data could well be in a compartment, but the compiler itself would probably be at system-low.

--dave

Qubes: security by virtualization

Posted May 6, 2010 20:06 UTC (Thu) by jengelh (subscriber, #33263) [Link]

That reminds me of MINIX, or microkernels in general. Drivers run in a "VM of sorts"... userspace process with possibly reduced privileges.

Qubes: security by virtualization

Posted May 8, 2010 9:09 UTC (Sat) by nix (subscriber, #2304) [Link]

The only thing that disturbs me here is the requirement for VT-d support for networking. IIRC, VT-d requires a modicum of BIOS and chipset support, and I have never yet seen a single desktop on which this support works. Sometimes turning it on does nothing; sometimes it locks up the system as soon as you try to use it, possibly because it's buggy as hell and DMAs stuff into the wrong places. I've seen it working on server systems, but isn't this supposed to be a desktop OS?

(maybe it works on very recent systems; the most recent I've tried it on is Nov-2009 Asus motherboards, which fail. Maybe Intel motherboards are more reliable in this area.)

older Qubes

Posted May 10, 2010 22:17 UTC (Mon) by roelofs (guest, #2599) [Link]

...of the Cobalt persuasion: http://www.flickr.com/photos/audrix/1715270038/

That was a lovely little box.

Greg