Saturday, May 28, 2011

(Un)Trusting the Cloud


Everybody loves The Cloud these days, and it is not hard to understand why. When every person owns computers (devices), the cloud is really hard to beat when it comes to syncing all your digital life back and forth between all those devices, and also sharing with your family members, friends, and colleagues at work. From task lists, through calendars, through health & fitness data, to work-related documents. And I'm not even mentioning all the unencrypted email that is out there.

One doesn't need to be especially smart or security conscious to realize how much this might be a threat to security and privacy. How much easier would it be to attack somebody's laptop if I knew precisely in which hotel and when he or she is planning to stay? How much more expensive would my health and life insurance be, if they could get a look at my health and fitness progress? Etc.

But we're willing to sacrifice our privacy and security in exchange for easy of syncing and sharing of our data. We decide to trust The Cloud. What specifically does that mean?

First, it means we trust the particular cloud-based service vendor, such as the provides of our training monitoring app and service. We trust that this vendor is: 1) non-malicious and ethical, and so is not going to sell our private data to some other entity, e.g. insurance company, and 2) that the software written by this vendor is somehow secure, so it would not be easy for an attacker to break into their cloud service and download all the user's data (and then sell to health insurance companies).

Next, we trust the cloud infrastructure provider, such as Amazon EC2. We trust that the cloud provider is 1) non-malicious and ethical, and that they won't really read the memory of the virtual machine on which the previously mentioned cloud-service is running (and won't make it available to a local government officials, e.g. in China), and 2) that they secured their infrastructure properly (e.g. it wouldn't be easy for one customer to “escape” from a VM and read all the memory of the VMs belonging to other customers).

Finally we trust all the infrastructure that is in the middle between us and the service provider, such as e.g. the networking protocols, are safe to use (e.g. we trust all the engineers working in any of the ISP we use won't sniff/spoof our communication, e.g. by using some fake or quasi-fake SSL certs).

So, that's a hell of a lot of trusting! And the stake is high. Do we really need to make such a sacrifice? Do we really need to hand in all our private data to all those organizations? Of course we don't!

First, notice that in majority of cases, the cloud is only used basically as a on-line storage. No processing, just dump storage. Indeed, what kind of server-side processing does your task list or calender require? Or your freestyle swimming results? Or your conference slides? None.

And we know for very long how to safely keep secrets on untrusted storage, don't we? This is achieved via encryption (and digital signatures for integrity/authenticity). So, the idea is very simple: let's encrypt all the data before we send them to the cloud. The point here is, the encryption must be done by the app that is running on our client device. Not in the cloud, of course.

Ok, so let's say I have my calendar records encrypted in the cloud, how do I share it with my other devices and other people, such as my partner and colleagues at work? Very simple – you encrypt each record with a random symmetric key and then, for every other device or person who you want to grant access to your calendar you make the symmetric key available to this person, by encrypting it with their public key (if you're paranoid, you can even verify fingerprints using some out-band communication channel, such as phone, to ensure the cloud/service provider didn't do MITM attack on you). What if you want to share only some events (or some details) with some group of people (e.g. only your availability info)? Very simple – just encrypt those records you want to share in non-full access with some other symmetric key and publish only this key to those people/devices you want to grant such non-full access.

Implementing the above would require writing new end-user apps, or plugins for existing apps (such as Outlook), so that they do encryption/decryption/signing/verification before sending the data out to the cloud. But what stops the malicious vendor from offering apps that would be leaking out our secrets, e.g. the keys? Well, nothing actually. But this time, the vendor would need to explicitly build in some kind of backdoor into the app. The same could be done with any other vendor, and any other, non-cloud-based app. After all, how do we know that MS Word, which is not cloud-based yet, is not sending out fragments of our texts to Agent Smith? Note how different this is from a situation when the vendor already owns all our data, unencrypted, brought legitimately to their servers, and all they need to do is to read them from their own disks. No need to plant and distribute any backdoors!

In practice few vendors would be risking their reputation and would be willing to build in a backdoor into an app that is then made available to customers. Because every backdoor in such client-exposed code will sooner or later be found (You would really not believe what great lengths all those young people aimed with disassembler and debugger would go to, to win an economy class ticket to the middle of desert in the hottest summer season, just to be able to deliver a presentation on how evil/stupid a company X is ;).

One problem is, however, with accessing our encrypted cloud over a Web Browser. In contrast to apps, the web browser content is much less identifiable. An app can have a digital signature – everybody know its an App v 1.1, published by X. As explained above it would be rather stupid for X to plant a backdoor into such an app. But a Web-delivered Javascript is much more tentative, and it's very possible for X to e.g. deliver various versions of scripts to different customers. Digital signature on client-side scripts, paired with ability to whitelist allowed client-side-scripts, would likely solve this problem.

So, why we still haven't got client-side-encrypted cloud-services? The question is rhetorical, of course. Most vendors actually loves the idea of having unlimited access to their customers data. Do you think Google would be happy to give up an opportunity to data mine all your data? This might affect their ad business, health research, or just Secret Plan To 0wn The World. After our dead body, I can almost hear them yelling! After all they have just came up with Chrome OS to bring even more data into their data mining machine...

To sum it up, there is no technical reason we must entrust all those people with our most private data. Sooner or later somebody will start selling client-side-encrypted cloud services, and I would be the first person to sign up for it. Hopefully it will happen sooner than later (to late?).

This post also hopefully shows, again, one more aspect – that we can, relatively easy, move most of the IT infrastructure out of the “TCB” (Trusted Computing Base, used as metaphor here). In other words, we can design our systems and services so that we don't need to trust a whole lot of things, including servers and the networking infrastructure (except for its reliability, but not for its security). But, there always remains one element that we must trust – these are our client devices. If they are compromised, the attacker can steal everything.

Strangely most people still don't get it, or get it backwards. Just the fact that “information is not stored on the iPad but kept safe on the corporate network”, doesn't change anything! Really. If the attacker owns your iPad, then she also can do anything that the legitimate user could do from this iPad. So if you could get to the company's secret trade data from your iPad's Receiver, so would be able to do the malware/attacker.

Saturday, May 21, 2011

The App-oriented UI Model and its Security Implications


Most of the desktop OSes today, such as Windows or Mac, expose and encourage a File-oriented UI model. You pick a file in the file manager, click it, and then the file manager automagically determines the best app to handle the file, starts the app, and passes the file to it.

Back in the MS-DOS days we used a different model: an app-oriented model – you started an app first, e.g. Word Perfect, or Lotus 1-2-3, and then you opened a file from within the app (Norton Commander and similar programs somehow changed that later).

Interestingly this very same app-oriented model is now becoming popular again thanks to systems such as iOS and Android. There is no such thing as a global File Explorer or Finder on an iPad. Only the apps. One must first pick an app, and then it's the application's responsibility to expose an option for opening one of your “files”, if the app supports it (e.g. the calendar or task list apps would always open your default calendar or task list without asking for anything).

I actually like this app-oriented model a lot! It's much less confusing to the user. Just think about all those attacks in the past where an attacker could prepare a file with some innocently-looking extension but which in fact was an MZ executable. Or how many times people are not even aware which app they use! One might argue that user should not be distracted by such “unimportant” things as what app he or she uses for her work, but I disagree. Apparently Apple, and millions of iPhone and iPad users, disagree too.

But the main reason why I like this app-oriented model is because it just fits greatly into the Security by Isolation philosophy.

Just think about it: if it's possible to get users to consciously select an app, and we now know it is possible thanks to the millions of app-oriented devices sold, then it should be not much more difficult to get them to also consciously select the domain or area, such as “work”, or “personal”, which they wish to use. Just imagine that instead of one “Mail” app, you would have two apps (and two icons): “Mail Work”, and “Mail Personal”.

There are some technicalities here – such as e.g. how to isolate apps between each other? Do we need to build another layer of isolation in a form of VMs to isolate “Mail Work” from “Mail Personal”, or should the (new) OSes and the (new) APIs be designed in such a way, that they were thin and secure, and allow for very good isolation between processes without using virtualization?

In Qubes we must use this additional layer of abstraction (virtualization), because we want to use Linux apps (and in the future also Windows apps), and they require huge POSIX/X API (and Win32 API) to work correctly. And those APIs are not easily isolate-able. So we use VMs as “API providers”. Same with isolating networking drivers and stacks – we need Linux kernel API to get those drivers and stacks running, so that's why we use a Linux-based “NetVM” for isolating networking. For this reason we expect users to explicitly define domains, such as “work”, “personal”, etc. This is because we cannot afford to run every single app in a separate AppVM (more precisely we cannot afford to create a working copy of this huge POSIX/X API for each app).

But we could very well imagine a well constructed API for apps that would just be easily isolate-able (I'm not saying iOS or Android has such an API), and so there would be no need to define domains explicitly. Still, we would need a possibility to define more than one instance of each app – such as the previously mentioned “Mail Work” and “Mail Personal”.


The app-oriented model seems to be the future. And so seems the Security by Isolation philosophy!

Friday, May 13, 2011

Following the White Rabbit: Software Attacks Against Intel VT-d

Today we publish a new paper which is a result of our several month long in-depth evaluation of Intel VT-d technology. To quote the abstract:
We discuss three software attacks that might allow for escaping from a VT-d-protected driver domain in a virtualization system. We then focus on one of those attacks, and demonstrate practical and reliable code execution exploit against a Xen system. Finally, we discuss how new hardware from Intel offers a potential for protection against our attacks in the form of Interrupt Remapping (for client systems available only on the very latest Sandy Bridge processors). But we also discuss how this protection could be circumvented on a Xen system under certain circumstances...

I think the attack is likely the most complex and surprising out of all the things we have presented so far. Parts of it are even funny (if you share our weird sense of humor), such as the use of ICMP ping to generate MSIs. The paper also covers the vendors' response. You can download the paper here.