IL-SUNG LEE: Good afternoon, everyone. So let’s get started here. My name is Il-Sung Lee. Along with me, I have a Greg King from PayPal and my colleague Joseph Valente. And we’re here to talk about bringing you more control, new services for data security and transparency. So first of all, does anyone have any idea what we’re going to talk about today? Anyone? OK, Tim doesn’t count because he’s our engineering director, and he’s cheating. All right, yes?
AUDIENCE: Zero trust?
IL-SUNG LEE: Zero trust? No, not quite, OK? It’s very close. So before I tell you what it is, last year, I actually went on a stage similar to this and talked about how– like, I announced Cloud HSM. And it was a relief. It was a huge relief because I got to finally tell my wife what I worked on. Because she had no idea. I wouldn’t tell her. And when we finally released, and I asked her like, this is what I do, she’s like, I still don’t understand, right? And so this time, I did the same thing. I said, I could tell you what I’m working on now. And I explained to her, and she’s like, I still don’t understand. So, you know, I gave up. But we don’t talk about the current encryption in GCP. We’re going to talk about the external key management, OK?
The stuff that, if you went to the keynote this morning, you heard Thomas and Suzanne talk about. And then the key access justifications, all right? So as I go along, if you have any questions, please just try shout it out as much as possible. I’ll try to listen. If not, then come see me after the session. But I want to just quickly kind of rehash what we do in terms of current encryption in GCP, OK? And this will kind of give a good framing about the things that we’re going to talk about next. So I’ve given this slide, many, many times. And it basically describes what our current offerings are for encryption in the Google Cloud for data at rest.
Now, first of all, the scale shows that on the left-hand side, it’s basically where Google has control of all the keys. And so this is where we’re talking about things like our default encryption. And so as we’ve actually mentioned before, we were the first cloud to do the– all your data is encrypted at rest by default. And there’s no way for you to turn it off. We do it at scale. We do it in such a way that it’s non-impactful. It’s transparent. But it’s done for all of your data that you bring up to our cloud. And then on the left-hand side– or the right-hand side, I’m sorry– you have– Google has no control over keys. So in between there, you have a lot of different options, right? So if you want more control over keys and default encryption, then we provide Cloud KMS.
You can use Cloud KMS to encrypt your data directly. Or you can integrate it with some of our storage services, right? And then you can actually control the keys in the sense that you can disable them. You can rotate them. You can do all sorts of things with them. But then customers start saying that, well, I need something to help with my compliance because I need something that has a FIPS certification for example. So we provided Cloud HSM. And that provides you with the ability to actually have a cloud-hosted and managed HSM service that gives you the ability to put your keys in a FIPS 140-2 level 3 service. And then we have customers providing encryption keys for both GCS, or cloud storage, and for Compute Engine, which requires you to bring the key with you every time you make a call that requires access to that data, OK? So it’s something that you have to bring every time. It’s only available for those two services. And then finally, we get to the point where some customers have said, I want to bring my own devices. I want to make sure that you don’t have access to the devices. I have control. And so I’m going to use my HSMs in some [INAUDIBLE] data centers. And we have a number of customers doing this as well.
Now, the problem there remains of how do I retain possession and control over my keys while I still want to be able to leverage the power and the processing capabilities of the cloud, right? And so in that case, you can actually think about, like, wouldn’t it be great if I had an option on the very, very right side of this scale? And this is where External Key Manager comes in. That’s currently in alpha, but will soon be in beta. And essentially, this is about using your key outside of Google to actually encrypt your data at rest. Now, let me talk a little bit more about the External Key Manager.
And so before we do that, let’s just go back a little bit and talk about a little refresher on how we do data-at-rest encryption with the Google Cloud. So when you bring your data up into our cloud, we take that data, and we break it up into shards, right? We break it up into manageable chunks. And each one those little chunks gets its own individual AS key that’s used to encrypt it. Now, that key stays with the data itself, OK? And we do this at scale. And that key, since it stays with the data, has to be wrapped using a higher-level key. So we’re doing envelope encryption, if you understand the terminology. And that key exists either in our own internal KMS system. Or if you’re using it yourself, it can be in our Cloud KMS or Cloud HSM system, OK? So you have control, but it’s hosted within our cloud. Now, in the case of external key management, that key actually resides outside of Google. It’s in some system outside of the Google premises. And so if you’re hosting your data in BigQuery or in GCE Persistent Disk, or you’re just basically doing a direct symmetric encryption call to our Cloud KMS, you can actually leverage this key that’s outside of our cloud through an HTTPS channel to this external entity. Now, we have a number of partners that will be ready for us– ready to integrate with us right away when we actually go and release to beta. And so our partners at Equinix, Fortanix, Ionic, Talis, and Unbound all are delivering solutions that integrate right away with our EKM solution.
Now, what are the main benefits of doing this? Well, number one, you maintain your key provenance, right? And so what I mean by that is that you get to actually understand and reason about, I know exactly where the key originated from. I know exactly where the key is located because, you know, it’s under my control. I know how many copies are, who has access. And I can probably also have my own independent audit logs because it’s depending on myself. Another reason may be that you have requirements for key access control. And this is sort of a big one, right? And so here, we’re talking about the fact that your keys are always outside of Google. That external encryption key is never brought inside Google. This is not bring your own key, OK? That key stays outside our perimeter. And so that key is necessary for us to decrypt our data at rest. Without access to that key, we can’t actually take that data at rest and make it data in use.
So as a result, if you put your data in BigQuery, for example, and that key is unavailable to us, then there’s nothing we can do to actually decrypt it, OK? It stays encrypted. And the third reason that we often hear from customers is that, I want to centrally manage my keys, OK? I have one key management system that I’m using for my on-premise. I’m using it for my– I want use it for my keys in the cloud. And I want to use that to actually enforce certain policies and have a single point where I can actually allow and disallow access to those keys, or delete the keys, or whatever I need to do. Now, this also facilitates things like multicloud, hybrid cloud, things like that. But it gives you a really, really interesting way of actually enforcing policy around the keys. And this will become important in a bit. Now, configuring this service has become very, very easy. So this is a snapshot of what the UI will look like once we ship the beta.
And so if you’ve used Cloud KMS before, you know that the key usually actually comes and says what type of key. Do you want HSM? Do you want software? But now it’s a little bit different because you’re selecting where the key is generated. So I announced earlier today, but for example, Cloud HSM now has the ability to actually import keys from external sources, OK? And that’s GA. So you can import the key. And now you can actually have a third option, which is have an externally managed key. Now, as you can see in the UI, the way that works is that there’s a little of a handshake that happens between us and the external system. The external system will have to authorize our service account to access the key. And that’s why we provide you with the service account at the very bottom. It doesn’t work very well here, sorry. But at the very bottom there, we actually give you the service account so that you know what it is for the Cloud KMS.
And then on the external side, once you create the key, they will give you a key URI that you would enter into the URI box right the very bottom. And so once you do that, you create the key. We’ll validate that the key exists and that’s reachable. And then it’ll basically show up in Cloud KMS as a key like every other key, except that the properties will say that it’s external, OK? So it’s a very, very straightforward workflow to actually get this accomplished. It’s only available for symmetric keys. We will likely think about expanding in the future. It’ll be available in all regions. And what it really means for a region to have EKM support is that that information necessary to go and communicate out to the external key will initiate from the region that you create.
So if I create this KMS key in, say, the Frankfurt region, then whenever you try to reach out to that external key, it’ll be done through Frankfurt, OK? Or if you do it through one of the multi-regions, then it can happen through any one of the data centers that are part of that multi-region. Now, the other thing is that, again, the URI’s provided externally. And so there’s documentation from our partners around how you would actually accomplish this on their side. Any questions before I go on? All right, fantastic. So, considerations– you know, I want to give you some caveats as well to make sure you really understand what the service is providing. So again, it’s available for BigQuery, GCE, and symmetric key encryption. You know, this feature was designed for those services. You know, there may be future discussions around other things. But that is what we actually designed this for.
Now, obviously, when you have a key system that’s outside of our control, outside the cloud, and across the internet, then you’re introducing additional points of failure, right? It’s just inherent in the system. And so there will be potentially reduced reliability as a result of that. You know, it’s not completely contained within our Google Cloud. So we can’t provide complete assurance that the key will always be reachable, the network will always be reachable, et cetera. And so if those kind of things are very big concerns for you, then you should maybe consider using our Cloud HSM to Cloud KMS where we manage everything within our data centers, OK? And those give you very, very, much higher reliability. The errors resulting from the encryption or decryption request to the external keys are going to manifest as user errors. Those are errors that we generally consider that we can’t control because they’re outside of our domain. And so when you see them, they’ll show up as user errors, which also means that they don’t count towards error calculations, right? And we’ll make sure that any errors that manifest from using these external keys will have lots and lots of information in there so that you can quickly ascertain why this failed, OK? We want to make sure that you can understand that this is a problem in our system or the external system. And the errors will populate within the error message itself and inside that data access logs. And so if you enable the data access logs, all that information will be recorded in there for you to actually see as well.
One other consideration is that, as you saw in the UI in the previous screen, there is essentially one URI field. So it means that you will have one endpoint, or we will have one endpoint that we target. So if things like high availability and things like that are very much considerations for you, then you should consider on the external side whether the solutions have things like failure domains, global load balancers, things like that, OK? So that’s a consideration that you have to have. And finally, most importantly, it’s coming soon to beta. I’ve been forbidden to actually tell you exactly when. But I’ll just say that it’s coming very soon, OK? Now, with that said, I’m going to stop talking a little bit, or talk less. And I’m going to invite Greg King to come on stage to actually show us how this thing works.
[APPLAUSE] GREG KING: Thank you. All right.
IL-SUNG LEE: So Greg, so before we actually go to the demo, my understanding is that PayPal is looking into using our external key manager–
GREG KING: Yes, we are.
IL-SUNG LEE: –to help solve some of your difficult problems. So can you explain why this is important to you?
GREG KING: Absolutely, thank you for the question. Really, the main focal point or concentration that we can come up with for a company like PayPal is going to be an increased security solution. I know some of that was mentioned through Il-Sung’s slides. But when you think about the ability for the management of a key externally, it really brings in an added feature where you have an independent security control from the cloud services. And that’s very important when you start thinking things like defense in depth or other things. And for PayPal in particular, we traditionally have been an on-prem type of service. So we have very comfortable feelings about managing our own data, managing our own keys and encryption systems. And with that comes certain policies that get put in place, like you have to manage your own keys with PayPal-generated HSMs, et cetera, et cetera. When you move to the cloud, you lose that control. You rely on shared services within the cloud for your data and your keys. So the external HSM brings in our ability to be able to control that externally. As Il-Sung said earlier, you can turn the keys off, turn them on– they’re totally under your control. The other advantage, as I said, was with PayPal’s restrictions is it actually enables new business use cases to go to the cloud environment. There’s varying degrees and levels of data control that we have depending on what type of data it is. And some data may not have been enabled for the cloud-type services. But with the external key managers, we can now potentially bring those into the cloud environments. That’s very big for PayPal, and it enables use cases that weren’t there before.
IL-SUNG LEE: OK, and can you describe a little bit about how your system is set up?
GREG KING: Yeah, sure. So there’s really two options for doing EKM. One of them is going to be very complicated, which means you might use your own external HSM services. You can either use on-prem, or you might deploy your own HSMs if you’ve got the budget, the resources, and constraints to be able to do that. The other option is to use one of the HSM services that have become affiliated with Google. And they’ve already done the interconnectivity that’s required to get CMEK to work. One of those, I will demo, which is Fortanix SDKMS. So for us, the other thing that we’re looking for is global availability. So we regionally deploy services for KMS, the external KMS locally to wherever regions within GCP that we’re going to use so we can keep latency down. And obviously, we want to have high constraints for availability. So that’s the main deployments.
IL-SUNG LEE: OK, great. So, Greg, can you tell us what we’re going to be seeing here?
GREG KING: Sure, let me get logged in. So I’ve got two slides up, or two web pages. And one of them is going to be the GCP or the Cloud Platform. And what I’ve done is implemented a BigQuery table, a very simple table just for demonstration purposes. And then I have the Fortanix SDKMS GUI, which is where I would control the external keys which I’ve generated. Il-Sung went through the methods and the interconnects for how you would generate those keys, so I won’t repeat that here. But what I do want to show you is when you have the BigQuery set up, and you’re trying to access the data, if you enable and disable those keys, obviously the data will go away. So you’re going to be able to enforce the restrictions on that data external to the cloud service itself, which is really the key point.
So within BigQuery, like I said, I just created a very small sample set. And I’m going to show you one thing real quick before the data is the details. So since this was pre-set-up, I wanted to point out at the bottom, which hopefully that’s visible, you’ll see it is a customer managed key. From this particular point of view, the way it’s developed– you wouldn’t necessarily be able to tell it’s an EKM key. But when you go and implement it the way Il-Sung discussed, it’ll be clear that it is.
I’ve got it labeled as an EKM 002. For me, when I bring up Fortanix, you’ll see the equivalence to the key that’s in there. OK, so to preview the data, you can just see it’s a very simple database structure– nothing elaborate, just enough that you can show the data. Obviously, I’ve brought the data up so that that would relatively make it apparent that the key to unlock the data is available and is usable from the service.
So if I go over to Fortanix, their SDKMS interfaces– and just to give you a quick representation, the key that I brought up before– you’ll see that I’ve got a key developed here with a similar name. They’re not necessarily the same. But for relational purposes, I do have it in there so I know the relationship to the key that’s in CMEK in GCP. So within the service, the application is set up. It is the service account through the external KMS in GCP.
It’s connectivity that Fortanix has implemented into their service to enable this to work. And in this particular service, I’ve got an enable and a disable feature on the side. So I can just hit the switch there. And from this perspective, the key has now been disabled. To make that apparent, you come back over to the key table. And if I try and do a preview, it’s going to give me an error saying the key itself or the data itself can’t be reconstructed because the key is not available.
So if I circle back out, go back to SDKMS with my application and re-enable that feature, come back to BigQuery– and when you preview the table now, the data is back available. So obviously you know that the access to the key is dependent on the external service under the control of PayPal or the customer, whichever customer is going to be using it. And that’s the key feature to the service.
IL-SUNG LEE: That’s great. Well, thank you very much, Greg. That’s a really, really a great demonstration. I think what we’d like to do now is to invite Joseph Valente onto the stage. And he is going to talk a little bit about the access justifications.
JOSEPH VALENTE: Thanks, Il-Sung. Thanks, everyone, for coming. I’m crunched in the last thing between your end-of-day snacks, so I’ll keep it short. I know that Il-Sung’s just given a very detailed presentation where he’s walked through all of the control that we get from having our keys in an external key manager. And so at this point, you’re probably wondering, what do I actually get from getting key access justifications as well? I already have this tremendous level of control having my keys outside of Google. What benefit does this give me?
And what I’m going to hope to address in this part of the presentation is precisely what that benefit is and how that works. And I’m going to do that by walking through an example of how encryption works within a service that runs on Google Cloud Platform.
So I want you to imagine that just in the demonstration you just saw, you are running a query on BigQuery. But this time, instead of using an externally-managed or customer-managed encryption key, you’re just going to use Google’s default encryption at rest to encrypt your data. So your BigQuery table data is going to be encrypted. And the key is going to live in Google’s internal key management system. And when you run this query. It’s going to fire off a request to that internal key management system saying, hey, please decrypt this table so I can do whatever it is the good customer is asking me to do.
When that decryption event happens, some plain-text data is going to be loaded up by the BigQuery binary. And it’s going to be able to perform your query for you.
So, for example, if your query was, give me all of the sales in Q1 2019, it’s going to go ahead, decrypt all of the rows of data that have the transactions for that quarter, sum them all together, and then return you the result. Now, as you all would know, and as Il-Sung has covered in his presentation, we also have the ability for you to replace those keys with keys that you manage. And those keys are going to live in the customer’s own cloud key management service instance. And what you can do there is have management over the key so that when you delete it, Google can no longer access the data.
Already, I’m sure you’re probably thinking this is getting to a pretty high level of control. I have the keys within the key management service. I can delete them when I no longer want Google to have access. And that access goes away. But there’s a few things that remain. One of those is the BigQuery engineers, or the engineers who are administering or running or developing whatever service it is that you’re using on the cloud. The other thing, of course, is the cloud key management service engineers. So there’s a degree of trust that you need to place in these people. Google already offers industry-leading controls in both these spaces. We have Access Approval and Access Transparency, which very, very heavily lock down what these administrators are able to do. In many cases, actually in BigQuery itself, when you’re using some of these controls, they won’t even have access to the underlying table data. However, there’s still an element of control that’s there. And you will need to obviously read the documentation for these products to understand what their limitations are.
So you want to remove these people from the trust equation as much as you can. Now, when you use the external key manager service, Google’s cloud key management service engineers can no longer access the keys that live inside the external key manager. So that’s a huge leap over what we had previously.
But as you can see, on the left-hand side of this diagram, all of that infrastructure is still Google-managed. And you still have to trust what’s going on there. You have to trust that we’re correctly administering the controls. You have to trust that our administrators have their access locked down. And you also need to trust that the software itself is not doing anything malicious with your data. And so in order to do that, we want to give you visibility into whether or not the requests that are coming to decrypt your data are originating from you, the customer, from the software itself, or from the engineers who are the administrators and developers of that software. And the way in which we’re doing this is by providing you a justification each time we request a decryption from the external key manager.
So let’s walk through some details about what exactly is going to be inside these justifications. So as I mentioned on the previous slide, there are three broad ways in which your data may be decrypted. The first is that you yourself are running some query or some access to your data via the service.
And an example of this is what we saw before. I’m using BigQuery. I’m trying to run a query. And that’s going to need to access my decrypted data in order to work. The second broad category is the decryption events that occur as a result of system operations. Now, from time to time, Google’s systems are going to do certain things.
For example, BigQuery every now and then performs optimizations on your data so that is much quicker to query a BigQuery query than if the optimization hadn’t taken place. That’s obviously what you’re paying for when you get this high speed and performance from using a GCP service. But you want to have visibility into when that’s happening and understand the boundaries of that activity so that you can be comfortable putting your data onto the cloud. And so we’re going to now, for the first time, give you visibility into that with a justification showing where a system operation is taking place. The third and final set is the broad set in green at the bottom of Google administrative activity. You’ll notice, for those of you who are familiar with our Access Transparency product, that these are the exact same justifications that you get with Access Transparency and Access Approval.
And so you may be wondering at this point, isn’t this just the same set of controls? And the answer is yes, it is. We’re building on the same set of controls. We’re bringing those justifications through all the way to the key manager so that you can actually use encryption to enforce your access policies.
And finally, through the functionality provided by our partners and integrators, you can use blocking policy at your external key manager to block these requests. Now, obviously, if you set a policy that says block all customer-initiated access, you’re going to lose access to your own data. So please be aware if you’re planning to do that. That’s probably not a good idea. But if you were to go and, for example, set a policy that said, block all Google administrative access to my data, that would be a valid policy.
You could continue using your BigQuery service or whatever service it is you happen to be running. And while it may be difficult for you to get support, you would still be able to use the service and have that access enforced at the key manager. Now, at this point, you’re probably saying, that’s really nice that I get a justification and that you’ve got all these controls in place. But what confidence do I actually have that all of these things are going to work in practice? And that’s where our integrity commitment comes into play. So Google Cloud is going to give you a commitment to protect the integrity of the controls and the justifications. And you can find out more about this when you’re on-boarded for the product. So, Key Access Justifications– it’s coming very soon to alpha. I can’t say when, but it’s coming soon. So watch this space.
It’ll be available for Google Compute Engine and Persistent Disk, as well as BigQuery. And it covers the transition from data at rest to data in use. Like I said, it’s going to give you that justification when your data is decrypted And it’s going to appear at your external key manager. And you can block it if you don’t want it to happen.
Using External Key Manager and Key Access Justifications together is going to make you the ultimate arbiter of access to your data. And that’s really, really important. Why is that really important? Because no other cloud provider is going to give you this level of control. This is something you can only get from GCP. Some little fine print and details– you can’t turn this on yourself right now. Like I said, it’s not actually available just yet. But even when it is available, you’ll be considered based on a variety of criteria. So we’re just looking for customers who are going to help us make the service successful. Please register your interest.
There’s a link in our blog post if you are interested in using this product. Secondly, do you need to use Google’s External Key Manager Service to get Key Access Justifications? The answer is yes. This is only available if you use the External Key Manager. And then finally, what data is in scope for Key Access Justifications? It’s, generally speaking, the same things that are in scope for both customer-managed encryption key access, as well as External Key Manager product. So it’s the highly sensitive data, things like the underlying table data inside BigQuery.
So let’s do a quick demonstration, and let’s walk through a short video of how this might work when I’m using this in a service on GCP. So I want you to imagine that you’re running a query now on BigQuery similar to what we saw before. And I’m going to click the Run button. And my query went ahead and ran as intended. And now I’m going to jump to my External Key Manager software. And I can clearly see here that a customer-initiated access has taken place and that my External Key Manager access policy allowed that access to happen. I’m going to go ahead and refresh my EKM page. And I can now see that there is a new request that’s come through that was authorized, but I didn’t actually do anything.
When I scroll down, I can see that that was actually a Google-initiated system operation and that my access policy allowed that access to happen. So that’s all good. Now let’s imagine that something goes terribly wrong, and you need to contact Google support. You’ve completely lost access to your storage. Things are coming up completely corrupted. You’re seriously worried. You smash all the red buttons you can smash. And Google goes and calls the engineer who is the most highly privileged person for the service who is on call. And that person pulls up a tool. Just FYI, this is not the actual tool. We can’t show you our internal tools, so this is an animation. And they have to put in a justification. And that justification will show you the ticket number of the request that you open up– excuse me, sorry– ticket number of the request that you open up. And it will be validated automatically by our systems to ensure that that ticket was actually something you as a customer opened and that you want this request to be happening.
Once that’s validated, the access will go ahead and take place. But there’s a catch. In this case, you’ve set your External Key Manager access policy to deny all administrative access. And so when you see this customer-initiated support request come in, it will register as a deny, and the access will be denied to the Google administrator. So no matter how high this person’s privileges are, they won’t be able to do it. But in this case, because you really need support, you probably want to change that. So here, we are seeing us updating our EKM access policy to remove that reason from the blocklist. And we’re going to see the administrator going and trying this again and trying to get that access to go through. So we’re going to send this off, and voila. So it’s now allowed. The access has gone through. And we can see the customer-initiated support justification.
So that’s a quick demonstration of what you’re going to get when we ship Key Access Justifications for GCP. But overall, from today’s presentation, I want to just reinforce a few takeaways. So the first thing is, please look out for the release announcements for External Key Manager beta and Key Access Justifications alpha. Both of those are coming soon. Be on the lookout for a blog post. And let us know if you would like to be on-boarded once the products are released. The second thing is, keep your data in our cloud, not your keys. So you can now finally get this unprecedented level of control when you use External Key Manager to secure your data at rest in GCP.
And then finally, if it’s available to you, you can use Key Access Justifications to make yourself the ultimate arbiter over access to your data and get an incredible amount of granularity over that access by using key access policies to govern why your data can be accessed. So that’s all I have for today. I believe we have some time remaining, and we also have some roving microphones. So if there are any questions, I’d like to invite Il-Sung back on stage, and we can take those from you.
IL-SUNG LEE: Are there any questions that anyone has?
AUDIENCE: The External Key Manager used a service account to get access to the on-premise solution. Does that mean the person who needs access to the data also needs access to the service account? Or is that kind of just grouped in together?
IL-SUNG LEE: Yeah, so what’s happening is that the external key manager’s authorizing KMS to access that key, right? And so never minding Key Access Justifications. You’re basically authorizing KMS to access it for whatever need that it actually says. So if it needs to do something for offline compaction or fulfill customer requests, then that’s what it is. And then Key Access Justifications gives you that extra level of clarity around when you might want to actually say no to the decryption. But yes, you’re actually authorizing the entire service.
AUDIENCE: So do you need to authorize the customer or the analyst who wants to see the data?
IL-SUNG LEE: I’m sorry, could you speak up a little?
AUDIENCE: Do you need to authorize the analyst who wants to see the data to be able to use the service account as well as being able to see the data?
IL-SUNG LEE: No, no. So you don’t have to actually authorize the data itself because the data– like, for example, in our examples, it was BigQuery. BigQuery’s already authorized KMS to access the data. Or KMS has authorized BigQuery to use a key to do the encryption/decryption. And then the EKM is authorizing KMS to actually authorize outside keys. So it’s a chain of permissions that happen.
AUDIENCE: Thank you.
IL-SUNG LEE: You’re welcome. That was a great question.
AUDIENCE: Yes, hello. My question is about [INAUDIBLE] trust because you ask the external partner to decrypt the data key, if I understand, with the master key. But at the beginning of the encryption, you encrypt it with your data keys. And so, in fact, how can we trust, as customer, that the data keys are still not available on your side?
IL-SUNG LEE: Now, that’s a really, really good point. I mean, you still have to maintain a level of trust in our system because we maintain all the data encryption keys, right? And we have to, because at the scale that we run at and the fact that all this is transparent, you pretty much have to do a system of envelope-type encryption. So this kind goes back to what Joseph was stating around the commitment that Google makes that we are not doing anything inappropriate. And the way that the system works is the way that we’re professing that it works, right? And so you’re absolutely right.
I can say out loud on the stage here that we, at KMS, or in the Key Management System– we do not actually cache any keys in there. The storage services may do something a little bit differently, like, for example, BigQuery– by default, they basically cache result sets. If you run the same queries over and over, it caches result sets. You can turn off that caching. Or you can encrypt the results sets, in which case, every time you run the query, it has to go outside and get access to the external key. GCE is a little bit different because they actually only need the key to actually decrypt the persistent disk When it starts up the VM. And once a VM is running, it keeps that instance key in memory so that it can do fast read and writes to the persistent disk. If that VM is ever brought down and you want to restart it, then it has to go outside to the external key again. So different storage systems may do different things with the keys. But on the KMS side, or the EKM side, we don’t actually cache any of these keys. Thank you. There’s one behind.
AUDIENCE: Hello. If I don’t trust Google to be the custodian of my key, why would I trust another cloud vendor? And if I don’t trust Google, why would I trust you to be honest about your justification?
IL-SUNG LEE: That’s a great question. So there is value for a lot of people in having split responsibility, right? So having to require two independent parties to collaborate to break the system is much different than having trust in just one single system. That’s number one. Number two is that I listed five partners up there. Some of them do provide you with cloud services where they’re basically providing a cloud key management service, right? And so you’ll have to have trust in their system, obviously. Some of the other partners are actually shipping more of the on-premise type of solutions where you’re buying things like infrastructure from them that you can run in your own premises. And as we go further and further along with this whole external key management story, we’re going make it easier and easier for that scenario to happen. But it is possible where you can actually use some of the solutions to actually buy the infrastructure, have it in your own physical protected boundary and then use that interface with our EKM.
AUDIENCE: Yeah, and sorry, just to follow up, but the second part– if I don’t trust you, why should I trust you to be honest about your justification always?
JOSEPH VALENTE: Yes, so to that part, I think the key difference here is that we’re giving you the commitment. So we’re actually just straight out saying, we’re not going to circumvent the controls. We’re going to make sure that what’s in the justifications are true. And that’s a commitment that we’re stating here on stage. And you can take it away. And obviously, if you still don’t trust it, and you really, really don’t trust Google, then there’s nothing we can do beyond that point. But I think this is a very big step forward in that you really have that commitment from us.
IL-SUNG LEE: Yeah, one of the things that I hope it’s apparent is that we are trying to make sure that you can’t ever eliminate trust from us. But we’d like to make it so that that amount of trust that you have in us is kind of at a more comfortable level. And we’re trying to make it such that there’s more and more comfort as we go along with this. But yes, you’re still ultimately putting your plain-text data up into our cloud. But we’re trying to give you additional controls to give you control over that and when they can be accessed and such. Yeah, there’s a question right there.
AUDIENCE: Hey, so if we have concurrent requests to big data, like key requests, so each one would go and request a key, or like if they tried to fetch the key once and use them for all the requests?
IL-SUNG LEE: It’s somewhat complicated. Most people, what they would do is that if you had different data sets in BigQuery, that if you want the maximum separation of duties, you would use a different key, right? So one data set can use one key. Another data set can use the other key. I mean, they can all be hosted in the same external solution. But in those cases, they’re independent, right? So that key would only live for the lifetime of that one query. And then this key would only live for the lifetime of that one query. If they’re both using the same key, then it really, really depends on how the queries overlap, right? So if they’re not overlapping at all, then obviously, once one query finishes, the other key goes away. In the other case, the key would go away as soon as the key is no longer needed in either case, right? But again, there’s no real caching of the key in that example. It’s just basically in the process as long as it’s needed. And afterwards, it’s actually purged.
AUDIENCE: So if it was the same data set, and we have two requests or two queries that are running at exactly the same time, but they’re not really overlapping in terms of the data they’re accessing, but it’s still like the same table, let’s say, each one will get its own key, even if it’s the same key. I’m talking about the requests from Google to the key manager, the external key manager.
IL-SUNG LEE: Yeah, so that is something that I believe that should be a separate request. But I have to check, because now we’re getting to more details around how the underlying BigQuery system works. And so I would feel more comfortable if– come and talk to me afterwards. You know, we’ll exchange informations, but I’ll verify for you for sure.
AUDIENCE: Thank you.
IL-SUNG LEE: Yeah. Any other questions that we can answer, folks? Oh, there’s one more.
AUDIENCE: Hello. You told us a little about encrypting BigQuery data. However, some time ago, several months ago, when we were testing some of our workflows, then our setups is covering breaches or the situations where the data is not encrypted– discovered that, for example, hash tables were not encrypted. How it looks like currently?
IL-SUNG LEE: Did you hear that?
JOSEPH VALENTE: Sorry, can you repeat just the last bit from when you were just talking about where you using BigQuery? Sorry.
AUDIENCE: I’m asking how currently it looks like encrypting hash tables– hash which are created by BigQuery.
IL-SUNG LEE: Oh, hash tables. I’m not certain about that, to be quite honest. I mean, fundamentally what we’re trying to explain is that whenever BigQuery needs to read data off of disk, it needs to come to us, right? And so I’m not sure how the hash tables are manifested within the memory itself. If it’s already in memory, then it wouldn’t need to come to us. But the key thing that has to happen is that whenever data transforms from data at rest to data in use, then it requires usage of the external key.
AUDIENCE: The problem is such type of the data can last for 24 hours.
IL-SUNG LEE: The cache– you’re talking the cache results sets? Oh, I see. Yeah, so in that case, they don’t need to reach out to us because BigQuery on their side has already cached results sets for that particular query and has nothing to do with us at that point. They no longer need to move that data from at rest to in use. And so they would be able to actually go and query that without actually hitting the external key manager.
AUDIENCE: OK. Thank you.
IL-SUNG LEE: And that’s why I was saying earlier that you can potentially do a couple of things. You could turn off the cached results. Or if you encrypt your result set, then it requires a call to the external key manager every single time you actually run the query. So those are two things you can do. Good question. Anything else we can help with? Going once, twice, all right. Thank you very much, everyone.
JOSEPH VALENTE: Thanks everyone.