The other day, I was chatting with one of my old buddies who works in the information security sector about containers. The conversation quickly turned into a discussion about container isolation (any wonder, when two security-minded guys are talking? :^)). As we discussed various approaches for isolating containers, and the pros and cons of each approach, we agreed that there seems to be a lot of confusion over isolation in general. Hence, this post, in which I will attempt to explain the typical approaches taken for isolating applications, application modules or components. I will also offer a proposal for a new approach to container isolation that is based on the actual behavior of an application.
What is Isolation?
the process or fact of isolating or being isolated.
The definition clearly indicates that isolation is a result of an external enforcement. Just like in the real world, isolation could be enforced on a particular aspect or capability of an entity. As an example, ‘political isolation’ is an enforcement applied to a person restricting him from using his political capabilities to operate as a politician. In the application security realm, isolation has come to be defined at a much more granular level, so that it’s applied at feature level rather than at a capability (could be thought of as a feature set) level. Specifically, rather than “isolating” an application from performing any network operations, the application is “isolated” from – say – listening on ports or from establishing outgoing connections.
The concept of isolation is derived from the justifiable fear that, with unrestricted capabilities, a misbehaving application could cause problems. The misbehavior could be the result of a bug in the application or due to being controlled by an external entity. The resulting problems could be: a) that unauthorized users have access to data or services that they shouldn’t have access to (compromise); b) the application not being available to the rightful users (Denial of Service); or c) an application is used to get to other applications with the intent of compromising and/or denying service of those applications. With isolation, an application is restricted to a predefined set of capabilities, so that if the application were to misbehave, its misbehavior would be restricted to those predefined capabilities.
An application, be it an app running on a mobile endpoint or a backend server application, could be thought of operating along three key capability dimensions: Network, I/O and Application Tiers. The first two capabilities are obvious. The third one refers to the capabilities captured in each tier of an application and the interactions among them.
Of course, the third dimension overlaps with the first two, as the communication among the different tiers could occur over network or through I/O. Of the three dimensions, network is the one that could be most easily identified — whether at an intra- or inter-application level — as explicit proof of interaction (packets) being exchanged over the network fabric (wired or wireless). Hence, over the past two decades, network isolation has become synonymous with application isolation. Over the years, network isolation has moved a level down to be applied to the modules (which could represent various tiers of an N-tiered application) of an application as well.
Additionally, as storage got detached from compute —for redundancy and reliability— in the form of Network File Servers (NFS) or Network Attached Storage (NAS), I/O traffic showed up on the network. Hence, I/O-related isolation could also be defined in terms of network isolation.
Defining application isolation along the third dimension of Application Tiers, and doing so automatically, remains the most challenging aspect of forcing applications to behave normally. But, such an isolation enforcement represents the best approach because it depends on understanding the normal functioning of an application, in its various tiers, and restricting them to only those capabilities that are required by each tier to perform its tasks at any given point in time. In other words, such an isolation enforcement requires understanding the various states in which an application operates, the events that it deals with in each of those states, and the actions it takes in response to those events. Basically, creation of a State Transition Diagram for the application. Then, using that state transition diagram to define and enforce isolation policies for each of those states.
In terms of defining an application’s behavior, the dimension of Application Tiers logically sits on top of the dimensions of Network and I/O.
Static versus Dynamic Isolation
The current network-based isolation solutions are all examples of static isolation. Before someone jumps up and points me to the existence of “dynamic network isolation” solutions, let me clarify what I mean by the lack of dynamism in such solutions. None of such solutions offer the capability to change the network isolation policies based on the requirements, or not, of network capabilities by an application in its different application states. All they offer is the capability to define and push network policies from a central panel, and those policies can be enforced as soon as the policies (or changes) are received by the network appliances.
To define it more explicitly: based on the context I have provided thus far, a dynamic isolation enforcement is one for which the policy is created automatically, by learning about the states and behavior of an application in each of its states, and adjusts itself automatically depending on which state the application is operating in. Perhaps, the ideal term for such an isolation is ‘Elastic Isolation’.
External versus Internal Isolation
Before wrapping up our discussion on isolation, let’s examine the isolation techniques from a different perspective. In the above sections I explained the isolation enforcement from the perspective of capabilities of the application that is being isolated. Now, let’s look at the enforcement points. There are three locations from where isolation policies could be enforced on an application: 1) external to the application; 2) at the periphery of the application; and 3) from within the application.
Network isolation enforcements, which could also encapsulate I/O enforcements (as explained above), are typically applied on the application from outside. It’s also important to clarify here that such isolation approaches sometimes operate not at the application level but at a virtual machine (VM) level in which the application is running. Additionally, if an application is split across multiple VMs, network isolation policies could be applied to the application components by applying them to the individual VMs. Same goes for applications or application components/processes running in containers. Often, this is referred to as container isolation, but is it? Not really, because it isolates a container only along the network dimension, while the container’s other dimensions, as defined above, are still exposed to or available for attacks.
The periphery-level enforcements are typically applied through an approach known as application sandboxing. This type of isolation operates on the processes (as in, OS process) that an application comprises and entails restricting the processes to a predefined set of OS system calls. The primary goal behind such a sandboxing implementation is to limit the kernel surface area (in terms of the system calls) that is exposed to the application, thereby limiting the scope of the problems that the application causes in case of misbehavior. SELinux– and AppArmor-based application sandboxing are examples of such enforcements. SECCOMP-BPF is an interesting example of an intersection between the network isolation and application sandboxing.
As I mentioned at the beginning of this post, I believe what we need is a new and more effective approach to container security: enforcement from within the application itself. A more appropriate term to describe such an enforcement would be ‘Externally Controlled Restraint’. The term “restraint” makes sense here since the enforcement action is coming from a part of the application itself (compare this to the inner conscience of humans), whereas the enforcement policy is coming from an external entity. Couple this enforcement measure with enforcement along the three dimensions of application as explained above, and you have a sophisticated, holistic enforcement approach that can be applied at an individual container level. An enforcement that automatically understands and adjusts itself for applications that are split into various modules, components and microservices, each of which could run in a separate container. This is true container isolation and represents the most effective approach for implementing secure containers.
I would love to hear your thoughts on this approach. For making the approach airtight, from a security perspective, I have deliberately skipped covering a critical piece, which I will add as an update to this post in a week or so. :^)