Why Managing Configuration Drift is Critical in the Cloud

woman working at computer

Identifying and Managing Drift is Vital to Your Organization’s Success

What is Configuration Drift
Organizations have been working to alleviate misconfigurations in their environments for as long as developers have been writing software. Unfortunately, configuration drift in the cloud is inevitable. Even companies with the best of intentions, who have worked through cloud adoption scenarios and based their infrastructure on the well-architected frameworks, frequently run into changes from their baseline.

Configuration drift occurs when the actual known state of the infrastructure differs from the last defined configuration. There are a variety of reasons why this can happen, and the consequences of drift vary significantly – from simply being an annoyance to causing drastic failures in production. Drift can occur by adding or removing resources, or making changes to existing resource definitions, and can happen manually or via a number of automation tools. In and of itself, drift isn’t necessarily bad, but with the varying severity of consequences mixed with the speed of the cloud, awareness is vital as well as having a process in place to manage unintended, costly misconfigurations in the cloud.

Production Stability via Baseline Enforcement
A cloud baseline is a high-fidelity ‘snapshot’ of the environment that is deemed to be the known state and is an important first step in cloud security and tracking configuration drift. Maintaining an accurate baseline and understanding the direct impact drift has on the security and stability of production environments in the cloud is a process that cannot be overlooked. The advantages that so many rushed to the cloud for – scalability, flexibility, rapid implementation, and ease of spinning up resources – create a major challenge in maintaining an accurate baseline. Using spreadsheets to compare baseline configurations in hopes of catching any drift is a thing of the past, and it is nearly impossible to get the accuracy and speed needed to be secure in the cloud. The continuous snapshot gathered by OpsCompass provides visibility into the specific settings of each resource and the ability to be notified of misconfigurations in near real time, and is a modern solution to this continuous problem.

Additionally, the CIS and NIST CSF frameworks are a great place to start in terms of a policy framework with industry best practices. However, even these standards have their limitations when it comes to monitoring configuration drift. There are several ways to misconfigure resources in the cloud that do not violate the benchmarks provided for each of the major public clouds, not to mention any specific company guidelines that need to be tracked to ensure staff is adhering to internal policies.

Improve Visibility with Frequent Cloud Audits
Per each public clouds shared responsibility model (AWS, Azure, GCP), it is important to distinguish that the cloud provider is responsible for the security of the cloud, and that the enterprise needs to ensure security in the cloud (of services, data, and access to apps for example). Having complete visibility and an accurate understanding of cloud baselines can be a challenge (especially if the enterprise is multi-cloud) but with the right tools and processes, organizations can stay on top of misconfiguration errors.

A great first step is to conduct regular and consistent internal cloud audits. This can involve locating risks, accounting for updated policies, understanding and planning for vulnerabilities, evaluating (and re-evaluating) controls, and creating a risk assessment plan to address all of these factors. A clearly defined process ensures that changes in the cloud line up with a company’s compliance standards, policies, and the needs of their users. Identifying common configuration drift early in the pipeline allows for a deeper investigation into how better guardrails can be put in place to ensure secure deployments. As a cloud management software solution, OpsCompass offers the flexibility of any-time audits that remain accurate with every change that takes place in the cloud; highlighting and categorizing the configuration drift as it happens in real time.

How to Identify Configuration Drift
Like a small leak behind a wall in a house, one of the main reasons configuration drift can be so destructive is that if no one is consistently looking for it, drift can go unnoticed as it slowly undermines the foundation of your infrastructure. Then when the wall finally cracks, it takes time to identify the root cause of the configuration drift that started it all – and time is a precious resource in an emergency.

Below we’ll take a look at a few common examples of configuration drift, most of which occur for a valid reason, but if they go unnoticed over time can lead to bigger issues. Accompanying the examples below are screenshots of the data found by OpsCompass within moments of the drift taking place in the public cloud; essentially informing the user where to look to ensure that a leak is not forming.

1.) In this case, there is a configuration drift in a network security group, as a user opened RDP to the internet. Firewall changes may not always be reason to sound the alarm, as there are many legitimate reasons this could have taken place, but it is worth being made aware of, so you can confirm that proper precautions are in place.

Config-Drift-Critical_1

2.) In this example, there is configuration drift on a virtual machine, as a user scaled up the sku of the `vmSize`. A unique feature of OpsCompass’ configuration drift detection is the ability to anticipate the cost impact of a change, even before usage charges occur. The user has visibility into a comparison of the current state versus the previous state and has the ability to acknowledge the change.

Config-Drift-Critical_2

3.) In this case, there is configuration drift within a network security group where a user removed the more secure, HTTPS, leaving only HTTP. If you notice, OpsCompass surfaces that this deployment was originally a part of a cloud adoption framework (CAF) blueprint and was later changed by Tom. Tracking changes that are taking place outside of normal deployment methods is another vital component of improving the security posture of a cloud environment.

Config-Drift-Critical_3

4.) This example shows configuration drift has occurred due to an AWS user being inactive for 90 days, so their status was changed from “Active” to “Inactive”. The drift was the result of an AWS system-initiated change, rather than the result of an individual making a change. Interestingly though, the lack of action by the user is what caused this drift to result in a potential vulnerability.

Config-Drift-Critical_4

5.) In this case, there is configuration drift because a token expired on a connection to Office 365 for workflow integration. The drift was caused by a refresh token expiring, and again while no user made the change, this time the drift was in fact due to inactivity – because no one updated the token it is flagged as broken.

Config-Drift-Critical_5


Misconfiguration: A Prefix or a Pre-Fix
As adoption of the cloud grows, so does the opportunity for misconfiguration and exploitation. Mismanagement of these risks and vulnerabilities can not only cripple a company’s IT infrastructure, but also significantly impact a company’s reputation and the trust of its customers.

That’s where Cloud Security Posture Management (CSPM) comes in. CSPM ensures that IT Operations and Cloud Engineering teams have visibility into the ever-evolving security and compliance posture of their public cloud resources and SaaS environments, and provides the tools to better protect a company’s data, employees, and customers.

Whether it is securing your company’s cloud infrastructure, or reducing the risks associated with SaaS applications, OpsCompass provides CSPM that helps protect against misconfigurations and other vulnerabilities your company may face.

Are you concerned about configuration drift? OpsCompass can help.

Share the Post: