Best Practices for Infrastructure as a Service (IaaS) in Microsoft Azure
According to Gartner, by 2020 worldwide public cloud services will account for $383 billion, a fifth of which (19%) will be Infrastructure as a Service. In a June 2017 report, Gartner put Microsoft as a leader in Magic Quadrant for Cloud Infrastructure as a Service, worldwide, along with Amazon Web Services. A successful cloud implementation requires a secure foundation and architectural design. Increasing complex environment, loss of visibility and control, unexpected billing are a few issues that arise that makes users re-evaluate their cloud strategy. In another study, Gartner predicts that through 2020 public cloud IaaS workloads will suffer at least 60% fewer security incidents than those in traditional data centers. While metered service (pay per use) is the promise of the cloud, there aren’t many organizations who do this. At a Gartner conference in 2016 I had the opportunity to watch demonstration of one large cloud customer whereby an automated bot powers down servers that are turned on but not in use, saving thousands of dollars in costs per month for the company.
With experience with three big cloud providers (AWS, Azure and Rackspace) I will share my experience and best practice for building a secure architectural design in Microsoft Azure. Enterprise boundaries with perimeter firewalls and tightly guarded digital assets in the data center are now disappearing, making way for apps and data appearing in the cloud. It is therefore critical that scalable infrastructure design be put in place from the ground up.
Microsoft Azure uses Network Security Group (NSG) to control incoming and outgoing traffic from / to the Azure hosts and subnets. A Network Security Group is a simple stateful packet filtering firewall or router and NOT full packet inspection like a session layer firewall. There is no protocol validation, intrusion detection or prevention capability in a NSG. A group of firewall rules can be configured for the NSG and applied to a subnet or a host network interface card (NIC). This group of rules allow and deny traffic for a source or destination. NSGs use a 5-tuple to evaluate traffic:
· Source and destination IP address
· Source and destination port
· Protocol: TCP or UDP
This means that traffic can be controlled between a single VM and a group of VMs, or a single VM to another single VM (good practice for DMZ), or between entire subnets. This is difficult or impossible to achieve with a traditional firewall (blocking traffic single VM to another single VM within the same VLAN) so Azure NSG has an advantage here. An NSG comes with some built-in rules also that you should be aware of which includes Deny all inbound from the internet, allow all traffic within a specific virtual network, allow Azure load balancing inbound and allow all internet bound traffic.
Azure allows an IPSec tunnel to your data center through a VPN gateway. From a security perspective this could be the weakest link so weigh the risk of what digital assets are exposed inside the data center in case a host in Azure is compromised.
Network Segmentation: Segment the Virtual network as you would in the datacenter and apply the firewall rules closest to the destination. So, if the entire subnet going to Azure is a /20, create multiple /24 subnets as follows – gateway, VIP, production server subnet, production DMZ subnet, QA server subnet, QA DMZ subnet, Dev/sandbox server subnet, Dev/sandbox DMZ subnet, etc. treat these subnets like you would in any other firewall in the data center. That means, if you have a rule saying production servers can communicate with any other production servers, allow the same privileges to Azure production subnet. Also, block host to host communication in the DMZ. It is important to build a base ruleset so universal services like DNS, NTP, SNMP, monitoring etc. are allowed by default when a new VM host is configured in Azure. Once the rule base is setup, you can look at the Effective Rules by selecting the NSG -> Effective Security Rules
Audit: Most log analysis tools (Splunk, Sumo Logic included) have Add-on for Microsoft cloud services that can pull activity logs, service status, operational messages, Azure audit, Azure resource data and Azure Storage Table and Blob data from a variety of Microsoft cloud services using the Office 365 Management APIs, Azure Service Management APIs and Azure Storage API.
Monitor for abnormalities: once the base security features are setup, data is audited, and baselines created, it is important to monitor for abnormalities. This is both from functionality (e.g., speed) and security point of view (e.g., privileged access).
Questions, comments, concerns? Please put them below: