The possibilities for cost optimisation can be roughly divided into the following 6 measures, which I would like to describe in this article:
The most important point, however, is that the cloud engineer is familiar with the concepts and billing models of the cloud down to the last detail. To really save costs in the cloud, the advantages of the cloud must also be exploited.
This requires a change in mindset. Many of the classic data centre approaches do not lead to success in the cloud. Cloud properties such as elasticity (see box), automatic scaling or compute-on-demand must be designed and used. It is imperative to take advantage of the fact that in cloud computing the computers are only rented and not purchased. Therefore, there are no high investment costs and the operating costs only apply during the rental period.
The most effective and also most obvious way to save costs is to delete resources that are no longer used. In doing so, we return the compute and storage capacity and no longer use anything. Although this is obvious, the point is worth mentioning here. In too many customer projects, we experience environments in which resources exist that have not been needed for years and cause unnecessary costs. To prevent this, consistent governance and regular checks of the Azure environment are mandatory.
The second way to optimise costs is to switch off resources. Whether a resource can be switched off at all depends on the type of resource. While PaaS services often cannot be switched off, this is possible with IaaS workloads, i.e. virtual machines. Especially if virtual machines are in use on which employees work during working days, then it makes sense to switch them on at night and over the weekend only when needed. The cost savings are massive, the costs drop to about a third, not even including holidays, for example. The same applies to test environments that are only used sporadically during larger test runs.
One property of the cloud is that it is "elastic". But what does that mean exactly? Elastic means that the performance of cloud solutions can be geared to demand. Thanks to the cloud infrastructure, a lot of power can be drawn at the moment when a lot of power is in demand. This can be used daily (we reduce the power at night), weekly (a lot of power is provided during the day on weekdays) or seasonally (we increase the power on Black Friday). The highlight of elasticity is that it is directly reflected in the costs. In contrast to the classic data centre, where hardware procurement cycles have to be planned over several years, costs in the cloud can be optimised from minute to minute. The extreme example of elasticity is compute-on-demand, when batch jobs are executed for e.g. daily, monthly or annual financial statements. The compute resources can be ramped up exclusively for the execution of these jobs and ramped down again after the computation is complete. If we take the example of the "Beast" VM with 416 cores and use the VM for only one hour every working day, the costs drop from $95,000 per month to $2,550 per month.
If there are resources in use that cannot be switched off, then the alternative is to scale them up and down. With Platform-as-a-Service services, scaling is built into the platform, while Infrastructure-as-a-Service solutions need to be designed for scaling.
Scaling is differentiated between manual and automatic scaling. Manual scaling means that scaling is triggered either by a user or by an external process. With automatic scaling, the resource is configured with rules so that it can scale by itself when certain constellations occur.
Of course, the serverless computing variant must also be mentioned here. If possible, workloads can be rebuilt with serverless computing offerings to increase cost efficiency.
Since cloud computing offers the possibility of renting virtual machines instead of having to purchase them, this brings enormous flexibility.
However, many customers run solutions in the cloud that do not need this flexibility. This is where the principle of "reservations" comes into play. With "reservations", compute or storage power can be reserved in a cloud data centre. Using the Azure Cloud as an example, it works in such a way that a customer commits to Microsoft to reserve an agreed service for a period of one or three years. This commitment, which makes it easier for the cloud provider to plan, is rewarded with a discount for the customer. According to official information, the discounts are in the range of 30 or 60 percent for 1-year or 3-year "reservations". The conclusion of a reservation is subject to certain conditions, such as a penalty payment that becomes due if a reservation has to be returned.
The last point that can be considered for cost optimisation is the so-called hybrid benefit. With the Hybrid Benefit, software licences can be moved from the on-premises data centre to the cloud.
In the Azure Cloud, there is a Hybrid Benefit for the operating system (Windows Server) and for the database (SQL Server). As these licences are included in the per-minute price for cloud machines, the cost is reduced if an existing licence is used.
Hybrid Benefit is also an incentive scheme for migration to the cloud. For every virtual machine licensed in the on-premises data centre, a virtual machine in the cloud can be obtained at a reduced price with the Hybrid Benefit.
In summary, the following can be said: When it comes to cost management in the cloud, it is extremely important to understand the cost drivers of the cloud and to know the savings opportunities. In practice, we see every day that the classic data centre concepts are transferred one-to-one to the cloud. As a result, many of the advantages of cloud computing cannot be exploited.
Manuel Meyer's article on the topic "What causes costs to increase in the cloud?" can be found here.