An organization for which I am a contractor had an annual Azure bill that totalled more than $100,000 per year. After re-provisioning, we reduced the annual bill to less than $13,000. Sounds good to you? Here are some tips.
1. Avoid the Azure-provided database servers, especially Cosmo DB.
The charges are fantastically high for SQL (and noSQL) services on Azure. Cosmo DB charges by the number of collections you use, not the amount of storage. We had one project for which the CosmoDB bill was $22,000 per year! It had 11 collections in it, each with fewer than 5 documents. I almost fainted.
What to do instead? Create a Linux virtual machine and run all of your file and SQL oriented services from that machine.
It is quite easy to install and run MongoDB, PostgreSQL, and MySQL. The Linux administrator skills required are, well, very meager. If your administrator has installed Linux, say, on 5 different workstations during the last few years, and fiddled with configurations and user requests for packages, I expect your administrator can handle this.
The only interesting part of this exercise came up in the administration of PostgreSQL. For whatever reason, the default settings are not optimized for speed on data set queries. After a couple of hours of panicked Googling and test adjustments in the config file, our PostgreSQL server instance was running faster--yes, faster, than the Asure-provided SQL server.
Another benefit of doing this is that your companion devices, which you keep in the same region, in the same virtual net, can interact with your file/SQL server quickly. We set up an SSH tunnel between the systems and it really speeds things up. And makes them more secure.
2. Avoid using the Azure AppServices framework.
The app server prices are scaled to the demands on your app. An app that has any serious activity can cost you hundreds of dollars per month.
It is a better idea to buy a VM and run your own app server.
Most of the apps I needed to administer were Node.js or Python Flask apps. As developers know, these apps are typically developed with a small test web server that starts with a warning "this is not a production quality web server. Use a secure WSGI server."
I had never done this before January, 2020, and it was not too easy. I spent about a week googling around for alternative methods before I eventually settled in a "middleware" server called Phusion Passenger. Passenger is offered as a free software, but there is a commercial version that has a few features. One feature that attracted me was that Passenger is multi-lingual. It can host Python, Node.js, Ruby/Rails, and others. And the instructions are truly excellent.
The best web server at the current time is Nginx (say "engine X"). Nginx acts as a "reverse proxy server" that receives web visitors and assigns their requests to the various Passenger configurations.
3. Change billing from pay-as-you-go to reservations.
This is a little tricky, Microsoft does not make it very easy to figure out. If you are willing to make a commitment of 3 years to use a resource of a given size (a VM with given number of CPU and memory), the price per month will drop. The price drop will be at least 30%, but maybe more.
There's a penalty for early withdrawal, as they say, but it is not severe. If you stay with the reservation for one calendar year, you would break even if you cancel the service.
Keep in mind that the reservation is not for a particular machine, it is for a particular class of machine. So if you create a reservation for one machine project, and that project is killed off, you can remove that VM and start a new one of same type and it inherits the lower reservation price.
4. Buy a rack server for GPU calculations
The Azure price for machines capable of GPU calculations is extraordinarily high. We found a much better option. There's a local company called Stallard Technology (STI) that sells Dell rack servers. Some are brand new, some are factory reconditioned.
We bought a rack server that had 2 Nvidia GPU devices in it for about $4300. It runs fine. I set up Ubuntu with Tensor Flow on that system and it generates results more quickly than the Azure GPU system did. After a few months, we will have saved enough on the Azure GPU bill to pay for the server.
The conclusion: Azure VM are handy devices that you can afford, if you are willing to run your own services. A lot of money can be saved.