Why Microsoft is Betting on FPGAs for Machine Learning at the Edge
As time has passed, Microsoft has begun to use FPGAs with their Azure and Bing infrastructure in order to increase the amount of work that can be done at one time with things like software-defined mining and search indexing. Even years ago when Microsoft began to do these things, the idea was to eventually offer hardware acceleration to customers in one way or another.
Image Classification and Recognition Models
Many different image classification and recognition models have been created using deep neural networks, including DenseNet-121, ResNet 50, SSD-VGG, ResNet 152, and VGG-16. These models on the Azure Machine Learning services are now capable of running with FPGA (field-programmable gate array) hardware acceleration. The models can also be added to packages and ran on machines like Azure Data Box Edge.
FPGA at the Edge
Running these kinds of trained models on the cloud doesn’t always prove to be practical. For instance, image recognition software used on a production line can determine if there are ingredients that might spoil food. However, spending time waiting for those images to upload will slow production to a crawl. In addition, some environments may not have the connectivity possible for these techniques to begin with.
Moving Toward the Future
Support of FPGA started with ResNet, which continues to be the most popular option. Microsoft has also started to add other models with support for transfer learning. This involves taking a trained model and then retraining it, so it works for a different data set for quick and high accuracy. The tools that are used to port models are quickly improving to the point where they might be available to customers in the future. However, the emphasis continues to be usability and simplicity.
Changes Throughout 2019
This year, Microsoft made public FPGA chips that offer machine model training and interference. In addition to that, the Open Neural Network Exchange (ONNX) will start to support Intel’s nGraph and Nvidia’s TensorRT for interference on Intel and Nvidia hardware. This is happening after Microsoft jumped into the MLFlow project and began to open source ONNX Runtime.
Affordable and Precise Models
The ability to train models on the cloud and then use them at the edge makes development easier. However, it can also lead to less accuracy. The reason for this is that making cloud trained models fit local hardware tends to drop numerical precision. However, Microsoft has made strides to decrease the frequency of this issue.
The Azure FPGA design lets you run a model across numerous FPGAs without the threat of increased latency. This is possible because rather than communicating using a CPU, the FPGAs can be connected together directly. These machines are efficient enough that there is no need to deploy a huge amount of them at once.
This means that processing images like in the example above could become much easier than ever before. As Microsoft continues on this path, the entire landscape of machine learning may find itself changing.