Deep Learning Frameworks: Performances Analysis

LERAT, Jean-Sébastien; Mahmoudi, Ahmed Sidi; Mahmoudi, Saïd

Voir/Ouvrir

DeepLearn.pdf (512.1Ko)

Date

2021

Auteur

LERAT, Jean-Sébastien

Mahmoudi, Ahmed Sidi

Mahmoudi, Saïd

Metadata

Afficher la notice complète

Deep Learning Frameworks: Performances Analysis

Résumé

The fourth industrial revolution use modern technologies that produce a continues flow of data. This large amount of data cannot be analyzed with traditional technologies to detect and diagnose problem without the need of a human. Deep learning consists of a set of methods based on neural networks that can process and extract information from a such amount of data. Deep learning frameworks provide a high-level programming interface to offer fast design and implementation of deep learning tasks. Based on them, new models and applications are developed and perform better and better. Nevertheless, a framework that runs on a single computer cannot alone takes into account the huge flow of data. It is known that cluster of computers can operate to quickly deliver a model or to enable the design of a complex neural network spread among computers. Edge artificial intelligence and cloud computers are other technologies in which deep learning tasks can be distributed between the available computing nodes. The advantage of cloud computing over other technologies is its elasticity: the ability to scale its infrastructure depending of the resources requirement. To design a framework which scales compute nodes depending on the deep learning task, we review and analyze the state-of-the-art frameworks. In this work, we collect data on how frameworks use the CPU, the RAM, and the GPU with and without multi-threading on convolutional neural networks that predict a label on a small and a big dataset. Moreover, we discuss the process of data collection management when using GPU frameworks. We consider five frameworks, namely MxNet, Paddle, pyTorch, Singa, and Tensorflow 2. All of them have a native implementation with a Python binding and they support both the CUDA and the OpenCL library. We noted that Singa quickly computes the results but does not take care of available resources resulting in a crash. MxNet and Paddle also cannot handle some running configurations and are not able to adapt their behavior to accomplish the task. Other frameworks can achieve the goal with a difference in response time in favor of pyTorch. Moreover we show that pyTorch uses 100% of the CPU with a steady number of threads counter to Tensorflow. Tensorflow requires less RAM than pyTorch with the stochastic gradient descent method. When it comes to the mini-batch learning process, pyTorch has quite similar RAM needs than Tensorflow. We also infer the GPU behavior from time spent in the CUDA functions. These different analyzes show us that pyTorch makes better use of the available resources. That is why it outperforms Tensorflow.

Voir le document

Voir/Ouvrir

Date

Auteur

Metadata

Partage ça

Deep Learning Frameworks: Performances Analysis

Résumé