Introducing YOLO-NAS: One of The Most Efficient Object Detection Algorithms

There's a new state-of-the-art model in town for real-time object detection

Luís Fernando Torres

LatinXinAI

· ~9 min read · May 7, 2023 (Updated: May 19, 2023) · Free: Yes

Introduction

YOLO (You Only Look Once) is !pip install super-gradients==3.1.0

You Only Look Once: Unified, Real-Time Object Detection by Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi, which can be read here.

Since its introduction, YOLO has become one of the most popular algorithms for object detection and classification tasks, thanks to its high accuracy and speed. It has achieved state-of-the-art performance on a variety of object detection benchmarks.

Yolo Architecture

Just recently, in the first week of May 2023, the YOLO-NAS model has been introduced to the Machine Learning world, and it has unmatched precision and speed, outperforming other models like YOLOv7 and YOLOv8.

YOLO-NAS vs. other models

The YOLO-NAS model is pre-trained on datasets like COCO and Objects365, which makes it suitable for real-world applications. It is currently available on Deci's SuperGradients, which is a PyTorch-based library that contains nearly 40 pre-trained models for performing different computer vision tasks, such as classification, detection, segmentation, etc.

Let's get to work, then, and install the SuperGradients library to start using YOLO-NAS!

# Installing supergradients lib code>

`Importing and Loading YOLO-NAS`

#importing models from supergradients' training module import models

The next step is to initiate the model. YOLO-NAS is available in different models, for this notebook, we're going to use yolo_nas_l, with pretrained_weights = 'coco'.

You can get more information on the different models on this GitHub page.

# Initializing model pretrained_weights = "coco")

`Model Architecture`

In the code cell below, we use torchinfo's summary to obtain the YOLO-NAS architecture, which is useful to get an in-depth understanding on how the model operates.

# Yolo NAS architecture ============================================================================================================ Input Shape          Output Shape         Param #              Trainable ============================================================================================================ [16, 3, 640, 640]    [16, 8400, 4]        --                   True [16, 3, 640, 640]    [16, 96, 160, 160]   --                   True [16, 3, 640, 640]    [16, 48, 320, 320]   --                   True [16, 3, 640, 640]    [16, 48, 320, 320]   3,024                True [16, 48, 320, 320]   [16, 96, 160, 160]   --                   True [16, 48, 320, 320]   [16, 96, 160, 160]   88,128               True [16, 96, 160, 160]   [16, 96, 160, 160]   758,594              True [16, 96, 160, 160]   [16, 192, 80, 80]    --                   True [16, 96, 160, 160]   [16, 192, 80, 80]    351,360              True [16, 192, 80, 80]    [16, 192, 80, 80]    2,045,315            True [16, 192, 80, 80]    [16, 384, 40, 40]    --                   True [16, 192, 80, 80]    [16, 384, 40, 40]    1,403,136            True [16, 384, 40, 40]    [16, 384, 40, 40]    13,353,733           True [16, 384, 40, 40]    [16, 768, 20, 20]    --                   True [16, 384, 40, 40]    [16, 768, 20, 20]    5,607,936            True [16, 768, 20, 20]    [16, 768, 20, 20]    22,298,114           True [16, 768, 20, 20]    [16, 768, 20, 20]    --                   True [16, 768, 20, 20]    [16, 384, 20, 20]    295,680              True --                   --                   --                   -- [16, 1536, 20, 20]   [16, 768, 20, 20]    1,181,184            True [16, 96, 160, 160]   [16, 96, 80, 80]     --                   True [16, 768, 20, 20]    [16, 192, 20, 20]    --                   True [16, 384, 40, 40]    [16, 192, 40, 40]    74,112               True [16, 192, 80, 80]    [16, 192, 80, 80]    37,248               True [16, 192, 80, 80]    [16, 192, 40, 40]    332,160              True [16, 768, 20, 20]    [16, 192, 20, 20]    147,840              True [16, 192, 20, 20]    [16, 192, 40, 40]    147,648              True [16, 576, 40, 40]    [16, 192, 40, 40]    110,976              True [16, 192, 40, 40]    [16, 192, 40, 40]    2,595,716            True [16, 192, 40, 40]    [16, 96, 40, 40]     --                   True [16, 192, 80, 80]    [16, 96, 80, 80]     18,624               True [16, 96, 160, 160]   [16, 96, 160, 160]   9,408                True [16, 96, 160, 160]   [16, 96, 80, 80]     83,136               True [16, 192, 40, 40]    [16, 96, 40, 40]     18,624               True [16, 96, 40, 40]     [16, 96, 80, 80]     36,960               True [16, 288, 80, 80]    [16, 96, 80, 80]     27,840               True [16, 96, 80, 80]     [16, 96, 80, 80]     2,546,372            True [16, 96, 80, 80]     [16, 192, 40, 40]    --                   True [16, 96, 80, 80]     [16, 96, 40, 40]     83,136               True [16, 192, 40, 40]    [16, 192, 40, 40]    1,280,900            True [16, 192, 40, 40]    [16, 384, 20, 20]    --                   True [16, 192, 40, 40]    [16, 192, 20, 20]    332,160              True [16, 384, 20, 20]    [16, 384, 20, 20]    5,117,700            True [16, 96, 80, 80]     [16, 8400, 4]        --                   True [16, 96, 80, 80]     [16, 68, 80, 80]     --                   True [16, 96, 80, 80]     [16, 128, 80, 80]    12,544               True [16, 128, 80, 80]    [16, 128, 80, 80]    147,712              True [16, 128, 80, 80]    [16, 80, 80, 80]     10,320               True [16, 128, 80, 80]    [16, 128, 80, 80]    147,712              True [16, 128, 80, 80]    [16, 68, 80, 80]     8,772                True [16, 192, 40, 40]    [16, 68, 40, 40]     --                   True [16, 192, 40, 40]    [16, 256, 40, 40]    49,664               True [16, 256, 40, 40]    [16, 256, 40, 40]    590,336              True [16, 256, 40, 40]    [16, 80, 40, 40]     20,560               True [16, 256, 40, 40]    [16, 256, 40, 40]    590,336              True [16, 256, 40, 40]    [16, 68, 40, 40]     17,476               True [16, 384, 20, 20]    [16, 68, 20, 20]     --                   True [16, 384, 20, 20]    [16, 512, 20, 20]    197,632              True [16, 512, 20, 20]    [16, 512, 20, 20]    2,360,320            True [16, 512, 20, 20]    [16, 80, 20, 20]     41,040               True [16, 512, 20, 20]    [16, 512, 20, 20]    2,360,320            True [16, 512, 20, 20]    [16, 68, 20, 20]     34,884               True ============================================================================================================ ============================================================================================================ 27238.60 ============================================================================================================

`Object Detection on Images`

We can now test the model's abilities in detecting objects on different images.

In the code below, we initiate a variable called image, which receives a URL containing an image. Then, we can use the predict and show methods to display the image with the model's predictions on it.

image = "https://i.pinimg.com/736x/b4/29/48/b42948ef9202399f13d6e6b3b8330b20.jpg" class=mt-7>

YOLO-NAS: Object Detection on Image

On the image above, we can see the detections made for each object and the confidence scores the model has in its own predictions. For instance, we can see that the model has a 97% confidence score that the white object on the floor is a cup. However, there are many objects in this image, and we can see that the model mistakes the Nintendo 64 game console for a car.

We can improve our results by using the conf argument, which serves as a threshold for detections. We can, for instance, change this value to conf = 0.50, so the model only displays detections on which the confidence score is above 50%. Let's try it out.

image = "https://i.pinimg.com/736x/b4/29/48/b42948ef9202399f13d6e6b3b8330b20.jpg" = 0.50).show()

YOLO-NAS: Object Detection on Image

Now the model only displays objects with at least a 50% confidence score in its detections, which are the cup, the TV, and the remote.

We can test more images.

YOLO-NAS: Object Detection on Image

`Object Detection on Videos`

We can also use the YOLO-NAS model to perform real-time object detection on videos!

On the codes below, I use the YouTubeVideo module from the IPython library to select and save any YouTube video I'd like.

from IPython.display import YouTubeVideo # Importing YouTubeVideo from IPython's display module video ID # Loading video video

Now that we have selected a video, we are going to use the youtube-dl library to download the video from YouTube in a .mp4 format.

After this is done, we save the video to the input_video_path variable, which will serve as input for our model to perform detections.

# Downloading video /watch?v={video_id}' om/ytdl-org/youtube-dl.git" xt=mp4]+bestaudio[ext=m4a]/mp4' "$video_url" Rehab 'Short Game' Commercial-VtK2ZMlcCQU.mp4" /code>

Now we import PyTorch and enable the GPU.

import torch else "cpu"

We then use the to( ) method to run the YOLO-NAS model on GPU, and we use the predict( ) method to perform predictions on the video stored in the input_video_path variable. The save( ) method is used to save the video with the detections on it to the path specified by output_video_path.

yolo_nas.to(device).predict(input_video_path).save(output_video_path) # Running predictions on video █████| 900/900 [33:15<00:00,  2.22s/it]

After all that is done, we use IPython again to display a .gif file containing the video downloaded above in a .gif format, so it is visible on this Kaggle notebook.

from IPython.display import Image f/detections.gif','rb') as f: format='png'))

You can see the results below:

 👨‍💻Object Detection: YOLO-NAS Model 🔍 class="text-sm text-grey-darker">lore and run machine learning code with Kaggle Notebooks | Using data from [Private Datasource]www
kaggle.com

`Conclusion`

We performed an initial object detection task on both images and video using the newly released YOLO-NAS model.

It's important, however, to highlight the fact that you can fine-tune this model using a custom dataset, which improves its performance on certain objects. For more information on how to fine-tune the YOLO-NAS model, please, feel free to take a look at the notebook Intro to SuperGradients + YOLONAS Starter Notebook, available on Google Colab.

Thank you so much for reading,

Luís Fernando Torres

LinkedIn

Kaggle

`Reference`

Kaggle Notebook — 👨‍💻Object Detection: YOLO-NAS Model 🔍

LatinX in AI (LXAI) logo

Do you identify as Latinx and are working in artificial intelligence or know someone who is Latinx and is working in artificial intelligence?

Get listed on our directory and become a member of our member's forum: https://forum.latinxinai.org/
Become a writer for the LatinX in AI Publication by emailing us at [email protected]
Learn more on our website: http://www.latinxinai.org/

Don't forget to hit the 👏 below to help support our community — it means a lot!

Thank you :)


        
            #object-detection#deep-learning#machine-learning#neural-networks#data-science

Introducing YOLO-NAS: One of The Most Efficient Object Detection Algorithms

There's a new state-of-the-art model in town for real-time object detection

Introduction

Importing and Loading YOLO-NAS

Model Architecture

Object Detection on Images

Object Detection on Videos

👨‍💻Object Detection: YOLO-NAS Model 🔍 class="text-sm text-grey-darker">lore and run machine learning code with Kaggle Notebooks | Using data from [Private Datasource]www

Conclusion

Reference

Reporting a Problem

`Importing and Loading YOLO-NAS`

`Model Architecture`

`Object Detection on Images`

`Object Detection on Videos`

`Conclusion`

`Reference`