CUDA Accelerated: NVIDIA Launches New CUDA Libraries to Enhance Computing Acceleration and Deliver Extreme Speeds to Scientific and Industrial Applications

Editor’s note: This is the first in our new CUDA Accelerated blog series, which each month highlights the latest libraries, NIM microservices and tools that help developers and businesses use GPUs to accelerate their workloads.

Summary of the story: New libraries on computer accelerators provide increased speed and reduced energy and cost in data processing, artificial intelligence, artificial intelligence, AI data curation, data processing, 6G research, AI-physics and more. It includes:

  • LLM software: NeMo Curator, to create datasets, adds image processing and Nemotron-4 340B to create high-quality data.
  • Data processing: cuVS vector search to generate indexes in minutes instead of days with the new Polars GPU engine in open beta.
  • Physical AI: For physics simulations, Warp supports calculations with the new TIle API. Compared to wireless networks, Aerial adds the feature of sequential mapping and simulation. And for wireless link testing, Sionna adds a new tool for real-time input

Companies around the world are turning to NVIDIA’s fast computing to speed up applications that used to only use CPUs. This has enabled them to achieve high speeds and benefit from significant energy savings.

In Houston, CPFD develops computational fluid dynamics software for industrial simulations, such as its Barracuda Virtual Reactor software that helps design next-generation reactors. The plastic recycling center runs CPFD software in the cloud powered by NVIDIA powered computers. With the CUDA GPU accelerator, it can scale efficiently and run simulations 400x faster and 140x more energy efficient than using a CPU workstation.

A conveyor belt filled with plastic bottles passes through a recycling station. An image created by AI.
Bottles are being put into plastic recycling facilities. An image created by AI.

Most popular video conferencing systems cover several hundred thousand meetings per hour. While using the CPUs to generate the captions, the app can query the transformer-powered AI voice recognition model three times a second. After moving to GPUs in the cloud, the throughput of the application increased to 200 queries per second – a 66x speedup and a 25x power improvement.

In homes around the world, the e-commerce site connects hundreds of millions of shoppers a day to the products they need using a powerful algorithm powered by a deep learning algorithm, running on its fastest NVIDIA computer. After switching from CPUs to GPUs in the cloud, it achieved a very low latency with 33x speed and a 12x power efficiency improvement.

With the large size of the data, accelerated computing in the cloud has been established to be able to use new methods.

NVIDIA Accelerated Computing on CUDA GPUs Is Sustainable Computing

NVIDIA estimates that if all AI, HPC and data analytics tasks currently running on CPU servers were accelerated with CUDA GPUs, data centers could save 40 terawatt-hours per year. That’s the equivalent of powering 5 million homes in the US annually.

Accelerated computing uses the parallel power of CUDA GPUs to complete larger tasks than CPUs, improving productivity and significantly reducing cost and power consumption.

Although adding GPUs to a CPU-only server adds significant power, the GPU’s speed finishes tasks quickly and then enters the power-hungry state. The overall power consumption of GPU-accelerated computers is significantly lower than general-purpose CPUs, while still providing superior performance.

Improvements in power efficiency are possible due to local, cloud and hybrid applications using faster computing on GPUs compared to CPUs.
GPUs get up to 20x more power compared to traditional CPU-only servers because they provide more performance per watt, completing more tasks in less time.

In the last decade, NVIDIA AI computers have achieved a nearly 100,000x increase in processing power for large types of languages. To that end, if the car’s efficiency improves as much as NVIDIA has improved the AI ​​capabilities of its high-speed computing platform, it will achieve 500,000 miles per gallon. It’s enough to drive to the moon, and back, on less than a gallon of fuel.

In addition to the impressive increase in AI performance, GPU computing can significantly improve CPU performance. Customers of NVIDIA’s accelerated multitasking computing platform in the cloud service provider have seen 10-180x speedups for many types of real-world tasks, from data processing to computer vision, as the chart below shows.

Data processing, scientific computing, AI speech, machine learning, search, computer vision and other services managed by cloud clients have achieved 10-160x speedups.
Speedups of 10-180x have been achieved in real-world scenarios with cloud clients across multiple applications with the NVIDIA computing platform.

As workloads continue to outpace computing power, CPUs have struggled to keep up, creating a performance gap and driving “load inflation.” The chart below shows the trend over the years of how the size of the data has outstripped the size of the compute performance per watt of CPUs.

What is known as compute inflation is illustrated by a graph, with an arc showing CPU performance per watt decreasing as data size increases rapidly.
The widening gap between data size and the decreasing performance per watt of CPUs.

Saving energy in GPU acceleration frees up what would otherwise be a waste of money and energy.

With its great savings in power consumption, fast computers are standard computers.

The Right Tools for Every Job

GPUs cannot accelerate programs written for CPUs of all kinds. Special algorithm libraries are required to run specific tasks. Just as a mechanic would have a toolbox with everything from a screwdriver to a wrench for various tasks, NVIDIA provides a variety of libraries to perform low-level tasks such as calculations and calculations on data.

Each NVIDIA CUDA library is optimized for use with NVIDIA GPUs. Together, they combine the power of the NVIDIA platform.

New updates continue to be added to the CUDA pipeline, spanning a wide range of applications:

LLM programs

NeMo Curator gives developers the flexibility to quickly create datasets in large-scale linguistic models (LLM). Recently, we announced the ability to go beyond speech to extend to multimodal support, including imaging.

SDG (synthetic data generation) augments existing data sets with high-quality, structured data to customize and update LLM models and programs. We announced the Nemotron-4 340B, a new series of models designed for SDG that enable businesses and developers to use model outputs and create custom models.

Data Processing Software

to VS is an open source search library with GPU integration that provides incredible speed and performance for LLMs and semantic searches. The latest CuVS allows large indexes to be built in minutes instead of hours or days, and search them at scale.

polar is an open source library that uses query optimization and other techniques to process hundreds of millions of rows of data efficiently on a single machine. The new Polars GPU engine powered by NVIDIA’s cuDF library will be available in open beta. It provides 10x more power compared to the CPU, bringing the energy savings of fast computing to data professionals and their applications.

Physical AI

Warpthanks to the advanced GPU simulation and graphics, it helps to accelerate spatial computing by making it easier to write different programs for physics, detection, robotics and geometry processing. The next release will have support for the new Tile API that allows developers to use Tensor Cores inside GPUs for matrix and Fourier computations.

Spacecraft is a group of accelerator platforms that include Aerial CUDA-Accelerated RAN and Aerial Omniverse Digital Twin for designing, simulating and deploying wireless networks for commercial and industrial research. The next release will include a new Aerial extension with more ray tracing maps and more accurate simulations.

Shanna is a GPU-accelerated open source library for wireless networking and communication systems. With GPUs, Sionna achieves a large scale order, which facilitates the analysis of these systems and paves the way for next-generation research. The next release will include all the tools needed to design, train and evaluate neural network receivers, including support for real-time control of such neural network receivers using NVIDIA TensorRT.

NVIDIA offers more than 400 libraries. Some, such as CV-CUDA, excel in the processing and optimization of computing tasks found in user-generated videos, dynamic systems, mapping and video conferencing. Others, like cuDF, accelerate data frames and tables between SQL databases and pandas in data science.

CAD – Computer Aided Design, CAE – Computer Aided Engineering, EDA – Electronic Design Automation

Most of these libraries are flexible – for example, cuBLAS for running algebra – and can be used for many applications, while others are more specialized to focus on specific applications, such as cuLitho for silicon computational lithography.

For developers who don’t want to develop their own pipelines and libraries for NVIDIA CUDA-X, NVIDIA NIM provides a streamlined way to deploy production by packaging multiple libraries and AI models into optimized containers. Microservices in containers allow you to develop out of the box.

Extending the functionality of these libraries is an increasing number of advanced hardware-based features that provide extremely high performance. NVIDIA Blackwell software, for example, includes a decompression engine that opens compressed files up to 18x faster than CPUs. This greatly speeds up data processing tasks that require frequent access to complex files such as SQL, Apache Spark and pandas, and downloads them for computation time.

The integration of NVIDIA’s unique high-speed CUDA GPU libraries into cloud computing platforms provides incredible speed and power for a variety of applications. This integration allows businesses to reduce costs and significantly contribute to the advancement of virtual computing, helping the billions of users who rely on cloud-based services to benefit from a sustainable and affordable digital environment.

Find out more about NVIDIA standard computers try and find out Power calculation to get the money that will waste energy and air.

You see notice related to software knowledge.

#CUDA #Accelerated #NVIDIA #Launches #CUDA #Libraries #Enhance #Computing #Acceleration #Deliver #Extreme #Speeds #Scientific #Industrial #Applications

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top