Nvidia partners with Google Cloud to launch AI-focused hardware instances
In partnership with Google, Nvidia today launched a new cloud hardware offering, the L4 platform, optimized to run video-focused applications.
Available in private preview on Google Cloud through Google’s G2 virtual machines, Nvidia says that the L4 platform is designed to accelerate “AI-powered” video performance. Serving as a general-purpose GPU, L4 delivers video decoding as well as transcoding and video streaming capabilities.
Beyond providing access to the L4 platform through Google Cloud, Google is integrating L4 into Vertex AI, its managed machine learning service for enterprise customers.
For those who prefer not to sign up with Google Cloud, L4 will be available later this year from Nvidia’s network hardware partners, including Asus, Cisco, Dell, Hewlett Packard Enterprise and Lenovo.
L4 sits alongside the other AI-focused hardware solutions Nvidia announced today, including L40, H100 NVL and Grace Hopper for Recommendation Models. L40 is optimized for graphics and AI-enabled 2D, video and 3D image generation, while H100 NVL supports deploying large language models such as ChatGPT. (As the name implies, Grace Hopper for Recommendation Models is recommendation model-focused.)
L40 is available this week through Nvidia’s aforementioned hardware partners. Nvidia expects Grace Hopper and the H100 NVL, meanwhile, will ship in the second half of the year.
In related news, today marks the launch of Nvidia’s DGX Cloud platform, which gives companies access to infrastructure and software to train models for generative and other forms of AI. Announced earlier this year, DGX Cloud lets enterprises rent clusters of Nvidia hardware on a monthly basis — starting at an eye-watering $36,999 per instance per month.
Each instance of DGX Cloud features eight Nvidia H100 or A100 80GB Tensor Core GPUs for a total of 640GB of GPU memory per node, paired with storage. With DGX Cloud subscriptions, customers also get access to AI Enterprise, Nvidia’s software layer containing AI frameworks, pretrained models and “accelerated” data science libraries.
Nvidia says that it’s partnering with “leading” cloud service providers to host DGX Cloud infrastructure, starting with Oracle Cloud Infrastructure. Microsoft Azure is expected to begin hosting DGX Cloud next fiscal quarter, and the service will soon expand to Google Cloud.
Nvidia’s aggressive push into AI compute comes as the company moves away from unprofitable investments in other areas, like gaming and professional virtualization. Nvidia’s last earnings report showed its data center business, which includes chips for AI, continued to grow (to $3.62 billion), suggested that Nvidia could continue to benefit from the generative AI boom.