Tag

Microservices

All articles tagged with #microservices

technology1 year ago

"NVIDIA and SAP Collaborate to Revolutionize Gen AI Model Deployment with NIM Microservices"

NVIDIA has launched enterprise-grade generative AI microservices that enable businesses to create and deploy custom applications on their own platforms while retaining ownership of their intellectual property. These microservices, built on the NVIDIA CUDA platform, include NIM microservices for optimized inference on popular AI models and CUDA-X microservices for data processing, retrieval-augmented generation, and more. The microservices are adopted by leading application platform providers and can be accessed through NVIDIA AI Enterprise 5.0, offering a standardized path to run custom AI models optimized for NVIDIA's CUDA installed base.

technology1 year ago

"NVIDIA and SAP Introduce NIM for Accelerated Generative AI in Healthcare and Enterprise Applications"

Nvidia has launched Nvidia NIM, a software platform aimed at simplifying the deployment of custom and pre-trained AI models into production environments by combining models with optimized inferencing engines and packaging them into containers as microservices. NIM includes support for various models and is being integrated into platforms like SageMaker, Kubernetes Engine, and Azure AI. The company plans to add more capabilities over time and has already garnered interest from companies like Box, Cloudera, and Dropbox.

technology2 years ago

Amazon's Prime Video team saves 90% costs by switching from microservices to monolith and EC2/ECS.

An Amazon Prime Video team's case study has suggested that moving from a microservices architecture to a monolith can reduce infrastructure costs by over 90%. The team initially created a solution with distributed components orchestrated by AWS Step Functions, but it turned out to be a bottleneck. The team then packed all the components into a single process, eliminating the need for S3. The solution now runs on EC2 and ECS, with a lightweight orchestration layer to distribute customer requests. The paper is a refreshingly honest look at how to reduce cost with a simplified architecture, as well as a case study in willingness to change track.