Senior System Software Engineer, NCCL - Partner Enablement job opportunity at NVIDIA.



DatePosted 19 Days Ago bot
NVIDIA Senior System Software Engineer, NCCL - Partner Enablement
Experience: 5-years
Pattern: Remote
apply Apply Now
Salary:
Status:

NCCL - Partner Enablement

Copy Link Report
degreeOND
loacation Switzerland, Remote, Switzerland
loacation Switzerland, R..........Switzerland

NVIDIA is leading the way in groundbreaking developments in Artificial Intelligence, High Performance Computing and Visualization. The GPU, our invention, serves as the visual cortex of modern computers and is at the heart of our products and services. Our work opens up new universes to explore, enables amazing creativity and discovery, and powers what were once science fiction inventions from artificial intelligence to autonomous cars. Come work for the team that brought to you NCCL, NVSHMEM & GPUDirect. Our GPU communication libraries are crucial for scaling Deep Learning and HPC applications! We are looking for a motivated Partner Enablement Engineer to guide our key partners and customers with NCCL. Most DL/HPC applications run on large clusters with high-speed networking (Infiniband, RoCE, Ethernet). This is an outstanding opportunity to get an end to end understanding of the AI networking stack. Are you ready for to contribute to the development of innovative technologies and help realize NVIDIA's vision? What you will be doing: Engage with our partners and customers to root cause functional and performance issues reported with NCCL Conduct performance characterization and analysis of NCCL and DL applications on groundbreaking GPU clusters Develop tools and automation to isolate issues on new systems and platforms, including cloud platforms (Azure, AWS, GCP, etc.) Guide our customers and support teams on HPC knowledge and standard methodologies for running applications on multi-node clusters Document and conduct trainings/webinars for NCCL Engage with internal teams in different time zones on networking, GPUs, storage, infrastructure and support. What we need to see: B.S./M.S. degree in CS/CE or equivalent experience with 5+ years of relevant experience. Experience with parallel programming and at least one communication runtime (MPI, NCCL, UCX, NVSHMEM) Excellent C/C++ programming skills, including debugging, profiling, code optimization, performance analysis, and test design Experience working with engineering or academic research community supporting HPC or AI Practical experience with high performance networking: Infiniband/RoCE/Ethernet networks, RDMA, topologies, congestion control Expert in Linux fundamentals and a scripting language, preferably Python Familiar with containers, cloud provisioning and scheduling tools (Docker, Docker Swarm, Kubernetes, SLURM, Ansible) Adaptability and passion to learn new areas and tools Flexibility to work and communicate effectively across different teams and timezones Ways to stand out from the crowd: Experience conducting performance benchmarking and developing infrastructure on HPC clusters. Prior system administration experience, esp for large clusters. Experience debugging network configuration issues in large scale deployments Familiarity with CUDA programming and/or GPUs. Good understanding of Machine Learning concepts and experience with Deep Learning Frameworks such PyTorch, TensorFlow Deep understanding of technology and passionate about what you do NVIDIA is at the forefront of breakthroughs in Artificial Intelligence, High-Performance Computing, and Visualization. Our teams are composed of driven, innovative professionals dedicated to pushing the boundaries of technology. We offer highly competitive salaries, an extensive benefits package, and a work environment that promotes diversity, inclusion, and flexibility. As an equal opportunity employer, we are committed to fostering a supportive and empowering workplace for all. Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. For Poland: The base salary range is 221,250 PLN - 383,500 PLN for Level 3, and 292,500 PLN - 507,000 PLN for Level 4.




Other Ai Matches

Senior Software Engineer, Observability Applicants are expected to have a solid experience in handling Observability related tasks
Manager, SWQA Test Development Applicants are expected to have a solid experience in handling SWQA Test Development related tasks
Research Scientist, ML Systems - PhD New College Grad 2026 Applicants are expected to have a solid experience in handling ML Systems - PhD New College Grad 2026 related tasks
Creative Director - APAC Applicants are expected to have a solid experience in handling Job related tasks
Senior Design for Debug Architect and Methodology Engineer Applicants are expected to have a solid experience in handling Job related tasks
Senior DL Algorithms Engineer - Cosmos Applicants are expected to have a solid experience in handling Job related tasks
Senior SWQA Test Developer - Embedded Applicants are expected to have a solid experience in handling Job related tasks
Senior Research Scientist, Multi-Modal Language Models Applicants are expected to have a solid experience in handling Multi-Modal Language Models related tasks
ATE Test Engineer Applicants are expected to have a solid experience in handling Job related tasks
Senior ASIC Verification Engineer, Coherent High Speed Interconnect Applicants are expected to have a solid experience in handling Coherent High Speed Interconnect related tasks
Senior Design Verification Engineer - Interconnect IP Applicants are expected to have a solid experience in handling Job related tasks
remote-jobserver Remote
Senior Software Engineer - Dynamic Storage Applicants are expected to have a solid experience in handling Job related tasks
Senior ASIC Verification Engineer – Global IP Applicants are expected to have a solid experience in handling Job related tasks
Senior Power Methodology and Modeling Engineer Applicants are expected to have a solid experience in handling Job related tasks
PR Specialist Applicants are expected to have a solid experience in handling Job related tasks
Senior Compiler Engineer, AI Inference Platforms Applicants are expected to have a solid experience in handling AI Inference Platforms related tasks
Senior Production Factory Planner Applicants are expected to have a solid experience in handling Job related tasks
Manager, GPU Accelerated Data Analytics Applicants are expected to have a solid experience in handling GPU Accelerated Data Analytics related tasks
Inventory Control Planner Applicants are expected to have a solid experience in handling Job related tasks
Senior Startups Inception Partner Manager Applicants are expected to have a solid experience in handling Job related tasks
Senior Physical Design Engineer Applicants are expected to have a solid experience in handling Job related tasks
Senior Deep Learning Software Engineer, PyTorch - TensorRT Performance Applicants are expected to have a solid experience in handling PyTorch - TensorRT Performance related tasks
Network Site Reliability Engineer Applicants are expected to have a solid experience in handling Job related tasks