Sr. Staff Software Engineer - HPC Network Engineering job opportunity at LinkedIn.



Date2026-03-18T20:39:00.192Z bot
LinkedIn Sr. Staff Software Engineer - HPC Network Engineering
Experience: General
Pattern: Full-time
apply Apply Now
Salary:
Status:

Job

Copy Link Report
degreeGeneral
loacation Mountain View, California, United States Of America
loacation Mountain View,..........United States Of America
Auto GPT Summarize Enabled

Job DescriptionAt LinkedIn, our approach to flexible work is centered on trust and optimized for culture, connection, clarity, and the evolving needs of our business. The work location of this role is hybrid, meaning it will be performed both from home and from a LinkedIn office on select days, as determined by the business needs of the team.This role will be based in Mountain View, CA.We are seeking an HPC Network Engineer to design, deploy, and operate high-performance, low-latency Ethernet fabrics for large-scale GPU clusters. The role focuses on RoCE v2–based GPU interconnect networks supporting AI/ML training, inference, and HPC workloads. You will work closely with systems, GPU, platform, and software teams to build scalable, lossless Ethernet networks optimized for RDMA traffic.As a Senior Staff Software Engineer, you will define long-term technical direction, lead cross-org initiatives, mentor senior engineers, and drive solutions for complex distributed systems challenges at massive scale. This role requires deep expertise in backend systems, data processing, and large-scale system design, with strong understanding of networking concepts.Responsibilities:Network architecture and design for large-scale LLM training and inference workloads.Design RoCE v2–based GPU interconnection fabrics for multi-rack and multi-pod GPU clustersDefine lossless Ethernet architectures (Clos / fat-tree / leaf-spine) optimized for RDMASelect and validate 400G / 800G Ethernet switching platforms and NICs (ConnectX, BlueField, etc.)Deep expertise in host-level and Kubernetes pod networking architectures, including enablement of high-performance features such as RDMA and GPU Direct.Experience in host network performance tuning for large-scale collective communications, balancing latency, throughput, and congestion control.Analyze system performance and diagnose complex cross-layer issues.

Other Ai Matches

Staff Technical Program Manager Applicants are expected to have a solid experience in handling Job related tasks
Technical Services Manager Applicants are expected to have a solid experience in handling Job related tasks
Senior Engineer, Site Reliability (Oracle Apps DBA) Applicants are expected to have a solid experience in handling Site Reliability (Oracle Apps DBA) related tasks
Senior Staff AI Engineer, Network Growth AI Applicants are expected to have a solid experience in handling Network Growth AI related tasks