About Our Client

We’re a deep-tech startup building next-generation networking and communication infrastructure for AI/ML systems. Our work focuses on eliminating bottlenecks in distributed compute — improving how GPUs, clusters, and networks communicate at scale.

This isn’t application-layer AI. We operate at the transport, systems, and hardware-adjacent layers, solving real performance problems across data centers, edge environments, and specialized networks.

The Role

We’re hiring a Senior Software Engineer to work on high-performance networking and GPU communication systems.

You’ll be designing and optimizing the core infrastructure that powers large-scale AI workloads — reducing latency, improving throughput, and unlocking better utilization across distributed systems.

What You’ll Do

Design and implement high-throughput, low-latency transport systems
- TCP / UDP / QUIC
- Congestion control, pacing, and flow control
Optimize distributed system performance across:
- CPU, memory, and network
- GPU-to-GPU and node-to-node communication
Work on real-world bottlenecks in AI/ML infrastructure
Build and ship production systems in:
- C / C++
- (Rust is a plus)
Collaborate with a small, highly technical team to deliver measurable performance gains

What You Bring

Core Requirements

Strong proficiency in C/C++
Deep understanding of networking fundamentals
- Transport protocols
- Congestion and flow control
Experience working on systems-level performance problems
Strong debugging skills and ability to reason from first principles
Exposure to GPU systems or acceleration
-CUDA
-NCCL (NVIDIA Collective Communications Library)
-GPU data movement or optimization

What We’re Looking For

Engineers who enjoy working close to the metal
Strong problem solvers who can break down complex systems
People who value substance over buzzwords
Individuals who can contribute quickly and independently

This role is best suited for engineers with a systems and networking background, rather than those focused solely on high-level ML frameworks.

Preferred Experience

Rust experience (or interest in learning)
RDMA / Infiniband / NVLink familiarity
Experience with erasure coding / FEC
Mobile or edge networking experience
High-performance or low-latency systems
Distributed systems or data plane optimization
Hardware-adjacent environments (e.g., FPGA, network processors)

How We Work

Small, senior, highly technical team
Remote-friendly (U.S. time zones preferred)
Focus on real output and performance impact, not process overhead

Why Join

Work on critical infrastructure for AI/ML systems
Solve problems at the intersection of networking and compute
Join a team that values technical depth and execution
Direct impact on performance at scale

Bonus: GPU-Focused Role

We’re also actively interested in engineers with deep GPU communication experience (CUDA, NCCL, NVIDIA stack). If that’s your background, we’d especially like to connect.

Our Headquarters

Our Mailbox

Senior Software Engineer (AI/ML Networking – Systems & GPU Performance) – (LATAM Only)