Based in Boston, Massachusetts. Our team of professionals are dedicated to providing exceptional service and support to our clients. We have the expertise and experience to solve even the most complex technology challenges. 

// contact us
Our Headquarters

Boston, MA, USA

Senior Software Engineer (AI/ML Networking – Systems & GPU Performance) – (LATAM Only)

About Our Client

We’re a deep-tech startup building next-generation networking and communication infrastructure for AI/ML systems. Our work focuses on eliminating bottlenecks in distributed compute — improving how GPUs, clusters, and networks communicate at scale.

This isn’t application-layer AI. We operate at the transport, systems, and hardware-adjacent layers, solving real performance problems across data centers, edge environments, and specialized networks.


The Role

We’re hiring a Senior Software Engineer to work on high-performance networking and GPU communication systems.

You’ll be designing and optimizing the core infrastructure that powers large-scale AI workloads — reducing latency, improving throughput, and unlocking better utilization across distributed systems.


What You’ll Do

  • Design and implement high-throughput, low-latency transport systems
    • TCP / UDP / QUIC
    • Congestion control, pacing, and flow control
  • Optimize distributed system performance across:
    • CPU, memory, and network
    • GPU-to-GPU and node-to-node communication
  • Work on real-world bottlenecks in AI/ML infrastructure
  • Build and ship production systems in:
    • C / C++
    • (Rust is a plus)
  • Collaborate with a small, highly technical team to deliver measurable performance gains

What You Bring

Core Requirements

  • Strong proficiency in C/C++
  • Deep understanding of networking fundamentals
    • Transport protocols
    • Congestion and flow control
  • Experience working on systems-level performance problems
  • Strong debugging skills and ability to reason from first principles
  • Exposure to GPU systems or acceleration
    -CUDA
    -NCCL (NVIDIA Collective Communications Library)
    -GPU data movement or optimization

What We’re Looking For

  • Engineers who enjoy working close to the metal
  • Strong problem solvers who can break down complex systems
  • People who value substance over buzzwords
  • Individuals who can contribute quickly and independently

This role is best suited for engineers with a systems and networking background, rather than those focused solely on high-level ML frameworks.


Preferred Experience

  • Rust experience (or interest in learning)
  • RDMA / Infiniband / NVLink familiarity
  • Experience with erasure coding / FEC
  • Mobile or edge networking experience
  • High-performance or low-latency systems
  • Distributed systems or data plane optimization
  • Hardware-adjacent environments (e.g., FPGA, network processors)

How We Work

  • Small, senior, highly technical team
  • Remote-friendly (U.S. time zones preferred)
  • Focus on real output and performance impact, not process overhead

Why Join

  • Work on critical infrastructure for AI/ML systems
  • Solve problems at the intersection of networking and compute
  • Join a team that values technical depth and execution
  • Direct impact on performance at scale

Bonus: GPU-Focused Role

We’re also actively interested in engineers with deep GPU communication experience (CUDA, NCCL, NVIDIA stack). If that’s your background, we’d especially like to connect.

Job Category: Data Engineering
Job Type: Contract
Job Location: USA

Apply for this position

Drop files here or click to uploadMaximum allowed file size is 50 MB.
Allowed Type(s): .pdf, .doc, .docx