Sr. Software Engineer - K8s - GPU Orchestration - REMOTE Job at Living Talent, San Jose, CA

Y3ZhRHZQMmUzbzY1dHl0MVlxV1RpdDArTWc9PQ==
  • Living Talent
  • San Jose, CA

Job Description

GPU Orchestration
  • Startup
  • Company size: 30
  • Remote within North America
  • Compensation: Base Salary 250k + Equity

Key Responsibilities

  • Lead Design, Architecture & Development of K8s-based cloud infrastructure.
  • Use K8s Controllers, Operators & CRs to Implement scalable, high-availability solutions.
  • Integrate Karpenter, and/or other advanced tools for infrastructure optimization.
  • Architect MLOps Middleware integration (dynamic workload migration, resource disaggregation).
  • Build monitoring, logging & alerting systems.
  • Drive infrastructure cost optimization through FinOps best practices in K8s deployments.
  • Promote K8s best practices & mentor software engineers.
  • Collaborate across teams to drive K8s adoption in multi-cloud and hybrid environments.
  • Open-Source Contributions in the Kubernetes community.

Qualifications

Kubernetes Expertise

  • Designing, deploying, and managing K8s clusters (AKS, EKS, GKE, OpenStack, etc.).
  • Hands-on experience with K8s core components (Karpenter, cluster autoscaler, CNI, CSI, CRI, CRD, operators).
  • 5+ years in Kubernetes infrastructure.
  • Contributing to open-source Kubernetes projects.
  • 10+ years: software engineering experience.
  • Go, Python, Bash, etc. (one or more).
  • Excellent communication skills for both technical and non-technical stakeholders.
  • Bachelor’s or Master’s degree in Computer Science or related field (preferred).

Preferred Experience

  • GPU scheduling, container orchestration, HPC (high-performance computing) workloads.
  • Multi-cloud & hybrid cloud deployments familiarity.
  • MLOps platforms experience (Kubeflow, TFX, etc.).
  • FinOps practices & cloud cost management experience/knowledge

Job Tags

Remote job,

Similar Jobs

The Dog Stop

The Dog Stop Grooming Academy ENROLLING NOW! Job at The Dog Stop

 ...Time to emBARK on a new career! The Dog Stop Grooming Academy is currently enrolling students for its August 5 session. The program runs for 6 weeks, and students are taught the following skills: Nail Trims and Grinds Ear Cleanings Baths Gland Expressions... 

TEKsystems

ISP Support Technician Job at TEKsystems

 ...+ Troubleshooting + Ticketing System + Google IT Professional or studying for CompTIA Overview...  ...to educate and support our members as we work with them through a consultative process....  ...your own internet connection + If your home internet goes down, you will need a backup... 

Gpac

Commercial Roofing Estimator Job at Gpac

Commercial Roof Estimator Needed We are actively working with a reputable commercial roofing company looking to add a talented Roofing Estimator/PM to their team! If you are looking for a chance to get in on one of the most stable roofing companies out there as a Commercial... 

SmartIPlace

Full Stack Developer (C#, Angular, Python, SQL) Job at SmartIPlace

 ...chance to build next-gen tools that transform how traders manage trade positions, balance sheets, and risk -with no accounting experience required . Work directly with traders and help the desk shift from legacy systems to sleek, modern dashboards using Angular... 

Casey Toyota

Express Service Technician Job at Casey Toyota

Casey Auto Group is seeking an Express Service Technician to join our team! Casey Auto Group has been serving the local community for over 60 years. We are committed to providing exceptional customer service and maintaining the highest standards of quality in all...