ExpertGrid
← All jobs

Systems & Infrastructure Specialist

$40 - $70/hr Worldwide Remote · worldwide expert Contract / freelance
Pay rate · $40 - $70/hr
  • Job Description
  • Systems & Infrastructure Specialist
  • Contractor
  • Remote

Job Summary

In this role, you'll apply your expertise to help train next-generation AI systems. Your work will shape how models learn, reason, and perform through high-quality, real-world input. No prior experience in AI is required — your domain knowledge is what matters.

Key Responsibilities

  • • Navigate, troubleshoot, and recover dynamic infrastructure and long-running processes in real-time using command-line tools.
  • • Master and manage highly containerized environments, including orchestrating Dockerized sandboxes and CI/CD workflows.
  • • Build, maintain, and optimize systems for AI model training and high-throughput compute environments.
  • • Respond swiftly to system errors, executing dynamic mid-operation replanning and recovery.
  • • Collaborate with engineering and AI teams to ensure seamless integration, reliability, and performance.
  • • Document system architectures, incident responses, and recovery protocols with meticulous clarity.
  • • Contribute expertise to evolving project needs, adapting to new technologies and scaling strategies as required.

Required Skills and Qualifications

  • • Demonstrated expert proficiency working in terminal environments for system builds, server administration, and infrastructure management.
  • • Advanced problem-solving skills for multi-step troubleshooting, filesystem navigation, and process management within containerized settings.
  • • Hands-on experience with Python, Bash, JavaScript/TypeScript, Go, Rust, and/or C/C++.
  • • Deep familiarity with build systems, package managers, databases, web servers, ML frameworks, version control, and cryptography tools.
  • • Proven ability to execute dynamic infrastructure recovery and optimize long-running processes under pressure.
  • • Strong written and verbal communication skills, with a passion for precise technical documentation.
  • • Systems multilingualism: versatility across operating systems, languages, and emerging DevOps tools.

Preferred Qualifications

  • • Prior experience in high-compute environments for AI/ML workloads.
  • • Background in Site Reliability Engineering or DevOps roles focused on mission-critical infrastructure.
  • • Familiarity with advanced container orchestration and distributed system design.
Fill in your name, country and email to proceed to next step.
Looking for something else? Browse all AI jobs →