Typical hourly range for this type of role — the exact rate is confirmed by the hiring company.
Job Description
Member of Technical Staff (Research Engineering)
Full-time
Remote
The Role
We are seeking a Research Engineer to operate at the frontier of Reinforcement Learning (RL), developing novel environments, training pipelines, and evaluation systems that advance the capabilities of modern AI models. This role sits at the intersection of research and production, translating experimental ideas into scalable, high-performance systems.
What You’ll Work On
Architect self-contained RL environments that capture complex, real-world tasks, including reward functions, verifiers, and evaluation logic.
Design and scale episode pipelines and multi-component training processes (MCPs) to support reproducible experimentation.
Build automated data generation systems, leveraging synthetic data to accelerate training cycles without compromising quality.
Develop and integrate AI-driven evaluation and quality assurance systems for automated grading, validation, and feedback loops.
Fine-tune and optimize open-source RL models using internally generated datasets and custom training strategies.
Establish benchmarking frameworks to measure model capability, robustness, and data quality across tasks.
Contribute to the release and analysis of evaluations on internal and external benchmark platforms (e.g., the hiring company benchmarks).
What We're Looking For
Deep experience in Reinforcement Learning, including environment design and training dynamics.
Strong track record of building and scaling RL systems, pipelines, or experimentation frameworks.
Proficient in automation and data generation, including synthetic data pipelines.
Familiar with automated evaluation systems, model validation, and quality assurance workflows.
Experienced in fine-tuning and evaluating open-source ML models.
Clear, concise communicator with strong technical writing skills.
Comfortable operating in fast-paced, research-driven, and highly collaborative environments.
Preferred
Experience publishing benchmarks, evaluations, or research artifacts.
Familiarity with evaluation ecosystems (e.g., the hiring company benchmarks or similar frameworks).
Background in scalable infrastructure for large-scale RL experimentation.
Compensation & Benefits Notice
The national pay range for this full-time position is base salary of $140,000 –$180,000 USD. All employees are eligible for equity compensation, and employees may also receive performance-based bonuses, dependent on role and subject to company policies. The hiring company provides a comprehensive benefits package, including up to 100% reimbursement for health-insurance premiums, paid time off, a 401(K) plan with a company match, and additional benefits designed to support a high-performing, remote-first workforce.
The hiring company is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex (including pregnancy, sexual orientation, or gender identity), national origin, age, disability, genetic information, veteran status, or any other characteristic protected by applicable local laws, regulations and ordinances. If you need assistance and/or a reasonable accommodation during the application process, reach out to support@the hiring company.ai.
Our hiring process utilizes artificial intelligence tools to assist in candidate screening and assessment. Our AI tools are designed to complement, not replace, human decision-making.
Disclaimer
The information contained in this job posting, including but not limited to ro