MLE Bench – Data Analyst

Typical $25–60/hr Worldwide Remote · worldwide coding Contract / freelance

Pay rate · Typical $25–60/hr

Typical hourly range for this type of role — the exact rate is confirmed by the hiring company.

Overview

Analyze ML datasets to evaluate and improve system performance.

About the hiring company

Based in San Francisco, California, the hiring company is the world’s leading research accelerator for frontier AI labs and a trusted partner for global enterprises deploying advanced AI systems. The hiring company supports customers in two ways: first, by accelerating frontier research with high-quality data, advanced training pipelines, plus top AI researchers who specialize in coding, reasoning, STEM, multilinguality, multimodality, and agents; and second, by applying that expertise to help enterprises transform AI from proof of concept into proprietary intelligence with systems that perform reliably, deliver measurable impact, and drive lasting results on the P&L

Role Overview

We are looking for experienced

Data Analysts (MLE Bench)

to contribute to benchmark-driven evaluation projects focused on real-world machine learning systems. This role involves hands-on analytical work with production-like datasets, metrics, and ML outputs to help evaluate, diagnose, and improve the performance of advanced AI systems.
The ideal candidate is comfortable working at the intersection of data analysis and machine learning, with strong analytical rigor and the ability to work with real datasets and ML evaluation workflows.

What does day-to-day life look like?

Analyze structured and unstructured datasets generated from

ML training, inference, and evaluation pipelines

.
Define, compute, and validate metrics used to evaluate model performance and behavior.
Investigate data distributions, model outputs, failure modes, and edge cases relevant to benchmark tasks.
Write and run

Python and SQL

code to analyze data, create reports, and support evaluation workflows.
Validate data quality, consistency, and correctness across datasets and experiments.
Create clear, well-documented analytical artifacts and reproducible analysis workflows.
Collaborate with ML engineers and researchers to design challenging, real-world evaluation scenarios for MLE Bench.

Requirements

Minimum 3+ years of experience as a

Data Analyst or Analytics-focused Engineer

.
Strong proficiency in

Python

for data analysis.
Solid experience with

SQL

and relational datasets.
Experience analyzing

ML outputs and evaluation metrics

.
Strong understanding of statistics and analytical reasoning.
Ability to work with large, complex datasets and draw reliable insights.
Experience writing clean, readable, and well-documented analytical code.
Excellent spoken and written

English communication skills

Perks of Freelancing With the hiring company

Work in a fully remote environment.
Opportunity to work on cutting-edge AI projects with leading LLM companies.

Offer Details

Commitments Required

At least 4 hours per day and minimum 20 hours per week with overlap of 4 hours with PST.

Engagement Type

Contractor assignment (no medical/paid leave)

Duration of Contract

3 months (adjustable based on engagement)

Location

India, Pakistan, Nigeria, Kenya, Egypt, Ghana, Bangladesh, Turkey, Brazil, Mexico

Evaluation Process

Technical Interview with live coding challege (60 mins)

Fill in your name, country and email to proceed to next step.

Looking for something else? Browse all AI jobs →