The Role

This role exists for one reason: to build AI systems that get better on their own.

We're looking for an AI Tooling Specialist with a deep focus on recursive self-learning — the design and implementation of systems that evaluate their own outputs, identify failure modes, and update their own behaviour without requiring constant human intervention. You'll own this capability end-to-end at Team Wheel, from architecture through to production.

If you've spent time thinking seriously about how to make AI systems that genuinely improve over time — not just ones that are prompted better — this is the role for you.

We believe in paying fairly and being upfront about it. For this role, we're offering a base salary of £80,000–£110,000 depending on experience, with additional performance-related pay. We know the people we're looking for have options, and we want them invested in what we're building together so we are open to building a creative package for the right individual, possibly including shadow equity.

What You'll Do

Design and Build Self-Improving Systems

Architect recursive learning loops in which AI outputs are automatically scored, critiqued and fed back into subsequent generations
Build self-critique and self-refinement pipelines where models evaluate and iteratively improve their own responses before delivery
Implement Constitutional AI-style or critic-model approaches to drive automated quality improvement without human-in-the-loop on every iteration
Design memory and state mechanisms that allow systems to accumulate and apply learnings across sessions and tasks

Build the Evaluation Layer

Develop robust, automated evaluation frameworks (evals) that measure output quality across multiple dimensions
Create feedback signal pipelines — combining automated scoring, preference signals and sparse human feedback — to drive continuous improvement
Build regression detection so the system flags when it is getting worse, not just when it is getting better
Instrument all systems for observability: you'll know at any point whether the loop is working

Automate the Optimisation Process

Implement automated prompt and chain optimisation (DSPy, TextGrad, or similar) so that the system tunes itself rather than relying on manual prompt engineering
Build processes that generate, test and promote improved instructions or configurations autonomously
Work towards systems that can propose and validate their own architectural changes within defined constraints

Integrate and Deploy

Build integrations with the platforms Team Wheel works with (Workday and other HR SaaS)
Own reliability and performance in production — these are live systems, not experiments
Document the architecture, the learning mechanisms, and the failure modes clearly

What We're Looking For

Essential

Strong Python and hands-on experience building with LLMs at a production level
Demonstrable experience designing recursive or iterative self-improvement loops in AI systems
Deep familiarity with evaluation design — automated scoring, preference modelling, critic models
Experience with prompt/chain optimisation frameworks (DSPy, TextGrad, or equivalent)
Solid understanding of agent architectures and multi-step reasoning pipelines
You can articulate clearly why a self-learning loop failed — not just that it did

Desirable

Background in or strong understanding of RLHF, DPO, or related preference optimisation techniques
Experience with Constitutional AI, self-critique, or debate-style architectures
Exposure to enterprise HR platforms or complex API integrations
Published work, open source contributions, or personal projects in self-improving AI systems

Why Join Team Wheel

Own a genuinely novel capability inside a growing consultancy
Production systems from day one — not research, not prototypes
Work alongside domain experts who understand the HR tech context deeply
A culture that stays close to the frontier and invests in doing this properly
Competitive salary and flexible working

Team Wheel is an equal opportunities employer. We welcome applications from candidates of all backgrounds.

AI RSL Tooling Specialist

Build AI systems that improve themselves. You'll own recursive self-learning at Team Wheel — designing feedback loops, eval pipelines and self-optimising agents that get better without human intervent