AI RSL Tooling Specialist
Build AI systems that improve themselves. You'll own recursive self-learning at Team Wheel — designing feedback loops, eval pipelines and self-optimising agents that get better without human intervent
The Role
This role exists for one reason: to build AI systems that get better on their own.
We're looking for an AI Tooling Specialist with a deep focus on recursive self-learning — the design and implementation of systems that evaluate their own outputs, identify failure modes, and update their own behaviour without requiring constant human intervention. You'll own this capability end-to-end at Team Wheel, from architecture through to production.
If you've spent time thinking seriously about how to make AI systems that genuinely improve over time — not just ones that are prompted better — this is the role for you.
We believe in paying fairly and being upfront about it. For this role, we're offering a base salary of £80,000–£110,000 depending on experience, with additional performance-related pay. We know the people we're looking for have options, and we want them invested in what we're building together so we are open to building a creative package for the right individual, possibly including shadow equity.
What You'll Do
Design and Build Self-Improving Systems
Architect recursive learning loops in which AI outputs are automatically scored, critiqued and fed back into subsequent generations
Build self-critique and self-refinement pipelines where models evaluate and iteratively improve their own responses before delivery
Implement Constitutional AI-style or critic-model approaches to drive automated quality improvement without human-in-the-loop on every iteration
Design memory and state mechanisms that allow systems to accumulate and apply learnings across sessions and tasks
Build the Evaluation Layer
Develop robust, automated evaluation frameworks (evals) that measure output quality across multiple dimensions
Create feedback signal pipelines — combining automated scoring, preference signals and sparse human feedback — to drive continuous improvement
Build regression detection so the system flags when it is getting worse, not just when it is getting better
Instrument all systems for observability: you'll know at any point whether the loop is working
Automate the Optimisation Process
Implement automated prompt and chain optimisation (DSPy, TextGrad, or similar) so that the system tunes itself rather than relying on manual prompt engineering
Build processes that generate, test and promote improved instructions or configurations autonomously
Work towards systems that can propose and validate their own architectural changes within defined constraints
Integrate and Deploy
Build integrations with the platforms Team Wheel works with (Workday and other HR SaaS)
Own reliability and performance in production — these are live systems, not experiments
Document the architecture, the learning mechanisms, and the failure modes clearly
What We're Looking For
Essential
Strong Python and hands-on experience building with LLMs at a production level
Demonstrable experience designing recursive or iterative self-improvement loops in AI systems
Deep familiarity with evaluation design — automated scoring, preference modelling, critic models
Experience with prompt/chain optimisation frameworks (DSPy, TextGrad, or equivalent)
Solid understanding of agent architectures and multi-step reasoning pipelines
You can articulate clearly why a self-learning loop failed — not just that it did
Desirable
Background in or strong understanding of RLHF, DPO, or related preference optimisation techniques
Experience with Constitutional AI, self-critique, or debate-style architectures
Exposure to enterprise HR platforms or complex API integrations
Published work, open source contributions, or personal projects in self-improving AI systems
Why Join Team Wheel
Own a genuinely novel capability inside a growing consultancy
Production systems from day one — not research, not prototypes
Work alongside domain experts who understand the HR tech context deeply
A culture that stays close to the frontier and invests in doing this properly
Competitive salary and flexible working
Team Wheel is an equal opportunities employer. We welcome applications from candidates of all backgrounds.
- Locations
- London
- Remote status
- Fully Remote
About Team Wheel
Team Wheel is the driving force behind smarter HR in hospitality, helping hotels, restaurants and leisure brands thrive by transforming how they attract, manage, optimise, retain and pay people. As trusted advisors and hands-on partners, we guide organisations through every step of their HR tech journey, from strategy and system selection to implementation and support, working with leading platforms like Workday, Dayforce, HiBob and more. Passionate about people and powered by technology, we exist to make hospitality work better for everyone.