Reinforcement Learning Engine

Train AI agents through reward-based optimization

1. Select Language Model

2. Upload Training Data

or

Drag and drop your JSONL file here or click to browse

Upload training prompts in JSONL format

4. Configure Training

Enable for complex environments with multiple interactions

0.0010.010.1

5. Execute Training

3. Define Reward Function

The reward function evaluates model responses in Python. Must define reward_fn that returns a scalar score.
REWARD FUNCTION
PYTHON

INPUT FORMAT

completion: List of message dicts with model's response
**kwargs: Additional fields from JSONL (e.g., expected_result)

OUTPUT

Must define def reward_fn(completion, **kwargs)
Return a scalar (higher is better)

EXAMPLE REWARD FUNCTIONS