Complete API reference for training custom models, managing training jobs, and running inference with MotteRL.
Start a new model training job with custom reward functions.
{ "model": "qwen-2.5-7b", "learningRate": 0.01, "iterations": 100, "batchSize": 8, "multiStep": false, "rewardFunction": "def reward_fn(completion, **kwargs):\n response = completion[0].get('content', '')\n expected = kwargs.get('expected_result', '')\n return 1.0 if expected.lower() in response.lower() else 0.0", "trainingData": [ { "prompt": "What is artificial intelligence?", "expected_result": "AI is the simulation of human intelligence in machines" }, { "prompt": "Define machine learning", "expected_result": "ML is a subset of AI that learns from data" } ] }
{ "job_id": "train_abc123", "status": "queued", "model": "qwen-2.5-7b", "estimated_duration": "2-4 hours", "created_at": "2024-01-15T10:30:00Z" }
Parameter | Type | Required | Description |
---|---|---|---|
model | string | Yes | Model to train |
learningRate | number | No | Learning rate (0.001-0.1) |
iterations | number | No | Training iterations (50-500) |
rewardFunction | string | Yes | Python reward function |
trainingData | array | Yes | Training examples |
Get the current status and progress of a training job.
{ "job_id": "train_abc123", "status": "training", "progress": 0.75, "current_iteration": 75, "total_iterations": 100, "metrics": { "average_reward": 0.82, "loss": 0.15, "learning_rate": 0.01 }, "estimated_completion": "2024-01-15T14:30:00Z", "logs_url": "/api/training/logs/train_abc123" }
Retrieve detailed training logs and metrics.
{ "job_id": "train_abc123", "logs": [ { "timestamp": "2024-01-15T10:30:00Z", "level": "INFO", "message": "Training started with 100 examples" }, { "timestamp": "2024-01-15T10:35:00Z", "level": "INFO", "message": "Iteration 10/100 - Loss: 0.45, Reward: 0.62" } ], "metrics_history": [ {"iteration": 1, "loss": 0.8, "reward": 0.3}, {"iteration": 10, "loss": 0.45, "reward": 0.62}, {"iteration": 20, "loss": 0.32, "reward": 0.75} ] }
Cancel a running training job.
{ "job_id": "train_abc123", "status": "cancelled", "message": "Training job cancelled successfully" }
List all available models and your trained models.
{ "base_models": [ { "id": "qwen-2.5-7b", "name": "Qwen 2.5 7B", "parameters": "7B", "description": "General purpose multilingual model", "training_time": "2-4 hours" }, { "id": "phi-3.5-mini", "name": "Phi 3.5 Mini", "parameters": "3.8B", "description": "Fast inference optimized model", "training_time": "1-2 hours" } ], "custom_models": [ { "id": "model_abc123", "name": "Customer Support Agent", "base_model": "qwen-2.5-7b", "created_at": "2024-01-15T10:30:00Z", "status": "ready" } ] }
Run inference with a trained model.
{ "prompt": "How can I help you with your order?", "max_tokens": 150, "temperature": 0.7, "context": { "customer_id": "cust_123", "order_id": "order_456" } }
{ "model_id": "model_abc123", "response": "I'd be happy to help you with your order! Let me look up the details for order #456. I can see that your order is currently being processed and should ship within 1-2 business days. Is there anything specific you'd like to know about your order?", "tokens_used": 45, "processing_time": 1.2, "confidence": 0.89 }
Model | Parameters | Best For | Training Time | Cost/Hour |
---|---|---|---|---|
Qwen 2.5 7B | 7 billion | General purpose, multilingual | 2-4 hours | $2.50 |
Phi 3.5 Mini | 3.8 billion | Fast inference, mobile | 1-2 hours | $1.50 |
Llama 3.1 8B | 8 billion | Code generation, reasoning | 3-5 hours | $3.00 |
Mistral 7B | 7 billion | Instruction following | 2-3 hours | $2.00 |
Training job failed due to invalid parameters or data issues.
Not enough credits to start training. Please add credits to your account.
The specified model ID does not exist or is not accessible.
The provided reward function contains syntax errors or invalid operations.