Skip to content

Migration to llm_config

Migrating from Simple Ponderation to Advanced llm_config

Section titled “Migrating from Simple Ponderation to Advanced llm_config”

ARAL v1.2.0 — Guide for upgrading persona configurations


ARAL v1.2.0 introduces advanced multi-LLM routing with the new llm_config structure, which provides significantly more capabilities than the simple llm_ponderation approach.

  • Backward Compatible: Simple llm_ponderation still works
  • Enhanced Capabilities: llm_config adds routing, aggregation, optimization
  • PCL v2.1.0 Aligned: Full compatibility with Persona Configuration Language

{
"config": {
"llm_ponderation": {
"enabled": true,
"weights": {
"gpt-5.2": 0.8,
"claude-sonnet-4.5": 0.2
},
"merge_strategy": "weighted_blend",
"execution": "parallel",
"normalize": true
}
}
}

Limitations:

  • ❌ Only supports weighted blending
  • ❌ No task-based routing
  • ❌ No cost or latency optimization
  • ❌ No response aggregation options
  • ❌ No per-provider configuration
  • ❌ No model-specific prompts

Minimum Change - Convert to providers array:

{
"llm_config": {
"providers": [
{
"provider": "openai",
"model": "gpt-5.2",
"weight": 0.8,
"priority": 1
},
{
"provider": "anthropic",
"model": "claude-sonnet-4.5",
"weight": 0.2,
"priority": 2
}
],
"routing_strategy": "ponderation"
}
}

Add Routing and Optimization:

{
"llm_config": {
"providers": [
{
"provider": "openai",
"model": "gpt-5.2",
"weight": 0.8,
"priority": 1,
"fallback": false,
"config_override": {
"temperature": 0.7,
"max_tokens": 2000
}
},
{
"provider": "anthropic",
"model": "claude-sonnet-4.5",
"weight": 0.2,
"priority": 2,
"fallback": false,
"config_override": {
"temperature": 0.8,
"max_tokens": 2000
}
},
{
"provider": "openai",
"model": "gpt-3.5-turbo",
"priority": 3,
"fallback": true
}
],
"routing_strategy": "priority_based",
"cost_optimization": {
"enabled": true,
"max_cost_per_request": 0.1,
"prefer_cheaper": false
},
"latency_optimization": {
"enabled": true,
"max_latency_ms": 5000
},
"fallback_chain": ["gpt-5.2", "claude-sonnet-4.5", "gpt-3.5-turbo"]
}
}

Task-Based Model Selection:

{
"llm_config": {
"providers": [
{
"provider": "anthropic",
"model": "claude-opus-4",
"weight": 0.5,
"priority": 1
},
{
"provider": "openai",
"model": "gpt-4",
"weight": 0.3,
"priority": 2
},
{
"provider": "anthropic",
"model": "claude-sonnet-4",
"weight": 0.2,
"priority": 3
}
],
"routing_strategy": "specialized",
"aggregation": {
"method": "best_of_n",
"selection_criteria": ["creativity", "coherence", "engagement"],
"min_responses": 2,
"max_responses": 3
},
"cost_optimization": {
"enabled": true,
"max_cost_per_request": 0.5,
"budget_limit": 100.0
}
},
"prompts": {
"system": "You are a creative specialist...",
"llm_specific": {
"claude-opus-4": {
"system": "You excel at deep creative exploration...",
"prefix": "Creative challenge:\n\n"
},
"gpt-4": {
"system": "You excel at structured creativity...",
"prefix": "Structured task:\n\n"
},
"claude-sonnet-4": {
"system": "You provide balanced refinement...",
"prefix": "Refinement task:\n\n"
}
}
},
"extensions": {
"routing_rules": {
"brainstorming": "claude-opus-4",
"long_form_content": "gpt-4",
"refinement": "claude-sonnet-4",
"quick_ideas": "gpt-3.5-turbo"
}
}
}

  • Move config.llm_ponderation.weights to llm_config.providers array
  • Add provider and model fields for each entry
  • Convert weight values to provider objects
  • Set routing_strategy (start with priority_based or ponderation)
  • Add priority values to providers
  • Configure fallback providers
  • Add cost_optimization if budget is a concern
  • Add latency_optimization for real-time needs
  • Configure aggregation for quality improvement
  • Add config_override for per-provider settings
  • Add llm_specific prompts for specialized instructions
  • Configure routing_rules for task-based routing
  • Set up circuit_breaker for reliability
  • Configure aggregation.selection_criteria

Choose based on your use case:

Use CaseRecommended StrategyAggregation Method
General purpose, simplepriority_basedfirst
Budget-constrainedcost_optimizedfirst
Quality-criticalquality_firstbest_of_n
Real-time interactivelatency_firstfirst
Creative with varied tasksspecializedbest_of_n
High-stakes decisionsconsensusconsensus
Load distributionround_robinfirst
Ensemble qualityponderationweighted_blend

Example 1: General Assistant (Cost-Conscious)

Section titled “Example 1: General Assistant (Cost-Conscious)”
{
"llm_config": {
"providers": [
{ "provider": "openai", "model": "gpt-3.5-turbo", "priority": 1 },
{
"provider": "anthropic",
"model": "claude-sonnet-4",
"priority": 2,
"fallback": true
}
],
"routing_strategy": "cost_optimized",
"cost_optimization": {
"enabled": true,
"max_cost_per_request": 0.05,
"prefer_cheaper": true
}
}
}

Example 2: Research Analyst (Quality-First)

Section titled “Example 2: Research Analyst (Quality-First)”
{
"llm_config": {
"providers": [
{ "provider": "anthropic", "model": "claude-opus-4", "priority": 1 },
{ "provider": "openai", "model": "gpt-4", "priority": 2 },
{ "provider": "google", "model": "gemini-ultra", "priority": 3 }
],
"routing_strategy": "quality_first",
"aggregation": {
"method": "best_of_n",
"selection_criteria": ["accuracy", "detail", "clarity"],
"min_responses": 2
}
}
}

Example 3: Creative Specialist (Specialized)

Section titled “Example 3: Creative Specialist (Specialized)”

See creative-specialist.json for full example.


The old config.llm_ponderation structure still works:

{
"config": {
"llm_ponderation": {
"enabled": true,
"weights": { "gpt-4": 0.7, "claude-opus": 0.3 }
}
}
}

You can include both for compatibility:

{
"llm_config": {
"providers": [
/* new format */
]
},
"config": {
"llm_ponderation": {
"enabled": true,
"weights": {
/* old format as fallback */
}
}
}
}

Priority: If both exist, llm_config takes precedence over llm_ponderation.


Terminal window
npx ajv validate -s schemas/persona.schema.json -d your-persona.json
import { ARALAgent } from "@aral-standard/sdk";
const agent = new ARALAgent({
persona: "./your-persona.json",
});
await agent.initialize();
// Test different task types
const brainstormResult = await agent.process({
input: "Generate creative ideas",
task_type: "brainstorming",
});
const refinementResult = await agent.process({
input: "Polish this content: ...",
task_type: "refinement",
});
const stats = agent.getStats();
console.log({
totalCost: stats.totalCost,
avgLatency: stats.averageLatency,
providerUsage: stats.providerUsage,
});


  1. Flexibility: 9 routing strategies vs 1
  2. Optimization: Cost and latency controls
  3. Quality: Advanced aggregation methods
  4. Specialization: Task-based routing and model-specific prompts
  5. Reliability: Fallback chains and circuit breakers
  6. Standards: PCL v2.1.0 compatibility
  • Now: If you need task-based routing, cost optimization, or quality aggregation
  • Later: If simple ponderation meets your needs
  • 🔄 Never: Simple ponderation will remain supported indefinitely

Ready to upgrade? Start with the creative-specialist.json example!