Stateful LLM serving instances that self-reflect and learn from their mistakes during idle time.
Self-improvement is itself a learnable trait. Different model instances undergoing online learning arrive at different end states—some become stronger self-learners than others. Adaptible explores this phenomenon by creating multiple instances that self-update, releasing them into production, and pruning the weaker learners while propagating successful ones.
The core hypothesis: by diversifying training bets autonomously and greedily sampling from winners, we create an evolutionary bottleneck that selects for models adept at self-improvement. The goal isn’t just a model that learns—it’s discovering which models learn to learn.
Adaptible wraps an LLM in a server that:
This enables online learning: models that improve through use without full retraining.
pip install adaptible
Or from source:
git clone https://github.com/your-org/adaptible.git
cd adaptible
pip install -e .
python -m adaptible.local
This starts a FastAPI server at http://127.0.0.1:8000. The web UI is available at /static/.
import asyncio
import adaptible
async def main():
server = adaptible.MutableHostedLLM(host="127.0.0.1", port=8000)
await server.up()
# Server runs until you stop it
await asyncio.sleep(3600)
await server.down()
asyncio.run(main())
import adaptible
model = adaptible.StatefulLLM()
# Generate a response
response = model.generate_response("What is the capital of France?")
print(response)
| Endpoint | Method | Description |
|---|---|---|
/interact |
POST | Send a prompt, get a response |
/stream_interact |
POST | Stream the response |
/trigger_review |
POST | Start the self-correction cycle |
/sync |
GET | Wait for background training to complete |
/history |
GET | Get all interactions |
/status |
GET | Health check |
curl -X POST http://127.0.0.1:8000/interact \
-H "Content-Type: application/json" \
-d '{"prompt": "Hello, how are you?"}'
# Trigger self-correction on recent interactions
curl -X POST http://127.0.0.1:8000/trigger_review
# Wait for training to complete
curl http://127.0.0.1:8000/sync
Adaptible includes a comprehensive evaluation harness for measuring self-correction effectiveness.
# Run evaluation with default settings
python -m adaptible.eval
# Run with options
python -m adaptible.eval \
--subset 20 \
--shuffle \
--category geography \
--iterations 25
import adaptible.eval as eval
# Load the built-in dataset (100+ trivia questions)
dataset = eval.generate_default_dataset()
# Configure the experiment
config = eval.EvaluationConfig(
name="my_experiment",
training_iterations=25,
train_ratio=0.8,
shuffle=True,
)
# Run evaluation
harness = eval.EvaluationHarness()
result = harness.run(dataset, config)
# Generate HTML report
eval.generate_html_report(result, "/tmp/report.html")
The evaluation:
All experiments are persisted to a SQLite database (outputs/adaptible.db) for structured analysis.
examples
├── canonical_id, question, ground_truth_answer
├── key_terms, category, difficulty
├── source_type (static_trivia | web_scrape)
├── valid_at (NULL for timeless facts, DATE for time-sensitive)
└── created_at
experiments
├── name, experiment_type (eval | autonomous)
├── config_json, model_checkpoint
└── started_at, completed_at
responses
├── example_id, experiment_id
├── response_text, response_raw
├── phase (baseline | post_training)
├── token_count, max_tokens, truncated
└── created_at
training_events
├── example_id, experiment_id
├── training_iterations, training_time_seconds
└── created_at
# Interactive Python REPL with database helpers
python scripts/explore_db.py
# Create demo data first
python scripts/explore_db.py --demo
# Print summary for a specific experiment
python scripts/explore_db.py --summary 1
The notebooks/explore_experiments.ipynb notebook provides SQL-based exploration:
from adaptible import Database
db = Database()
# Get experiment metrics
metrics = db.compute_metrics(experiment_id=1)
print(f"Improvement rate: {metrics['improvement_rate']:.1%}")
# Find regressions (items that got worse after training)
for r in db.get_regressions(experiment_id=1):
print(f"{r['example'].canonical_id}: {r['baseline_text']} → {r['post_text']}")
# Export LLM-friendly summary
print(db.export_experiment_summary(experiment_id=1))
The autonomous module enables continuous self-improvement by scraping claims from the web, testing them, and training on corrections.
# Run the autonomous learning loop
python -m adaptible.autonomous
The autonomous node persists state to outputs/autonomous/state.json and logs to outputs/autonomous/logs/. All training events are also recorded in the experiment database.
from adaptible.autonomous import AutonomousNode
node = AutonomousNode(
max_tokens=2048,
training_iterations=25,
db_path="outputs/adaptible.db",
)
node.run()
The core hypothesis: self-improvement is itself a learnable trait. Different model instances undergoing online learning arrive at different end states—some become stronger self-learners than others.
import adaptible.eval as eval
# Configure experiment
config = eval.MetaLearningConfig(
name="meta_experiment",
seeds=[42, 123, 456, 789, 1011], # Run N instances
checkpoint_interval=10, # Checkpoint every 10 training events
training_iterations=25,
train_ratio=0.8,
)
# Load dataset
dataset = eval.generate_default_dataset()
# Run experiment
experiment = eval.MetaLearningExperiment()
result = experiment.run(dataset, config)
# Analyze results
print(f"Best seed: {result.best_seed}")
print(f"Score variance: {result.score_variance}")
# Save for later analysis
result.save("outputs/meta_experiment.json")
The meta-learning score measures how learning efficiency changes over time:
meta_learning_score = (late_improvement_rate - early_improvement_rate)
+ (early_forgetting_rate - late_forgetting_rate)
improvement_rate: Fraction of wrong→right transitionsforgetting_rate: Fraction of right→wrong transitionsnet_learning: improved - regressedpost_accuracy: Current accuracy on trained itemsEvaluation runs on a 1.5B parameter model show net positive self-improvement:
The improvement rate exceeds the forgetting rate, indicating the self-correction loop produces net learning. The model successfully identifies some of its own errors, generates corrections, and updates its weights to reflect those corrections.
This is early-stage work. The gains are modest but demonstrate that end-to-end self-improvement is achievable with small models and LoRA fine-tuning.
StatefulLLM accepts these parameters:
| Parameter | Default | Description |
|---|---|---|
model_name |
mlx-community/DeepSeek-R1-Distill-Qwen-1.5B |
HuggingFace model path |
learning_rate |
5e-5 |
Training learning rate |
max_tokens |
2048 |
Max tokens per response |
epochs |
5 |
Training epochs per revision |
num_lora_layers |
24 |
Number of LoRA layers |
lora_parameters |
{"rank": 32, "dropout": 0.0, "scale": 10.0} |
LoRA config |
loop_detection_sequence_length |
8 |
Token sequence length for loop check |
loop_detection_max_repetitions |
3 |
Repetitions before stopping generation |
.
├── adaptible/
│ ├── __init__.py # Public API
│ ├── local.py # Local server runner
│ └── _src/ # Internal implementation
│ ├── _api.py # FastAPI routes
│ ├── _classes.py # Data models
│ ├── _llm.py # StatefulLLM class with loop detection
│ ├── _server.py # Server entry point
│ ├── db.py # SQLite database layer
│ ├── autonomous/ # Autonomous learning module
│ │ ├── __init__.py
│ │ ├── __main__.py # CLI entry point
│ │ └── node.py # AutonomousNode class
│ ├── eval/ # Evaluation framework
│ │ ├── __init__.py # EvaluationHarness, TriviaDataset, etc.
│ │ ├── __main__.py # CLI entry point
│ │ ├── dataset.py # Trivia dataset
│ │ ├── harness.py # Evaluation harness
│ │ └── meta.py # Meta-learning experiments
│ ├── revise/ # Self-correction logic
│ │ └── revise.py # Revision prompts and training examples
│ └── tests/ # Unit tests
├── notebooks/
│ └── explore_experiments.ipynb # SQL-based experiment analysis
├── scripts/
│ └── explore_db.py # Interactive database explorer
├── outputs/
│ ├── adaptible.db # Experiment database
│ └── autonomous/ # Autonomous node state and logs
├── examples/
└── pyproject.toml
python -m unittest discover -s adaptible -p '*_test.py' -v
Contact for information.