Configuration¶
This section covers all configuration options available in AgenticPay.
Environment Configuration¶
Basic Parameters¶
env = make(
"Task1_basic_price_negotiation-v0",
# Required: Agent instances
buyer_agent=buyer,
seller_agent=seller,
# Negotiation limits
max_rounds=20, # Maximum negotiation rounds
# Price settings
initial_seller_price=150.0, # Starting seller price
buyer_max_price=120.0, # Buyer's maximum (confidential)
seller_min_price=80.0, # Seller's minimum (confidential)
price_tolerance=0.0, # Threshold for price agreement
# Context information
environment_info={
"temperature": "warm",
"season": "summer",
"weather": "sunny",
"location": "downtown mall",
},
# Reward configuration
reward_weights={
"buyer_savings": 1.0,
"seller_profit": 1.0,
"time_cost": 0.1,
},
)
Reward Weights¶
The reward system balances different objectives:
Weight |
Description |
|---|---|
|
Encourages buyer to save money ( |
|
Encourages seller profit ( |
|
Penalizes long negotiations (fewer rounds = better) |
Example configurations:
# Buyer-focused (maximize savings)
reward_weights = {
"buyer_savings": 2.0,
"seller_profit": 0.5,
"time_cost": 0.1,
}
# Balanced
reward_weights = {
"buyer_savings": 1.0,
"seller_profit": 1.0,
"time_cost": 0.1,
}
# Quick resolution focused
reward_weights = {
"buyer_savings": 1.0,
"seller_profit": 1.0,
"time_cost": 0.5,
}
Agent Configuration¶
Buyer Agent¶
buyer = BuyerAgent(
model=model,
buyer_max_price=120.0, # Maximum acceptable price
)
Seller Agent¶
seller = SellerAgent(
model=model,
seller_min_price=80.0, # Minimum acceptable price
)
Model Configuration¶
SGLang¶
from agenticpay.models.sglang_lm import SGLangLM
model = SGLangLM(
model_path="agenticpay/models/download_models/Qwen3-8B-Instruct",
)
vLLM¶
from agenticpay.models.vllm_lm import VLLMLM
model = VLLMLM(
model_path="path/to/model",
trust_remote_code=True,
gpu_memory_utilization=0.9, # GPU memory usage (0.0-1.0)
tensor_parallel_size=4, # Number of GPUs for parallelism
)
OpenAI¶
from agenticpay.models.openai_lm import OpenAILLM
model = OpenAILLM(
model_name="gpt-4", # Model name
api_key="sk-...", # API key (or use env var)
temperature=0.7, # Response randomness
max_tokens=1024, # Maximum response length
)
Reset Configuration¶
When resetting an environment, you can configure:
Product Information¶
observation, info = env.reset(
user_requirement="I need a high-quality winter jacket",
product_info={
"name": "Premium Winter Jacket",
"brand": "Mountain Gear",
"price": 180.0,
"features": ["Waterproof", "Insulated", "Windproof", "Breathable"],
"condition": "New",
"material": "Gore-Tex",
"warranty": "2 years",
},
)
User Profile¶
user_profile = """
- Prefers business/professional style
- Budget-conscious but values quality
- Likes to compare prices before purchasing
- May mention competitor prices during negotiation
"""
observation, info = env.reset(
user_requirement="I need a laptop for work",
product_info={...},
user_profile=user_profile,
)
Multi-Product Configuration¶
observation, info = env.reset(
products=[
{
"name": "Laptop",
"price": 1200.0,
"features": ["16GB RAM", "512GB SSD"],
},
{
"name": "Mouse",
"price": 50.0,
"features": ["Wireless", "Ergonomic"],
},
{
"name": "Keyboard",
"price": 100.0,
"features": ["Mechanical", "RGB"],
},
]
)
Environment Variables¶
AgenticPay supports these environment variables:
# OpenAI API Key
export OPENAI_API_KEY="sk-..."
# HuggingFace Token (for gated models)
export HF_TOKEN="hf_..."
# CUDA Device Selection
export CUDA_VISIBLE_DEVICES="0,1,2,3"
Logging Configuration¶
Configure logging for debugging:
import logging
# Set logging level
logging.basicConfig(level=logging.INFO)
# Enable detailed negotiation logging
logging.getLogger("agenticpay").setLevel(logging.DEBUG)
Best Practices¶
Price Ranges: Ensure
buyer_max_price > seller_min_pricefor achievable dealsRound Limits: Set
max_roundsbased on complexity (5-10 for simple, 20-30 for complex)Memory Management: For long sessions, consider clearing conversation history periodically
GPU Memory: Monitor GPU usage with vLLM’s
gpu_memory_utilizationparameterTemperature: Lower temperature (0.3-0.5) for more consistent negotiations, higher (0.7-0.9) for varied responses