Core Concepts

This section explains the fundamental concepts and architecture of AgenticPay.

Framework Overview

AgenticPay is built around three main components:

  1. Environments: Define the negotiation scenario and rules

  2. Agents: LLM-powered buyer and seller entities

  3. Models: LLM backends for agent reasoning

┌─────────────────────────────────────────────────────────┐
│                    Environment                          │
│  ┌─────────────┐              ┌─────────────┐          │
│  │   Buyer     │◄────────────►│   Seller    │          │
│  │   Agent     │  Negotiation │   Agent     │          │
│  └──────┬──────┘              └──────┬──────┘          │
│         │                            │                  │
│         └────────────┬───────────────┘                  │
│                      │                                  │
│              ┌───────▼───────┐                          │
│              │  LLM Model    │                          │
│              └───────────────┘                          │
└─────────────────────────────────────────────────────────┘

Environment Types

AgenticPay provides various environment types organized by complexity:

Single Buyer + Product + Seller

Basic scenarios with one buyer, one product, and one seller.

  • Task1: Basic Price Negotiation

  • Task2: Close Price Negotiation (narrow price ranges)

  • Task3: Close to Market Price Negotiation

Multi-Product Environments

One buyer and seller negotiating over multiple products.

  • Task1: General multi-product negotiation

  • Task2: Two product negotiation

  • Task3: Five product negotiation

  • Task4: Select three from five products

Multi-Seller Environments

One buyer negotiating with multiple sellers.

  • Parallel: Simultaneous negotiations with multiple sellers

  • Sequential: One-by-one negotiations with sellers

Multi-Buyer Environments

Multiple buyers competing for products.

  • Parallel: Simultaneous buyer negotiations

  • Sequential: One-by-one buyer negotiations

Complex Multi-Agent Environments

Combinations of multiple buyers, sellers, and products:

  • Multi-Buyer + Multi-Seller

  • Multi-Products + Multi-Seller

  • Multi-Buyer + Multi-Products

  • Multi-Buyer + Multi-Products + Multi-Seller

Agent Architecture

Base Agent

All agents inherit from BaseAgent which provides:

  • Model integration for LLM inference

  • Response generation interface

  • State management

class BaseAgent:
    def __init__(self, model):
        self.model = model

    def respond(self, conversation_history, current_state):
        # Generate response using LLM
        pass

Buyer Agent

The BuyerAgent represents a customer trying to purchase products.

Key attributes:

  • buyer_max_price: Maximum price the buyer is willing to pay (confidential)

  • model: LLM model for generating responses

Behavior:

  • Negotiates to get the lowest possible price

  • Uses user requirements and preferences

  • May walk away if price exceeds maximum

Seller Agent

The SellerAgent represents a merchant selling products.

Key attributes:

  • seller_min_price: Minimum acceptable price (confidential)

  • model: LLM model for generating responses

Behavior:

  • Negotiates to maximize sale price

  • Uses product information and market conditions

  • May refuse offers below minimum price

Conversation Memory

The ConversationMemory class manages dialogue history:

from agenticpay.memory import ConversationMemory

memory = ConversationMemory()

# Add messages
memory.add_message(role="buyer", content="I'd like to buy this jacket")
memory.add_message(role="seller", content="It's priced at $150")

# Retrieve history
full_history = memory.get_history()
recent = memory.get_recent(n=5)

Features:

  • Message storage with metadata (role, round, timestamp)

  • Full or recent history retrieval

  • Role-based filtering

State Management

The negotiation state tracks:

observation = {
    "conversation_history": [...],    # List of messages
    "current_round": 3,               # Current negotiation round
    "seller_price": 130.0,            # Current seller asking price
    "buyer_offer": 100.0,             # Latest buyer offer
    "product_info": {...},            # Product details
    "environment_info": {...},        # Environmental context
}

Reward System

AgenticPay uses a configurable reward system:

reward_weights = {
    "buyer_savings": 1.0,    # Weight for buyer savings
    "seller_profit": 1.0,    # Weight for seller profit
    "time_cost": 0.1,        # Penalty for negotiation rounds
}

Reward Components:

  • Buyer Savings: buyer_max_price - final_price

  • Seller Profit: final_price - seller_min_price

  • Time Cost: Penalty based on rounds used

Registration System

AgenticPay uses a Gymnasium-like registration system:

from agenticpay import make, register
from agenticpay.envs import pprint_registry

# List all registered environments
pprint_registry()

# Create environment by ID
env = make("Task1_basic_price_negotiation-v0", ...)

# Register custom environment
register(
    id="MyCustomEnv-v0",
    entry_point="my_module:MyEnvClass",
    max_episode_steps=100,
)