Environments

This section provides detailed documentation for all available negotiation environments.

Environment API

All environments follow the Gymnasium-like API:

# Create environment
env = make("Environment-ID-v0", **kwargs)

# Reset and get initial observation
observation, info = env.reset(**reset_kwargs)

# Run negotiation step
observation, reward, terminated, truncated, info = env.step(
    buyer_action=buyer_action,
    seller_action=seller_action
)

# Display current state
env.render()

# Cleanup
env.close()

Common Parameters

All environments share these configuration parameters:

Parameter

Type

Description

buyer_agent

BuyerAgent

The buyer agent instance

seller_agent

SellerAgent

The seller agent instance

max_rounds

int

Maximum negotiation rounds

initial_seller_price

float

Starting price from seller

buyer_max_price

float

Maximum acceptable price for buyer

seller_min_price

float

Minimum acceptable price for seller

price_tolerance

float

Price difference threshold for agreement

environment_info

dict

Contextual information (weather, season, etc.)

reward_weights

dict

Weights for reward components

Single Buyer + Product + Seller

Basic negotiation scenarios with one buyer, one product, and one seller.

Task1: Basic Price Negotiation

Environment ID: Task1_basic_price_negotiation-v0

Standard price negotiation between buyer and seller.

env = make(
    "Task1_basic_price_negotiation-v0",
    buyer_agent=buyer,
    seller_agent=seller,
    max_rounds=20,
    initial_seller_price=150.0,
    buyer_max_price=120.0,
    seller_min_price=80.0,
)

Task2: Close Price Negotiation

Environment ID: Task2_close_price_negotiation-v0

Tests edge cases with narrow price ranges where buyer max and seller min are close.

Task3: Close to Market Price Negotiation

Environment ID: Task3_close_to_market_price_negotiation-v0

Tests scenarios where initial price is near the market/fair price.

Multi-Product Environments

Environments for negotiating multiple products.

Task1: Multi-Product Negotiation

Environment ID: Task1_multi_product_negotiation-v0

General multi-product negotiation.

env = make(
    "Task1_multi_product_negotiation-v0",
    buyer_agent=buyer,
    seller_agent=seller,
    max_rounds_per_product=20,
)

observation, info = env.reset(
    products=[
        {"name": "Laptop", "price": 1000.0},
        {"name": "Mouse", "price": 50.0},
        {"name": "Keyboard", "price": 80.0},
    ]
)

Task2-4: Specific Product Counts

  • Task2: Two product negotiation

  • Task3: Five product negotiation

  • Task4: Select three from five products

Multi-Seller Environments

Environments with multiple sellers competing for a single buyer.

Parallel Multi-Seller

Environment IDs:

  • Task1_parallel_two_seller_negotiation-v0

  • Task2_parallel_three_seller_negotiation-v0

Buyer negotiates with multiple sellers simultaneously.

from agenticpay.agents.seller_agent import SellerAgent

seller1 = SellerAgent(model=model, seller_min_price=80.0)
seller2 = SellerAgent(model=model, seller_min_price=85.0)

env = make(
    "Task1_parallel_two_seller_negotiation-v0",
    buyer_agent=buyer,
    seller_agents=[seller1, seller2],
    max_rounds=20,
)

Sequential Multi-Seller

Environment IDs:

  • Task3_sequential_two_seller_negotiation-v0

  • Task4_sequential_three_seller_negotiation-v0

Buyer negotiates with sellers one at a time, using previous negotiations as context.

Multi-Buyer Environments

Environments with multiple buyers competing for products.

Parallel Multi-Buyer

Environment IDs:

  • Task1_parallel_two_buyer_negotiation-v0

  • Task2_parallel_three_buyer_negotiation-v0

Multiple buyers negotiate with the seller simultaneously.

buyer1 = BuyerAgent(model=model, buyer_max_price=120.0)
buyer2 = BuyerAgent(model=model, buyer_max_price=115.0)

env = make(
    "Task1_parallel_two_buyer_negotiation-v0",
    buyer_agents=[buyer1, buyer2],
    seller_agent=seller,
    max_rounds=20,
)

Sequential Multi-Buyer

Environment IDs:

  • Task3_sequential_two_buyer_negotiation-v0

  • Task4_sequential_three_buyer_negotiation-v0

Buyers negotiate with the seller one at a time.

Complex Multi-Agent Environments

Multi-Buyer + Multi-Seller

Environment IDs:

  • Task1_parallel_two_buyer_two_seller_negotiation-v0

  • Task2_parallel_three_buyer_three_seller_negotiation-v0

  • Task3_sequential_two_buyer_two_seller_negotiation-v0

  • Task4_sequential_three_buyer_three_seller_negotiation-v0

Multi-Products + Multi-Seller

Environment IDs:

  • Task1_parallel_two_seller_per_one_product_negotiation-v0

  • Task2_parallel_three_seller_per_one_product_negotiation-v0

  • Task3_sequential_two_seller_per_one_product_negotiation-v0

  • Task4_sequential_three_seller_per_one_product_negotiation-v0

Multi-Buyer + Multi-Products

Environment IDs:

  • Task1_parallel_two_buyer_two_product_negotiation-v0

  • Task2_parallel_three_buyer_two_product_negotiation-v0

  • Task3_sequential_two_buyer_two_product_negotiation-v0

  • Task4_sequential_three_buyer_two_product_negotiation-v0

Multi-Buyer + Multi-Products + Multi-Seller

The most complex scenarios:

Environment IDs:

  • Task1_parallel_two_buyer_two_seller_two_product_negotiation-v0

  • Task2_parallel_three_buyer_three_seller_two_product_negotiation-v0

  • Task3_sequential_two_buyer_two_seller_two_product_negotiation-v0

  • Task4_sequential_three_buyer_three_seller_three_product_negotiation-v0

Creating Custom Environments

You can create custom environments by inheriting from BaseEnv:

from agenticpay.core import BaseEnv
from agenticpay.envs import register

class MyCustomEnv(BaseEnv):
    def __init__(self, buyer_agent, seller_agent, **kwargs):
        super().__init__()
        self.buyer_agent = buyer_agent
        self.seller_agent = seller_agent
        # Custom initialization

    def reset(self, **kwargs):
        # Initialize negotiation state
        observation = {...}
        info = {...}
        return observation, info

    def step(self, buyer_action, seller_action):
        # Process actions and update state
        observation = {...}
        reward = 0.0
        terminated = False
        truncated = False
        info = {...}
        return observation, reward, terminated, truncated, info

    def render(self):
        # Display current state
        pass

    def close(self):
        # Cleanup resources
        pass

# Register the environment
register(
    id="MyCustomEnv-v0",
    entry_point="my_module:MyCustomEnv",
    max_episode_steps=100,
)