# Playground Site Notes

This repo includes a self-contained “Playground” experience: interactive RL environments with simple baselines, training controls, and visualizations intended for learning and demos.

## How It Runs

- **Routes**: Playground pages live under `app/playground/*` (Next.js App Router).
  - Landing: `app/playground/page.tsx`
  - Warehouse: `app/playground/warehouse/page.tsx`
  - Stocks: `app/playground/finance/stocks/page.tsx`
- **Static export constraint**: `next.config.mjs` sets `output: "export"`, so pages are exported to static HTML.
  - That means **no dynamic server API routes** can be used for live market data.
  - Any external data fetches must happen client-side (or you must switch to a server deployment).

## “Gym-like” Standards

Environments follow a Gym-style lifecycle:

- `reset(seed?)` initializes episode state and returns an observation.
- `step(action)` advances one timestep and returns `{ obs/state, reward, done, truncated, info }`.

Core utilities:

- Deterministic PRNG: `lib/playground/prng.ts`
- Common UI patterns: tooltips, plots, animated rollouts (500ms per step), and “Run episode”/“Train” controls.

## What “RL Algorithms” Are Included

Each environment supports:

- **Random policy** baseline (sanity check).
- **Tabular Q-learning** (ε-greedy exploration) with user controls:
  - `episodes`, `steps/episode`, `γ` (discount), `ε` (exploration), `α` (learning rate)
  - Plots for training returns and rollout traces.

Training is chunked on the client (using `requestAnimationFrame`) to keep the UI responsive.

## Environments

### 1) Warehouse Robot (Robotics)

- Page: `/playground/warehouse`
- Env core: `lib/playground/warehouseEnv.ts`
- Q-learning: `lib/playground/warehouseQLearning.ts`
- UI: `components/playground/WarehousePlayground.tsx`

Key features:

- Discrete actions: Up/Down/Left/Right/Pickup/Drop
- Interactive editor: start position, dropoff, shelves + item counts, carry capacity, step limit
- Visualizations:
  - Cumulative reward plot
  - **MDP “graph” view** from the latest rollout (empirical state visitation + action probabilities):
    - `components/playground/MdpGraph.tsx`

### 2) Stocks Trading (Finance)

- Page: `/playground/finance/stocks`
- Env core: `lib/playground/stockTradingEnv.ts`
- Q-learning: `lib/playground/stockTradingQLearning.ts`
- UI: `components/playground/StockTradingPlayground.tsx`

Core mechanics:

- Actions: **Buy (all-in)**, **Sell (all-out)**, **Hold**
- Episode = stepping through historical prices (daily/weekly)
- Reward = step-by-step change in portfolio value, normalized by base investment
- Visualizations:
  - Combined **Price + Portfolio** chart with action markers:
    - `components/playground/PriceActionChart.tsx`
  - Cumulative reward plot
  - “Agent Automation” table (action timeline)

#### Market Data Providers + CORS

Because the site is statically exported, market data is fetched in the browser and can be blocked by CORS.

The UI supports:

- **Data provider**:
  - `Auto (Yahoo → Alpha)`
  - `Yahoo`
  - `Alpha Vantage`
- **Proxy mode** for browser CORS fallbacks:
  - `Auto` (tries direct, then public proxies)
  - `Direct`
  - `AllOrigins`
  - `corsproxy.io`

Implementation:

- CORS fallback fetcher: `lib/playground/corsFetch.ts`
- Alpha Vantage parsing/resampling: `lib/playground/alphaVantage.ts`

Alpha Vantage API key:

- The page accepts an API key input and stores it in `localStorage` (key: `optrl.alphaVantageKey`).
- If blank, it uses the `demo` key (rate-limited; often only works reliably for `IBM`).

## Deploy / Subdomain Notes

If you want this to be a separate site like `playground.optrl.com`:

- Deploy the same build but route the subdomain to the exported playground pages.
- Optionally add host-based rewrites (or a dedicated deployment) so `playground.optrl.com/` maps to `/playground`.

## Adding a New Environment

Pattern to follow:

1. Add environment core under `lib/playground/<env>.ts` (Gym-like `reset/step`).
2. Add algorithm baseline under `lib/playground/<env>QLearning.ts` (or another method).
3. Add UI under `components/playground/<Env>Playground.tsx`.
4. Add a page route under `app/playground/.../page.tsx`.
5. Add a card to the landing page `app/playground/page.tsx`.

