AgentScope MVP Technical Blueprint — OTel-Native Observability Stack, SDK Design, and Deployment Topology

Technology Mar 20, 2026 by deep-research

#agentscope #observability #opentelemetry #clickhouse #sdk #architecture #multi-agent #traces #mvp

AgentScope MVP Technical Blueprint — OTel-Native Observability Stack, SDK Design, and Deployment Topology

Date: 2026-03-20 Context: AgentScope lacks a technical architecture blueprint. This report synthesizes market intelligence (MOKA-301), competitive gaps, and emerging standards into a concrete MVP architecture with quantified market data, competitive pricing analysis, and risk-weighted go/no-go recommendation.

Executive Summary

Go/no-go: Conditional GO — The agent observability market is real ($7.81B multi-agent platform market in 2025, 47.7% CAGR to $54.9B by 2030 [1]), but the window is narrowing fast. Langfuse’s acquisition by ClickHouse (Jan 2026, $15B valuation [2]) validates the OTel+ClickHouse thesis but also creates a formidable incumbent. AgentScope must ship within 10 weeks and differentiate sharply on multi-agent cost attribution — a gap no competitor has closed.
AgentScope should be an OTel-native, ClickHouse-backed observability platform purpose-built for multi-agent systems — not a generic LLM tracing tool
The MVP must differentiate on three axes: (1) multi-agent cost attribution, (2) orchestration-aware tracing (agent handoffs, tool chains, delegation graphs), and (3) self-hosted-first deployment
SDK strategy: TypeScript-first (aligned with Moklabs stack), Python SDK as fast-follow. Both emit standard OTel spans with AgentScope semantic extensions aligned to the emerging OTel GenAI Agent Semantic Conventions [3]
Storage: ClickHouse for traces/metrics/logs — 10-20x compression ratio, sub-second queries over billions of spans, columnar design ideal for high-cardinality agent metadata [4]
The architecture separates into four layers: SDK —> Collector —> Storage —> UI, each independently deployable and replaceable
ICP: Teams running 3+ agents in production, spending $5K-50K/mo on LLM APIs, needing cost visibility per agent/task/user
Pricing target: Free self-hosted (AGPL), cloud at $99-499/mo — undercutting LangSmith Plus ($39/seat x teams) and Langfuse Pro ($199/mo) on total cost while delivering superior multi-agent features

1. Should Moklabs Build This? (Go/No-Go)

The Case FOR Building

Signal	Data Point	Source
Market size	AI agents market: $7.63B (2025) —> $10.91B (2026) —> $182.97B (2033), CAGR 49.6%	Grand View Research [5]
Multi-agent growth	Gartner saw 1,445% surge in multi-agent system inquiries Q1 2024 to Q2 2025	Gartner [6]
Observability spending	96% of IT leaders expect observability spending to hold steady or grow; 62% plan increases	LogicMonitor 2026 [7]
Platform switching	67% of orgs likely to switch observability platforms within 1-2 years	Grafana Survey 2025 [8]
OTel momentum	95% adoption for new cloud-native instrumentation in 2026; 10B+ daily spans processed	CNCF, byteiota [9][10]
Observability ROI	Organizations report 2.6x average ROI from observability spending	Grafana Survey 2025 [8]
Agent observability gap	89% of organizations have implemented observability for agents, but quality issues remain the #1 production barrier (32%)	getmaxim.ai [11]

The Case AGAINST Building

Risk	Data Point	Source
Langfuse+ClickHouse dominance	20K GitHub stars, 26M+ SDK installs/mo, 2,000+ paying customers, 19 of Fortune 50	ClickHouse blog [2]
Crowded market	100+ observability tools in use across surveyed orgs; avg org uses 8 tools	Grafana Survey 2025 [8]
AGPL adoption friction	Google bans AGPL entirely; many enterprises have blanket AGPL restrictions	Google OSS Policy [12]
Braintrust momentum	$80M Series B at $800M valuation (Feb 2026) for AI observability	Axios [13]
Tool fatigue	84% of orgs are consolidating observability tools, not adding new ones	Grafana Survey 2025 [8]

Verdict: Conditional GO

The market is massive and growing 49.6% CAGR, but the competitive moat must come from multi-agent-native features that generic LLM observability tools don’t provide. Langfuse’s 2026 roadmap explicitly targets multi-agent support [14], giving AgentScope roughly 6-9 months of differentiation window. The condition: ship MVP in 10 weeks or don’t build at all.

2. What Specifically Would We Build? (Concrete MVP)

Architecture Overview

                    +----------------------------------+
                    |          AgentScope UI            |
                    |   (React + TanStack + Tremor)     |
                    |   Trace explorer, cost dashboard, |
                    |   agent graph, eval results       |
                    +---------------+------------------+
                                    | SQL / HTTP API
                    +---------------v------------------+
                    |        ClickHouse Cluster         |
                    |   traces / metrics / logs / evals |
                    |   Materialized views for KPIs     |
                    +---------------^------------------+
                                    | OTLP/gRPC + HTTP
                    +---------------+------------------+
                    |     OTel Collector (custom)       |
                    |   Schema enforcement, sampling,   |
                    |   cost enrichment, rate limiting  |
                    +---------------^------------------+
                                    | OTLP export
         +--------------------------+-------------------------+
         |                          |                         |
+---------+--------+  +------------+---------+  +-------------+---------+
|  TypeScript SDK  |  |    Python SDK        |  |  Generic OTel SDK     |
|  @agentscope/ts  |  |  agentscope-py       |  |  (any OTel client)    |
|                  |  |                      |  |                       |
|  Auto-instrument |  |  Decorators for      |  |  Manual spans with    |
|  MCP, A2A, tool  |  |  LangChain, CrewAI,  |  |  GenAI semantic       |
|  calls, handoffs |  |  OpenAI SDK          |  |  conventions          |
+------------------+  +----------------------+  +-----------------------+

Design Principles

OTel-native, not OTel-compatible — spans follow GenAI Semantic Conventions including the emerging agent span conventions (experimental, but actively developed [3]). No proprietary wire format.
Zero-config for common frameworks — auto-instrumentation for MCP servers, OpenAI SDK, Anthropic SDK, LangChain, CrewAI
Multi-agent first — trace model captures agent identity, delegation chains, and inter-agent communication as first-class concepts (the feature gap no competitor has fully closed)
Self-hosted default — single docker compose up deploys the full stack. Cloud offering later.
Cost as a dimension — every span carries token counts and estimated cost. Materialized views aggregate by agent, task, user, and time window.

3. Trace Model — Agent-Aware Semantic Conventions

AgentScope extends OTel GenAI Semantic Conventions with agent-specific attributes, aligned with the upstream GenAI Agent Spans specification where gen_ai.operation.name uses invoke_agent and create_agent [3]:

Span Types

Span Kind	Name Pattern	Key Attributes
Agent Run	`invoke_agent.{agent_name}`	`gen_ai.agent.id`, `gen_ai.agent.name`, `agent.framework`, `gen_ai.request.model`, `agent.role`
LLM Call	`gen_ai.chat`	`gen_ai.system`, `gen_ai.request.model`, `gen_ai.usage.input_tokens`, `gen_ai.usage.output_tokens`, `gen_ai.usage.cost_usd`
Tool Call	`tool.invoke.{tool_name}`	`tool.name`, `tool.source` (mcp/native/a2a), `tool.mcp_server`, `tool.success`
Agent Handoff	`agent.handoff`	`agent.handoff.from`, `agent.handoff.to`, `agent.handoff.reason`
Retrieval	`retrieval.query`	`retrieval.source`, `retrieval.docs_count`, `retrieval.latency_ms`
Guardrail	`guardrail.check`	`guardrail.name`, `guardrail.result` (pass/fail/warn), `guardrail.latency_ms`
Eval	`eval.run`	`eval.name`, `eval.score`, `eval.criteria`, `eval.judge_model`

Agent Identity Propagation

// Every span in a multi-agent trace carries:
interface AgentContext {
  'gen_ai.agent.id': string;        // unique agent instance (aligned with OTel spec)
  'gen_ai.agent.name': string;      // human-readable name
  'agent.session_id': string;       // conversation/task session
  'agent.parent_agent_id'?: string; // delegation chain
  'agent.framework': string;        // 'langchain' | 'crewai' | 'openai' | 'custom'
  'gen_ai.request.model': string;   // 'claude-sonnet-4-6' | 'gpt-4o' etc.
  'agent.cost.cumulative_usd': number; // running cost for this agent in session
}

Cost Attribution Model

Session Cost
|-- Agent A Cost ($0.42)
|   |-- LLM calls: $0.38 (3 calls, claude-sonnet-4-6)
|   |-- Tool calls: $0.04 (2 MCP calls with upstream API costs)
|   +-- Handoff to Agent B
|-- Agent B Cost ($0.15)
|   |-- LLM calls: $0.12 (1 call, gpt-4o-mini)
|   +-- Retrieval: $0.03 (vector search)
+-- Total: $0.57

Cost is computed at span level using a configurable price table:

const priceTable: PriceTable = {
  'claude-sonnet-4-6': { input: 3.0, output: 15.0, cached: 0.30 },  // per 1M tokens
  'gpt-4o':        { input: 2.5, output: 10.0, cached: 1.25 },
  'gpt-4o-mini':   { input: 0.15, output: 0.60, cached: 0.075 },
};

4. SDK Design

TypeScript SDK (Primary)

import { AgentScope } from '@agentscope/ts';

// Initialize -- auto-detects framework
const scope = new AgentScope({
  endpoint: 'http://localhost:4318',  // OTel collector
  projectId: 'my-project',
  environment: 'production',
  pricing: 'default',  // or custom PriceTable
});

// Option 1: Wrap an existing agent framework
const wrappedAgent = scope.wrap(myLangChainAgent, {
  name: 'research-agent',
  model: 'claude-sonnet-4-6',
});

// Option 2: Manual instrumentation
const span = scope.startAgentRun('planner-agent', {
  model: 'gpt-4o',
  sessionId: 'session-123',
});

const toolSpan = span.startToolCall('web-search', { source: 'mcp' });
// ... execute tool ...
toolSpan.end({ success: true, resultSize: 3 });

span.end();

// Option 3: Auto-instrumentation (zero-code)
scope.autoInstrument({
  openai: true,      // patches OpenAI SDK
  anthropic: true,   // patches Anthropic SDK
  mcp: true,         // patches MCP client calls
  fetch: true,       // patches global fetch for API cost tracking
});

Auto-Instrumentation Strategy

Target	Method	What’s Captured
OpenAI SDK	Monkey-patch `chat.completions.create`	Model, tokens, latency, messages, cost
Anthropic SDK	Monkey-patch `messages.create`	Model, tokens, latency, tool_use blocks, cost
MCP Client	Wrap `callTool()` and `listTools()`	Server name, tool name, args, result, latency
A2A Client	Wrap `sendTask()`	Target agent, task description, result
fetch/axios	Optional patch	External API calls with URL, status, latency

SDK risk note: Monkey-patching is fragile. Both Helicone (proxy-based [15]) and Braintrust (wrapper-based [16]) have explored alternative integration patterns. AgentScope should support both monkey-patch (zero-config DX) and explicit wrapper (production reliability) modes.

Python SDK

from agentscope import AgentScope, trace_agent

scope = AgentScope(
    endpoint="http://localhost:4318",
    project_id="my-project",
)

# Decorator-based instrumentation
@trace_agent(name="research-agent", model="claude-sonnet-4-6")
async def research(query: str):
    # All LLM calls inside are auto-traced
    result = await client.messages.create(...)
    return result

# Framework integrations
from agentscope.integrations import langchain_callback, crewai_callback

# LangChain: add as callback handler
chain.invoke(input, config={"callbacks": [langchain_callback(scope)]})

# CrewAI: add as crew callback
crew = Crew(agents=[...], callbacks=[crewai_callback(scope)])

SDK Package Structure

@agentscope/ts                    # Core SDK
|-- @agentscope/openai            # OpenAI auto-instrumentation
|-- @agentscope/anthropic         # Anthropic auto-instrumentation
|-- @agentscope/mcp               # MCP client instrumentation
|-- @agentscope/langchain         # LangChain.js integration
+-- @agentscope/vercel-ai         # Vercel AI SDK integration

agentscope-py                     # Python core
|-- agentscope[openai]            # OpenAI instrumentation
|-- agentscope[anthropic]         # Anthropic instrumentation
|-- agentscope[langchain]         # LangChain callback
+-- agentscope[crewai]            # CrewAI callback

5. Storage Layer — ClickHouse Schema

Why ClickHouse (Quantified)

Criteria	ClickHouse	PostgreSQL	Elasticsearch
Write throughput	1M+ rows/sec	~50K rows/sec	~100K docs/sec
Query latency (1B rows)	Sub-second	Minutes	Seconds
Compression ratio	10-20x (up to 50x reported [4])	3-5x	5-8x
Storage cost/TB (cloud)	~$25-35/TB compressed [17]	~$100/TB	~$200/TB
Storage cost/TB (self-hosted)	~$5/mo (NVMe)	~$20/mo	~$50/mo
OTel native support	Yes (ClickStack)	No	Yes (via APM)
Columnar analytics	Native	No	Partial
Self-hosted complexity	Medium	Low	High

Validation: SigNoz, a ClickHouse+OTel observability platform, has proven this stack in production as a Datadog alternative [18]. Langfuse migrated to ClickHouse pre-acquisition and reported significant query performance improvements for trace analytics [19].

Counter-point: ClickHouse’s operational complexity is non-trivial for self-hosted users. TimescaleDB outperforms ClickHouse 56x on small batch writes (14,200 vs 250 ops/s at batch size 100 [20]) — relevant for low-volume early adopters. Mitigation: provide ClickHouse Cloud as managed option; optimize batch sizes in collector.

Core Tables

-- Traces table (main fact table)
CREATE TABLE agentscope.traces (
    trace_id          FixedString(32),
    span_id           FixedString(16),
    parent_span_id    FixedString(16),
    span_name         LowCardinality(String),
    span_kind         Enum8('agent_run'=1, 'llm_call'=2, 'tool_call'=3,
                            'handoff'=4, 'retrieval'=5, 'guardrail'=6, 'eval'=7),
    start_time        DateTime64(9),    -- nanosecond precision
    end_time          DateTime64(9),
    duration_ns       UInt64,
    status_code       Enum8('ok'=0, 'error'=1),

    -- Project / environment
    project_id        LowCardinality(String),
    environment       LowCardinality(String),

    -- Agent identity
    agent_id          String,
    agent_name        LowCardinality(String),
    agent_framework   LowCardinality(String),
    agent_model       LowCardinality(String),
    agent_session_id  String,
    agent_parent_id   String DEFAULT '',

    -- LLM specifics
    input_tokens      UInt32 DEFAULT 0,
    output_tokens     UInt32 DEFAULT 0,
    cached_tokens     UInt32 DEFAULT 0,
    cost_usd          Float64 DEFAULT 0,

    -- Tool specifics
    tool_name         LowCardinality(String) DEFAULT '',
    tool_source       LowCardinality(String) DEFAULT '',
    tool_success      Nullable(UInt8),

    -- Flexible attributes (for everything else)
    attributes        Map(String, String),

    -- Partitioning
    _date             Date DEFAULT toDate(start_time)
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(_date)
ORDER BY (project_id, _date, trace_id, span_id)
TTL _date + INTERVAL 90 DAY;

-- Materialized view: cost by agent per hour
CREATE MATERIALIZED VIEW agentscope.cost_by_agent_hourly
ENGINE = SummingMergeTree()
ORDER BY (project_id, agent_name, agent_model, hour)
AS SELECT
    project_id,
    agent_name,
    agent_model,
    toStartOfHour(start_time) AS hour,
    sum(cost_usd) AS total_cost,
    sum(input_tokens) AS total_input_tokens,
    sum(output_tokens) AS total_output_tokens,
    count() AS span_count,
    countIf(status_code = 'error') AS error_count
FROM agentscope.traces
WHERE span_kind = 'llm_call'
GROUP BY project_id, agent_name, agent_model, hour;

-- Materialized view: session-level aggregates
CREATE MATERIALIZED VIEW agentscope.session_summary
ENGINE = AggregatingMergeTree()
ORDER BY (project_id, agent_session_id)
AS SELECT
    project_id,
    agent_session_id,
    min(start_time) AS session_start,
    max(end_time) AS session_end,
    sum(cost_usd) AS total_cost,
    sum(input_tokens + output_tokens) AS total_tokens,
    countIf(span_kind = 'llm_call') AS llm_calls,
    countIf(span_kind = 'tool_call') AS tool_calls,
    countIf(span_kind = 'handoff') AS handoffs,
    countIf(status_code = 'error') AS errors,
    uniqExact(agent_name) AS unique_agents
FROM agentscope.traces
GROUP BY project_id, agent_session_id;

Data Retention Strategy

Tier	Duration	Resolution	Storage
Hot	7 days	Full spans with all attributes	NVMe SSD
Warm	30 days	Full spans, compressed attributes	SSD
Cold	90 days	Aggregated materialized views only	S3/R2 (via ClickHouse tiered storage)
Archive	1 year	Session summaries + cost rollups	S3/R2 (Parquet export)

6. Collector Layer — OTel Collector Configuration

Custom OTel Collector with AgentScope-specific processors:

# otel-collector-config.yaml
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318

processors:
  # Enrich spans with cost data based on model + tokens
  agentscope_cost_enrichment:
    price_table_url: "http://agentscope-api:8080/prices"
    fallback_prices:
      default_input_per_1m: 1.0
      default_output_per_1m: 3.0

  # Enforce schema -- reject malformed spans, add defaults
  agentscope_schema_validator:
    required_attributes:
      - project_id
    defaults:
      environment: "development"

  # Tail-based sampling -- keep all errors, sample success
  tail_sampling:
    decision_wait: 10s
    policies:
      - name: errors-always
        type: status_code
        status_code: { status_codes: [ERROR] }
      - name: high-cost
        type: numeric_attribute
        numeric_attribute:
          key: cost_usd
          min_value: 0.10
      - name: sample-success
        type: probabilistic
        probabilistic: { sampling_percentage: 25 }

  batch:
    send_batch_size: 10000
    timeout: 5s

exporters:
  clickhouse:
    endpoint: tcp://clickhouse:9000
    database: agentscope
    traces_table_name: traces
    ttl: 90d
    create_schema: true

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [agentscope_schema_validator, agentscope_cost_enrichment, tail_sampling, batch]
      exporters: [clickhouse]

Cost Enrichment Pipeline

The collector intercepts every span with gen_ai.* attributes and:

Reads gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens
Looks up per-token price from configurable price table
Computes cost_usd = (input_tokens * input_price + output_tokens * output_price) / 1_000_000
Attaches cost_usd as span attribute before writing to ClickHouse

This means SDKs don’t need to know prices — cost is computed server-side and stays current. This mirrors how Helicone handles cost calculation for 300+ models via their open-source cost repository [15].

7. UI Layer — Dashboard Architecture

Tech Stack

Component	Technology	Rationale
Framework	React 19 + Vite	Moklabs standard, fast dev
Routing	TanStack Router	Type-safe, file-based
Data fetching	TanStack Query	Caching, optimistic updates
Charts	Tremor + Recharts	Pre-built analytics components
Table	TanStack Table	Virtualized, sortable, filterable
Graph visualization	React Flow	Agent delegation graphs (interactive DAG)
Styling	Tailwind CSS + Radix	Consistent with Moklabs design system

Core Views (MVP)

Trace Explorer — Waterfall view of spans within a trace. Filter by agent, model, time range, cost range, status. Click to expand span details with full attributes, messages, tool inputs/outputs.
Agent Dashboard — Per-agent cards showing: total cost (24h/7d/30d), request count, error rate, avg latency, top tools used. Drill-down to agent’s recent traces.
Cost Analytics — Time-series chart of spend by agent, model, project. Table with cost breakdown. Budget alerts configuration. Comparison view (this week vs last week). This is the killer feature — no competitor offers per-agent, per-session cost attribution with budget alerting.
Session Replay — Timeline of a complete multi-agent session. Shows agent handoffs as a directed graph. Click any node to see the agent’s trace waterfall. This is AgentScope’s unique differentiator — LangSmith only supports LangGraph, Langfuse has generic spans, Braintrust focuses on evals [16][21][22].
Eval Dashboard (v1.1) — Evaluation scores over time. A/B comparison of prompt versions. Regression detection alerts.

API Layer

GET  /api/traces?project={id}&from={ts}&to={ts}&agent={name}&min_cost={usd}
GET  /api/traces/{traceId}/spans
GET  /api/agents?project={id}
GET  /api/agents/{name}/stats?window=24h
GET  /api/costs/by-agent?project={id}&window=7d
GET  /api/costs/by-model?project={id}&window=7d
GET  /api/sessions/{sessionId}
GET  /api/sessions/{sessionId}/graph  -- returns agent delegation DAG
POST /api/evals                       -- submit eval results
GET  /api/evals?project={id}&name={evalName}&window=30d

API server is a lightweight Node.js (Hono or Fastify) service that translates REST requests to ClickHouse SQL queries.

8. Deployment Topology

Self-Hosted (MVP Default)

# docker-compose.yml
services:
  clickhouse:
    image: clickhouse/clickhouse-server:24.12
    volumes:
      - clickhouse_data:/var/lib/clickhouse
      - ./init-schema.sql:/docker-entrypoint-initdb.d/init.sql
    ports:
      - "8123:8123"   # HTTP
      - "9000:9000"   # Native

  collector:
    image: ghcr.io/moklabs/agentscope-collector:latest
    depends_on: [clickhouse]
    ports:
      - "4317:4317"   # gRPC
      - "4318:4318"   # HTTP
    environment:
      - CLICKHOUSE_DSN=tcp://clickhouse:9000/agentscope

  api:
    image: ghcr.io/moklabs/agentscope-api:latest
    depends_on: [clickhouse]
    ports:
      - "8080:8080"
    environment:
      - CLICKHOUSE_DSN=http://clickhouse:8123/agentscope

  ui:
    image: ghcr.io/moklabs/agentscope-ui:latest
    depends_on: [api]
    ports:
      - "3000:3000"
    environment:
      - API_URL=http://api:8080

Total resource requirements (MVP):

ClickHouse: 2 CPU, 4GB RAM, 50GB SSD (handles ~10M spans/day)
Collector + API + UI: 1 CPU, 1GB RAM combined
Minimum: single VPS with 4GB RAM can run the full stack

Cloud Offering (Post-MVP)

Managed ClickHouse Cloud backend (~$25-35/TB compressed storage [17])
Multi-tenant with project-level isolation
Collector runs as edge workers (Cloudflare) for low-latency ingest
UI served from CDN

9. Who Buys It and For How Much? (ICP + Willingness to Pay)

Ideal Customer Profile

Attribute	Description
Company stage	Series A-C startups and mid-market teams building AI-powered products
Team size	3-20 engineers working on AI/agent features
Agent maturity	Running 3+ agents in production, at least one multi-agent workflow
LLM spend	$5K-50K/mo on API costs (pain point: “where does the money go?”)
Current tooling	Using basic logging or outgrowing Langfuse free tier / LangSmith developer plan
Tech stack	TypeScript or Python, using OpenAI/Anthropic APIs or LangChain/CrewAI

Competitive Pricing Landscape

Platform	Free Tier	Entry Paid	Mid-Tier	Enterprise	Self-Hosted
Langfuse	50K units/mo	$29/mo (Core)	$199/mo (Pro)	$2,499/mo	Free (MIT), $500/mo enterprise features [23]
LangSmith	5K traces/mo	$39/seat/mo (Plus)	~$195/mo (5 seats)	Custom	BYOC option [24]
Braintrust	Free tier	Usage-based, no seat limits	Usage-based	Custom	No [16]
Arize Phoenix	Unlimited (OSS)	$50/mo (managed cloud)	$500/mo	$50K-100K/yr (AX)	Free (ELv2) [25]
Helicone	10K requests/mo	Usage-based	Usage-based	Custom	Yes (OSS) [15]
Datadog LLM Obs	None	~$120/day auto-activation on LLM spans	Per-span billing	Bundled with APM	No [26]
AgentScope	Unlimited (self-hosted)	$99/mo (cloud)	$299/mo	$499/mo	Free (AGPL)

AgentScope Pricing Rationale

Free self-hosted: Drives adoption, builds community, creates migration funnel to cloud
$99/mo cloud entry: Undercuts Langfuse Pro ($199/mo) while offering superior multi-agent features
No per-seat pricing: Follows Braintrust model — team-friendly, removes adoption friction
Usage-based overage: $5 per 100K additional spans (vs Langfuse $8 per 100K units [23])

Market Sizing (Bottom-Up)

~50,000 companies actively building with AI agents (estimate based on 26M+ Langfuse SDK installs/mo [2])
5% addressable in year 1 with multi-agent use cases = 2,500 potential customers
2% conversion to paid cloud at avg $200/mo = 50 customers x $200 x 12 = $120K ARR Year 1
Growth target: 500 cloud customers by Year 2 = $1.2M ARR

10. Competitive Positioning — What’s the Unfair Advantage?

Feature Comparison Matrix

Feature	Langfuse	LangSmith	Braintrust	Arize Phoenix	Helicone	AgentScope
Multi-agent traces	Generic spans	LangGraph only	Generic spans	Generic spans	Sessions	Native agent identity, handoffs, delegation graphs
Cost attribution	Per-trace	Per-trace	Per-trace	Basic	Per-request	Per-agent, per-session, per-user with budget alerts
Agent delegation graph	No	No	No	No	No	Visual DAG of agent-to-agent handoffs
OTel native	Yes (post-acq)	No	No	Yes	No (proxy)	Yes (born OTel, aligned with GenAI agent conventions)
Self-hosted	Yes (MIT)	BYOC only	No	Yes (ELv2)	Yes (OSS)	Yes (AGPL)
Framework agnostic	Yes	Mostly LangChain	Yes	Yes	Yes	Yes
ClickHouse backend	Yes (post-acq)	No	Brainstore (custom)	No	No	Yes (from day one)
Eval integration	Yes	Yes	Best-in-class	Yes	Basic	Planned (v1.1)

Why Moklabs, Why Now

Multi-agent is unsolved: Every competitor traces individual LLM calls. None visualize multi-agent delegation as a first-class DAG with per-agent cost attribution. This is AgentScope’s category.
OTel GenAI conventions are crystallizing: The agent span conventions moved from proposal to experimental in 2025 [3]. Building on this standard now means AgentScope becomes the reference implementation.
ClickHouse acquisition created noise: Langfuse’s acquisition introduces uncertainty for self-hosted users. Some will look for alternatives — AgentScope should be there.
Moklabs stack advantage: OctantOS and Paperclip are multi-agent orchestrators. AgentScope is the natural observability layer for our own stack, giving us dogfooding advantage and a built-in distribution channel.

Unique Value Props for Marketing

“See your agents think” — Session replay with agent delegation graphs. No other tool visualizes multi-agent collaboration.
“Know what your agents cost” — Real-time cost attribution per agent, per task, per user. Budget alerts before you get a surprise bill.
“OTel-native, vendor-free” — Standard OpenTelemetry. Switch backends without changing code. No proprietary lock-in.
“One command to deploy” — docker compose up and start tracing in 5 minutes. No cloud account needed.

11. What Kills This Idea? (Top 5 Risks + Counter-Arguments)

Risk 1: Langfuse + ClickHouse Closes the Multi-Agent Gap (CRITICAL)

Threat: Langfuse’s 2026 roadmap explicitly targets “production monitoring and analytics for real agent systems” [14]. With ClickHouse’s $400M in fresh capital and $15B valuation, they can hire 10x our team and ship multi-agent features within 6 months.

Why AgentScope might still lose: Langfuse has 20K GitHub stars, 26M+ SDK installs/mo, 2,000+ paying customers, and 19 of the Fortune 50 [2]. Network effects in developer tooling are brutal — teams default to what their peers use.

Mitigation: Ship before they do. Focus on the multi-agent DAG visualization and per-agent cost attribution — features that require deep architectural decisions Langfuse can’t easily bolt on. Also: AGPL protects against ClickHouse bundling AgentScope’s innovations into their proprietary cloud.

Risk 2: AGPL License Limits Enterprise Adoption (HIGH)

Threat: Google explicitly bans AGPL [12]. Many Fortune 500 companies have blanket AGPL restrictions. AGPL-licensed projects have lower GitHub adoption than MIT/Apache alternatives.

Why this matters: Our ICP includes Series A-C startups (typically AGPL-tolerant), but enterprise expansion requires addressing this. Langfuse (MIT) and Arize Phoenix (ELv2) are more permissive.

Mitigation: Offer dual licensing — AGPL for community, commercial license for enterprises who can’t use AGPL. This is the proven model (MongoDB SSPL + commercial, Grafana AGPL + commercial). Alternatively, consider Apache 2.0 for SDKs (which touch customer code) and AGPL only for the server components.

Risk 3: Datadog / New Relic Add Native Agent Observability (HIGH)

Threat: Datadog already has LLM Observability with agent monitoring capabilities [26]. Their $120/day auto-activation on LLM spans shows aggressive pricing intent. With 26,800+ customers and $2.6B+ ARR, they can subsidize AI observability as a loss leader bundled with APM.

Counter-argument: Enterprise teams already paying $100K+/yr to Datadog will just enable the LLM Observability add-on rather than adopt a new vendor. Agent-native startups, however, don’t want Datadog’s complexity or pricing model.

Mitigation: Don’t compete with Datadog on enterprise APM. Target the “AI-native startup” segment that finds Datadog too expensive and too complex. AgentScope’s self-hosted model and transparent pricing are the antidote to Datadog’s surprise bills.

Risk 4: OTel GenAI Conventions Break Backward Compatibility (MEDIUM)

Threat: The GenAI semantic conventions are still experimental [3]. Breaking changes could require SDK rewrites. The OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental flag exists precisely because the spec isn’t stable yet.

Mitigation: Pin to a stable subset of attributes. Implement an adapter layer in the collector that normalizes old convention formats to new ones. Contribute upstream to influence the direction — becoming the reference implementation gives us a seat at the table.

Risk 5: The “Yet Another Tool” Problem (MEDIUM)

Threat: Organizations use an average of 8 observability tools; 84% are actively consolidating [8]. Developers have tool fatigue. Adding a new agent-specific observability tool goes against the consolidation trend.

Counter-argument: Agent observability is a new category — it’s not replacing an existing tool, it’s filling a gap. 89% of orgs have already implemented some form of agent observability [11], which means budget exists. The question is whether they’ll use a purpose-built tool or stretch an existing one.

Mitigation: Position AgentScope as a complement to existing APM (Datadog, New Relic, Grafana), not a replacement. OTel-native design means data can flow to both AgentScope and existing backends simultaneously. Offer an OTLP-out mode where AgentScope enriches spans and forwards to existing collectors.

12. MVP Scope & Milestones

Phase 1: Core Tracing (4 weeks)

ClickHouse schema + init scripts
Custom OTel Collector with cost enrichment processor
TypeScript SDK with manual instrumentation API
OpenAI + Anthropic auto-instrumentation
Basic trace explorer UI (waterfall view)
Docker Compose deployment
README + quickstart guide

Phase 2: Agent Intelligence (3 weeks)

Agent identity propagation and session grouping
Agent delegation graph visualization (React Flow DAG)
Cost analytics dashboard with time-series charts
MCP client auto-instrumentation
Python SDK (core + OpenAI + Anthropic)
Tail-based sampling in collector

Phase 3: Ecosystem (3 weeks)

LangChain.js + LangChain Python integration
CrewAI integration
Vercel AI SDK integration
Eval framework (submit + dashboard)
Budget alerts (webhook notifications)
Public documentation site

Phase 4: Cloud + Growth (ongoing)

Multi-tenant cloud offering
GitHub App for PR-level cost reports
Slack/Discord bot for budget alerts
Paperclip/OctantOS native integration
Community plugin system

13. Technology Decisions Summary

Decision	Choice	Rationale
Wire protocol	OTLP (gRPC + HTTP)	Industry standard (95% adoption [9]), zero vendor lock-in
Storage	ClickHouse	Best price/performance for observability analytics; Langfuse acquisition validated this; SigNoz proved the OTel+ClickHouse stack [18]
SDK language priority	TypeScript first	Moklabs stack is TS-native; JS agent ecosystem growing fastest
UI framework	React + Vite	Moklabs standard; richest component ecosystem
Collector	Custom OTel Collector distribution	Need AgentScope-specific processors (cost enrichment, schema validation)
License	AGPL-3.0 (server) + Apache 2.0 (SDKs)	AGPL protects open-source while enabling cloud monetization; Apache SDKs avoid enterprise friction [12]
API framework	Hono on Node.js	Lightweight, fast, edge-compatible for future cloud deployment
Graph visualization	React Flow	Better for interactive agent graphs; React-native

AgentScope MVP Technical Blueprint — OTel-Native Observability Stack, SDK Design, and Deployment Topology

Executive Summary

1. Should Moklabs Build This? (Go/No-Go)

The Case FOR Building

The Case AGAINST Building

Verdict: Conditional GO

2. What Specifically Would We Build? (Concrete MVP)

Architecture Overview

Design Principles

3. Trace Model — Agent-Aware Semantic Conventions

Span Types

Agent Identity Propagation

Cost Attribution Model

4. SDK Design

TypeScript SDK (Primary)

Auto-Instrumentation Strategy

Python SDK

SDK Package Structure

5. Storage Layer — ClickHouse Schema

Why ClickHouse (Quantified)

Core Tables

Data Retention Strategy

6. Collector Layer — OTel Collector Configuration

Cost Enrichment Pipeline

7. UI Layer — Dashboard Architecture

Tech Stack

Core Views (MVP)

API Layer

8. Deployment Topology

Self-Hosted (MVP Default)

Cloud Offering (Post-MVP)

9. Who Buys It and For How Much? (ICP + Willingness to Pay)

Ideal Customer Profile

Competitive Pricing Landscape

AgentScope Pricing Rationale

Market Sizing (Bottom-Up)

10. Competitive Positioning — What’s the Unfair Advantage?

Feature Comparison Matrix

Why Moklabs, Why Now

Unique Value Props for Marketing

11. What Kills This Idea? (Top 5 Risks + Counter-Arguments)

Risk 1: Langfuse + ClickHouse Closes the Multi-Agent Gap (CRITICAL)

Risk 2: AGPL License Limits Enterprise Adoption (HIGH)

Risk 3: Datadog / New Relic Add Native Agent Observability (HIGH)

Risk 4: OTel GenAI Conventions Break Backward Compatibility (MEDIUM)

Risk 5: The “Yet Another Tool” Problem (MEDIUM)

12. MVP Scope & Milestones

Phase 1: Core Tracing (4 weeks)

Phase 2: Agent Intelligence (3 weeks)

Phase 3: Ecosystem (3 weeks)

Phase 4: Cloud + Growth (ongoing)

13. Technology Decisions Summary

Sources

Related Reports