AgentScope MVP Technical Blueprint — OTel-Native Observability Stack, SDK Design, and Deployment Topology
AgentScope MVP Technical Blueprint — OTel-Native Observability Stack, SDK Design, and Deployment Topology
Date: 2026-03-20 Context: AgentScope lacks a technical architecture blueprint. This report synthesizes market intelligence (MOKA-301), competitive gaps, and emerging standards into a concrete MVP architecture with quantified market data, competitive pricing analysis, and risk-weighted go/no-go recommendation.
Executive Summary
- Go/no-go: Conditional GO — The agent observability market is real ($7.81B multi-agent platform market in 2025, 47.7% CAGR to $54.9B by 2030 [1]), but the window is narrowing fast. Langfuse’s acquisition by ClickHouse (Jan 2026, $15B valuation [2]) validates the OTel+ClickHouse thesis but also creates a formidable incumbent. AgentScope must ship within 10 weeks and differentiate sharply on multi-agent cost attribution — a gap no competitor has closed.
- AgentScope should be an OTel-native, ClickHouse-backed observability platform purpose-built for multi-agent systems — not a generic LLM tracing tool
- The MVP must differentiate on three axes: (1) multi-agent cost attribution, (2) orchestration-aware tracing (agent handoffs, tool chains, delegation graphs), and (3) self-hosted-first deployment
- SDK strategy: TypeScript-first (aligned with Moklabs stack), Python SDK as fast-follow. Both emit standard OTel spans with AgentScope semantic extensions aligned to the emerging OTel GenAI Agent Semantic Conventions [3]
- Storage: ClickHouse for traces/metrics/logs — 10-20x compression ratio, sub-second queries over billions of spans, columnar design ideal for high-cardinality agent metadata [4]
- The architecture separates into four layers: SDK —> Collector —> Storage —> UI, each independently deployable and replaceable
- ICP: Teams running 3+ agents in production, spending $5K-50K/mo on LLM APIs, needing cost visibility per agent/task/user
- Pricing target: Free self-hosted (AGPL), cloud at $99-499/mo — undercutting LangSmith Plus ($39/seat x teams) and Langfuse Pro ($199/mo) on total cost while delivering superior multi-agent features
1. Should Moklabs Build This? (Go/No-Go)
The Case FOR Building
| Signal | Data Point | Source |
|---|---|---|
| Market size | AI agents market: $7.63B (2025) —> $10.91B (2026) —> $182.97B (2033), CAGR 49.6% | Grand View Research [5] |
| Multi-agent growth | Gartner saw 1,445% surge in multi-agent system inquiries Q1 2024 to Q2 2025 | Gartner [6] |
| Observability spending | 96% of IT leaders expect observability spending to hold steady or grow; 62% plan increases | LogicMonitor 2026 [7] |
| Platform switching | 67% of orgs likely to switch observability platforms within 1-2 years | Grafana Survey 2025 [8] |
| OTel momentum | 95% adoption for new cloud-native instrumentation in 2026; 10B+ daily spans processed | CNCF, byteiota [9][10] |
| Observability ROI | Organizations report 2.6x average ROI from observability spending | Grafana Survey 2025 [8] |
| Agent observability gap | 89% of organizations have implemented observability for agents, but quality issues remain the #1 production barrier (32%) | getmaxim.ai [11] |
The Case AGAINST Building
| Risk | Data Point | Source |
|---|---|---|
| Langfuse+ClickHouse dominance | 20K GitHub stars, 26M+ SDK installs/mo, 2,000+ paying customers, 19 of Fortune 50 | ClickHouse blog [2] |
| Crowded market | 100+ observability tools in use across surveyed orgs; avg org uses 8 tools | Grafana Survey 2025 [8] |
| AGPL adoption friction | Google bans AGPL entirely; many enterprises have blanket AGPL restrictions | Google OSS Policy [12] |
| Braintrust momentum | $80M Series B at $800M valuation (Feb 2026) for AI observability | Axios [13] |
| Tool fatigue | 84% of orgs are consolidating observability tools, not adding new ones | Grafana Survey 2025 [8] |
Verdict: Conditional GO
The market is massive and growing 49.6% CAGR, but the competitive moat must come from multi-agent-native features that generic LLM observability tools don’t provide. Langfuse’s 2026 roadmap explicitly targets multi-agent support [14], giving AgentScope roughly 6-9 months of differentiation window. The condition: ship MVP in 10 weeks or don’t build at all.
2. What Specifically Would We Build? (Concrete MVP)
Architecture Overview
+----------------------------------+
| AgentScope UI |
| (React + TanStack + Tremor) |
| Trace explorer, cost dashboard, |
| agent graph, eval results |
+---------------+------------------+
| SQL / HTTP API
+---------------v------------------+
| ClickHouse Cluster |
| traces / metrics / logs / evals |
| Materialized views for KPIs |
+---------------^------------------+
| OTLP/gRPC + HTTP
+---------------+------------------+
| OTel Collector (custom) |
| Schema enforcement, sampling, |
| cost enrichment, rate limiting |
+---------------^------------------+
| OTLP export
+--------------------------+-------------------------+
| | |
+---------+--------+ +------------+---------+ +-------------+---------+
| TypeScript SDK | | Python SDK | | Generic OTel SDK |
| @agentscope/ts | | agentscope-py | | (any OTel client) |
| | | | | |
| Auto-instrument | | Decorators for | | Manual spans with |
| MCP, A2A, tool | | LangChain, CrewAI, | | GenAI semantic |
| calls, handoffs | | OpenAI SDK | | conventions |
+------------------+ +----------------------+ +-----------------------+
Design Principles
- OTel-native, not OTel-compatible — spans follow GenAI Semantic Conventions including the emerging agent span conventions (experimental, but actively developed [3]). No proprietary wire format.
- Zero-config for common frameworks — auto-instrumentation for MCP servers, OpenAI SDK, Anthropic SDK, LangChain, CrewAI
- Multi-agent first — trace model captures agent identity, delegation chains, and inter-agent communication as first-class concepts (the feature gap no competitor has fully closed)
- Self-hosted default — single
docker compose updeploys the full stack. Cloud offering later. - Cost as a dimension — every span carries token counts and estimated cost. Materialized views aggregate by agent, task, user, and time window.
3. Trace Model — Agent-Aware Semantic Conventions
AgentScope extends OTel GenAI Semantic Conventions with agent-specific attributes, aligned with the upstream GenAI Agent Spans specification where gen_ai.operation.name uses invoke_agent and create_agent [3]:
Span Types
| Span Kind | Name Pattern | Key Attributes |
|---|---|---|
| Agent Run | invoke_agent.{agent_name} | gen_ai.agent.id, gen_ai.agent.name, agent.framework, gen_ai.request.model, agent.role |
| LLM Call | gen_ai.chat | gen_ai.system, gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens, gen_ai.usage.cost_usd |
| Tool Call | tool.invoke.{tool_name} | tool.name, tool.source (mcp/native/a2a), tool.mcp_server, tool.success |
| Agent Handoff | agent.handoff | agent.handoff.from, agent.handoff.to, agent.handoff.reason |
| Retrieval | retrieval.query | retrieval.source, retrieval.docs_count, retrieval.latency_ms |
| Guardrail | guardrail.check | guardrail.name, guardrail.result (pass/fail/warn), guardrail.latency_ms |
| Eval | eval.run | eval.name, eval.score, eval.criteria, eval.judge_model |
Agent Identity Propagation
// Every span in a multi-agent trace carries:
interface AgentContext {
'gen_ai.agent.id': string; // unique agent instance (aligned with OTel spec)
'gen_ai.agent.name': string; // human-readable name
'agent.session_id': string; // conversation/task session
'agent.parent_agent_id'?: string; // delegation chain
'agent.framework': string; // 'langchain' | 'crewai' | 'openai' | 'custom'
'gen_ai.request.model': string; // 'claude-sonnet-4-6' | 'gpt-4o' etc.
'agent.cost.cumulative_usd': number; // running cost for this agent in session
}
Cost Attribution Model
Session Cost
|-- Agent A Cost ($0.42)
| |-- LLM calls: $0.38 (3 calls, claude-sonnet-4-6)
| |-- Tool calls: $0.04 (2 MCP calls with upstream API costs)
| +-- Handoff to Agent B
|-- Agent B Cost ($0.15)
| |-- LLM calls: $0.12 (1 call, gpt-4o-mini)
| +-- Retrieval: $0.03 (vector search)
+-- Total: $0.57
Cost is computed at span level using a configurable price table:
const priceTable: PriceTable = {
'claude-sonnet-4-6': { input: 3.0, output: 15.0, cached: 0.30 }, // per 1M tokens
'gpt-4o': { input: 2.5, output: 10.0, cached: 1.25 },
'gpt-4o-mini': { input: 0.15, output: 0.60, cached: 0.075 },
};
4. SDK Design
TypeScript SDK (Primary)
import { AgentScope } from '@agentscope/ts';
// Initialize -- auto-detects framework
const scope = new AgentScope({
endpoint: 'http://localhost:4318', // OTel collector
projectId: 'my-project',
environment: 'production',
pricing: 'default', // or custom PriceTable
});
// Option 1: Wrap an existing agent framework
const wrappedAgent = scope.wrap(myLangChainAgent, {
name: 'research-agent',
model: 'claude-sonnet-4-6',
});
// Option 2: Manual instrumentation
const span = scope.startAgentRun('planner-agent', {
model: 'gpt-4o',
sessionId: 'session-123',
});
const toolSpan = span.startToolCall('web-search', { source: 'mcp' });
// ... execute tool ...
toolSpan.end({ success: true, resultSize: 3 });
span.end();
// Option 3: Auto-instrumentation (zero-code)
scope.autoInstrument({
openai: true, // patches OpenAI SDK
anthropic: true, // patches Anthropic SDK
mcp: true, // patches MCP client calls
fetch: true, // patches global fetch for API cost tracking
});
Auto-Instrumentation Strategy
| Target | Method | What’s Captured |
|---|---|---|
| OpenAI SDK | Monkey-patch chat.completions.create | Model, tokens, latency, messages, cost |
| Anthropic SDK | Monkey-patch messages.create | Model, tokens, latency, tool_use blocks, cost |
| MCP Client | Wrap callTool() and listTools() | Server name, tool name, args, result, latency |
| A2A Client | Wrap sendTask() | Target agent, task description, result |
| fetch/axios | Optional patch | External API calls with URL, status, latency |
SDK risk note: Monkey-patching is fragile. Both Helicone (proxy-based [15]) and Braintrust (wrapper-based [16]) have explored alternative integration patterns. AgentScope should support both monkey-patch (zero-config DX) and explicit wrapper (production reliability) modes.
Python SDK
from agentscope import AgentScope, trace_agent
scope = AgentScope(
endpoint="http://localhost:4318",
project_id="my-project",
)
# Decorator-based instrumentation
@trace_agent(name="research-agent", model="claude-sonnet-4-6")
async def research(query: str):
# All LLM calls inside are auto-traced
result = await client.messages.create(...)
return result
# Framework integrations
from agentscope.integrations import langchain_callback, crewai_callback
# LangChain: add as callback handler
chain.invoke(input, config={"callbacks": [langchain_callback(scope)]})
# CrewAI: add as crew callback
crew = Crew(agents=[...], callbacks=[crewai_callback(scope)])
SDK Package Structure
@agentscope/ts # Core SDK
|-- @agentscope/openai # OpenAI auto-instrumentation
|-- @agentscope/anthropic # Anthropic auto-instrumentation
|-- @agentscope/mcp # MCP client instrumentation
|-- @agentscope/langchain # LangChain.js integration
+-- @agentscope/vercel-ai # Vercel AI SDK integration
agentscope-py # Python core
|-- agentscope[openai] # OpenAI instrumentation
|-- agentscope[anthropic] # Anthropic instrumentation
|-- agentscope[langchain] # LangChain callback
+-- agentscope[crewai] # CrewAI callback
5. Storage Layer — ClickHouse Schema
Why ClickHouse (Quantified)
| Criteria | ClickHouse | PostgreSQL | Elasticsearch |
|---|---|---|---|
| Write throughput | 1M+ rows/sec | ~50K rows/sec | ~100K docs/sec |
| Query latency (1B rows) | Sub-second | Minutes | Seconds |
| Compression ratio | 10-20x (up to 50x reported [4]) | 3-5x | 5-8x |
| Storage cost/TB (cloud) | ~$25-35/TB compressed [17] | ~$100/TB | ~$200/TB |
| Storage cost/TB (self-hosted) | ~$5/mo (NVMe) | ~$20/mo | ~$50/mo |
| OTel native support | Yes (ClickStack) | No | Yes (via APM) |
| Columnar analytics | Native | No | Partial |
| Self-hosted complexity | Medium | Low | High |
Validation: SigNoz, a ClickHouse+OTel observability platform, has proven this stack in production as a Datadog alternative [18]. Langfuse migrated to ClickHouse pre-acquisition and reported significant query performance improvements for trace analytics [19].
Counter-point: ClickHouse’s operational complexity is non-trivial for self-hosted users. TimescaleDB outperforms ClickHouse 56x on small batch writes (14,200 vs 250 ops/s at batch size 100 [20]) — relevant for low-volume early adopters. Mitigation: provide ClickHouse Cloud as managed option; optimize batch sizes in collector.
Core Tables
-- Traces table (main fact table)
CREATE TABLE agentscope.traces (
trace_id FixedString(32),
span_id FixedString(16),
parent_span_id FixedString(16),
span_name LowCardinality(String),
span_kind Enum8('agent_run'=1, 'llm_call'=2, 'tool_call'=3,
'handoff'=4, 'retrieval'=5, 'guardrail'=6, 'eval'=7),
start_time DateTime64(9), -- nanosecond precision
end_time DateTime64(9),
duration_ns UInt64,
status_code Enum8('ok'=0, 'error'=1),
-- Project / environment
project_id LowCardinality(String),
environment LowCardinality(String),
-- Agent identity
agent_id String,
agent_name LowCardinality(String),
agent_framework LowCardinality(String),
agent_model LowCardinality(String),
agent_session_id String,
agent_parent_id String DEFAULT '',
-- LLM specifics
input_tokens UInt32 DEFAULT 0,
output_tokens UInt32 DEFAULT 0,
cached_tokens UInt32 DEFAULT 0,
cost_usd Float64 DEFAULT 0,
-- Tool specifics
tool_name LowCardinality(String) DEFAULT '',
tool_source LowCardinality(String) DEFAULT '',
tool_success Nullable(UInt8),
-- Flexible attributes (for everything else)
attributes Map(String, String),
-- Partitioning
_date Date DEFAULT toDate(start_time)
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(_date)
ORDER BY (project_id, _date, trace_id, span_id)
TTL _date + INTERVAL 90 DAY;
-- Materialized view: cost by agent per hour
CREATE MATERIALIZED VIEW agentscope.cost_by_agent_hourly
ENGINE = SummingMergeTree()
ORDER BY (project_id, agent_name, agent_model, hour)
AS SELECT
project_id,
agent_name,
agent_model,
toStartOfHour(start_time) AS hour,
sum(cost_usd) AS total_cost,
sum(input_tokens) AS total_input_tokens,
sum(output_tokens) AS total_output_tokens,
count() AS span_count,
countIf(status_code = 'error') AS error_count
FROM agentscope.traces
WHERE span_kind = 'llm_call'
GROUP BY project_id, agent_name, agent_model, hour;
-- Materialized view: session-level aggregates
CREATE MATERIALIZED VIEW agentscope.session_summary
ENGINE = AggregatingMergeTree()
ORDER BY (project_id, agent_session_id)
AS SELECT
project_id,
agent_session_id,
min(start_time) AS session_start,
max(end_time) AS session_end,
sum(cost_usd) AS total_cost,
sum(input_tokens + output_tokens) AS total_tokens,
countIf(span_kind = 'llm_call') AS llm_calls,
countIf(span_kind = 'tool_call') AS tool_calls,
countIf(span_kind = 'handoff') AS handoffs,
countIf(status_code = 'error') AS errors,
uniqExact(agent_name) AS unique_agents
FROM agentscope.traces
GROUP BY project_id, agent_session_id;
Data Retention Strategy
| Tier | Duration | Resolution | Storage |
|---|---|---|---|
| Hot | 7 days | Full spans with all attributes | NVMe SSD |
| Warm | 30 days | Full spans, compressed attributes | SSD |
| Cold | 90 days | Aggregated materialized views only | S3/R2 (via ClickHouse tiered storage) |
| Archive | 1 year | Session summaries + cost rollups | S3/R2 (Parquet export) |
6. Collector Layer — OTel Collector Configuration
Custom OTel Collector with AgentScope-specific processors:
# otel-collector-config.yaml
receivers:
otlp:
protocols:
grpc:
endpoint: 0.0.0.0:4317
http:
endpoint: 0.0.0.0:4318
processors:
# Enrich spans with cost data based on model + tokens
agentscope_cost_enrichment:
price_table_url: "http://agentscope-api:8080/prices"
fallback_prices:
default_input_per_1m: 1.0
default_output_per_1m: 3.0
# Enforce schema -- reject malformed spans, add defaults
agentscope_schema_validator:
required_attributes:
- project_id
defaults:
environment: "development"
# Tail-based sampling -- keep all errors, sample success
tail_sampling:
decision_wait: 10s
policies:
- name: errors-always
type: status_code
status_code: { status_codes: [ERROR] }
- name: high-cost
type: numeric_attribute
numeric_attribute:
key: cost_usd
min_value: 0.10
- name: sample-success
type: probabilistic
probabilistic: { sampling_percentage: 25 }
batch:
send_batch_size: 10000
timeout: 5s
exporters:
clickhouse:
endpoint: tcp://clickhouse:9000
database: agentscope
traces_table_name: traces
ttl: 90d
create_schema: true
service:
pipelines:
traces:
receivers: [otlp]
processors: [agentscope_schema_validator, agentscope_cost_enrichment, tail_sampling, batch]
exporters: [clickhouse]
Cost Enrichment Pipeline
The collector intercepts every span with gen_ai.* attributes and:
- Reads
gen_ai.request.model,gen_ai.usage.input_tokens,gen_ai.usage.output_tokens - Looks up per-token price from configurable price table
- Computes
cost_usd = (input_tokens * input_price + output_tokens * output_price) / 1_000_000 - Attaches
cost_usdas span attribute before writing to ClickHouse
This means SDKs don’t need to know prices — cost is computed server-side and stays current. This mirrors how Helicone handles cost calculation for 300+ models via their open-source cost repository [15].
7. UI Layer — Dashboard Architecture
Tech Stack
| Component | Technology | Rationale |
|---|---|---|
| Framework | React 19 + Vite | Moklabs standard, fast dev |
| Routing | TanStack Router | Type-safe, file-based |
| Data fetching | TanStack Query | Caching, optimistic updates |
| Charts | Tremor + Recharts | Pre-built analytics components |
| Table | TanStack Table | Virtualized, sortable, filterable |
| Graph visualization | React Flow | Agent delegation graphs (interactive DAG) |
| Styling | Tailwind CSS + Radix | Consistent with Moklabs design system |
Core Views (MVP)
-
Trace Explorer — Waterfall view of spans within a trace. Filter by agent, model, time range, cost range, status. Click to expand span details with full attributes, messages, tool inputs/outputs.
-
Agent Dashboard — Per-agent cards showing: total cost (24h/7d/30d), request count, error rate, avg latency, top tools used. Drill-down to agent’s recent traces.
-
Cost Analytics — Time-series chart of spend by agent, model, project. Table with cost breakdown. Budget alerts configuration. Comparison view (this week vs last week). This is the killer feature — no competitor offers per-agent, per-session cost attribution with budget alerting.
-
Session Replay — Timeline of a complete multi-agent session. Shows agent handoffs as a directed graph. Click any node to see the agent’s trace waterfall. This is AgentScope’s unique differentiator — LangSmith only supports LangGraph, Langfuse has generic spans, Braintrust focuses on evals [16][21][22].
-
Eval Dashboard (v1.1) — Evaluation scores over time. A/B comparison of prompt versions. Regression detection alerts.
API Layer
GET /api/traces?project={id}&from={ts}&to={ts}&agent={name}&min_cost={usd}
GET /api/traces/{traceId}/spans
GET /api/agents?project={id}
GET /api/agents/{name}/stats?window=24h
GET /api/costs/by-agent?project={id}&window=7d
GET /api/costs/by-model?project={id}&window=7d
GET /api/sessions/{sessionId}
GET /api/sessions/{sessionId}/graph -- returns agent delegation DAG
POST /api/evals -- submit eval results
GET /api/evals?project={id}&name={evalName}&window=30d
API server is a lightweight Node.js (Hono or Fastify) service that translates REST requests to ClickHouse SQL queries.
8. Deployment Topology
Self-Hosted (MVP Default)
# docker-compose.yml
services:
clickhouse:
image: clickhouse/clickhouse-server:24.12
volumes:
- clickhouse_data:/var/lib/clickhouse
- ./init-schema.sql:/docker-entrypoint-initdb.d/init.sql
ports:
- "8123:8123" # HTTP
- "9000:9000" # Native
collector:
image: ghcr.io/moklabs/agentscope-collector:latest
depends_on: [clickhouse]
ports:
- "4317:4317" # gRPC
- "4318:4318" # HTTP
environment:
- CLICKHOUSE_DSN=tcp://clickhouse:9000/agentscope
api:
image: ghcr.io/moklabs/agentscope-api:latest
depends_on: [clickhouse]
ports:
- "8080:8080"
environment:
- CLICKHOUSE_DSN=http://clickhouse:8123/agentscope
ui:
image: ghcr.io/moklabs/agentscope-ui:latest
depends_on: [api]
ports:
- "3000:3000"
environment:
- API_URL=http://api:8080
Total resource requirements (MVP):
- ClickHouse: 2 CPU, 4GB RAM, 50GB SSD (handles ~10M spans/day)
- Collector + API + UI: 1 CPU, 1GB RAM combined
- Minimum: single VPS with 4GB RAM can run the full stack
Cloud Offering (Post-MVP)
- Managed ClickHouse Cloud backend (~$25-35/TB compressed storage [17])
- Multi-tenant with project-level isolation
- Collector runs as edge workers (Cloudflare) for low-latency ingest
- UI served from CDN
9. Who Buys It and For How Much? (ICP + Willingness to Pay)
Ideal Customer Profile
| Attribute | Description |
|---|---|
| Company stage | Series A-C startups and mid-market teams building AI-powered products |
| Team size | 3-20 engineers working on AI/agent features |
| Agent maturity | Running 3+ agents in production, at least one multi-agent workflow |
| LLM spend | $5K-50K/mo on API costs (pain point: “where does the money go?”) |
| Current tooling | Using basic logging or outgrowing Langfuse free tier / LangSmith developer plan |
| Tech stack | TypeScript or Python, using OpenAI/Anthropic APIs or LangChain/CrewAI |
Competitive Pricing Landscape
| Platform | Free Tier | Entry Paid | Mid-Tier | Enterprise | Self-Hosted |
|---|---|---|---|---|---|
| Langfuse | 50K units/mo | $29/mo (Core) | $199/mo (Pro) | $2,499/mo | Free (MIT), $500/mo enterprise features [23] |
| LangSmith | 5K traces/mo | $39/seat/mo (Plus) | ~$195/mo (5 seats) | Custom | BYOC option [24] |
| Braintrust | Free tier | Usage-based, no seat limits | Usage-based | Custom | No [16] |
| Arize Phoenix | Unlimited (OSS) | $50/mo (managed cloud) | $500/mo | $50K-100K/yr (AX) | Free (ELv2) [25] |
| Helicone | 10K requests/mo | Usage-based | Usage-based | Custom | Yes (OSS) [15] |
| Datadog LLM Obs | None | ~$120/day auto-activation on LLM spans | Per-span billing | Bundled with APM | No [26] |
| AgentScope | Unlimited (self-hosted) | $99/mo (cloud) | $299/mo | $499/mo | Free (AGPL) |
AgentScope Pricing Rationale
- Free self-hosted: Drives adoption, builds community, creates migration funnel to cloud
- $99/mo cloud entry: Undercuts Langfuse Pro ($199/mo) while offering superior multi-agent features
- No per-seat pricing: Follows Braintrust model — team-friendly, removes adoption friction
- Usage-based overage: $5 per 100K additional spans (vs Langfuse $8 per 100K units [23])
Market Sizing (Bottom-Up)
- ~50,000 companies actively building with AI agents (estimate based on 26M+ Langfuse SDK installs/mo [2])
- 5% addressable in year 1 with multi-agent use cases = 2,500 potential customers
- 2% conversion to paid cloud at avg $200/mo = 50 customers x $200 x 12 = $120K ARR Year 1
- Growth target: 500 cloud customers by Year 2 = $1.2M ARR
10. Competitive Positioning — What’s the Unfair Advantage?
Feature Comparison Matrix
| Feature | Langfuse | LangSmith | Braintrust | Arize Phoenix | Helicone | AgentScope |
|---|---|---|---|---|---|---|
| Multi-agent traces | Generic spans | LangGraph only | Generic spans | Generic spans | Sessions | Native agent identity, handoffs, delegation graphs |
| Cost attribution | Per-trace | Per-trace | Per-trace | Basic | Per-request | Per-agent, per-session, per-user with budget alerts |
| Agent delegation graph | No | No | No | No | No | Visual DAG of agent-to-agent handoffs |
| OTel native | Yes (post-acq) | No | No | Yes | No (proxy) | Yes (born OTel, aligned with GenAI agent conventions) |
| Self-hosted | Yes (MIT) | BYOC only | No | Yes (ELv2) | Yes (OSS) | Yes (AGPL) |
| Framework agnostic | Yes | Mostly LangChain | Yes | Yes | Yes | Yes |
| ClickHouse backend | Yes (post-acq) | No | Brainstore (custom) | No | No | Yes (from day one) |
| Eval integration | Yes | Yes | Best-in-class | Yes | Basic | Planned (v1.1) |
Why Moklabs, Why Now
- Multi-agent is unsolved: Every competitor traces individual LLM calls. None visualize multi-agent delegation as a first-class DAG with per-agent cost attribution. This is AgentScope’s category.
- OTel GenAI conventions are crystallizing: The agent span conventions moved from proposal to experimental in 2025 [3]. Building on this standard now means AgentScope becomes the reference implementation.
- ClickHouse acquisition created noise: Langfuse’s acquisition introduces uncertainty for self-hosted users. Some will look for alternatives — AgentScope should be there.
- Moklabs stack advantage: OctantOS and Paperclip are multi-agent orchestrators. AgentScope is the natural observability layer for our own stack, giving us dogfooding advantage and a built-in distribution channel.
Unique Value Props for Marketing
- “See your agents think” — Session replay with agent delegation graphs. No other tool visualizes multi-agent collaboration.
- “Know what your agents cost” — Real-time cost attribution per agent, per task, per user. Budget alerts before you get a surprise bill.
- “OTel-native, vendor-free” — Standard OpenTelemetry. Switch backends without changing code. No proprietary lock-in.
- “One command to deploy” —
docker compose upand start tracing in 5 minutes. No cloud account needed.
11. What Kills This Idea? (Top 5 Risks + Counter-Arguments)
Risk 1: Langfuse + ClickHouse Closes the Multi-Agent Gap (CRITICAL)
Threat: Langfuse’s 2026 roadmap explicitly targets “production monitoring and analytics for real agent systems” [14]. With ClickHouse’s $400M in fresh capital and $15B valuation, they can hire 10x our team and ship multi-agent features within 6 months.
Why AgentScope might still lose: Langfuse has 20K GitHub stars, 26M+ SDK installs/mo, 2,000+ paying customers, and 19 of the Fortune 50 [2]. Network effects in developer tooling are brutal — teams default to what their peers use.
Mitigation: Ship before they do. Focus on the multi-agent DAG visualization and per-agent cost attribution — features that require deep architectural decisions Langfuse can’t easily bolt on. Also: AGPL protects against ClickHouse bundling AgentScope’s innovations into their proprietary cloud.
Risk 2: AGPL License Limits Enterprise Adoption (HIGH)
Threat: Google explicitly bans AGPL [12]. Many Fortune 500 companies have blanket AGPL restrictions. AGPL-licensed projects have lower GitHub adoption than MIT/Apache alternatives.
Why this matters: Our ICP includes Series A-C startups (typically AGPL-tolerant), but enterprise expansion requires addressing this. Langfuse (MIT) and Arize Phoenix (ELv2) are more permissive.
Mitigation: Offer dual licensing — AGPL for community, commercial license for enterprises who can’t use AGPL. This is the proven model (MongoDB SSPL + commercial, Grafana AGPL + commercial). Alternatively, consider Apache 2.0 for SDKs (which touch customer code) and AGPL only for the server components.
Risk 3: Datadog / New Relic Add Native Agent Observability (HIGH)
Threat: Datadog already has LLM Observability with agent monitoring capabilities [26]. Their $120/day auto-activation on LLM spans shows aggressive pricing intent. With 26,800+ customers and $2.6B+ ARR, they can subsidize AI observability as a loss leader bundled with APM.
Counter-argument: Enterprise teams already paying $100K+/yr to Datadog will just enable the LLM Observability add-on rather than adopt a new vendor. Agent-native startups, however, don’t want Datadog’s complexity or pricing model.
Mitigation: Don’t compete with Datadog on enterprise APM. Target the “AI-native startup” segment that finds Datadog too expensive and too complex. AgentScope’s self-hosted model and transparent pricing are the antidote to Datadog’s surprise bills.
Risk 4: OTel GenAI Conventions Break Backward Compatibility (MEDIUM)
Threat: The GenAI semantic conventions are still experimental [3]. Breaking changes could require SDK rewrites. The OTEL_SEMCONV_STABILITY_OPT_IN=gen_ai_latest_experimental flag exists precisely because the spec isn’t stable yet.
Mitigation: Pin to a stable subset of attributes. Implement an adapter layer in the collector that normalizes old convention formats to new ones. Contribute upstream to influence the direction — becoming the reference implementation gives us a seat at the table.
Risk 5: The “Yet Another Tool” Problem (MEDIUM)
Threat: Organizations use an average of 8 observability tools; 84% are actively consolidating [8]. Developers have tool fatigue. Adding a new agent-specific observability tool goes against the consolidation trend.
Counter-argument: Agent observability is a new category — it’s not replacing an existing tool, it’s filling a gap. 89% of orgs have already implemented some form of agent observability [11], which means budget exists. The question is whether they’ll use a purpose-built tool or stretch an existing one.
Mitigation: Position AgentScope as a complement to existing APM (Datadog, New Relic, Grafana), not a replacement. OTel-native design means data can flow to both AgentScope and existing backends simultaneously. Offer an OTLP-out mode where AgentScope enriches spans and forwards to existing collectors.
12. MVP Scope & Milestones
Phase 1: Core Tracing (4 weeks)
- ClickHouse schema + init scripts
- Custom OTel Collector with cost enrichment processor
- TypeScript SDK with manual instrumentation API
- OpenAI + Anthropic auto-instrumentation
- Basic trace explorer UI (waterfall view)
- Docker Compose deployment
- README + quickstart guide
Phase 2: Agent Intelligence (3 weeks)
- Agent identity propagation and session grouping
- Agent delegation graph visualization (React Flow DAG)
- Cost analytics dashboard with time-series charts
- MCP client auto-instrumentation
- Python SDK (core + OpenAI + Anthropic)
- Tail-based sampling in collector
Phase 3: Ecosystem (3 weeks)
- LangChain.js + LangChain Python integration
- CrewAI integration
- Vercel AI SDK integration
- Eval framework (submit + dashboard)
- Budget alerts (webhook notifications)
- Public documentation site
Phase 4: Cloud + Growth (ongoing)
- Multi-tenant cloud offering
- GitHub App for PR-level cost reports
- Slack/Discord bot for budget alerts
- Paperclip/OctantOS native integration
- Community plugin system
13. Technology Decisions Summary
| Decision | Choice | Rationale |
|---|---|---|
| Wire protocol | OTLP (gRPC + HTTP) | Industry standard (95% adoption [9]), zero vendor lock-in |
| Storage | ClickHouse | Best price/performance for observability analytics; Langfuse acquisition validated this; SigNoz proved the OTel+ClickHouse stack [18] |
| SDK language priority | TypeScript first | Moklabs stack is TS-native; JS agent ecosystem growing fastest |
| UI framework | React + Vite | Moklabs standard; richest component ecosystem |
| Collector | Custom OTel Collector distribution | Need AgentScope-specific processors (cost enrichment, schema validation) |
| License | AGPL-3.0 (server) + Apache 2.0 (SDKs) | AGPL protects open-source while enabling cloud monetization; Apache SDKs avoid enterprise friction [12] |
| API framework | Hono on Node.js | Lightweight, fast, edge-compatible for future cloud deployment |
| Graph visualization | React Flow | Better for interactive agent graphs; React-native |
Sources
- Multi-Agent System Platform Market — Mordor Intelligence
- ClickHouse Acquires Langfuse — ClickHouse Blog (Jan 2026)
- OTel Semantic Conventions for GenAI Agent Spans
- ClickHouse Observability Cost Optimization Playbook
- AI Agents Market Size and Share Report 2033 — Grand View Research
- Multiagent Systems in Enterprise AI — Gartner
- 2026 Observability & AI Outlook — LogicMonitor
- Observability Survey 2025 — Grafana Labs
- OpenTelemetry 95% Adoption — byteiota
- From Chaos to Clarity: OTel Across Clouds — CNCF (Nov 2025)
- Top 5 AI Agent Observability Platforms in 2026 — getmaxim.ai
- AGPL Policy — Google Open Source
- Braintrust Raises $80M at $800M Valuation — Axios (Feb 2026)
- Langfuse Joins ClickHouse — Langfuse Blog
- Helicone — Open Source LLM Observability
- Braintrust — AI Observability Platform
- ClickHouse Cloud Pricing
- SigNoz — Open Source Observability with ClickHouse
- How Langfuse Scales LLM Observability with ClickHouse Cloud
- TimescaleDB vs ClickHouse vs MongoDB Benchmark — DEV Community
- LangSmith Observability Platform
- Langfuse AI Agent Observability
- Langfuse Pricing
- LangSmith Plans and Pricing
- Arize Phoenix — GitHub
- Datadog LLM Observability
- Observability Market Size 2031 — Mordor Intelligence
- Can OpenTelemetry Save Observability in 2026? — The New Stack
- AGPL License is a Non-Starter for Most Companies — Open Core Ventures
- OpenTelemetry GenAI Semantic Conventions
- AI Observability Tools Buyer’s Guide 2026 — Braintrust
- Best Open Source Observability Solutions 2026 — ClickHouse