Test summary

Prompt
25 Tools Run v32
Dataset
Multi-mcp-history
Status
15 Completed
Test run duration
4 minutes 20 seconds
Evaluation cost
-
Started At.
Jul 3, 2025, 5:45:58 PM
Author
MMadhu Shantan

Summary by evaluator

Tool Call Accuracy
ResultMean scorePass rate
Fail
0.7373.33%

Tokens

TypeValue
Total tokens873316
Input tokens851416
Completion tokens21900

Cost

TypeValue
Total cost$ 14.41374
Input token cost$ 12.77124
Completion token cost$ 1.6425

Latency (ms)

TypeValue
min12593.00 ms
max101764.50 ms
p5023135.00 ms
p90 93695.00 ms
p95 101823.00 ms
p99 101823.00 ms
mean 36261.33 ms
stdDeviation27116.4625
total15