Test summary

Prompt
25 Tools Run v37
Dataset
Multi-mcp-history
Status
15 Completed
Test run duration
5 minutes 41 seconds
Evaluation cost
-
Started At.
Jul 4, 2025, 8:54:42 AM
Author
MMadhu Shantan

Summary by evaluator

Tool Call Accuracy
ResultMean scorePass rate
Fail
0.6766.67%

Tokens

TypeValue
Total tokens1362500
Input tokens1340581
Completion tokens21919

Cost

TypeValue
Total cost$ 4.350528
Input token cost$ 4.021743
Completion token cost$ 0.328785

Latency (ms)

TypeValue
min6361.00 ms
max119857.32 ms
p5016943.00 ms
p90 71039.00 ms
p95 119871.00 ms
p99 119871.00 ms
mean 34265.47 ms
stdDeviation30331.9477
total15