Test summary

Prompt
25 Tools Run v31
Dataset
Multi-mcp-history
Status
15 Completed
Test run duration
3 minutes 21 seconds
Evaluation cost
-
Started At.
Jul 3, 2025, 4:52:03 PM
Author
MMadhu Shantan

Summary by evaluator

Tool Call Accuracy
ResultMean scorePass rate
Fail
0.7373.33%

Tokens

TypeValue
Total tokens843415
Input tokens791120
Completion tokens8390

Cost

TypeValue
Total cost$ 1.009875
Input token cost$ 0.9889
Completion token cost$ 0.020975

Latency (ms)

TypeValue
min9684.00 ms
max179561.45 ms
p5028351.00 ms
p90 82495.00 ms
p95 179583.00 ms
p99 179583.00 ms
mean 41671.47 ms
stdDeviation42818.0179
total15