Test summary
Prompt
25 Tools Run v33
Dataset
Multi mcp - gpt 4.1
Status
15 Completed
Test run duration
40 seconds
Evaluation cost
-
Started At.
Jul 3, 2025, 9:32:18 PM
Author
M
Madhu Shantan
Summary by evaluator
Tool Call Accuracy
Result
Mean score
Pass rate
Pass
0.8
80%
Tokens
Type
Value
Total tokens
496210
Input tokens
489770
Completion tokens
6440
Cost
Type
Value
Total cost
$ 1.03106
Input token cost
$ 0.97954
Completion token cost
$ 0.05152
Latency (ms)
Type
Value
min
1930.00 ms
max
20300.08 ms
p50
6359.00 ms
p90
14303.00 ms
p95
20303.00 ms
p99
20303.00 ms
mean
6931.27 ms
stdDeviation
4749.423
total
15