Test summary

Prompt
25 Tools Run v33
Dataset
Multi mcp - gpt 4.1
Status
15 Completed
Test run duration
40 seconds
Evaluation cost
-
Started At.
Jul 3, 2025, 9:32:18 PM
Author
MMadhu Shantan

Summary by evaluator

Tool Call Accuracy
ResultMean scorePass rate
Pass
0.880%

Tokens

TypeValue
Total tokens496210
Input tokens489770
Completion tokens6440

Cost

TypeValue
Total cost$ 1.03106
Input token cost$ 0.97954
Completion token cost$ 0.05152

Latency (ms)

TypeValue
min1930.00 ms
max20300.08 ms
p506359.00 ms
p90 14303.00 ms
p95 20303.00 ms
p99 20303.00 ms
mean 6931.27 ms
stdDeviation4749.423
total15