Unexplainable Costs
Hello,
I am creating a chatbot for an online Albanian store. I am using Trieve to handle the product searching and Vapi as the 'orchestration' layer .
The system prompt for the assistant is 889 tokens. However even the shortest output costs almost 3 cents (0.028) in the GPT-o4 model. I am using Postman to see how everything would work in a third-party interface and that's how I made the discovery.
I am running 4o because that's the most lightweight model which speaks the dialect the project needs to be in.
The tokens needed to run the prompt: 894
The tokens the input was in: 13
The tokens the assistant used to output: 119
Total tokens: 8935
Completion Tokens: 117
Can someone tell me if this has anything with the fact that I'm using Trieve to handle data retrieval?
(Here is the chat ID I used to track the pricing and output via postman: b660bb65-f286-4af3-9ad7-75e90120748d)
I am creating a chatbot for an online Albanian store. I am using Trieve to handle the product searching and Vapi as the 'orchestration' layer .
The system prompt for the assistant is 889 tokens. However even the shortest output costs almost 3 cents (0.028) in the GPT-o4 model. I am using Postman to see how everything would work in a third-party interface and that's how I made the discovery.
I am running 4o because that's the most lightweight model which speaks the dialect the project needs to be in.
The tokens needed to run the prompt: 894
The tokens the input was in: 13
The tokens the assistant used to output: 119
Total tokens: 8935
Completion Tokens: 117
Can someone tell me if this has anything with the fact that I'm using Trieve to handle data retrieval?
(Here is the chat ID I used to track the pricing and output via postman: b660bb65-f286-4af3-9ad7-75e90120748d)