One of my main problems of late is the context window size of GPT-3.5 Turbo and the price of GPT-4o, the app I have built uses 300k+ tokens per conversation. Throw in voice costs and it gets a bit expensive.
Plus, GPT-3.5 is not the smartest model and I found better results with Claude Haiku. The problem with Haiku is the low API limits and some weird stability issues.
I had no choice though, since GPT-4 was too expensive for my use case considering the amount of tokens I use.
Trying out GPT-4o mini, and was fairly impressed, its reasoning is much better than GPT-3.5, and the context window is 128k - which is good enough for my use case (...and the low price helps too).
Have you used mini yet and what are your thoughts and experiences thus far?