Kimi K2 evaluation - blog.icod.de

Found this on GosuCoder’s discord server about Kimi K2.
There’s a lot of hype, but no real world data.
This is what he had to say:

Kimi K2 Thinking is a coding model for the backend, it’s not a great frontend model. It’s strengths lay in architecture planning and backend implementation. It’s bad at XML-based tool-calling, making it a model not to be used in Cline/RooCode. It’s native tool calling capabilities are quite stable with a 87% hit-rate while maintaining 97% structural integrity. The speed based on provider can be a horror from ~20 tps to ~115 tps. Overall one of the strongest models we’ve seen that are open source. Despite some benchmarks however it’s closer to GPT-5 (med) than to GPT-5 (high) and atleast in coding is falling heavily behind GPT-5-Codex.

Leave a Reply Cancel reply