model with thinking?

#5
by pypry - opened

Models without thinking can only handle simple tasks.

I have never seen any Qwen Coder Thinking. What you need is an agentic tool like sequential thinking instead of reasoning.

Models without thinking can only handle simple tasks.

Thinking is just a way for a model to explore options via random paths. You already get that in an agentic tool, because it tries things, compiles, and if it fails, it can try something else, AND it gets feedback from the compiler, your tests, etc. More than that: it's been trained on the thinking of real developers, who fixed real bugs. Thinking would still add something, but mostly wastes context and adds latency before the real response/action, if the model is trained well and running in a good framework.

Models without thinking can only handle simple tasks.

Thinking is just a way for a model to explore options via random paths. You already get that in an agentic tool, because it tries things, compiles, and if it fails, it can try something else, AND it gets feedback from the compiler, your tests, etc. More than that: it's been trained on the thinking of real developers, who fixed real bugs. Thinking would still add something, but mostly wastes context and adds latency before the real response/action, if the model is trained well and running in a good framework.

Your point is understandable, suggesting that in well-trained models and efficient frameworks, chain-of-thought (thinking) may increase latency and consume resources. However, I would like to argue against this from the following perspectives:

The essence of thinking is not random exploration
Human developers’ thinking is not a purely random search of paths but a directed convergence process based on logical reasoning, abstract induction, and causal judgment. If properly designed, a model's "chain-of-thought" can simulate this structured reasoning rather than merely trial and error. For example, when solving complex bugs, developers first hypothesize the root cause and then systematically verify it, rather than blindly trying every compilation path.

Chain-of-thought reduces trial-and-error costs
Even though agent tools can iterate through compilation feedback, in real-world scenarios (such as production environment deployments or hardware testing), each trial may come with high costs. An internal thinking process allows the model to conduct virtual reasoning before "taking action," thereby eliminating invalid paths in advance, which can actually lower overall costs.

Training data cannot cover all long-tail problems
While models have indeed learned from developers' historical experiences, novel problems often require combining known knowledge, making cross-domain analogies, or creatively deconstructing issues. Chain-of-thought can explicitly construct reasoning steps, helping models break free from the limitations of data distribution rather than merely repeating historical patterns.

The value of interpretability and controllability
The thinking process provides opportunities for human supervision and intervention. If a model directly outputs actions, errors may be harder to trace. Explicit reasoning steps allow developers to make corrections at critical junctures, which is essential for high-reliability scenarios (such as medical or financial code).

Trade-off between latency and effectiveness
Although chain-of-thought increases single-response latency, it may significantly reduce the number of action iterations in complex tasks. For instance, generating the correct solution through multi-step reasoning in one go can save more time than multiple fast but erroneous attempts.

Yes, I was being sloppy by using the term "random". Obviously it's within constraints, like "what might work here; let's try this patch and recompile, and see what the compiler says." Even non-thinking LLMs basically work by searching the space of human knowledge constrained by human language's grammar rules, I think.

I should clarify: I do want a reasoning version of this too (and I'm sure it's on the way), but I don't think it will be better for agentic development tools that provide an automated chat history of coding attempts, because that amounts to the same thing, only with more targetted compiler/tool interactions.

Sign up or log in to comment