At a high level, Large Language Models (LLMs) function as statistical models that predict the most likely next word(s) in a sequence based on learned token probabilities. While they excel at parsing tokens and creating extensive neural networks connecting them, assigning meaningful interpretations to these tokens remains a significant challenge. Sure, we can improve it by train with larger size high quality data including synthetic one to make the more better.

Indeed, a winning commercial strategy for LLM vendors involves continually developing larger and super-sized models, such as Meta scaling the Llama model up to 405 billion parameters or Mistral's introduce Mixtral-Larger-2. Empirical evidence demonstrates that larger models often outperform their predecessors, as evident in table comparisons:

None
Figure 2: source: https://aimlapi.com/blog/mistral-large-2-beats-llama-3-1-405b

As illustrated, larger Language Models (LLMs) typically exhibit superior performance due to their enhanced ability to discern intricate patterns within data and generate more accurate responses, including textual reasoning.

Another example is Meta's Llama 3–70B outperforms Llama 3–8B across numerous benchmarks, as depicted in the chart provided by Meta:

None
Figure 3: source meta-llama 3 evaluation scores

Too bad, these scaling up model introduces significant computational and resource challenges in training or inference task.

Problem:

What strategies can be employed to enhance the text reasoning capabilities of smaller-sized Language Model (LLM) architectures, which may not have the same computational resources or parameter counts as larger models? How can we guide the LLM to "think" ?

Experiment:

In this article would like to share my experimental attempts to guide LLM thinking structure via system of thinking mental models.

Solution: Two-Step Approach:

  • Finetuning: Commence by fine-tuning the Llama 3.1, a smaller-sized transformer model with 8 billion parameters, on an enhanced reasoning dataset to bolster its cognitive capabilities.
  • Revelation of Internal Processes: Subsequently, leverage a system of thinking model guidance techniques (Think, Plan, Reasoning and Reflection) to unveil the model's internal thought processes and the rationales underlying its processing mechanisms.

What is Intuitive Thinker?

It's finetuned Llama 3.1 8B model with argument-ed synthetic reasoning dataset with model uploaded in Huggingface and Ollama site.

It's has API package employs function calling to dynamically apply guided structured system-of-thinking models, enabling more effective problem-solving for user queries by utilising either a dynamic or a static thinking model. (see here github ).

Individual system of thinking mental model package individual for easy to use locally in Ollama.

Checkout this demo notebook for a quick peak of the usage and result.

Available System Thinking Models Learn, Think, Plan, Reasoning and Reflection > 1. Chain-of-Thoughts > 2. Thinking Fast and Slow > 3. Critical Thinking > 4. Iceberg Mental Model > 5. Second Order Thinking

First let's take a look a sample question and response and use it to explain how LLM internal thinking works.

The LLM Model I will use here, it's my finetuned LLama 3.1 8B model running in Ollama.

ollama run mychen76/llama3.1-intuitive-thinker:q5`

The specific Mental Model applied is thinking fast and slow`

User Question:

"Patel family who is planning a road trip from Toronto to Vancouver, covering a total distance of 4,400 km. They want to divide the drive into equal parts over 5 days, with each day's drive being 15 km longer than the previous day. If they drive x km on the first day, what is the value of x that will allow them to complete the trip in 5 days?"

expected answer: 850 KM

Model Response in structure format include steps took to determine the answer. The specific mental model use here is called "Thinking Fast and Slow" — which use two modes of thought: "System 1" is fast, instinctive and emotional; "System 2" is slower, more deliberative, and more logical.


======Full Response=========
<thinking>
<thinking-fast>
Initial analysis:
- The Patel family needs to cover a total distance of 4,400 km.
- They want to divide the drive into equal parts over 5 days.
- Each day's drive should be 15 km longer than the previous day.
- We need to find the value of x that represents the distance driven on the first day.

Approach:
1. Calculate the total additional distance covered over 5 days.
2. Divide this by 5, as each day is 15 km longer than the previous one.
3. Add this increase to the distance driven on the first day (x) to find the value of x.

<thinking-slow>
Carefully interpreting the problem:
- Understanding the total distance and number of days are constant, so we can calculate based on given variables.
- Recognizing that each day's drive is 15 km longer due to a linear progression in distance.
- Breaking down the thought process into step-by-step calculations.

Suppressing biases:
- Ensuring the assumption about equal daily progressions doesn't skew reasoning.
- Double-checking for any potential oversights or simplifications.

Overriding initial impulse:
- Allowing for the calculation of x without assuming it's a simple arithmetic progression due to the given 
constraints.
</thinking-fast>
<thinking-slow>
Detailed step-by-step calculation:

1. Calculate the total additional distance covered over 5 days:
   Since each day is 15 km longer than the previous, this forms an arithmetic sequence where the difference between
consecutive terms (d) = 15 km.
   
2. The sum of the distances covered over these 5 days can be represented as:
   S = x + (x+15) + (x+30) + (x+45) + (x+60)
   
3. Since the total distance is 4,400 km and this sum equals 5x + 150, we can set up an equation:
   5x + 150 = 4400
   
4. Solving for x to find the initial day's distance:

   5x = 4400 - 150
   5x = 4250
   x = 850
   
Therefore, the value of x that will allow the Patel family to complete the trip in 5 days is 850 km.
</thinking-fast>

<reflection>
Reviewing reasoning and potential errors:
- The assumption about equal daily progressions and linear increase seems valid given the constraints.
- Ensuring the arithmetic progression calculation accounts for all additional distances.

Checking calculations:
1. Confirm the sequence of daily distances: x, (x+15), (x+30), (x+45), (x+60)
2. Verify that the total distance is correctly represented as 5x + 150
3. Double-check the arithmetic in the equation: 5x = 4400 - 150

Conclusion seems sound, but always worth a careful check.
</reflection>

<output>
The value of x that allows the Patel family to complete their road trip from Toronto to Vancouver in 5 days is 850 
km on the first day. This calculation accounts for the linear increase in daily distance and ensures the total trip
length covers the required 4,400 km over the specified period.
</thinking>

As you see the model response consist of multiple sections and steps.

<thinking>
-- how the thinking start...

<thinking-fast>
guided intruction on fast thinking steps
</thinking-fast>

<thinking-slow>
guided intruction on slow thinking steps
</thinking-slow>

<reflection>
reviewing own reasoning, logics and checking for errors and assumptions

<output>
this is the final answer 

</thinking>

The final answer is <output> section.

<output>
The value of x that allows the Patel family to complete their road trip from Toronto to Vancouver in 5 days is 850 
km on the first day. This calculation accounts for the linear increase in daily distance and ensures the total trip
length covers the required 4,400 km over the specified period.

The Core Concept

The core concept involves instructing the Language Model (LLM) to adhere to one of two "thinking models": a structured Chain-of-Thought approach or thoughtful reflection. Throughout this process, the model should document each intermediate step leading to the final answer.

Mental Models

A mental model is a conceptual framework constructed and utilised within our minds to comprehend complex phenomena. For instance, the supply and demand model provides a simplified representation of economic principles, aiding our understanding of market dynamics.

let's take a look few more mental models from system of thinking.

Similarly, the stock-and-flow mental model, a concept within the system of thinking framework, aids in comprehending the dynamics of complex systems.

In this model, "stock" refers to quantifiable elements at any given point in time, while "flows" represent events or rates of change that lead to either an increase or decrease in the stock's value..

Iceber model is another systems thinking model that help us understand issues at surface level and what are other hidden beneath the surface. in another word an iceberg model help us to illustrate how the surface-level events we react to are underpinned by less visible patterns, structures, and beliefs.

Non-Linear thinking which assume simple cause-and-effect relationship between system components. then Second order thinking involves considering the consequences of our actions and potential ripple effects they may have…etc. in short, apply best mental model for the scenario will result the best outcome.

<hr/>

Alright, now is time to try out the model and API

Usage: Ollama CLI

This is the easy and quickest way to try and test the model

## 1. pull the specialized model
ollama pull mychen76/llama3.1-intuitive-thinker:chain-of-thoughts.q5

## 2. run it
ollama run mychen76/llama3.1-intuitive-thinker:chain-of-thoughts.q5

## 3. start ask question
>> Patel family who is planning a road trip from Toronto to Vancouver, covering a total distance of 4,400 km. They want to divide the drive into equal parts over 5 days, with each day's drive being 15 km longer than the previous day. If they drive x km on the first day, what is the value of x that will allow them to complete the trip in 5 days?

## 4. check result
>>> <thinking>

To solve this problem, we need to determine the value of x, which represents the distance driven on the first day.

Step 1: Understand the constraints of the problem
The total distance to be covered is 4,400 km, and the drive should be divided into equal parts over 5 days. Each day's drive 
should be 15 km longer than the previous day.

..........
..........

Usage : Ollama API

## clone repo
git clone https://github.com/minyang-chen/intuitive_thinker.git

## install package 

cd intuitive_thinker
pip install dist/intuitive_thinker-0.1.0-py3-none-any.whl

## or activate the environment with poetry
cd intuitive_thinker
poetry shell

Update Environment Config

Locate .venv file at project folder and update environment specific setting to your own setup.

OLLAMA_URL="http://192.168.0.47:11434"
OLLAMA_KEY="ollama"
OLLAMA_MODEL_BASE="mychen76/llama3.1-intuitive-thinker:q5"

## prebuild specialized  mental model
OLLAMA_MODEL_CHAIN_OF_THOUGHTS="mychen76/llama3.1-intuitive-thinker:chain-of-thoughts.q5"
OLLAMA_MODEL_SECOND_ORDER_THINKING="mychen76/llama3.1-intuitive-thinker:second-order-thinking.q5"
OLLAMA_MODEL_ICEBERG_MENTAL_MODEL="mychen76/llama3.1-intuitive-thinker:iceberg-mental-model.q5"
OLLAMA_MODEL_THINKING_FAST_N_SLOW="mychen76/llama3.1-intuitive-thinker:thinking-fast-n-slow.q5"
OLLAMA_MODEL_CRITICAL_THINKING="mychen76/llama3.1-intuitive-thinker:critical-thinking.q5"

## Using OPENAI API
OPENAI_URL="http://192.168.0.26:11434/v1/"
OPENAI_KEY="ollama"
OPENAI_MODEL="mychen76/llama3.1-intuitive-thinker:q5"

## Huggingface
HF_MODEL_DEFAULT="mychen76/Llama-3.1_Intuitive-Thinker"
HF_MODEL_GGUF_Q5="https://huggingface.co/mychen76/Llama-3.1_Intuitive-Thinker_8B_2309_GGUF/resolve/main/Llama-3.1_Intuitive-Thinker_8B_2309.Q5_K_M.gguf"
HF_MODEL_GGUF_Q4="https://huggingface.co/mychen76/Llama-3.1_Intuitive-Thinker_8B_2309_GGUF/resolve/main/Llama-3.1_Intuitive-Thinker_8B_2309.Q4_K_M.gguf"

Using OllamaIntuitive Thinker API

from intuitive_thinker.ollama_thinker import OllamaIntuitiveThinker
from intuitive_thinker.mental_model import MentalModel
import os

### Environment Config

ollama_url = os.getenv("OLLAMA_URL")
ollama_key = os.getenv("OLLAMA_KEY")
ollama_model = os.getenv("OLLAMA_MODEL_BASE")

print(10*"-----")
print(ollama_url, ollama_key, ollama_model)
print(10*"-----")
##
## Use: Icerberg Mental Model
##
query = """
apply iceberg_thinking thinking for query:
Patel family who is planning a road trip from Toronto to Vancouver, covering a total distance of 4,400 km. They want to divide the drive into equal parts over 5 days, with each day's drive being 15 km longer than the previous day. If they drive x km on the first day, what is the value of x that will allow them to complete the trip in 5 days?]
"""

thinker = OllamaIntuitiveThinker(llm_url=ollama_url)
response = await thinker.llmchat(question=query,
                           llm_model=ollama_model,
                           mental_model=None)
print(10*"===")
print(response)

final_answer = thinker.parse_output(response)
print("========Final Answer=========")
print(final_answer)

Result

...calling tool:

[
    {
        'function': {
            'name': 'apply_thinking_model',
            'arguments': {
                'model': 'iceberg_thinking',
                'query': "Patel family who is planning a road trip from Toronto to Vancouver, covering a total 
distance of 4,400 km. They want to divide the drive into equal parts over 5 days, with each day's drive being 15 km
longer than the previous day. If they drive x km on the first day, what is the value of x that will allow them to 
complete the trip in 5 days?"
            }
        }
    }
]

...use thinking_model(iceberg_thinking,Patel family who is planning a road trip from Toronto to Vancouver, covering a total distance of 4,400 km. They want to divide the drive into equal parts over 5 days, with each day's drive being 15 km longer than the previous day. If they drive x km on the first day, what is the value of x that will allow them to complete the trip in 5 days?)

***after tool call response***
==============================
<thinking>
To solve this problem, we need to find the value of x that will allow the Patel family to complete their road trip in 5 days. 

1. First, let's identify the total distance they need to cover (4,400 km).
2. Next, we know each day's drive is 15 km longer than the previous day.
3. We can use algebraic expressions to represent this situation:
   Let x be the distance driven on the first day.
   Then, the distances for the next four days are: x + 15, x + 30, x + 45, and x + 60.
4. The sum of these distances should equal the total distance (4,400 km).
5. We can write this as an equation:

x + (x + 15) + (x + 30) + (x + 45) + (x + 60) = 4400

6. Combine like terms:
   5x + 150 = 4400
7. Subtract 150 from both sides:
   5x = 4250
8. Divide by 5 to solve for x:
   x = 850

Therefore, the Patel family should drive 850 km on the first day.

<reflection>
Reviewing our reasoning:
- We correctly identified the total distance and the increase in distance each day.
- Our algebraic expression accurately represented the situation.
- The equation we formed was logical and correct.
- Solving for x gave us a plausible answer (within the context of a road trip).
</reflection>

<output>
The value of x that will allow the Patel family to complete their road trip in 5 days is 850 km.
</output>
========Final Answer=========

The value of x that will allow the Patel family to complete their road trip in 5 days is 850 km.

Usage: LLama via OpenAI API

### Llama-Thinker

from intuitive_thinker.llama_thinker import LlamaIntuitiveThinker
from intuitive_thinker.mental_model import MentalModel
import os
from dotenv import load_dotenv
load_dotenv()

### Environment Config
openai_url = os.getenv("OPENAI_URL")
openai_api_key = os.getenv("OPENAI_KEY")
openai_model = os.getenv("OPENAI_MODEL")

print(10*"-----")
print(openai_url, openai_api_key, openai_model)
print(10*"-----")

##
## Use: general Chain-of-Thoughts
##
query = "count number of letter r in word strawberry?"

openthinker = LlamaIntuitiveThinker(llm_url=openai_url, llm_api_key=openai_api_key)
response = openthinker.llmchat(question=query, llm_model=openai_model, 
                               use_streaming=False,
                               mental_model=MentalModel.CHAIN_OF_THOUGHTS,
                              options={"temperature":1.0,"max_tokens":1024})
## {"temperature":1.5,"max_tokens":512}
print("======Full Response=========")
print(response)
print("======Final Answer=========")
answer= openthinker.parse_output(response)
print(answer)

Model Response

...use thinking_model(chain_of_thoughts,count number of letter r in word strawberry?)

======Full Response=========
<thinking>

To calculate the number of letter R's within word "strawberry".

1. Step through how we'll determine quantity.
The plan includes breaking down and reading term.
Looking by examining spelling and identify
any duplicates in process thus to know our answer in conclusion

There are seven items in what I've outlined:
      -step-by-anagram-of-and-name,
      -exam break and read word as individual letters out so spelling be soundly understood.
       Breaks then look for duplicate r.
      Count R.
      Identify number of
   Duplicates if one found and report on finding

2. Consider step to check if each might have errors or is the way the problem might better still approach.
   So step need reflection.

What about reading spelling forward instead going left word? Which letter comes from each spot.
       Checking in case there might an answer easier via spelling read that left 
       start again beginning forward with this
Reflection is closed as thought.
Let myself ask the question now is word breaking leftward spelling correct to count amount spelled by 'r?' Or did my other
   idea get there by a new insight? The better practice will get at the problem if and so one checks idea against 
   other possibilities in thinking part first to see which is stronger idea.

On reviewing earlier work done seems it is step which have errors. If going left begin each time we count word 's-r-b-g-' as 'str'.
Now then count: r

This plan may solve question, and here below now answer by method of above. By making the correction at a single step.
Str a W  ... R    e    Ry
</thinking>
<ouput>

Step by step count is done, result of word ' strawberries' counts number of duplicates (letter  that was 'r')
is...  three "R" exists in "strawberry". Thus  your final number of letter ' R ' spelled.
</output>"
======Final Answer=========
<thinking>

To calculate the number of letter R's within word "strawberry".

1. Step through how we'll determine quantity.
The plan includes breaking down and reading term.
Looking by examining spelling and identify
any duplicates in process thus to know our answer in conclusion

There are seven items in what I've outlined:
      -step-by-anagram-of-and-name,
      -exam break and read word as individual letters out so spelling be soundly understood.
       Breaks then look for duplicate r.
      Count R.
      Identify number of
   Duplicates if one found and report on finding
2. Consider step to check if each might have errors or is the way the problem might better still approach.
   So step need reflection.
What about reading spelling forward instead going left word? Which letter comes from each spot.
       Checking in case there might an answer easier via spelling read that left 
       start again beginning forward with this

Reflection is closed as thought.

Let myself ask the question now is word breaking leftward spelling correct to count amount spelled by 'r?' Or did my other
   idea get there by a new insight? The better practice will get at the problem if and so one checks idea against 
   other possibilities in thinking part first to see which is stronger idea.
On reviewing earlier work done seems it is step which have errors. If going left begin each time we count word 's-r-b-g-' as 'str'.
Now then count: r

This plan may solve question, and here below now answer by method of above. By making the correction at a single step.
Str a W  ... R    e    Ry

<ouput>

Step by step count is done, result of word ' strawberries' counts number of duplicates (letter  that was 'r')
is...  three "R" exists in "strawberry". Thus  your final number of letter ' R ' spelled.
"

Final Answer: `three "R" exists in "strawberry".`

In Summary

Experimental results show that incorporating mental models from the system of thinking framework can significantly improve reasoning abilities in smaller Language Models (LLMs).

This structured approach offers greater transparency into the LLM's thought process, outlining each step taken to arrive at a final answer. As a result, it becomes easier to identify logical reasoning and understand both correct and incorrect conclusions.

Try posing a question that requires reasoning and observe how the LLM responds. Check the result if its approach resembles your own thought processes.

Thanks again for reading :)

have a nice day!

Available Resources

Huggingface Model: https://huggingface.co/mychen76/Llama-3.1_Intuitive-Thinker

Ollama model: https://ollama.com/mychen76/llama3.1-intuitive-thinker

Gihub: https://github.com/minyang-chen/intuitive_thinker

Demo Notebook:

https://github.com/minyang-chen/intuitive_thinker/blob/main/intuitive_thinker_demo.ipynb