I’ll show you how to create any chatbot locally for free in just a few minutes. We’ll make a chatbot that can automatically respond to questions about any topic you ask, as well as a multimodal chatbot that answers questions about images. I’ll guide you through using ollama libraries and Langchain to create both types of chatbots step by step. Plus, I’ll provide custom code so you don’t have to build it from scratch; you can simply copy and paste it into your project

So, here’s the plan: we’re going to create two different chatbots. The first one is a multimodal chatbot. We’ll use the LLAVA chatbot, which can automatically answer questions about images. The second chatbot is a Question-Answer system that automatically responds to any question you ask. This will be useful for any type of business.

Table of Contents

Before we start! 🦸🏻‍♀️

If you like this topic and you want to support me:

Follow me on Medium and subscribe to get my latest article🫶
follow me on Youtube Gao Dalie (高達烈)

Ollama Python library

The Ollama Python library is a library that helps you to integrate ollama services into your Python project. Ollama is a platform that provides a natural language processing model, and this library allows you to easily utilize Ollama’s various features.

If you haven’t downloaded Ollama yet, please refer to my previous article where I explain how to download it.

CrewAi + Solar/Hermes + Langchain + Ollama = Super Ai Agent 2024

Features of ollama libraries :

The Ollama Python library supports a variety of functions and can perform various tasks such as streaming, multimodal, processing, text completion, and custom model creation.

1- Streaming response :

when you want to receive a continuous response from the server, you can use the ‘stream=True’ option.

stream = ollama.chat(
    model='llama2',
    messages=[{'role': 'user', 'content': 'Why is the sky blue?'}],
    stream=True,
)

for chunk in stream:
  print(chunk['message']['content'], end='', flush=True)

2- Custom Client :

Users can create a custom client according to their needs to connect to the Ollama server.

3- Asynchronous support:

Can also be used in an Asynchronous environment, and an Asynchronous client can be used through the ‘AsyncClient’ class. This class allows you to perform other tasks while waiting for communication with the Ollama server.

Using various features, the ollama Python library supports efficient natural language processing tasks. You can find more information in the Ollama library’s GitHub repository provided below.

GitHub – ollama/ollama-python: Ollama Python library

Install Ollama and dependencies

We create a virtual environment with “python3.11 -m venv ollama” and activate it with “source ollama/bin/activate” and then install “pip install ollama”

Installing Necessary Libraries

Before we start, let’s install the required libraries. Create a file named requirements.txt and write the dependencies below.

langchain
streamlit
ollama

After that, just open the terminal in VS Code and run the below command.

pip install -r requirements.txt

Import Libraries

we going to import Streamlit, time, and ollama and then we import Streamlit ChatMessage History from langchain.

import streamlit as st
import time
import ollama
from langchain.memory.chat_message_histories import StreamlitChatMessageHistory

we will set up the page title and layout style for a streamlit web application.

st.set_page_config(page_title="Ollama Chatbot", page_icon="💬")

let’s configure the sidebar in a streamlit application. the sidebar function is used to add widgets and content to the sidebar of the streamlit app.

we set the title of the sidebar to “Ollama chatbot “ to display a title in the sidebar. then we divide the line in the sidebar, separate the different sections

create a select box widget in the sidebar labelled “Choose a model”. Users can select one of the three options: “Mistral”, “Solar”, or “Code Llama”. The selected option is stored in the variable selected_model.

Here’s what happens in each branch of the if-elif-else statement:

If the user selects “Mistral”, it sets llm_model to “mistral” and displays information about the Mistral 7B model.
If the user selects “Solar”, it sets llm_model to “Solar” and displays information about the Llama 2 model released by Meta Platforms, Inc.
If the user selects “Code Llama”, it sets llm_model to “codellama” and displays information about the Code Llama model for generating and discussing code.


with st.sidebar:
    st.title('💬 Ollama Chatbot')
    
    st.divider()
    # Select the model
    selected_model = st.selectbox('Choose a model', ['Mistral', 'Solar', 'Code Llama'], key='selected_model')
    
    if selected_model == "Mistral":
        llm_model = "mistral"
        st.caption("""
                   The Mistral 7B model released by Mistral AI. 
                   Mistral 7B model is an Apache licensed 7.3B parameter model. 
                   It is available in both instruct (instruction following) and text completion.
                   """) 
    elif selected_model == "Solar":
        llm_model = "Solar"
        st.caption("""
                    SOLAR 10.7B is released by Upstage. 
                    Solar a 10B an open LLM outperforming other LLMs up to 30B parameters, 
                    including Mistral 7B. 🤯 Solar achieves an MMLU score of 65.48, 
                    which is only 4 points lower than Meta Llama 2 while being 7x smaller.
                   """) 
    else:
        llm_model = "codellama"
        st.caption("""
                   Code Llama is a model for generating and discussing code, built on top of Llama 2. 
                   It’s designed to make workflows faster and efficient for developers and make it easier for people to learn how to code. 
                   It can generate both code and natural language about code. 
                   Code Llama supports many of the most popular programming languages used today, including Python, C++, Java, PHP, Typescript (Javascript), C#, Bash and more.
                   """) 
    st.divider()

Define a text area widget named “System Prompt” where users can input text. The initial value of the text area is set to “You are a helpful assistant who answers questions in short sentences.” Users can modify this text as needed.

 system_prompt = st.text_area(
            label="System Prompt",
            value="You are a helpful assistant who answers questions in short sentences."
            )

creates a toggle button labeled ‘Activate GPU’, allowing users to activate or deactivate GPU usage. When the toggle is switched on (gpu_on is True), the variable activate_gpu is set to 1, indicating GPU activation.

If the toggle is switched off (gpu_on is False), activate_gpu is set to 0, indicating GPU deactivation. This functionality allows users to control GPU usage within the application.

    gpu_on = st.toggle('Activate GPU')

# Activate the GPU
if gpu_on:
    activate_gpu = 1
else:
    activate_gpu = 0

we use StreamlitChatMessageHistor to store messages in the Streamlit session state at the specified key=. The default key is "langchain_messages".

Note, StreamlitChatMessageHistory only works when run in a Streamlit app

If the chat message history is currently empty (no messages have been exchanged), it adds an AI message “How may I assist you today?”

# Set up memory
msgs = StreamlitChatMessageHistory(key="langchain_messages")
if len(msgs.messages) == 0:
    msgs.add_ai_message("How may I assist you today?")

create “response()” function to generate a response using a model called ‘llm_model” It used ollama. generate() to generate the response based on a given prompt.

def response():
    response = ollama.generate(model=llm_model,prompt=prompt)
    return response['response']

Display message stored in the message object. If a message is from a human, it shows the message with a computer avatar (🧑‍💻). If it’s not from a human, it displays the message with a speech bubble avatar (💬). Finally, it writes the content of each message.

for msg in msgs.messages:
    if msg.type == "human":
        with st.chat_message(msg.type, avatar="🧑‍💻"):
            st.write(msg.content)
    else:
       with st.chat_message(msg.type, avatar="💬"):
            st.write(msg.content)

Design the function to clear the chat history. It’s linked to a button titled ‘Clear Chat History’ placed in the sidebar. When the button is clicked, it triggers the clear_chat_history() function to remove all messages from the chat history stored in the msgs object. The button is styled to fit within the width of its container in the sidebar.

def clear_chat_history():
    msgs.clear()
st.sidebar.button('Clear Chat History', 
                  on_click=clear_chat_history, 
                  use_container_width=True)

Create input where users can type messages (chat_input). If the user inputs a message and hits enter, it captures the input (prompt). Then, it displays the captured message

if prompt := st.chat_input():
    with st.chat_message("human", avatar="🧑‍💻"):
        st.write(prompt)

we going to create a code to display an AI-generated response within a chat interface. It uses a speech bubble avatar (💬) to represent the assistant’s message. While processing the response, it shows a spinning indicator with the text “Thinking…”. The response is gradually built character by character and displayed with a vertical line (▌) to simulate typing.

with st.chat_message("assistant", avatar="💬"):
        with st.spinner("Thinking..."):
            response = response()
        placeholder = st.empty()
        full_response = ''
        for item in response:
            full_response += item
            placeholder.markdown(full_response + "▌")
            time.sleep(0.05)
        placeholder.markdown(full_response)

Multi-model Chatbot

we import Streamlit, ollama, and Image for manipulation and BytesIO for handling image data in memory.

import streamlit as st
import ollama
from PIL import Image  
from io import BytesIO

Define a list of names available_models containing different model names. Each item in the list represents a specific model, such as ‘dolphin’, ‘openhermes’, and ‘llava’.

available_models = [
    'dolphin-mistral:7b-v2.6-fp16',
    'openhermes:7b-mistral-v2.5-q6_K',
    'llava', 
]

Create a sidebar select box titled ‘Select a model:’, where users can choose from the available models listed in the available_models list.

Create a text area labeled ‘Ask a question:’, allowing users to input text for asking questions. The input text is stored in the variable user_input.

model_choice = st.sidebar.selectbox('Select a model:', available_models)

user_input = st.text_area('Ask a question:', '', height=150)

Uses HTML and CSS within a Markdown block to define a style called “dashed-box”. The style creates a dashed border with an orange color (#FFA500), rounded corners (border-radius), and padding around the content. Additionally, it sets margins above and below the box. This style can be applied to elements in the Streamlit app to create visually distinctive dashed borders around them. The unsafe_allow_html=True parameter allows Streamlit to render the HTML code safely within the Markdown block.

st.markdown("""
    <style>
    .dashed-box {
        border: 2px dashed #FFA500;
        border-radius: 10px;
        padding: 10px;
        margin: 10px 0;
    }
    </style>
    """, unsafe_allow_html=True)

We create a file uploader to ask a user to upload an image file by displaying the text “Upload an image”. Users can only upload files with extensions ‘jpg’, ‘png’, or ‘jpeg’ due to the specified file types (type=[‘jpg’,’ png’,’ jpeg’]).

uploaded_file = st.file_uploader("Upload an image", type=['jpg','png','jpeg'])

We check if an image file has been uploaded (uploaded_file). If a file has been uploaded, it opens the image using the Python Imaging Library (PIL) and assigns it to the variable img. If the image has an RGBA mode (alpha channel for transparency), it converts it to RGB

if uploaded_file:
    img = Image.open(uploaded_file)
    
    # Convert RGBA image to RGB
    if img.mode == 'RGBA':
        img = img.convert('RGB')
    
    st.image(img, caption='Uploaded image', use_column_width=True)

Then we create an if statement to check if the ‘Submit’ button has been clicked in a Streamlit application. If clicked, it initializes a list named messages, containing a dictionary representing user input.

Once the image is converted into bytes, it’s attached to the user’s message. The images key within the first message dictionary (messages[0]) is assigned a list containing the image bytes ([img_bytes]). This links the image data to the user’s input message.

if st.button('Submit'):
    messages = [{'role': 'user', 'content': user_input}]
    if uploaded_file:
        
        with BytesIO() as buffer:
            img.save(buffer, 'jpeg')
            img_bytes = buffer.getvalue()

        messages[0]['images'] = [img_bytes]

we use ollama to interact with ollama models using the Ollama Python library, It sends the prepared messages list (messages) and the chosen model (model_choice) to the chat function. The stream=True parameter indicates that the response should be streamed, meaning it will be received in chunks rather than all at once.

response = ollama.chat(
        model=model_choice,
        messages=messages,
        stream=True
    )

    final_response = ''
    for chunk in response:
        if 'content' in chunk.get('message', {}):
            final_response += chunk['message']['content']

Then we wrap the final response within the HTML div element, creating a visually distinct box with dashed borders and padding around the text.

st.markdown(f'<div class="dashed-box">{final_response}</div>', unsafe_allow_html=True)

Conclusion :

ollama libraries is the go-to option if you require an easy-to-use tool for running LLMs with efficiency and precision and represents significant advancements in the open-source AI community and offers robust solutions for different user requirements.

Reference :

- - https://ollama.ai/blog/python-javascript-libraries

Intro to DSPy: Simple Ideas To Improve Your RAG

Llmlingua + LlamaIndex + RAG = Cheaper Chatbot

AutoGen + LangChian + SQLite + Function Schema = Super AI Chabot

Microsoft PHI-2 + Huggine Face + Langchain = Super Tiny Chatbot

TaskWeaver + Planner + Plugin = Super AI Agent

OpenHermes 2.5 Vs GPT-4 Vs LLama2 = The Winner

AutoGen + LangChian + RAG + Function Call = Super AI Chabot

Why OpenChat Model is So Much Better Than ChatGPT?

How To Build a LLava chatbot

Why OpenChat Model is So Much Better Than ChatGPT?

Intro to DSPy: Simple Ideas To Improve Your RAG

Llmlingua + LlamaIndex + RAG = Cheaper Chatbot

AutoGen + LangChian + SQLite + Function Schema = Super AI Chabot

Microsoft PHI-2 + Huggine Face + Langchain = Super Tiny Chatbot

TaskWeaver + Planner + Plugin = Super AI Agent

OpenHermes 2.5 Vs GPT-4 Vs LLama2 = The Winner

AutoGen + LangChian + RAG + Function Call = Super AI Chabot

Why OpenChat Model is So Much Better Than ChatGPT?

How To Build a LLava chatbot

Why OpenChat Model is So Much Better Than ChatGPT?

[ollama libraries 🦙] Run Any Chatbot Free Locally on Your Computer

LangGraph + Gemini Pro + Custom Tool + Streamlit = Multi-Agent Application Development

Five Technique : VLLM + Torch + Flash_Attention =Super Local LLM

Gao Dalie (高達烈)

Five Technique : VLLM + Torch + Flash_Attention =Super Local LLM

Leave a Reply Cancel reply

You might also like

Phidata: How Easily I Build Autonomous Agents to Invest as a Beginner

LangChain + RAG Fusion + GPT-4o Python Project: Easy AI/Chat for your Docs

Intro to DSPy: Simple Ideas To Improve Your RAG

LangGraph + Adaptive Rag + LLama3 Python Project: Easy AI/Chat for your Docs

CrewAi + SharedMemory + Grop API = Powerful Ai Agent

LangGraph + Corrective RAG + Local LLM = Powerful Rag Chatbot

Stay Connected

More…