如何结合 Ollama 和 DeepSeek-R1 创建一个本地聊天机器人

国外 AI 技术达人 Mervin Praison (Mervin 个人技术网站）分享了一个使用 Ollama 和 DeepSeek-R1 创建本地 AI 聊天机器人的方法。通过这一方案，无需联网即可与 DeepSeek-R1 机器人进行对话，让它为你撰写各类文章，同时确保隐私信息的安全性。此外，这个示例还为大家提供了一个学习如何使用 Python 开发大语言模型应用的实践机会。

完整源代码

安装聊天机器人用到的 Python 库文件

pip install -U ollama chainlit streamlit gradio

聊天机器人主程序

import ollama

# Create streaming completion
completion = ollama.chat(
    model="deepseek-r1:latest",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Why sky is blue?"}
    ],
)

# Access message content directly from response
response = completion['message']['content']

print(response)

Streaming（Python 3.5 加入的标准库，是对序列操作的一种抽象和延迟计算的方式）

import ollama

# Create streaming completion
completion = ollama.chat(
    model="deepseek-r1:latest",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Why sky is blue?"}
    ],
    stream=True  # Enable streaming
)

# Print the response as it comes in
for chunk in completion:
    if 'message' in chunk and 'content' in chunk['message']:
        content = chunk['message']['content']
        print(content, end='', flush=True)

Gradio（一个用于创建机器学习模型交互式界面的Python 库）

import ollama
import gradio as gr

def chat_with_ollama(message, history):
    # Initialize empty string for streaming response
    response = ""
    
    # Convert history to messages format
    messages = [
        {"role": "system", "content": "You are a helpful assistant."}
    ]
    
    # Add history messages
    for h in history:
        messages.append({"role": "user", "content": h[0]})
        if h[1]:  # Only add assistant message if it exists
            messages.append({"role": "assistant", "content": h[1]})
    
    # Add current message
    messages.append({"role": "user", "content": message})
    
    completion = ollama.chat(
        model="deepseek-r1:latest",
        messages=messages,
        stream=True  # Enable streaming
    )
    
    # Stream the response
    for chunk in completion:
        if 'message' in chunk and 'content' in chunk['message']:
            content = chunk['message']['content']
            # Handle  and  tags
            content = content.replace("", "Thinking...").replace("", "\n\n Answer:")
            response += content
            yield response

# Create Gradio interface with Chatbot
with gr.Blocks() as demo:
    chatbot = gr.Chatbot()
    msg = gr.Textbox(placeholder="Enter your message here...")
    clear = gr.Button("Clear")

    def user(user_message, history):
        return "", history + [[user_message, None]]

    def bot(history):
        history[-1][1] = ""
        for chunk in chat_with_ollama(history[-1][0], history[:-1]):
            history[-1][1] = chunk
            yield history

    msg.submit(user, [msg, chatbot], [msg, chatbot], queue=False).then(
        bot, chatbot, chatbot
    )
    clear.click(lambda: None, None, chatbot, queue=False)

if __name__ == "__main__":
    demo.launch()
Streamlit（一个专门针对机器学习和数据科学团队的应用开发框架）
import streamlit as st
import ollama

# Set page title
st.title("Chat with Ollama")

# Initialize chat history in session state if it doesn't exist
if "messages" not in st.session_state:
    st.session_state.messages = [
        {"role": "system", "content": "You are a helpful assistant."}
    ]

# Display chat input
user_input = st.chat_input("Your message:")

# Display chat history and handle new inputs
for message in st.session_state.messages:
    if message["role"] != "system":
        with st.chat_message(message["role"]):
            st.write(message["content"])

if user_input:
    # Display user message
    with st.chat_message("user"):
        st.write(user_input)
    
    # Add user message to history
    st.session_state.messages.append({"role": "user", "content": user_input})
    
    # Get streaming response
    with st.chat_message("assistant"):
        message_placeholder = st.empty()
        full_response = ""
        
        completion = ollama.chat(
            model="deepseek-r1:latest",
            messages=st.session_state.messages,
            stream=True
        )
        
        # Process the streaming response
        for chunk in completion:
            if 'message' in chunk and 'content' in chunk['message']:
                content = chunk['message']['content']
                full_response += content
                message_placeholder.write(full_response + "▌")
        
        message_placeholder.write(full_response)
    
    # Add assistant response to history
    st.session_state.messages.append({"role": "assistant", "content": full_response})

Chainlit（一个开源的异步 Python 框架）
import chainlit as cl
import ollama
import json

@cl.on_message
async def main(message: cl.Message):
    # Create a message dictionary instead of using Message objects directly
    messages = [{'role': 'user', 'content': str(message.content)}]
    
    # Create a message first
    msg = cl.Message(content="")
    await msg.send()

    # Create a stream with ollama
    stream = ollama.chat(
        model='deepseek-r1:latest',  # Use a model you have installed
        messages=messages,
        stream=True,
    )

    # Stream the response token by token
    for chunk in stream:
        if token := chunk['message']['content']:
            await msg.stream_token(token)
    
    # Update the message one final time
    await msg.update()

@cl.on_chat_start
async def start():
    await cl.Message(content="Hello! How can I help you today?").send()

完整源代码

安装聊天机器人用到的 Python 库文件

聊天机器人主程序

Streaming（Python 3.5 加入的标准库，是对序列操作的一种抽象和延迟计算的方式）

Gradio（一个用于创建机器学习模型交互式界面的Python 库）

Streamlit（一个专门针对机器学习和数据科学团队的应用开发框架）

Chainlit（一个开源的异步 Python 框架）

相关文章

氛围编程的下半场：你的 AI 编程 Agent 已经不需要你的电脑了

Mistral 把编程 Agent 扔进云里

写规格说明，不写代码：Logic 如何用 500 字 spec 把模型指令遵循率拉高 6 个点

评论区