<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Ai on Mini Fish</title>
    <link>https://blog.minifish.org/tags/ai/</link>
    <description>Recent content in Ai on Mini Fish</description>
    <image>
      <title>Mini Fish</title>
      <url>https://blog.minifish.org/android-chrome-512x512.png</url>
      <link>https://blog.minifish.org/android-chrome-512x512.png</link>
    </image>
    <generator>Hugo -- 0.154.5</generator>
    <language>en-US</language>
    <copyright>Mini Fish 2014-present. Licensed under CC-BY-NC</copyright>
    <lastBuildDate>Mon, 19 Jan 2026 10:00:00 +0800</lastBuildDate>
    <atom:link href="https://blog.minifish.org/tags/ai/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>From AI Conversations to Published Blog: The MCP-Powered Publishing Revolution</title>
      <link>https://blog.minifish.org/posts/ai-mcp-blog-publishing-workflow/</link>
      <pubDate>Mon, 19 Jan 2026 10:00:00 +0800</pubDate>
      <guid>https://blog.minifish.org/posts/ai-mcp-blog-publishing-workflow/</guid>
      <description>&lt;h2 id=&#34;the-problem-lost-context-lost-thoughts&#34;&gt;The Problem: Lost Context, Lost Thoughts&lt;/h2&gt;
&lt;p&gt;We&amp;rsquo;ve all been there. You&amp;rsquo;re deep in a technical discussion with an AI assistant—analyzing code, exploring architecture, or debugging a complex issue. The conversation is rich with insights, and you think: &amp;ldquo;This would make a great blog post.&amp;rdquo;&lt;/p&gt;
&lt;p&gt;But then reality hits: you need to switch to your blog repository, format the content, commit it, push it, and wait for the build. By the time you&amp;rsquo;re back, the original context is gone, and the momentum is lost.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="the-problem-lost-context-lost-thoughts">The Problem: Lost Context, Lost Thoughts</h2>
<p>We&rsquo;ve all been there. You&rsquo;re deep in a technical discussion with an AI assistant—analyzing code, exploring architecture, or debugging a complex issue. The conversation is rich with insights, and you think: &ldquo;This would make a great blog post.&rdquo;</p>
<p>But then reality hits: you need to switch to your blog repository, format the content, commit it, push it, and wait for the build. By the time you&rsquo;re back, the original context is gone, and the momentum is lost.</p>
<p><strong>What if you could publish directly from where you are?</strong></p>
<h2 id="building-on-existing-automation">Building on Existing Automation</h2>
<p>In <a href="/posts/github-action">my previous post about automatically publishing a blog using GitHub Actions</a>, I set up a workflow where pushing to the blog repository triggers an automatic build and deployment to GitHub Pages. This solved the build and deployment automation, but there was still one manual step remaining: creating the post file itself.</p>
<p>The workflow I described there handles:</p>
<ol>
<li>Checking out the blog repository</li>
<li>Building the Hugo site with <code>make</code></li>
<li>Deploying to <code>jackysp.github.io</code></li>
</ol>
<p>But you still needed to be in the blog repository to create the post. That&rsquo;s where MCP changes everything.</p>
<h2 id="enter-mcp-the-missing-link">Enter MCP: The Missing Link</h2>
<p>The <a href="https://modelcontextprotocol.io/">Model Context Protocol (MCP)</a> is revolutionizing how AI agents interact with external systems. Instead of treating AI as a passive tool, MCP enables agents to act as autonomous agents with direct access to your tools and workflows.</p>
<p>In my setup, I&rsquo;ve connected MCP-enabled agents (like Cursor) directly to my blog repository via GitHub MCP. This means:</p>
<ul>
<li><strong>No context switching</strong>: Stay in your current working directory, whether it&rsquo;s a random project folder or a deep codebase exploration</li>
<li><strong>Preserve conversation flow</strong>: The AI maintains the full context of your discussion</li>
<li><strong>Direct publishing</strong>: Create and publish posts without leaving your IDE</li>
</ul>
<h2 id="the-architecture-seamless-integration">The Architecture: Seamless Integration</h2>
<p>Here&rsquo;s how the complete workflow operates:</p>
<pre tabindex="0"><code>┌─────────────────────────────────────────────────────────┐
│  AI Agent (Cursor/Claude) with MCP enabled              │
│  - Context: Any code repository or discussion            │
│  - Tool: GitHub MCP Server                               │
└──────────────────┬──────────────────────────────────────┘
                   │
                   │ Creates post via GitHub MCP
                   │
                   ▼
┌─────────────────────────────────────────────────────────┐
│  Blog Repository (jackysp/blog)                         │
│  - content/posts/[new-post].md                          │
│  - Commit: &#34;Publish: [title]&#34;                           │
└──────────────────┬──────────────────────────────────────┘
                   │
                   │ Push to master branch
                   │
                   ▼
┌─────────────────────────────────────────────────────────┐
│  GitHub Actions (from previous post)                     │
│  - Build: Hugo static site generation                   │
│  - Deploy: Push to jackysp.github.io                    │
└──────────────────┬──────────────────────────────────────┘
                   │
                   │ Published
                   │
                   ▼
┌─────────────────────────────────────────────────────────┐
│  Live Site (jackysp.github.io)                          │
│  - Post is live and accessible                           │
└─────────────────────────────────────────────────────────┘
</code></pre><p>The GitHub Actions part remains exactly as described in the previous post—no changes needed there. The MCP layer adds the ability to trigger it from anywhere.</p>
<h2 id="the-workflow-in-action">The Workflow in Action</h2>
<h3 id="1-ai-powered-content-creation">1. AI-Powered Content Creation</h3>
<p>When you&rsquo;re discussing a technical topic with an AI agent, you can simply ask:</p>
<blockquote>
<p>&ldquo;Turn this discussion into a blog post and publish it.&rdquo;</p>
</blockquote>
<p>The AI agent, with access to GitHub via MCP, can:</p>
<ul>
<li>Extract key insights from your conversation</li>
<li>Format content according to Hugo front matter requirements</li>
<li>Create properly structured markdown files</li>
<li>Handle images and assets</li>
<li>Commit and push to the repository</li>
</ul>
<h3 id="2-automated-build--deploy">2. Automated Build &amp; Deploy</h3>
<p>The moment a post is pushed to the <code>master</code> branch, the same GitHub Actions workflow from the previous post kicks in:</p>
<div class="highlight"><pre tabindex="0" style="color:#f8f8f2;background-color:#272822;-moz-tab-size:4;-o-tab-size:4;tab-size:4;"><code class="language-yaml" data-lang="yaml"><span style="display:flex;"><span><span style="color:#f92672">on</span>:
</span></span><span style="display:flex;"><span>  <span style="color:#f92672">push</span>:
</span></span><span style="display:flex;"><span>    <span style="color:#f92672">branches</span>: [ <span style="color:#ae81ff">master ]</span>
</span></span></code></pre></div><p>The workflow (as detailed in <a href="/posts/github-action">the previous post</a>):</p>
<ol>
<li>Checks out the blog repository with submodules</li>
<li>Builds the Hugo site using <code>make</code></li>
<li>Deploys the built artifacts to <code>jackysp.github.io</code></li>
</ol>
<p>All without manual intervention.</p>
<h3 id="3-governance-through-contracts">3. Governance Through Contracts</h3>
<p>To ensure quality and prevent accidents, I&rsquo;ve implemented an <strong>AI Publishing Contract</strong> (<code>PUBLISHING.md</code>) that defines:</p>
<ul>
<li><strong>Allowed paths</strong>: Only <code>content/**</code> and <code>static/**</code> can be modified</li>
<li><strong>Post format</strong>: Required front matter fields (title, date, tags, slug, summary)</li>
<li><strong>Image handling</strong>: Standardized location and reference format</li>
<li><strong>Commit conventions</strong>: Single commit per post with descriptive messages</li>
</ul>
<p>This contract ensures that AI agents can publish content while respecting the repository structure and quality standards.</p>
<h2 id="why-this-matters-the-developer-experience-revolution">Why This Matters: The Developer Experience Revolution</h2>
<h3 id="zero-context-switching">Zero Context Switching</h3>
<p>Traditional workflow:</p>
<ol>
<li>Copy conversation → Switch to blog repo → Format → Commit → Push → Wait</li>
<li><strong>Context lost</strong>, momentum broken</li>
</ol>
<p>New workflow:</p>
<ol>
<li>Ask AI to publish → Done</li>
<li><strong>Context preserved</strong>, workflow continuous</li>
</ol>
<h3 id="capturing-technical-insights">Capturing Technical Insights</h3>
<p>The best technical insights often emerge during active problem-solving. With this workflow, you can:</p>
<ul>
<li>Document discoveries in real-time</li>
<li>Turn debugging sessions into tutorials</li>
<li>Transform architecture discussions into deep-dives</li>
<li>Share codebase explorations as learning resources</li>
</ul>
<h3 id="scaling-knowledge-sharing">Scaling Knowledge Sharing</h3>
<p>Previously, the friction of publishing meant many valuable insights were never written down. Now, the barrier to publishing is minimal, making it easier to:</p>
<ul>
<li>Share learnings with your team</li>
<li>Build a personal knowledge base</li>
<li>Contribute to the developer community</li>
<li>Document your problem-solving journey</li>
</ul>
<h2 id="technical-implementation-details">Technical Implementation Details</h2>
<h3 id="mcp-server-configuration">MCP Server Configuration</h3>
<p>The GitHub MCP server provides the AI agent with:</p>
<ul>
<li>Repository read/write access</li>
<li>File creation and modification</li>
<li>Commit and push capabilities</li>
<li>Branch management</li>
</ul>
<h3 id="github-actions-workflow">GitHub Actions Workflow</h3>
<p>The CI/CD pipeline (as described in <a href="/posts/github-action">the previous post</a>) handles:</p>
<ul>
<li>Go environment setup (for Hugo builds)</li>
<li>Repository checkout with submodules</li>
<li>Site generation via <code>make</code></li>
<li>Deployment to GitHub Pages repository</li>
</ul>
<p>No changes needed to the existing workflow—it just gets triggered from a new entry point.</p>
<h3 id="hugo-site-configuration">Hugo Site Configuration</h3>
<p>Posts follow Hugo&rsquo;s standard structure:</p>
<ul>
<li><strong>Location</strong>: <code>content/posts/</code></li>
<li><strong>Format</strong>: YAML front matter + Markdown content</li>
<li><strong>Images</strong>: Stored in <code>content/posts/images/</code></li>
<li><strong>Draft control</strong>: <code>draft: true/false</code> for preview/publish</li>
</ul>
<h2 id="the-future-ai-augmented-documentation">The Future: AI-Augmented Documentation</h2>
<p>This workflow represents a shift toward <strong>AI-augmented documentation</strong>. Instead of treating AI as a writing assistant, we&rsquo;re treating it as a publishing agent that can:</p>
<ul>
<li>Understand context from code discussions</li>
<li>Extract technical insights automatically</li>
<li>Format and structure content appropriately</li>
<li>Publish without breaking workflow</li>
</ul>
<p>As MCP and similar protocols mature, we&rsquo;ll see more sophisticated capabilities:</p>
<ul>
<li>Automatic code analysis and explanation</li>
<li>Multi-post series generation from extended discussions</li>
<li>Cross-referencing with existing content</li>
<li>SEO and metadata optimization</li>
</ul>
<h2 id="getting-started">Getting Started</h2>
<p>If you want to set up a similar workflow:</p>
<ol>
<li><strong>Set up automated publishing</strong> (see <a href="/posts/github-action">my previous post</a>)</li>
<li><strong>Enable MCP in your AI agent</strong> (Cursor, Claude Desktop, etc.)</li>
<li><strong>Configure GitHub MCP server</strong> with repository access</li>
<li><strong>Define publishing contracts</strong> for governance</li>
<li><strong>Start publishing</strong> from your conversations</li>
</ol>
<p>The technical details are straightforward, but the impact on productivity and knowledge capture is profound.</p>
<h2 id="conclusion">Conclusion</h2>
<p>The intersection of AI agents, MCP protocols, and automated CI/CD creates a new paradigm for technical publishing. By building on the existing GitHub Actions automation and adding MCP as the entry point, we eliminate context switching and reduce friction.</p>
<p>This isn&rsquo;t just about automating blog posts—it&rsquo;s about <strong>preserving the flow state of technical discovery</strong> and making knowledge sharing as natural as having a conversation.</p>
<p>The future of technical documentation is here, and it&rsquo;s conversational.</p>
<hr>
<p><em>This post was created and published using the exact workflow described above—from a discussion about workflow automation to a live blog post, all without leaving the conversation context.</em></p>
]]></content:encoded>
    </item>
    <item>
      <title>Harnessing AI to Create High-Quality Podcasts Quickly and for Free</title>
      <link>https://blog.minifish.org/posts/harnessing-ai-to-create-high-quality-podcasts-quickly-and-for-free/</link>
      <pubDate>Wed, 11 Dec 2024 17:11:00 +0800</pubDate>
      <guid>https://blog.minifish.org/posts/harnessing-ai-to-create-high-quality-podcasts-quickly-and-for-free/</guid>
      <description>&lt;h2 id=&#34;introduction&#34;&gt;Introduction&lt;/h2&gt;
&lt;p&gt;As a long-time podcaster, I&amp;rsquo;ve always enjoyed sharing my thoughts and ideas through audio. While the world of video content—and the role of a YouTuber—has its allure, the complexities of video editing have kept me anchored in the realm of podcasting. My journey has involved leveraging platforms like &lt;a href=&#34;https://creators.spotify.com/&#34;&gt;&lt;strong&gt;Spotify Creator&lt;/strong&gt;&lt;/a&gt; (formerly &lt;strong&gt;Anchor&lt;/strong&gt;) for hosting and distributing my recordings. This platform offers a wide array of features for free, including audio recording, editing capabilities, and automatic promotion to Spotify.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<h2 id="introduction">Introduction</h2>
<p>As a long-time podcaster, I&rsquo;ve always enjoyed sharing my thoughts and ideas through audio. While the world of video content—and the role of a YouTuber—has its allure, the complexities of video editing have kept me anchored in the realm of podcasting. My journey has involved leveraging platforms like <a href="https://creators.spotify.com/"><strong>Spotify Creator</strong></a> (formerly <strong>Anchor</strong>) for hosting and distributing my recordings. This platform offers a wide array of features for free, including audio recording, editing capabilities, and automatic promotion to Spotify.</p>
<p>However, I sought a more comprehensive solution, one that would allow me to listen to my own podcast while driving, using the Podcast app on my CarPlay device. To achieve this, I ventured into publishing on <strong>Apple Podcasts</strong> (<a href="https://podcastsconnect.apple.com/">podcastsconnect.apple.com</a>), which also offers free hosting. With a self-designed cover and episodes uploaded, I was set—or so I thought.</p>
<h2 id="the-challenges-of-traditional-podcasting">The Challenges of Traditional Podcasting</h2>
<p>Despite having the technical setup, I faced significant challenges:</p>
<ul>
<li><strong>Consistency</strong>: Maintaining a regular publishing schedule proved difficult.</li>
<li><strong>Voice Quality</strong>: My voice quality was inconsistent, affecting listener engagement.</li>
<li><strong>Content Preparation</strong>: Crafting well-structured episodes without improvisation was challenging.</li>
<li><strong>Enhancements</strong>: Incorporating background music and other audio elements to enrich the listening experience required additional effort.</li>
</ul>
<p>These hurdles led to my podcast being suspended for approximately two years. I found myself in need of a solution that could simplify the process and revitalize my passion for podcasting.</p>
<h2 id="discovering-notebooklm-an-ai-powered-podcasting-tool">Discovering NotebookLM: An AI-Powered Podcasting Tool</h2>
<p>Recently, I stumbled upon <strong>NotebookLM</strong> (<a href="https://notebooklm.google.com/">notebooklm.google.com</a>), an innovative application developed by Google. NotebookLM harnesses the power of artificial intelligence to generate podcast content. Users can provide a topic and related documents, and the AI takes over, creating engaging podcast episodes.</p>
<h3 id="my-experience-with-notebooklm">My Experience with NotebookLM</h3>
<p>Intrigued, I decided to give NotebookLM a try. The results were nothing short of astounding:</p>
<ul>
<li><strong>Effortless Production</strong>: The AI effortlessly generated a half-hour episode featuring two speakers discussing the topics in English.</li>
<li><strong>Enhanced Content</strong>: It went beyond the provided information, utilizing search engines to gather additional relevant data from the internet.</li>
<li><strong>Quality Output</strong>: The quality of the generated content was exceptionally high, surpassing what I could produce on my own.</li>
<li><strong>Incorporated Music</strong>: Appropriate background music was added, enhancing the overall listening experience.</li>
<li><strong>Cost-Free</strong>: All these features were available entirely for free.</li>
</ul>
<h2 id="a-case-study-deep-dive-into-tidb">A Case Study: Deep Dive into TiDB</h2>
<p>To put NotebookLM to the test, I created an episode about <strong>TiDB</strong>, a product developed by my current employer. The process was seamless, and the final product was impressive. You can listen to the episode here: <a href="https://podcasts.apple.com/us/podcast/deep-dive-into-tidb/id1609444337?i=1000679181770">Deep Dive into TiDB</a>.</p>
<h2 id="conclusion">Conclusion</h2>
<p>The integration of AI into podcast creation through tools like NotebookLM has the potential to revolutionize the way we produce content. It removes many of the barriers that podcasters face, such as time constraints, technical challenges, and the need for consistent quality.</p>
<p>For anyone looking to start or rejuvenate their podcast without the traditional hassles, I highly recommend giving NotebookLM a try. It&rsquo;s remarkable to see how AI can not only match but enhance human capabilities in creative endeavors.</p>
<hr>
<p>I hope this helps! Let me know if there&rsquo;s anything you&rsquo;d like to add or modify in your blog post.</p>
]]></content:encoded>
    </item>
    <item>
      <title>Exploring Local LLMs with Ollama: My Journey and Practices</title>
      <link>https://blog.minifish.org/posts/exploring-local-llms-with-ollama-my-journey-and-practices/</link>
      <pubDate>Wed, 27 Nov 2024 18:26:14 +0800</pubDate>
      <guid>https://blog.minifish.org/posts/exploring-local-llms-with-ollama-my-journey-and-practices/</guid>
      <description>&lt;p&gt;Local Large Language Models (LLMs) have been gaining traction as developers and enthusiasts seek more control over their AI tools without relying solely on cloud-based solutions. In this blog post, I&amp;rsquo;ll share my experiences with &lt;strong&gt;Ollama&lt;/strong&gt;, a remarkable tool for running local LLMs, along with other tools like &lt;strong&gt;llamaindex&lt;/strong&gt; and &lt;strong&gt;Candle&lt;/strong&gt;. I&amp;rsquo;ll also discuss various user interfaces (UI) that enhance the local LLM experience.&lt;/p&gt;
&lt;hr&gt;
&lt;h2 id=&#34;table-of-contents&#34;&gt;Table of Contents&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#introduction-to-ollama&#34;&gt;Introduction to Ollama&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#a-popular-choice&#34;&gt;A Popular Choice&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#ease-of-use&#34;&gt;Ease of Use&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#built-with-golang&#34;&gt;Built with Golang&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#my-practices-with-ollama&#34;&gt;My Practices with Ollama&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#preferred-models&#34;&gt;Preferred Models&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#llama-31&#34;&gt;Llama 3.1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#mistral&#34;&gt;Mistral&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#phi-3&#34;&gt;Phi-3&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#qwen-2&#34;&gt;Qwen-2&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#hardware-constraints&#34;&gt;Hardware Constraints&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#exploring-uis-for-ollama&#34;&gt;Exploring UIs for Ollama&lt;/a&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;#openwebui&#34;&gt;OpenWebUI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#page-assist&#34;&gt;Page Assist&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#enchanted&#34;&gt;Enchanted&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#anythingllm&#34;&gt;AnythingLLM&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#dify&#34;&gt;Dify&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#diving-into-llamaindex&#34;&gt;Diving into llamaindex&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#experimenting-with-candle&#34;&gt;Experimenting with Candle&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;#conclusion&#34;&gt;Conclusion&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h2 id=&#34;introduction-to-ollama&#34;&gt;Introduction to Ollama&lt;/h2&gt;
&lt;h3 id=&#34;a-popular-choice&#34;&gt;A Popular Choice&lt;/h3&gt;
&lt;p&gt;&lt;a href=&#34;https://github.com/jmorganca/ollama&#34;&gt;Ollama&lt;/a&gt; has rapidly become a favorite among developers interested in local LLMs. Within a year, it has garnered significant attention on GitHub, reflecting its growing user base and community support.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>Local Large Language Models (LLMs) have been gaining traction as developers and enthusiasts seek more control over their AI tools without relying solely on cloud-based solutions. In this blog post, I&rsquo;ll share my experiences with <strong>Ollama</strong>, a remarkable tool for running local LLMs, along with other tools like <strong>llamaindex</strong> and <strong>Candle</strong>. I&rsquo;ll also discuss various user interfaces (UI) that enhance the local LLM experience.</p>
<hr>
<h2 id="table-of-contents">Table of Contents</h2>
<ul>
<li><a href="#introduction-to-ollama">Introduction to Ollama</a>
<ul>
<li><a href="#a-popular-choice">A Popular Choice</a></li>
<li><a href="#ease-of-use">Ease of Use</a></li>
<li><a href="#built-with-golang">Built with Golang</a></li>
</ul>
</li>
<li><a href="#my-practices-with-ollama">My Practices with Ollama</a>
<ul>
<li><a href="#preferred-models">Preferred Models</a>
<ul>
<li><a href="#llama-31">Llama 3.1</a></li>
<li><a href="#mistral">Mistral</a></li>
<li><a href="#phi-3">Phi-3</a></li>
<li><a href="#qwen-2">Qwen-2</a></li>
</ul>
</li>
<li><a href="#hardware-constraints">Hardware Constraints</a></li>
</ul>
</li>
<li><a href="#exploring-uis-for-ollama">Exploring UIs for Ollama</a>
<ul>
<li><a href="#openwebui">OpenWebUI</a></li>
<li><a href="#page-assist">Page Assist</a></li>
<li><a href="#enchanted">Enchanted</a></li>
<li><a href="#anythingllm">AnythingLLM</a></li>
<li><a href="#dify">Dify</a></li>
</ul>
</li>
<li><a href="#diving-into-llamaindex">Diving into llamaindex</a></li>
<li><a href="#experimenting-with-candle">Experimenting with Candle</a></li>
<li><a href="#conclusion">Conclusion</a></li>
</ul>
<hr>
<h2 id="introduction-to-ollama">Introduction to Ollama</h2>
<h3 id="a-popular-choice">A Popular Choice</h3>
<p><a href="https://github.com/jmorganca/ollama">Ollama</a> has rapidly become a favorite among developers interested in local LLMs. Within a year, it has garnered significant attention on GitHub, reflecting its growing user base and community support.</p>
<h3 id="ease-of-use">Ease of Use</h3>
<p>One of Ollama&rsquo;s standout features is its simplicity. It&rsquo;s as easy to use as Docker, making it accessible even to those who may not be deeply familiar with machine learning frameworks. The straightforward command-line interface allows users to download and run models with minimal setup.</p>
<h3 id="built-with-golang">Built with Golang</h3>
<p>Ollama is written in <strong>Golang</strong>, ensuring performance and efficiency. Golang&rsquo;s concurrency features contribute to Ollama&rsquo;s ability to handle tasks effectively, which is crucial when working with resource-intensive LLMs.</p>
<h2 id="my-practices-with-ollama">My Practices with Ollama</h2>
<h3 id="preferred-models">Preferred Models</h3>
<h4 id="llama-31">Llama 3.1</h4>
<p>I&rsquo;ve found that <strong>Llama 3.1</strong> works exceptionally well with Ollama. It&rsquo;s my go-to choice due to its performance and compatibility.</p>
<h4 id="mistral">Mistral</h4>
<p>While <strong>Mistral</strong> also performs well, it hasn&rsquo;t gained as much popularity as Llama. Nevertheless, it&rsquo;s a solid option worth exploring.</p>
<h4 id="phi-3">Phi-3</h4>
<p>Developed by Microsoft, <strong>Phi-3</strong> is both fast and efficient. The 2B parameter model strikes a balance between size and performance, making it one of the best small-sized LLMs available.</p>
<h4 id="qwen-2">Qwen-2</h4>
<p>Despite its impressive benchmarks, <strong>Qwen-2</strong> didn&rsquo;t meet my expectations in practice. It might work well in certain contexts, but it didn&rsquo;t suit my specific needs.</p>
<h3 id="hardware-constraints">Hardware Constraints</h3>
<p>Running large models on hardware with limited resources can be challenging. On my 16GB MacBook, models around <strong>7B to 8B parameters</strong> are the upper limit. Attempting to run larger models results in performance issues.</p>
<h2 id="exploring-uis-for-ollama">Exploring UIs for Ollama</h2>
<p>Enhancing the user experience with UIs can make interacting with local LLMs more intuitive. Here&rsquo;s a look at some UIs I&rsquo;ve tried:</p>
<h3 id="openwebui">OpenWebUI</h3>
<p><a href="https://github.com/OpenWebUI/OpenWebUI">OpenWebUI</a> offers a smooth and user-friendly interface similar to Ollama&rsquo;s default UI. It requires Docker to run efficiently, which might be a barrier for some users.</p>
<ul>
<li><strong>Features</strong>:
<ul>
<li>Basic Retrieval-Augmented Generation (RAG) capabilities.</li>
<li>Connection to OpenAI APIs.</li>
</ul>
</li>
</ul>
<h3 id="page-assist">Page Assist</h3>
<p><a href="https://chrome.google.com/webstore/detail/page-assist/">Page Assist</a> is a Chrome extension that I&rsquo;ve chosen for its simplicity and convenience.</p>
<ul>
<li><strong>Advantages</strong>:
<ul>
<li>No requirement for Docker.</li>
<li>Accesses the current browser page as input, enabling context-aware interactions.</li>
</ul>
</li>
</ul>
<h3 id="enchanted">Enchanted</h3>
<p><a href="https://apps.apple.com/app/enchanted-ai-assistant/id">Enchanted</a> is unique as it provides an iOS UI for local LLMs with support for Ollama.</p>
<ul>
<li><strong>Usage</strong>:
<ul>
<li>By using <strong>Tailscale</strong>, I can connect it to Ollama running on my MacBook.</li>
<li>Serves as an alternative to Apple’s native intelligence features.</li>
</ul>
</li>
</ul>
<h3 id="anythingllm">AnythingLLM</h3>
<p><a href="https://github.com/Mintplex-Labs/anything-llm">AnythingLLM</a> offers enhanced RAG capabilities. However, in my experience, it hasn&rsquo;t performed consistently well enough for regular use.</p>
<h3 id="dify">Dify</h3>
<p><a href="https://github.com/langgenius/dify">Dify</a> is a powerful and feature-rich option.</p>
<ul>
<li><strong>Pros</strong>:
<ul>
<li>Easy to set up with an extensive feature set.</li>
</ul>
</li>
<li><strong>Cons</strong>:
<ul>
<li>Resource-intensive, requiring Docker and running multiple containers like Redis and PostgreSQL.</li>
</ul>
</li>
</ul>
<h2 id="diving-into-llamaindex">Diving into llamaindex</h2>
<p><a href="https://github.com/jerryjliu/llama_index">llamaindex</a> is geared towards developers who are comfortable writing code. While it offers robust functionalities, it does have a learning curve.</p>
<ul>
<li><strong>Observations</strong>:
<ul>
<li>Documentation is somewhat limited, often necessitating diving into the source code.</li>
<li>The <code>llamaindex-cli</code> tool aims to simplify getting started but isn&rsquo;t entirely stable.
<ul>
<li>Works seamlessly with OpenAI.</li>
<li>Requires code modifications to function with Ollama.</li>
</ul>
</li>
</ul>
</li>
</ul>
<h2 id="experimenting-with-candle">Experimenting with Candle</h2>
<p><strong>Candle</strong> is an intriguing project written in <strong>Rust</strong>.</p>
<ul>
<li>
<p><strong>Features</strong>:</p>
<ul>
<li>Uses <a href="https://huggingface.co/">Hugging Face</a> to download models.</li>
<li>Simple to run but exhibits slower performance compared to Ollama.</li>
</ul>
</li>
<li>
<p><strong>Additional Tools</strong>:</p>
<ul>
<li><strong>Cake</strong>: A distributed solution based on Candle, <strong>Cake</strong> opens up possibilities for scaling and extending use cases.</li>
</ul>
</li>
</ul>
<h2 id="conclusion">Conclusion</h2>
<p>Exploring local LLMs has been an exciting journey filled with learning and experimentation. Tools like Ollama, llamaindex, and Candle offer various pathways to harnessing the power of LLMs on personal hardware. While there are challenges, especially with hardware limitations and setup complexities, the control and privacy afforded by local models make the effort worthwhile.</p>
<hr>
<p><em>Feel free to share your experiences or ask questions in the comments below!</em></p>
]]></content:encoded>
    </item>
  </channel>
</rss>
