Skip to content

Ollama Integration

COSMIC integrates with Ollama for local LLM verification without API costs.

Setup

1. Install Ollama

Download from ollama.com/download:

brew install ollama
curl -fsSL https://ollama.com/install.sh | sh

Download the installer from ollama.com

2. Pull a Model

# Recommended: Fast and efficient
ollama pull gemma3

# Alternatives
ollama pull qwen2.5-coder:7b
ollama pull llama3.2

3. Use with COSMIC

# Auto-detect best model
cosmic chunk document.txt --strategy full --ollama auto

# Use specific model
cosmic chunk document.txt --strategy full --ollama gemma3:latest

Model Recommendations

Model Size Speed Quality Best For
gemma3 3.3 GB Fast Good Default choice
qwen2.5-coder:7b 4.7 GB Medium Good Technical docs
llama3.2 Various Medium Good General use
deepseek-coder-v2 8.9 GB Slow Better Code-heavy docs
qwen3:30b 18 GB Slow Best Quality priority

Auto-Selection Logic

When using --ollama auto, COSMIC selects models in this order:

  1. gemma3 / gemma2 (smallest, fastest)
  2. qwen2.5-coder variants
  3. llama3.2 / llama3.1
  4. mistral
  5. Larger models as fallback

CLI Commands

Check Status

cosmic ollama status

Output:

Ollama Status:
  Installed: Yes
  Running: Yes
  Models available: 6
  Recommended model: gemma3:latest

List Models

cosmic ollama list

Output:

Available Ollama models:
NAME                    SIZE
--------------------------------------------------
gemma3:latest           3.3 GB
qwen2.5-coder:7b        4.7 GB
llama3.2:latest         2.0 GB

Start Server

cosmic ollama start

Python API

Basic Usage

from cosmic import COSMICChunker, COSMICConfig, Document
from cosmic.models.ollama import OllamaManager

# Create Ollama manager
ollama = OllamaManager()

if ollama.is_available():
    # List models
    models = ollama.list_models()
    for model in models:
        print(f"{model.name}: {model.size_gb:.1f} GB")

    # Auto-select best model
    model_name = ollama.auto_select_model()

    # Configure COSMIC
    config = COSMICConfig()
    config.llm.enabled = True
    config.llm.base_url = ollama.api_base_url
    config.llm.model_name = model_name

    # Process document
    chunker = COSMICChunker(config)
    chunks = chunker.chunk_document(doc, strategy="full")

Context Manager

from cosmic.models.ollama import OllamaManager

# Automatic server lifecycle management
with OllamaManager() as ollama:
    config = COSMICConfig()
    config.llm.base_url = ollama.api_base_url
    config.llm.model_name = ollama.auto_select_model()

    chunker = COSMICChunker(config)
    chunks = chunker.chunk_document(doc, strategy="full")
# Server automatically stopped if COSMIC started it

Environment Variables

# Ollama server URL
OLLAMA_HOST=http://localhost:11434

# Default model (or "auto")
COSMIC_OLLAMA_MODEL=auto

# Use Ollama as default provider
COSMIC_LLM_PROVIDER=ollama

Server Management

Automatic Management

When using --ollama, COSMIC:

  1. Checks if Ollama is installed
  2. Checks for available models
  3. Starts server if not running
  4. Uses the model for verification
  5. Stops server if COSMIC started it

Manual Management

# Start server manually
ollama serve

# Stop server
pkill ollama

# Check if running
curl http://localhost:11434/api/tags

Troubleshooting

Ollama Not Found

Error: Ollama is not installed
Install from: https://ollama.com/download

Solution: Install Ollama from the official website.

No Models Available

Error: No Ollama models available
Pull a model with: ollama pull gemma3

Solution:

ollama pull gemma3

Server Won't Start

Error: Failed to start Ollama server

Solutions:

  1. Check if port 11434 is in use
  2. Try starting manually: ollama serve
  3. Check Ollama logs

Model Too Large

If a model is too large for your system:

# Use a smaller model
ollama pull gemma3  # 3.3 GB

# Or a quantized version
ollama pull llama3.2:1b  # Smaller variant

Slow Performance

  • Use smaller models (gemma3, phi3)
  • Ensure GPU is being used
  • Consider using --no-llm for faster processing

Without Ollama

If you don't want to use Ollama:

# Skip LLM verification entirely
cosmic chunk document.txt --strategy full --no-llm

# Or use semantic-only strategy
cosmic chunk document.txt --strategy semantic

LLM verification is optional - COSMIC works well without it for most documents.