Ollama Integration¶

COSMIC integrates with Ollama for local LLM verification without API costs.

Setup¶

1. Install Ollama¶

macOSLinuxWindows

brew install ollama

curl -fsSL https://ollama.com/install.sh | sh

Download the installer from ollama.com

2. Pull a Model¶

# Recommended: Fast and efficient
ollama pull gemma3

# Alternatives
ollama pull qwen2.5-coder:7b
ollama pull llama3.2

3. Use with COSMIC¶

# Auto-detect best model
cosmic chunk document.txt --strategy full --ollama auto

# Use specific model
cosmic chunk document.txt --strategy full --ollama gemma3:latest

Model Recommendations¶

Model	Size	Speed	Quality	Best For
`gemma3`	3.3 GB	Fast	Good	Default choice
`qwen2.5-coder:7b`	4.7 GB	Medium	Good	Technical docs
`llama3.2`	Various	Medium	Good	General use
`deepseek-coder-v2`	8.9 GB	Slow	Better	Code-heavy docs
`qwen3:30b`	18 GB	Slow	Best	Quality priority

Auto-Selection Logic¶

When using --ollama auto, COSMIC selects models in this order:

gemma3 / gemma2 (smallest, fastest)
qwen2.5-coder variants
llama3.2 / llama3.1
mistral
Larger models as fallback

CLI Commands¶

Check Status¶

cosmic ollama status

Output:

Ollama Status:
  Installed: Yes
  Running: Yes
  Models available: 6
  Recommended model: gemma3:latest

List Models¶

cosmic ollama list

Output:

Available Ollama models:
NAME                    SIZE
--------------------------------------------------
gemma3:latest           3.3 GB
qwen2.5-coder:7b        4.7 GB
llama3.2:latest         2.0 GB

Start Server¶

cosmic ollama start

Python API¶

Basic Usage¶

from cosmic import COSMICChunker, COSMICConfig, Document
from cosmic.models.ollama import OllamaManager

# Create Ollama manager
ollama = OllamaManager()

if ollama.is_available():
    # List models
    models = ollama.list_models()
    for model in models:
        print(f"{model.name}: {model.size_gb:.1f} GB")

    # Auto-select best model
    model_name = ollama.auto_select_model()

    # Configure COSMIC
    config = COSMICConfig()
    config.llm.enabled = True
    config.llm.base_url = ollama.api_base_url
    config.llm.model_name = model_name

    # Process document
    chunker = COSMICChunker(config)
    chunks = chunker.chunk_document(doc, strategy="full")

Context Manager¶

from cosmic.models.ollama import OllamaManager

# Automatic server lifecycle management
with OllamaManager() as ollama:
    config = COSMICConfig()
    config.llm.base_url = ollama.api_base_url
    config.llm.model_name = ollama.auto_select_model()

    chunker = COSMICChunker(config)
    chunks = chunker.chunk_document(doc, strategy="full")
# Server automatically stopped if COSMIC started it

Environment Variables¶

# Ollama server URL
OLLAMA_HOST=http://localhost:11434

# Default model (or "auto")
COSMIC_OLLAMA_MODEL=auto

# Use Ollama as default provider
COSMIC_LLM_PROVIDER=ollama

Server Management¶

Automatic Management¶

When using --ollama, COSMIC:

Checks if Ollama is installed
Checks for available models
Starts server if not running
Uses the model for verification
Stops server if COSMIC started it

Manual Management¶

# Start server manually
ollama serve

# Stop server
pkill ollama

# Check if running
curl http://localhost:11434/api/tags

Troubleshooting¶

Ollama Not Found¶

Error: Ollama is not installed
Install from: https://ollama.com/download

Solution: Install Ollama from the official website.

No Models Available¶

Error: No Ollama models available
Pull a model with: ollama pull gemma3

Solution:

ollama pull gemma3

Server Won't Start¶

Error: Failed to start Ollama server

Solutions:

Check if port 11434 is in use
Try starting manually: ollama serve
Check Ollama logs

Model Too Large¶

If a model is too large for your system:

# Use a smaller model
ollama pull gemma3  # 3.3 GB

# Or a quantized version
ollama pull llama3.2:1b  # Smaller variant

Slow Performance¶

Use smaller models (gemma3, phi3)
Ensure GPU is being used
Consider using --no-llm for faster processing

Without Ollama¶

If you don't want to use Ollama:

# Skip LLM verification entirely
cosmic chunk document.txt --strategy full --no-llm

# Or use semantic-only strategy
cosmic chunk document.txt --strategy semantic

LLM verification is optional - COSMIC works well without it for most documents.