7000pctAUTO b57b670c4b
Some checks failed
CI / test (push) Failing after 11s
CI / lint (push) Failing after 6s
CI / type-check (push) Successful in 12s
fix: resolve CI linting and type errors
2026-02-04 12:58:37 +00:00
2026-02-04 12:33:47 +00:00

PromptForge

A CLI tool for versioning, testing, and sharing AI prompts across different LLM providers. Treat prompts as code with git integration, A/B testing, templating, and a shared prompt registry.

Features

  • Version Control: Track prompt changes with Git, create branches for A/B testing variations
  • Multi-Provider Support: Unified API for OpenAI, Anthropic, and Ollama
  • Prompt Templating: Jinja2-based variable substitution with type validation
  • A/B Testing Framework: Compare prompt variations with statistical analysis
  • Output Validation: Validate LLM responses against JSON schemas or regex patterns
  • Prompt Registry: Share prompts via local and remote registries

Installation

pip install promptforge

Or from source:

git clone https://github.com/yourusername/promptforge.git
cd promptforge
pip install -e .

Quick Start

  1. Initialize a new PromptForge project:
pf init
  1. Create your first prompt:
pf prompt create "Summarizer" -c "Summarize the following text: {{text}}"
  1. Run the prompt:
pf run Summarizer -v text="Your long text here..."

Configuration

Create a configs/promptforge.yaml file:

providers:
  openai:
    api_key: ${OPENAI_API_KEY}
    model: gpt-4
    temperature: 0.7
  anthropic:
    api_key: ${ANTHROPIC_API_KEY}
    model: claude-3-sonnet-20240229
  ollama:
    base_url: http://localhost:11434
    model: llama2

defaults:
  provider: openai
  output_format: text

Creating Prompts

Prompts are YAML files with front matter:

---
name: Code Explainer
description: Explain code snippets
version: "1.0.0"
provider: openai
tags: [coding, education]
variables:
  - name: language
    type: choice
    required: true
    choices: [python, javascript, rust, go]
  - name: code
    type: string
    required: true
validation:
  - type: regex
    pattern: "(def|function|fn|func)"
---
Explain this {{language}} code:

{{code}}

Focus on:
- What the code does
- Key functions/classes used
- Any potential improvements

Running Prompts

# Run with variables
pf run "Code Explainer" -v language=python -v code="def hello(): print('world')"

# Use a different provider
pf run "Code Explainer" -p anthropic -v language=rust -v code="..."

# Output as JSON
pf run "Code Explainer" -o json -v ...

Version Control

# Create a version commit
pf version create "Added validation rules"

# View history
pf version history

# Create a branch for A/B testing
pf version branch test-variation-a

# List all branches
pf version list

A/B Testing

Compare prompt variations:

# Test a single prompt
pf test "Code Explainer" --iterations 5

# Compare multiple prompts
pf test "Prompt A" "Prompt B" --iterations 3

# Run in parallel
pf test "Prompt A" "Prompt B" --parallel

Output Validation

Add validation rules to your prompts:

validation:
  - type: regex
    pattern: "^\\d+\\. .+"
    message: "Response must be numbered list"
  
  - type: json
    schema:
      type: object
      properties:
        summary:
          type: string
          minLength: 10
        keywords:
          type: array
          items:
            type: string

Prompt Registry

Local Registry

# List local prompts
pf registry list

# Add prompt to registry
pf registry add "Code Explainer" --author "Your Name"

# Search registry
pf registry search "python"

Remote Registry

# Pull a prompt from remote
pf registry pull <entry_id>

# Publish your prompt
pf registry publish "Code Explainer"

API Reference

Core Classes

  • Prompt: Main prompt model with YAML serialization
  • TemplateEngine: Jinja2-based template rendering
  • GitManager: Git integration for version control
  • ProviderBase: Abstract interface for LLM providers

Providers

  • OpenAIProvider: OpenAI GPT models
  • AnthropicProvider: Anthropic Claude models
  • OllamaProvider: Local Ollama models

Testing

  • ABTest: A/B test runner
  • Validator: Response validation framework
  • MetricsCollector: Metrics aggregation

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run tests: pytest tests/ -v
  5. Submit a pull request

License

MIT License - see LICENSE file for details.

Description
A CLI tool for versioning, testing, and sharing AI prompts across different LLM providers
Readme 227 KiB
Languages
Python 100%