promptforge/README.md

# PromptForge

A CLI tool for versioning, testing, and sharing AI prompts across different LLM providers. Treat prompts as code with git integration, A/B testing, templating, and a shared prompt registry.

## Features

- **Version Control**: Track prompt changes with Git, create branches for A/B testing variations
- **Multi-Provider Support**: Unified API for OpenAI, Anthropic, and Ollama
- **Prompt Templating**: Jinja2-based variable substitution with type validation
- **A/B Testing Framework**: Compare prompt variations with statistical analysis
- **Output Validation**: Validate LLM responses against JSON schemas or regex patterns
- **Prompt Registry**: Share prompts via local and remote registries

## Installation

```bash
pip install promptforge
```

Or from source:

```bash
git clone https://github.com/yourusername/promptforge.git
cd promptforge
pip install -e .
```

## Quick Start

1. Initialize a new PromptForge project:

```bash
pf init
```

2. Create your first prompt:

```bash
pf prompt create "Summarizer" -c "Summarize the following text: {{text}}"
```

3. Run the prompt:

```bash
pf run Summarizer -v text="Your long text here..."
```

## Configuration

Create a `configs/promptforge.yaml` file:

```yaml
providers:
  openai:
    api_key: ${OPENAI_API_KEY}
    model: gpt-4
    temperature: 0.7
  anthropic:
    api_key: ${ANTHROPIC_API_KEY}
    model: claude-3-sonnet-20240229
  ollama:
    base_url: http://localhost:11434
    model: llama2

defaults:
  provider: openai
  output_format: text
```

## Creating Prompts

Prompts are YAML files with front matter:

```yaml
---
name: Code Explainer
description: Explain code snippets
version: "1.0.0"
provider: openai
tags: [coding, education]
variables:
  - name: language
    type: choice
    required: true
    choices: [python, javascript, rust, go]
  - name: code
    type: string
    required: true
validation:
  - type: regex
    pattern: "(def|function|fn|func)"
---
Explain this {{language}} code:

{{code}}

Focus on:
- What the code does
- Key functions/classes used
- Any potential improvements
```

## Running Prompts

```bash
# Run with variables
pf run "Code Explainer" -v language=python -v code="def hello(): print('world')"

# Use a different provider
pf run "Code Explainer" -p anthropic -v language=rust -v code="..."

# Output as JSON
pf run "Code Explainer" -o json -v ...
```

## Version Control

```bash
# Create a version commit
pf version create "Added validation rules"

# View history
pf version history

# Create a branch for A/B testing
pf version branch test-variation-a

# List all branches
pf version list
```

## A/B Testing

Compare prompt variations:

```bash
# Test a single prompt
pf test "Code Explainer" --iterations 5

# Compare multiple prompts
pf test "Prompt A" "Prompt B" --iterations 3

# Run in parallel
pf test "Prompt A" "Prompt B" --parallel
```

## Output Validation

Add validation rules to your prompts:

```yaml
validation:
  - type: regex
    pattern: "^\\d+\\. .+"
    message: "Response must be numbered list"

  - type: json
    schema:
      type: object
      properties:
        summary:
          type: string
          minLength: 10
        keywords:
          type: array
          items:
            type: string
```

## Prompt Registry

### Local Registry

```bash
# List local prompts
pf registry list

# Add prompt to registry
pf registry add "Code Explainer" --author "Your Name"

# Search registry
pf registry search "python"
```

### Remote Registry

```bash
# Pull a prompt from remote
pf registry pull <entry_id>

# Publish your prompt
pf registry publish "Code Explainer"
```

## API Reference

### Core Classes

- `Prompt`: Main prompt model with YAML serialization
- `TemplateEngine`: Jinja2-based template rendering
- `GitManager`: Git integration for version control
- `ProviderBase`: Abstract interface for LLM providers

### Providers

- `OpenAIProvider`: OpenAI GPT models
- `AnthropicProvider`: Anthropic Claude models
- `OllamaProvider`: Local Ollama models

### Testing

- `ABTest`: A/B test runner
- `Validator`: Response validation framework
- `MetricsCollector`: Metrics aggregation

## Contributing

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Run tests: `pytest tests/ -v`
5. Submit a pull request

## License

MIT License - see LICENSE file for details.