7000pctAUTO 27593b9c4c
Some checks failed
CI / test (push) Has been cancelled
CI / build (push) Has been cancelled
Add models and utils modules
2026-02-03 01:23:56 +00:00
2026-02-03 01:23:56 +00:00

Local API Docs Search

A CLI tool that indexes local API documentation (OpenAPI specs, code comments, README files) and enables natural language semantic search using local embedding models. Developers can query their project's API docs offline without external API calls.

Features

  • Semantic Search: Natural language queries using local embedding models
  • Offline Processing: All processing happens locally - no external API calls
  • Multiple Formats: Support for OpenAPI/Swagger specs, Markdown READMEs, and code comments
  • Interactive Mode: Rich CLI interface with history and navigation
  • Configurable: Easy configuration via YAML or environment variables

Installation

# Clone the repository
git clone <repository-url>
cd local-api-docs-search

# Create virtual environment
python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -e .

# Or install with dev dependencies
pip install -e ".[dev]"

Quick Start

  1. Configure your environment:

    cp .env.example .env
    # Edit .env to set your documentation path
    
  2. Index your documentation:

    # Index all supported formats
    api-docs index ./docs
    
    # Index specific format
    api-docs index ./docs --type openapi
    api-docs index ./docs --type readme
    api-docs index ./docs --type code
    
    # Recursive indexing
    api-docs index ./docs --recursive
    
  3. Search your documentation:

    # Simple search
    api-docs search "how to authenticate users"
    
    # With limit
    api-docs search "authentication" --limit 5
    
    # Filter by type
    api-docs search "endpoints" --type openapi
    
  4. Interactive mode:

    api-docs interactive
    

Configuration

Environment Variables

Variable Default Description
API_DOCS_INDEX_PATH ./docs Directory containing docs to index
API_DOCS_MODEL_NAME all-MiniLM-L6-v2 Embedding model to use
API_DOCS_EMBEDDING_DEVICE cpu Device for embeddings (cpu/cuda/auto)
API_DOCS_CHROMA_PERSIST_DIR .api-docs/chroma ChromaDB storage directory
API_DOCS_DEFAULT_LIMIT 10 Default result limit
API_DOCS_VERBOSE false Enable verbose logging

YAML Config

You can also use a config.yaml file:

index_path: ./docs
model_name: all-MiniLM-L6-v2
embedding_device: cpu
chroma_persist_dir: .api-docs/chroma
default_limit: 10
verbose: false

Commands

index

Index documentation files for search.

api-docs index <path> [OPTIONS]

Options:
  --type TEXT     Type of docs: openapi, readme, code, or all (default: all)
  --recursive     Recursively search directories
  --batch-size    Documents per batch (default: 32)
  --help          Show this message and exit

Search indexed documentation.

api-docs search <query> [OPTIONS]

Options:
  --limit INTEGER    Maximum results (default: 10)
  --type TEXT        Filter by source type
  --json             Output as JSON
  --help             Show this message and exit

list

List indexed documents.

api-docs list [OPTIONS]

Options:
  --type TEXT    Filter by source type
  --json         Output as JSON
  --help         Show this message and exit

config

Manage configuration.

api-docs config [OPTIONS]

Options:
  --show    Show current configuration
  --reset   Reset to defaults
  --help    Show this message and exit

interactive

Enter interactive search mode with history and navigation.

api-docs interactive

Supported Formats

OpenAPI/Swagger

Parses OpenAPI 3.0+ and Swagger 2.0 specifications in YAML or JSON format. Extracts:

  • Endpoint paths and methods
  • Operation IDs and summaries
  • Parameters and request bodies
  • Response schemas
  • Tags and descriptions

README Files

Parses Markdown files including:

  • Section headers
  • Paragraph text
  • Code blocks
  • Lists and tables

Code Comments

Supports Python, JavaScript, and TypeScript files:

  • Google-style docstrings
  • JSDoc comments
  • Function and class documentation
  • Inline comments (configurable)

Interactive Mode

Interactive mode provides:

  • Real-time search as you type
  • Arrow key navigation through history
  • Result pagination
  • Syntax highlighting
  • Source attribution

Keyboard shortcuts:

  • / : Navigate history
  • Enter: Submit search
  • Tab: Complete suggestion
  • Ctrl+C: Exit
  • F1: Toggle help

Architecture

local-api-docs-search/
├── src/
│   ├── main.py           # CLI entry point
│   ├── cli/
│   │   ├── commands.py   # CLI command definitions
│   │   └── interactive.py # Interactive mode
│   ├── indexer/
│   │   ├── base.py       # Base indexer interface
│   │   ├── openapi.py    # OpenAPI spec indexer
│   │   ├── readme.py     # README file indexer
│   │   └── code.py       # Code comment indexer
│   ├── search/
│   │   ├── embeddings.py # Embedding model management
│   │   ├── vectorstore.py # ChromaDB operations
│   │   └── searcher.py   # Search logic
│   ├── models/
│   │   └── document.py   # Document models
│   └── utils/
│       ├── config.py     # Configuration management
│       └── formatters.py # Output formatting
├── tests/
├── pyproject.toml
└── README.md

Development

# Run tests
pytest tests/ -v --cov=src

# Run linting
black src/ tests/
ruff check src/ tests/

# Type checking
mypy src/

Error Handling

Error Solution
No embedding model found Model downloads automatically on first use. Use --force-download to re-download
Invalid OpenAPI spec Validation errors are logged with line numbers; valid parts are still indexed
ChromaDB collection not found Collection is created automatically; prompt to index if empty
No documents indexed Run api-docs index first with appropriate path
MemoryError with large docs Use batch processing; consider quantized embeddings

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Run tests and linting
  5. Submit a pull request

License

MIT License - see LICENSE file for details.

Description
A CLI tool that indexes local API documentation and enables natural language semantic search using local embedding models
Readme 202 KiB
v0.1.0 Latest
2026-02-03 01:26:06 +00:00
Languages
Python 100%