6.3 KiB
Local API Docs Search
A CLI tool that indexes local API documentation (OpenAPI specs, code comments, README files) and enables natural language semantic search using local embedding models. Developers can query their project's API docs offline without external API calls.
Features
- Semantic Search: Natural language queries using local embedding models
- Offline Processing: All processing happens locally - no external API calls
- Multiple Formats: Support for OpenAPI/Swagger specs, Markdown READMEs, and code comments
- Interactive Mode: Rich CLI interface with history and navigation
- Configurable: Easy configuration via YAML or environment variables
Installation
# Clone the repository
git clone <repository-url>
cd local-api-docs-search
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -e .
# Or install with dev dependencies
pip install -e ".[dev]"
Quick Start
-
Configure your environment:
cp .env.example .env # Edit .env to set your documentation path -
Index your documentation:
# Index all supported formats api-docs index ./docs # Index specific format api-docs index ./docs --type openapi api-docs index ./docs --type readme api-docs index ./docs --type code # Recursive indexing api-docs index ./docs --recursive -
Search your documentation:
# Simple search api-docs search "how to authenticate users" # With limit api-docs search "authentication" --limit 5 # Filter by type api-docs search "endpoints" --type openapi -
Interactive mode:
api-docs interactive
Configuration
Environment Variables
| Variable | Default | Description |
|---|---|---|
API_DOCS_INDEX_PATH |
./docs |
Directory containing docs to index |
API_DOCS_MODEL_NAME |
all-MiniLM-L6-v2 |
Embedding model to use |
API_DOCS_EMBEDDING_DEVICE |
cpu |
Device for embeddings (cpu/cuda/auto) |
API_DOCS_CHROMA_PERSIST_DIR |
.api-docs/chroma |
ChromaDB storage directory |
API_DOCS_DEFAULT_LIMIT |
10 |
Default result limit |
API_DOCS_VERBOSE |
false |
Enable verbose logging |
YAML Config
You can also use a config.yaml file:
index_path: ./docs
model_name: all-MiniLM-L6-v2
embedding_device: cpu
chroma_persist_dir: .api-docs/chroma
default_limit: 10
verbose: false
Commands
index
Index documentation files for search.
api-docs index <path> [OPTIONS]
Options:
--type TEXT Type of docs: openapi, readme, code, or all (default: all)
--recursive Recursively search directories
--batch-size Documents per batch (default: 32)
--help Show this message and exit
search
Search indexed documentation.
api-docs search <query> [OPTIONS]
Options:
--limit INTEGER Maximum results (default: 10)
--type TEXT Filter by source type
--json Output as JSON
--help Show this message and exit
list
List indexed documents.
api-docs list [OPTIONS]
Options:
--type TEXT Filter by source type
--json Output as JSON
--help Show this message and exit
config
Manage configuration.
api-docs config [OPTIONS]
Options:
--show Show current configuration
--reset Reset to defaults
--help Show this message and exit
interactive
Enter interactive search mode with history and navigation.
api-docs interactive
Supported Formats
OpenAPI/Swagger
Parses OpenAPI 3.0+ and Swagger 2.0 specifications in YAML or JSON format. Extracts:
- Endpoint paths and methods
- Operation IDs and summaries
- Parameters and request bodies
- Response schemas
- Tags and descriptions
README Files
Parses Markdown files including:
- Section headers
- Paragraph text
- Code blocks
- Lists and tables
Code Comments
Supports Python, JavaScript, and TypeScript files:
- Google-style docstrings
- JSDoc comments
- Function and class documentation
- Inline comments (configurable)
Interactive Mode
Interactive mode provides:
- Real-time search as you type
- Arrow key navigation through history
- Result pagination
- Syntax highlighting
- Source attribution
Keyboard shortcuts:
↑/↓: Navigate historyEnter: Submit searchTab: Complete suggestionCtrl+C: ExitF1: Toggle help
Architecture
local-api-docs-search/
├── src/
│ ├── main.py # CLI entry point
│ ├── cli/
│ │ ├── commands.py # CLI command definitions
│ │ └── interactive.py # Interactive mode
│ ├── indexer/
│ │ ├── base.py # Base indexer interface
│ │ ├── openapi.py # OpenAPI spec indexer
│ │ ├── readme.py # README file indexer
│ │ └── code.py # Code comment indexer
│ ├── search/
│ │ ├── embeddings.py # Embedding model management
│ │ ├── vectorstore.py # ChromaDB operations
│ │ └── searcher.py # Search logic
│ ├── models/
│ │ └── document.py # Document models
│ └── utils/
│ ├── config.py # Configuration management
│ └── formatters.py # Output formatting
├── tests/
├── pyproject.toml
└── README.md
Development
# Run tests
pytest tests/ -v --cov=src
# Run linting
black src/ tests/
ruff check src/ tests/
# Type checking
mypy src/
Error Handling
| Error | Solution |
|---|---|
| No embedding model found | Model downloads automatically on first use. Use --force-download to re-download |
| Invalid OpenAPI spec | Validation errors are logged with line numbers; valid parts are still indexed |
| ChromaDB collection not found | Collection is created automatically; prompt to index if empty |
| No documents indexed | Run api-docs index first with appropriate path |
| MemoryError with large docs | Use batch processing; consider quantized embeddings |
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Run tests and linting
- Submit a pull request
License
MIT License - see LICENSE file for details.