Files
local-api-docs-search/README.md
7000pctAUTO da3a3b6715
Some checks failed
CI / test (push) Has been cancelled
CI / build (push) Has been cancelled
Initial commit: Add Local API Docs Search CLI tool
2026-02-03 01:18:50 +00:00

266 lines
6.3 KiB
Markdown

# Local API Docs Search
A CLI tool that indexes local API documentation (OpenAPI specs, code comments, README files) and enables natural language semantic search using local embedding models. Developers can query their project's API docs offline without external API calls.
## Features
- **Semantic Search**: Natural language queries using local embedding models
- **Offline Processing**: All processing happens locally - no external API calls
- **Multiple Formats**: Support for OpenAPI/Swagger specs, Markdown READMEs, and code comments
- **Interactive Mode**: Rich CLI interface with history and navigation
- **Configurable**: Easy configuration via YAML or environment variables
## Installation
```bash
# Clone the repository
git clone <repository-url>
cd local-api-docs-search
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -e .
# Or install with dev dependencies
pip install -e ".[dev]"
```
## Quick Start
1. **Configure your environment**:
```bash
cp .env.example .env
# Edit .env to set your documentation path
```
2. **Index your documentation**:
```bash
# Index all supported formats
api-docs index ./docs
# Index specific format
api-docs index ./docs --type openapi
api-docs index ./docs --type readme
api-docs index ./docs --type code
# Recursive indexing
api-docs index ./docs --recursive
```
3. **Search your documentation**:
```bash
# Simple search
api-docs search "how to authenticate users"
# With limit
api-docs search "authentication" --limit 5
# Filter by type
api-docs search "endpoints" --type openapi
```
4. **Interactive mode**:
```bash
api-docs interactive
```
## Configuration
### Environment Variables
| Variable | Default | Description |
|----------|---------|-------------|
| `API_DOCS_INDEX_PATH` | `./docs` | Directory containing docs to index |
| `API_DOCS_MODEL_NAME` | `all-MiniLM-L6-v2` | Embedding model to use |
| `API_DOCS_EMBEDDING_DEVICE` | `cpu` | Device for embeddings (cpu/cuda/auto) |
| `API_DOCS_CHROMA_PERSIST_DIR` | `.api-docs/chroma` | ChromaDB storage directory |
| `API_DOCS_DEFAULT_LIMIT` | `10` | Default result limit |
| `API_DOCS_VERBOSE` | `false` | Enable verbose logging |
### YAML Config
You can also use a `config.yaml` file:
```yaml
index_path: ./docs
model_name: all-MiniLM-L6-v2
embedding_device: cpu
chroma_persist_dir: .api-docs/chroma
default_limit: 10
verbose: false
```
## Commands
### index
Index documentation files for search.
```bash
api-docs index <path> [OPTIONS]
Options:
--type TEXT Type of docs: openapi, readme, code, or all (default: all)
--recursive Recursively search directories
--batch-size Documents per batch (default: 32)
--help Show this message and exit
```
### search
Search indexed documentation.
```bash
api-docs search <query> [OPTIONS]
Options:
--limit INTEGER Maximum results (default: 10)
--type TEXT Filter by source type
--json Output as JSON
--help Show this message and exit
```
### list
List indexed documents.
```bash
api-docs list [OPTIONS]
Options:
--type TEXT Filter by source type
--json Output as JSON
--help Show this message and exit
```
### config
Manage configuration.
```bash
api-docs config [OPTIONS]
Options:
--show Show current configuration
--reset Reset to defaults
--help Show this message and exit
```
### interactive
Enter interactive search mode with history and navigation.
```bash
api-docs interactive
```
## Supported Formats
### OpenAPI/Swagger
Parses OpenAPI 3.0+ and Swagger 2.0 specifications in YAML or JSON format. Extracts:
- Endpoint paths and methods
- Operation IDs and summaries
- Parameters and request bodies
- Response schemas
- Tags and descriptions
### README Files
Parses Markdown files including:
- Section headers
- Paragraph text
- Code blocks
- Lists and tables
### Code Comments
Supports Python, JavaScript, and TypeScript files:
- Google-style docstrings
- JSDoc comments
- Function and class documentation
- Inline comments (configurable)
## Interactive Mode
Interactive mode provides:
- Real-time search as you type
- Arrow key navigation through history
- Result pagination
- Syntax highlighting
- Source attribution
Keyboard shortcuts:
- `` / ``: Navigate history
- `Enter`: Submit search
- `Tab`: Complete suggestion
- `Ctrl+C`: Exit
- `F1`: Toggle help
## Architecture
```
local-api-docs-search/
├── src/
│ ├── main.py # CLI entry point
│ ├── cli/
│ │ ├── commands.py # CLI command definitions
│ │ └── interactive.py # Interactive mode
│ ├── indexer/
│ │ ├── base.py # Base indexer interface
│ │ ├── openapi.py # OpenAPI spec indexer
│ │ ├── readme.py # README file indexer
│ │ └── code.py # Code comment indexer
│ ├── search/
│ │ ├── embeddings.py # Embedding model management
│ │ ├── vectorstore.py # ChromaDB operations
│ │ └── searcher.py # Search logic
│ ├── models/
│ │ └── document.py # Document models
│ └── utils/
│ ├── config.py # Configuration management
│ └── formatters.py # Output formatting
├── tests/
├── pyproject.toml
└── README.md
```
## Development
```bash
# Run tests
pytest tests/ -v --cov=src
# Run linting
black src/ tests/
ruff check src/ tests/
# Type checking
mypy src/
```
## Error Handling
| Error | Solution |
|-------|----------|
| No embedding model found | Model downloads automatically on first use. Use `--force-download` to re-download |
| Invalid OpenAPI spec | Validation errors are logged with line numbers; valid parts are still indexed |
| ChromaDB collection not found | Collection is created automatically; prompt to index if empty |
| No documents indexed | Run `api-docs index` first with appropriate path |
| MemoryError with large docs | Use batch processing; consider quantized embeddings |
## Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Run tests and linting
5. Submit a pull request
## License
MIT License - see LICENSE file for details.