diff --git a/README.md b/README.md index 8a62d4a..38bf47c 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,166 @@ -# codechunk-cli +# CodeChunk CLI -A CLI tool that analyzes codebases and generates optimized context bundles for local LLMs like Ollama and LM Studio \ No newline at end of file +A CLI tool that intelligently analyzes your codebase and generates optimized context bundles for local LLMs. It chunks code files by function/module, removes boilerplate, summarizes complex logic, and creates a single context-ready file that fits within local LLM context limits. + +![CI](https://7000pct.gitea.bloupla.net/7000pctAUTO/codechunk-cli/actions/workflows/ci.yml/badge.svg) +![Version](https://img.shields.io/pypi/v/codechunk-cli) +![Python](https://img.shields.io/pypi/pyversions/codechunk-cli) +![License](https://img.shields.io/pypi/l/mit) + +## Features + +- **Intelligent Code Chunking** - Chunk code by function/module boundaries using AST parsing +- **Boilerplate Removal** - Strip unnecessary code while preserving logic +- **Dependency Analysis** - Track imports and include only necessary dependencies +- **Context Estimation** - Estimate token count and warn when approaching limits +- **Multiple Output Formats** - Support for Ollama, LM Studio, and generic LLM inputs +- **Configurable Prioritization** - Rules to prioritize important code + +## Installation + +```bash +pip install codechunk-cli +``` + +Or from source: + +```bash +git clone https://7000pct.gitea.bloupla.net/7000pctAUTO/codechunk-cli.git +cd codechunk-cli +pip install -e . +``` + +## Quick Start + +```bash +# Generate context from current directory +codechunk generate --output context.md + +# Analyze codebase structure +codechunk analyze + +# Generate with specific format +codechunk generate --format ollama --max-tokens 4096 + +# Use custom config file +codechunk generate --config my-config.yaml +``` + +## Usage + +### Generate Context + +Create an optimized context file from your codebase: + +```bash +codechunk generate [OPTIONS] +``` + +Options: +- `--output, -o FILE` - Output file path (default: stdout) +- `--format FORMAT` - Output format: ollama, lmstudio, markdown (default: markdown) +- `--max-tokens N` - Maximum token limit (default: 8192) +- `--include PATTERNS` - Include file patterns (default: *.py) +- `--exclude PATTERNS` - Exclude file patterns +- `--config FILE` - Configuration file path +- `--verbose` - Verbose output + +### Analyze Codebase + +Analyze project structure and dependencies: + +```bash +codechunk analyze [PATH] +``` + +## Configuration + +Create a `.codechunk.yaml` file for reusable settings: + +```yaml +output: context.md +format: markdown +max_tokens: 8192 +include: + - "*.py" + - "*.js" + - "*.ts" +exclude: + - "**/test_*.py" + - "**/node_modules/**" +prioritization: + - pattern: "**/main.py" + weight: 2.0 + - pattern: "**/core/**" + weight: 1.5 +``` + +## Output Formats + +### Markdown +``` +# File: src/main.py + +```python +def main(): + ... +``` +``` + +### Ollama +```json +{ + "context": [ + {"file": "src/main.py", "content": "..."} + ] +} +``` + +### LM Studio +```json +{ + "messages": [ + {"role": "system", "content": "..."} + ] +} +``` + +## Supported Languages + +- Python +- JavaScript +- TypeScript +- Go +- Rust + +## Architecture + +``` +codechunk/ +├── cli.py # CLI interface +├── config.py # Configuration handling +├── core/ +│ ├── chunking.py # Code chunking logic +│ ├── parser.py # AST-based code parsing +│ ├── summarizer.py # Code summarization +│ ├── formatter.py # Output formatters +│ └── dependency.py # Dependency analysis +└── utils/ + ├── logger.py # Logging utilities + └── file_utils.py # File operations +``` + +## Contributing + +Contributions are welcome! Please read our contributing guidelines before submitting PRs. + +1. Fork the repository +2. Create a feature branch +3. Make your changes +4. Run tests: `pytest tests/ -v` +5. Run linting: `ruff check .` +6. Submit a pull request + +## License + +MIT License - see LICENSE file for details.