# CodeChunk CLI A CLI tool that intelligently analyzes your codebase and generates optimized context bundles for local LLMs. It chunks code files by function/module, removes boilerplate, summarizes complex logic, and creates a single context-ready file that fits within local LLM context limits. ![CI](https://7000pct.gitea.bloupla.net/7000pctAUTO/codechunk-cli/actions/workflows/ci.yml/badge.svg) ![Version](https://img.shields.io/pypi/v/codechunk-cli) ![Python](https://img.shields.io/pypi/pyversions/codechunk-cli) ![License](https://img.shields.io/pypi/l/mit) ## Features - **Intelligent Code Chunking** - Chunk code by function/module boundaries using AST parsing - **Boilerplate Removal** - Strip unnecessary code while preserving logic - **Dependency Analysis** - Track imports and include only necessary dependencies - **Context Estimation** - Estimate token count and warn when approaching limits - **Multiple Output Formats** - Support for Ollama, LM Studio, and generic LLM inputs - **Configurable Prioritization** - Rules to prioritize important code ## Installation ```bash pip install codechunk-cli ``` Or from source: ```bash git clone https://7000pct.gitea.bloupla.net/7000pctAUTO/codechunk-cli.git cd codechunk-cli pip install -e . ``` ## Quick Start ```bash # Generate context from current directory codechunk generate --output context.md # Analyze codebase structure codechunk analyze # Generate with specific format codechunk generate --format ollama --max-tokens 4096 # Use custom config file codechunk generate --config my-config.yaml ``` ## Usage ### Generate Context Create an optimized context file from your codebase: ```bash codechunk generate [OPTIONS] ``` Options: - `--output, -o FILE` - Output file path (default: stdout) - `--format FORMAT` - Output format: ollama, lmstudio, markdown (default: markdown) - `--max-tokens N` - Maximum token limit (default: 8192) - `--include PATTERNS` - Include file patterns (default: *.py) - `--exclude PATTERNS` - Exclude file patterns - `--config FILE` - Configuration file path - `--verbose` - Verbose output ### Analyze Codebase Analyze project structure and dependencies: ```bash codechunk analyze [PATH] ``` ## Configuration Create a `.codechunk.yaml` file for reusable settings: ```yaml output: context.md format: markdown max_tokens: 8192 include: - "*.py" - "*.js" - "*.ts" exclude: - "**/test_*.py" - "**/node_modules/**" prioritization: - pattern: "**/main.py" weight: 2.0 - pattern: "**/core/**" weight: 1.5 ``` ## Output Formats ### Markdown ``` # File: src/main.py ```python def main(): ... ``` ``` ### Ollama ```json { "context": [ {"file": "src/main.py", "content": "..."} ] } ``` ### LM Studio ```json { "messages": [ {"role": "system", "content": "..."} ] } ``` ## Supported Languages - Python - JavaScript - TypeScript - Go - Rust ## Architecture ``` codechunk/ ├── cli.py # CLI interface ├── config.py # Configuration handling ├── core/ │ ├── chunking.py # Code chunking logic │ ├── parser.py # AST-based code parsing │ ├── summarizer.py # Code summarization │ ├── formatter.py # Output formatters │ └── dependency.py # Dependency analysis └── utils/ ├── logger.py # Logging utilities └── file_utils.py # File operations ``` ## Contributing Contributions are welcome! Please read our contributing guidelines before submitting PRs. 1. Fork the repository 2. Create a feature branch 3. Make your changes 4. Run tests: `pytest tests/ -v` 5. Run linting: `ruff check .` 6. Submit a pull request ## License MIT License - see LICENSE file for details.