A CLI tool that intelligently analyzes your codebase and generates optimized context bundles for local LLMs. It chunks code files by function/module, removes boilerplate, summarizes complex logic, and creates a single context-ready file that fits within local LLM context limits.

Features

Intelligent Code Chunking - Chunk code by function/module boundaries using AST parsing
Boilerplate Removal - Strip unnecessary code while preserving logic
Dependency Analysis - Track imports and include only necessary dependencies
Context Estimation - Estimate token count and warn when approaching limits
Multiple Output Formats - Support for Ollama, LM Studio, and generic LLM inputs
Configurable Prioritization - Rules to prioritize important code

Installation

pip install codechunk-cli

Or from source:

git clone https://7000pct.gitea.bloupla.net/7000pctAUTO/codechunk-cli.git
cd codechunk-cli
pip install -e .

Quick Start

# Generate context from current directory
codechunk generate --output context.md

# Analyze codebase structure
codechunk analyze

# Generate with specific format
codechunk generate --format ollama --max-tokens 4096

# Use custom config file
codechunk generate --config my-config.yaml

Usage

Generate Context

Create an optimized context file from your codebase:

codechunk generate [OPTIONS]

Options:

--output, -o FILE - Output file path (default: stdout)
--format FORMAT - Output format: ollama, lmstudio, markdown (default: markdown)
--max-tokens N - Maximum token limit (default: 8192)
--include PATTERNS - Include file patterns (default: *.py)
--exclude PATTERNS - Exclude file patterns
--config FILE - Configuration file path
--verbose - Verbose output

Analyze Codebase

Analyze project structure and dependencies:

codechunk analyze [PATH]

Configuration

Create a .codechunk.yaml file for reusable settings:

output: context.md
format: markdown
max_tokens: 8192
include:
  - "*.py"
  - "*.js"
  - "*.ts"
exclude:
  - "**/test_*.py"
  - "**/node_modules/**"
prioritization:
  - pattern: "**/main.py"
    weight: 2.0
  - pattern: "**/core/**"
    weight: 1.5

Output Formats

Markdown

# File: src/main.py

```python
def main():
    ...


### Ollama
```json
{
  "context": [
    {"file": "src/main.py", "content": "..."}
  ]
}

LM Studio

{
  "messages": [
    {"role": "system", "content": "..."}
  ]
}

Supported Languages

Python
JavaScript
TypeScript
Go
Rust

Architecture

codechunk/
├── cli.py           # CLI interface
├── config.py        # Configuration handling
├── core/
│   ├── chunking.py  # Code chunking logic
│   ├── parser.py    # AST-based code parsing
│   ├── summarizer.py # Code summarization
│   ├── formatter.py # Output formatters
│   └── dependency.py # Dependency analysis
└── utils/
    ├── logger.py    # Logging utilities
    └── file_utils.py # File operations

Contributing

Contributions are welcome! Please read our contributing guidelines before submitting PRs.

Fork the repository
Create a feature branch
Make your changes
Run tests: pytest tests/ -v
Run linting: ruff check .
Submit a pull request

License

MIT License - see LICENSE file for details.