# CodeChunk CLI

A CLI tool that intelligently analyzes your codebase and generates optimized context bundles for local LLMs. It chunks code files by function/module, removes boilerplate, summarizes complex logic, and creates a single context-ready file that fits within local LLM context limits.

![CI](https://7000pct.gitea.bloupla.net/7000pctAUTO/codechunk-cli/actions/workflows/ci.yml/badge.svg)
![Version](https://img.shields.io/pypi/v/codechunk-cli)
![Python](https://img.shields.io/pypi/pyversions/codechunk-cli)
![License](https://img.shields.io/pypi/l/mit)

## Features

- **Intelligent Code Chunking** - Chunk code by function/module boundaries using AST parsing
- **Boilerplate Removal** - Strip unnecessary code while preserving logic
- **Dependency Analysis** - Track imports and include only necessary dependencies
- **Context Estimation** - Estimate token count and warn when approaching limits
- **Multiple Output Formats** - Support for Ollama, LM Studio, and generic LLM inputs
- **Configurable Prioritization** - Rules to prioritize important code

## Installation

```bash
pip install codechunk-cli
```

Or from source:

```bash
git clone https://7000pct.gitea.bloupla.net/7000pctAUTO/codechunk-cli.git
cd codechunk-cli
pip install -e .
```

## Quick Start

```bash
# Generate context from current directory
codechunk generate --output context.md

# Analyze codebase structure
codechunk analyze

# Generate with specific format
codechunk generate --format ollama --max-tokens 4096

# Use custom config file
codechunk generate --config my-config.yaml
```

## Usage

### Generate Context

Create an optimized context file from your codebase:

```bash
codechunk generate [OPTIONS]
```

Options:
- `--output, -o FILE` - Output file path (default: stdout)
- `--format FORMAT` - Output format: ollama, lmstudio, markdown (default: markdown)
- `--max-tokens N` - Maximum token limit (default: 8192)
- `--include PATTERNS` - Include file patterns (default: *.py)
- `--exclude PATTERNS` - Exclude file patterns
- `--config FILE` - Configuration file path
- `--verbose` - Verbose output

### Analyze Codebase

Analyze project structure and dependencies:

```bash
codechunk analyze [PATH]
```

## Configuration

Create a `.codechunk.yaml` file for reusable settings:

```yaml
output: context.md
format: markdown
max_tokens: 8192
include:
  - "*.py"
  - "*.js"
  - "*.ts"
exclude:
  - "**/test_*.py"
  - "**/node_modules/**"
prioritization:
  - pattern: "**/main.py"
    weight: 2.0
  - pattern: "**/core/**"
    weight: 1.5
```

## Output Formats

### Markdown
```
# File: src/main.py

```python
def main():
    ...
```
```

### Ollama
```json
{
  "context": [
    {"file": "src/main.py", "content": "..."}
  ]
}
```

### LM Studio
```json
{
  "messages": [
    {"role": "system", "content": "..."}
  ]
}
```

## Supported Languages

- Python
- JavaScript
- TypeScript
- Go
- Rust

## Architecture

```
codechunk/
├── cli.py           # CLI interface
├── config.py        # Configuration handling
├── core/
│   ├── chunking.py  # Code chunking logic
│   ├── parser.py    # AST-based code parsing
│   ├── summarizer.py # Code summarization
│   ├── formatter.py # Output formatters
│   └── dependency.py # Dependency analysis
└── utils/
    ├── logger.py    # Logging utilities
    └── file_utils.py # File operations
```

## Contributing

Contributions are welcome! Please read our contributing guidelines before submitting PRs.

1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Run tests: `pytest tests/ -v`
5. Run linting: `ruff check .`
6. Submit a pull request

## License

MIT License - see LICENSE file for details.