Files
code-privacy-shield/README.md
7000pctAUTO b8e3aef9f5
Some checks failed
CI / test (push) Has been cancelled
Initial upload: Code Privacy Shield v0.1.0
2026-02-02 20:50:52 +00:00

308 lines
6.4 KiB
Markdown

# Code Privacy Shield
A CLI tool that analyzes code before sending to AI coding assistants and automatically redacts sensitive data like API keys, PII, database connection strings, and environment variables. It preserves code structure while protecting secrets, using regex patterns and heuristics to identify and mask sensitive information.
## Features
- **API Key Detection**: Automatically detect and redact API keys, tokens, and secrets for common services (OpenAI, GitHub, AWS, Stripe, etc.)
- **PII Detection**: Mask personally identifiable information including emails, phone numbers, SSNs, and credit cards
- **Environment Variable Redaction**: Detect and redact environment variable patterns (`os.environ`, `os.getenv`, `.env` files)
- **Database Connection String Redaction**: Identify and mask database connection strings for PostgreSQL, MySQL, MongoDB, Redis, and more
- **Custom Redaction Rules**: Support user-defined redaction rules via configuration files
- **Preview Mode**: Show what will be redacted without modifying files
- **Code Structure Preservation**: Maintain code syntax and line numbers using position-based replacement
- **Multiple File Support**: Process multiple files and directories with glob patterns
- **Integration Hooks**: Pre-commit hooks and wrapper scripts for AI coding tools
## Installation
### From Source
```bash
# Clone the repository
git clone https://github.com/yourusername/code-privacy-shield.git
cd code-privacy-shield
# Install in development mode
pip install -e .
# Or install with dev dependencies
pip install -e ".[dev]"
```
### From PyPI (coming soon)
```bash
pip install code-privacy-shield
```
## Usage
### Basic Redaction
```bash
# Redact a single file
cps redact myfile.py
# Redact multiple files
cps redact file1.py file2.py file3.py
# Redact a directory recursively
cps redact my_project/
```
### Preview Mode
```bash
# Preview what will be redacted without modifying files
cps redact --preview myfile.py
# Or use the preview command directly
cps preview myfile.py
```
### In-Place Editing
```bash
# Edit files in place (use with caution!)
cps redact --in-place myfile.py
```
### Stdin/Stdout
```bash
# Pipe code through stdin
echo "api_key = 'sk-abc123'" | cps redact
# Redirect output to a file
cps redact myfile.py > redacted.py
```
### Check for Sensitive Data
```bash
# Check if a file contains sensitive data (exits 0 if clean, 1 if secrets found)
cps check myfile.py
```
### Configuration
```bash
# Initialize an example config file
cps init-config ~/.config/cps/config.toml
# Show configuration locations
cps config-locations
# View current configuration
cps config show
# Use a custom config file
cps --config /path/to/config.toml redact myfile.py
```
## Configuration File
Code Privacy Shield looks for configuration files in the following order:
1. `.cps.toml` (project directory)
2. `~/.config/cps/config.toml` (user config)
3. Command line `--config` option
### Example Configuration
```toml
[general]
preview_mode = false
quiet_mode = false
preserve_structure = true
recursive = true
[redaction]
default_replacement = "█" * 8
preserve_length = false
[redaction.categories]
api_keys = true
pii = true
database = true
env_var = true
ip = true
authorization = true
# Custom patterns
[[custom_patterns]]
name = "Internal API Key"
pattern = "(?i)(internal[_-]?api[_-]?key['\"]?\s*[:=]\s*['\"]?)([a-zA-Z0-9_-]{16,})"
category = "internal"
# Exclude patterns
exclude_patterns = [
"*.pyc",
"__pycache__",
".git",
".svn",
"node_modules",
".env",
"dist",
"build",
]
[output]
format = "text"
show_line_numbers = false
color_output = true
```
## Output Formats
### Text (default)
```bash
cps redact --preview myfile.py
# [api_keys] OpenAI API Key: sk-abc123def456 -> ████████c456
# [pii] Email Address: user@example.com -> ████████.com
```
### JSON
```bash
cps redact --preview --format json myfile.py
# {
# "matches": [
# {
# "category": "api_keys",
# "name": "OpenAI API Key",
# "original": "sk-abc123def456",
# "replacement": "████████c456"
# }
# ],
# "total_matches": 1,
# "categories": ["api_keys"]
# }
```
## Integration with AI Coding Tools
### Pre-commit Hook
Add the provided pre-commit hook to your repository:
```bash
cp examples/pre-commit-hook.sh .git/hooks/pre-commit
chmod +x .git/hooks/pre-commit
```
### Using with Claude Code
```bash
# Create a wrapper script
cat > /usr/local/bin/claude-safe
#!/bin/bash
# Read input file, redact secrets, then pass to Claude Code
cps redact "$1" | claude "$@"
```
### Using with Continue.dev
Configure in your `~/.continue/config.py`:
```python
# Add CPS as a preprocessing step
import subprocess
def preprocess_code(code: str) -> str:
result = subprocess.run(
["cps", "redact", "--stdin"],
input=code,
capture_output=True,
text=True
)
return result.stdout
```
## Supported Patterns
### API Keys & Tokens
- OpenAI API Keys (`sk-...`)
- GitHub Tokens (`ghp_...`, `gho_...`)
- AWS Access Keys
- Stripe Keys
- Slack Tokens
- SendGrid Keys
- Twilio Keys
- And more...
### Personally Identifiable Information
- Email addresses
- Phone numbers
- Social Security Numbers
- Credit card numbers
- Full names
- Addresses
- Passwords
- Usernames
### Database Connections
- PostgreSQL (`postgresql://...`)
- MySQL (`mysql://...`)
- MongoDB (`mongodb://...`)
- Redis (`redis://...`)
- SQLite
- SQL Server
- And more...
### Environment Variables
- `os.environ` access
- `os.getenv` calls
- Shell export statements
- `.env` file contents
### Authorization Headers
- Bearer tokens
- Basic auth
- API key headers
- Custom authorization headers
## Development
### Running Tests
```bash
# Run all tests
pytest tests/ -v
# Run with coverage
pytest tests/ --cov=src
# Run specific test file
pytest tests/test_patterns.py -v
```
### Adding Custom Patterns
To add custom patterns, create a configuration file:
```toml
[[custom_patterns]]
name = "My Custom Secret"
pattern = "(?i)mysecret['\"]?\s*[:=]\s*['\"]?)([a-zA-Z0-9]{16,})"
category = "custom"
```
## License
MIT License - see LICENSE file for details.
## Contributing
1. Fork the repository
2. Create a feature branch
3. Add tests for your changes
4. Ensure all tests pass
5. Submit a pull request
## Security
If you discover a security vulnerability, please open an issue or contact the maintainers directly. Do not disclose security issues publicly.