DataForge CLI
A powerful CLI tool for converting and validating data formats (JSON, YAML, TOML) with schema validation using JSON Schema.
Features
- Format Conversion: Convert between JSON, YAML, and TOML formats seamlessly
- Schema Validation: Validate data against JSON Schema with detailed error messages
- Type Checking: Infer and validate data types with schema inference
- Batch Processing: Process multiple files at once with pattern matching
- Quiet Mode: Minimal output for scripting and automation
Installation
From PyPI (Recommended)
pip install dataforge-cli
From Source
git clone https://7000pct.gitea.bloupla.net/7000pctAUTO/dataforge-cli.git
cd dataforge-cli
pip install -e ".[dev]"
Development Installation
pip install -e ".[dev]"
Quick Start
Convert a file from JSON to YAML
dataforge convert config.json config.yaml --to yaml
Validate a file against a JSON Schema
dataforge validate config.json --schema schema.json
Infer type schema from data
dataforge typecheck config.json --infer
Batch convert all JSON files to YAML
dataforge batch-convert --to yaml --pattern "*.json"
Validate all config files against a schema
dataforge batch-validate --schema schema.json --pattern "*.{json,yaml,yml,toml}"
Commands
convert
Convert a file from one format to another.
dataforge convert INPUT_FILE OUTPUT_FILE --to FORMAT [--from FORMAT] [--indent N] [--quiet]
Options:
INPUT_FILE: Input file path (or-for stdin)OUTPUT_FILE: Output file path (or-for stdout)--to, -t: Output format (required:json,yaml,toml)--from, -f: Input format (auto-detected from extension if not specified)--indent, -i: Indentation spaces (default: 2, use 0 for compact)--quiet, -q: Minimal output
Examples:
# Convert JSON to YAML
dataforge convert config.json config.yaml --to yaml
# Convert YAML to TOML with compact output
dataforge convert config.yaml config.toml --to toml --indent 0
# Convert from stdin to stdout
cat config.json | dataforge convert - config.yaml --to yaml
batch-convert
Convert multiple files from one format to another.
dataforge batch-convert [--from FORMAT] --to FORMAT [--output-dir DIR] [--pattern GLOB] [--recursive] [--quiet]
Options:
--to, -t: Output format (required)--from, -f: Input format--output-dir, -o: Output directory (default:.)--pattern, -p: File pattern for batch processing (default:*.{json,yaml,yml,toml})--recursive, -r: Search recursively--quiet, -q: Minimal output
Examples:
# Convert all JSON files to YAML
dataforge batch-convert --to yaml --pattern "*.json"
# Convert all files in configs/ directory
dataforge batch-convert --to yaml --directory configs/ --recursive
# Convert and save to output directory
dataforge batch-convert --to toml --output-dir converted/
validate
Validate a file against a JSON Schema.
dataforge validate INPUT_FILE [--schema SCHEMA_FILE] [--strict] [--quiet]
Options:
INPUT_FILE: Input file path (or-for stdin)--schema, -s: Path to JSON Schema file--strict: Strict validation mode--quiet, -q: Minimal output
Examples:
# Validate with schema
dataforge validate config.json --schema schema.json
# Validate without schema (check format only)
dataforge validate config.json
# Validate from stdin
cat config.json | dataforge validate - --schema schema.json
batch-validate
Validate multiple files against a JSON Schema.
dataforge batch-validate --schema SCHEMA_FILE [INPUT_FILES...] [--pattern GLOB] [--recursive] [--quiet]
Options:
--schema, -s: Path to JSON Schema file (required)--pattern, -p: File pattern for batch processing (default:*.{json,yaml,yml,toml})--recursive, -r: Search recursively--quiet, -q: Minimal output
Examples:
# Validate all config files
dataforge batch-validate --schema schema.json --pattern "*.{json,yaml,yml,toml}"
# Validate specific files
dataforge batch-validate --schema schema.json config1.json config2.yaml config3.toml
# Recursively validate all files
dataforge batch-validate --schema schema.json --recursive
typecheck
Check types in a data file.
dataforge typecheck INPUT_FILE [--infer] [--quiet]
Options:
INPUT_FILE: Input file path (or-for stdin)--infer: Infer schema from data and print it--quiet, -q: Minimal output
Examples:
# Check types in file
dataforge typecheck config.json
# Infer and print schema
dataforge typecheck config.json --infer
# Infer from stdin
cat config.json | dataforge typecheck - --infer
Global Options
--help, -h: Show help message--version: Show version
Exit Codes
DataForge uses exit codes to indicate success or failure:
0: Success1: Validation failed or error occurred
This makes it suitable for use in CI/CD pipelines and scripts.
Configuration
Environment Variables
No environment variables are required. All configuration is done via command-line options.
Project Configuration
DataForge can be configured via pyproject.toml:
[tool.dataforge]
# Custom configuration options (future)
JSON Schema Validation
DataForge supports JSON Schema validation with the following features:
- Draft-07 and Draft 2019-09 schemas
- Detailed error messages with path information
- Support for all JSON Schema keywords
Example schema:
{
"$schema": "http://json-schema.org/draft-07/schema#",
"type": "object",
"properties": {
"name": {
"type": "string"
},
"version": {
"type": "string",
"pattern": "^\\d+\\.\\d+\\.\\d+$"
}
},
"required": ["name", "version"]
}
Use Cases
Configuration File Validation
Validate all configuration files before deployment:
dataforge batch-validate --schema config-schema.json --pattern "*.{json,yaml,yml,toml}" --recursive
CI/CD Pipeline Integration
# Validate configs in CI
dataforge batch-validate --schema schema.json || exit 1
Data Format Conversion
Convert legacy configuration files:
for file in legacy/*.json; do
dataforge convert "$file" "${file%.json}.yaml" --to yaml
done
API Response Validation
# Validate API response schema
curl -s https://api.example.com/data | dataforge validate - --schema api-schema.json
Development
Setting Up Development Environment
# Clone the repository
git clone https://7000pct.gitea.bloupla.net/7000pctAUTO/dataforge-cli.git
cd dataforge-cli
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install in development mode
pip install -e ".[dev]"
# Run tests
pytest -v
# Run tests with coverage
pytest --cov=dataforge
Running Tests
# Run all tests
pytest -v
# Run specific test file
pytest tests/test_dataforge_parsers.py -v
# Run with coverage
pytest --cov=dataforge --cov-report=html
Code Style
# Check code style (if configured)
ruff check .
Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Roadmap
- Support for XML and CSV formats
- Plugin system for custom validators
- Configuration file support
- Improved error reporting with colored output
- Integration with popular frameworks
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Click for the CLI framework
- PyYAML for YAML support
- jsonschema for JSON Schema validation