From a7825b2c16ffc2a757cc4453fd5053476e85b3f1 Mon Sep 17 00:00:00 2001 From: 7000pctAUTO Date: Tue, 3 Feb 2026 04:17:05 +0000 Subject: [PATCH] Initial upload: DataForge CLI with full documentation and tests --- README.md | 373 +++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 371 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index 702affd..e0f1758 100644 --- a/README.md +++ b/README.md @@ -1,3 +1,372 @@ -# dataforge-cli +# DataForge CLI -A CLI tool that converts and validates data formats (JSON, YAML, TOML) with schema validation using JSON Schema \ No newline at end of file +A powerful CLI tool for converting and validating data formats (JSON, YAML, TOML) with schema validation using JSON Schema. + +[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) +[![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/) + +## Features + +- **Format Conversion**: Convert between JSON, YAML, and TOML formats seamlessly +- **Schema Validation**: Validate data against JSON Schema with detailed error messages +- **Type Checking**: Infer and validate data types with schema inference +- **Batch Processing**: Process multiple files at once with pattern matching +- **Quiet Mode**: Minimal output for scripting and automation + +## Installation + +### From PyPI (Recommended) + +```bash +pip install dataforge-cli +``` + +### From Source + +```bash +git clone https://7000pct.gitea.bloupla.net/7000pctAUTO/dataforge-cli.git +cd dataforge-cli +pip install -e ".[dev]" +``` + +### Development Installation + +```bash +pip install -e ".[dev]" +``` + +## Quick Start + +### Convert a file from JSON to YAML + +```bash +dataforge convert config.json config.yaml --to yaml +``` + +### Validate a file against a JSON Schema + +```bash +dataforge validate config.json --schema schema.json +``` + +### Infer type schema from data + +```bash +dataforge typecheck config.json --infer +``` + +### Batch convert all JSON files to YAML + +```bash +dataforge batch-convert --to yaml --pattern "*.json" +``` + +### Validate all config files against a schema + +```bash +dataforge batch-validate --schema schema.json --pattern "*.{json,yaml,yml,toml}" +``` + +## Commands + +### convert + +Convert a file from one format to another. + +```bash +dataforge convert INPUT_FILE OUTPUT_FILE --to FORMAT [--from FORMAT] [--indent N] [--quiet] +``` + +**Options:** + +- `INPUT_FILE`: Input file path (or `-` for stdin) +- `OUTPUT_FILE`: Output file path (or `-` for stdout) +- `--to, -t`: Output format (required: `json`, `yaml`, `toml`) +- `--from, -f`: Input format (auto-detected from extension if not specified) +- `--indent, -i`: Indentation spaces (default: 2, use 0 for compact) +- `--quiet, -q`: Minimal output + +**Examples:** + +```bash +# Convert JSON to YAML +dataforge convert config.json config.yaml --to yaml + +# Convert YAML to TOML with compact output +dataforge convert config.yaml config.toml --to toml --indent 0 + +# Convert from stdin to stdout +cat config.json | dataforge convert - config.yaml --to yaml +``` + +### batch-convert + +Convert multiple files from one format to another. + +```bash +dataforge batch-convert [--from FORMAT] --to FORMAT [--output-dir DIR] [--pattern GLOB] [--recursive] [--quiet] +``` + +**Options:** + +- `--to, -t`: Output format (required) +- `--from, -f`: Input format +- `--output-dir, -o`: Output directory (default: `.`) +- `--pattern, -p`: File pattern for batch processing (default: `*.{json,yaml,yml,toml}`) +- `--recursive, -r`: Search recursively +- `--quiet, -q`: Minimal output + +**Examples:** + +```bash +# Convert all JSON files to YAML +dataforge batch-convert --to yaml --pattern "*.json" + +# Convert all files in configs/ directory +dataforge batch-convert --to yaml --directory configs/ --recursive + +# Convert and save to output directory +dataforge batch-convert --to toml --output-dir converted/ +``` + +### validate + +Validate a file against a JSON Schema. + +```bash +dataforge validate INPUT_FILE [--schema SCHEMA_FILE] [--strict] [--quiet] +``` + +**Options:** + +- `INPUT_FILE`: Input file path (or `-` for stdin) +- `--schema, -s`: Path to JSON Schema file +- `--strict`: Strict validation mode +- `--quiet, -q`: Minimal output + +**Examples:** + +```bash +# Validate with schema +dataforge validate config.json --schema schema.json + +# Validate without schema (check format only) +dataforge validate config.json + +# Validate from stdin +cat config.json | dataforge validate - --schema schema.json +``` + +### batch-validate + +Validate multiple files against a JSON Schema. + +```bash +dataforge batch-validate --schema SCHEMA_FILE [INPUT_FILES...] [--pattern GLOB] [--recursive] [--quiet] +``` + +**Options:** + +- `--schema, -s`: Path to JSON Schema file (required) +- `--pattern, -p`: File pattern for batch processing (default: `*.{json,yaml,yml,toml}`) +- `--recursive, -r`: Search recursively +- `--quiet, -q`: Minimal output + +**Examples:** + +```bash +# Validate all config files +dataforge batch-validate --schema schema.json --pattern "*.{json,yaml,yml,toml}" + +# Validate specific files +dataforge batch-validate --schema schema.json config1.json config2.yaml config3.toml + +# Recursively validate all files +dataforge batch-validate --schema schema.json --recursive +``` + +### typecheck + +Check types in a data file. + +```bash +dataforge typecheck INPUT_FILE [--infer] [--quiet] +``` + +**Options:** + +- `INPUT_FILE`: Input file path (or `-` for stdin) +- `--infer`: Infer schema from data and print it +- `--quiet, -q`: Minimal output + +**Examples:** + +```bash +# Check types in file +dataforge typecheck config.json + +# Infer and print schema +dataforge typecheck config.json --infer + +# Infer from stdin +cat config.json | dataforge typecheck - --infer +``` + +### Global Options + +- `--help, -h`: Show help message +- `--version`: Show version + +## Exit Codes + +DataForge uses exit codes to indicate success or failure: + +- `0`: Success +- `1`: Validation failed or error occurred + +This makes it suitable for use in CI/CD pipelines and scripts. + +## Configuration + +### Environment Variables + +No environment variables are required. All configuration is done via command-line options. + +### Project Configuration + +DataForge can be configured via `pyproject.toml`: + +```toml +[tool.dataforge] +# Custom configuration options (future) +``` + +## JSON Schema Validation + +DataForge supports JSON Schema validation with the following features: + +- Draft-07 and Draft 2019-09 schemas +- Detailed error messages with path information +- Support for all JSON Schema keywords + +Example schema: + +```json +{ + "$schema": "http://json-schema.org/draft-07/schema#", + "type": "object", + "properties": { + "name": { + "type": "string" + }, + "version": { + "type": "string", + "pattern": "^\\d+\\.\\d+\\.\\d+$" + } + }, + "required": ["name", "version"] +} +``` + +## Use Cases + +### Configuration File Validation + +Validate all configuration files before deployment: + +```bash +dataforge batch-validate --schema config-schema.json --pattern "*.{json,yaml,yml,toml}" --recursive +``` + +### CI/CD Pipeline Integration + +```bash +# Validate configs in CI +dataforge batch-validate --schema schema.json || exit 1 +``` + +### Data Format Conversion + +Convert legacy configuration files: + +```bash +for file in legacy/*.json; do + dataforge convert "$file" "${file%.json}.yaml" --to yaml +done +``` + +### API Response Validation + +```bash +# Validate API response schema +curl -s https://api.example.com/data | dataforge validate - --schema api-schema.json +``` + +## Development + +### Setting Up Development Environment + +```bash +# Clone the repository +git clone https://7000pct.gitea.bloupla.net/7000pctAUTO/dataforge-cli.git +cd dataforge-cli + +# Create virtual environment +python -m venv venv +source venv/bin/activate # On Windows: venv\Scripts\activate + +# Install in development mode +pip install -e ".[dev]" + +# Run tests +pytest -v + +# Run tests with coverage +pytest --cov=dataforge +``` + +### Running Tests + +```bash +# Run all tests +pytest -v + +# Run specific test file +pytest tests/test_dataforge_parsers.py -v + +# Run with coverage +pytest --cov=dataforge --cov-report=html +``` + +### Code Style + +```bash +# Check code style (if configured) +ruff check . +``` + +## Contributing + +1. Fork the repository +2. Create a feature branch (`git checkout -b feature/amazing-feature`) +3. Commit your changes (`git commit -m 'Add amazing feature'`) +4. Push to the branch (`git push origin feature/amazing-feature`) +5. Open a Pull Request + +## Roadmap + +- [ ] Support for XML and CSV formats +- [ ] Plugin system for custom validators +- [ ] Configuration file support +- [ ] Improved error reporting with colored output +- [ ] Integration with popular frameworks + +## License + +This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details. + +## Acknowledgments + +- [Click](https://click.palletsprojects.com/) for the CLI framework +- [PyYAML](https://pyyaml.org/) for YAML support +- [jsonschema](https://python-jsonschema.readthedocs.io/) for JSON Schema validation