Initial upload: DataForge CLI with full documentation and tests
This commit is contained in:
373
README.md
373
README.md
@@ -1,3 +1,372 @@
|
||||
# dataforge-cli
|
||||
# DataForge CLI
|
||||
|
||||
A CLI tool that converts and validates data formats (JSON, YAML, TOML) with schema validation using JSON Schema
|
||||
A powerful CLI tool for converting and validating data formats (JSON, YAML, TOML) with schema validation using JSON Schema.
|
||||
|
||||
[](https://opensource.org/licenses/MIT)
|
||||
[](https://www.python.org/downloads/)
|
||||
|
||||
## Features
|
||||
|
||||
- **Format Conversion**: Convert between JSON, YAML, and TOML formats seamlessly
|
||||
- **Schema Validation**: Validate data against JSON Schema with detailed error messages
|
||||
- **Type Checking**: Infer and validate data types with schema inference
|
||||
- **Batch Processing**: Process multiple files at once with pattern matching
|
||||
- **Quiet Mode**: Minimal output for scripting and automation
|
||||
|
||||
## Installation
|
||||
|
||||
### From PyPI (Recommended)
|
||||
|
||||
```bash
|
||||
pip install dataforge-cli
|
||||
```
|
||||
|
||||
### From Source
|
||||
|
||||
```bash
|
||||
git clone https://7000pct.gitea.bloupla.net/7000pctAUTO/dataforge-cli.git
|
||||
cd dataforge-cli
|
||||
pip install -e ".[dev]"
|
||||
```
|
||||
|
||||
### Development Installation
|
||||
|
||||
```bash
|
||||
pip install -e ".[dev]"
|
||||
```
|
||||
|
||||
## Quick Start
|
||||
|
||||
### Convert a file from JSON to YAML
|
||||
|
||||
```bash
|
||||
dataforge convert config.json config.yaml --to yaml
|
||||
```
|
||||
|
||||
### Validate a file against a JSON Schema
|
||||
|
||||
```bash
|
||||
dataforge validate config.json --schema schema.json
|
||||
```
|
||||
|
||||
### Infer type schema from data
|
||||
|
||||
```bash
|
||||
dataforge typecheck config.json --infer
|
||||
```
|
||||
|
||||
### Batch convert all JSON files to YAML
|
||||
|
||||
```bash
|
||||
dataforge batch-convert --to yaml --pattern "*.json"
|
||||
```
|
||||
|
||||
### Validate all config files against a schema
|
||||
|
||||
```bash
|
||||
dataforge batch-validate --schema schema.json --pattern "*.{json,yaml,yml,toml}"
|
||||
```
|
||||
|
||||
## Commands
|
||||
|
||||
### convert
|
||||
|
||||
Convert a file from one format to another.
|
||||
|
||||
```bash
|
||||
dataforge convert INPUT_FILE OUTPUT_FILE --to FORMAT [--from FORMAT] [--indent N] [--quiet]
|
||||
```
|
||||
|
||||
**Options:**
|
||||
|
||||
- `INPUT_FILE`: Input file path (or `-` for stdin)
|
||||
- `OUTPUT_FILE`: Output file path (or `-` for stdout)
|
||||
- `--to, -t`: Output format (required: `json`, `yaml`, `toml`)
|
||||
- `--from, -f`: Input format (auto-detected from extension if not specified)
|
||||
- `--indent, -i`: Indentation spaces (default: 2, use 0 for compact)
|
||||
- `--quiet, -q`: Minimal output
|
||||
|
||||
**Examples:**
|
||||
|
||||
```bash
|
||||
# Convert JSON to YAML
|
||||
dataforge convert config.json config.yaml --to yaml
|
||||
|
||||
# Convert YAML to TOML with compact output
|
||||
dataforge convert config.yaml config.toml --to toml --indent 0
|
||||
|
||||
# Convert from stdin to stdout
|
||||
cat config.json | dataforge convert - config.yaml --to yaml
|
||||
```
|
||||
|
||||
### batch-convert
|
||||
|
||||
Convert multiple files from one format to another.
|
||||
|
||||
```bash
|
||||
dataforge batch-convert [--from FORMAT] --to FORMAT [--output-dir DIR] [--pattern GLOB] [--recursive] [--quiet]
|
||||
```
|
||||
|
||||
**Options:**
|
||||
|
||||
- `--to, -t`: Output format (required)
|
||||
- `--from, -f`: Input format
|
||||
- `--output-dir, -o`: Output directory (default: `.`)
|
||||
- `--pattern, -p`: File pattern for batch processing (default: `*.{json,yaml,yml,toml}`)
|
||||
- `--recursive, -r`: Search recursively
|
||||
- `--quiet, -q`: Minimal output
|
||||
|
||||
**Examples:**
|
||||
|
||||
```bash
|
||||
# Convert all JSON files to YAML
|
||||
dataforge batch-convert --to yaml --pattern "*.json"
|
||||
|
||||
# Convert all files in configs/ directory
|
||||
dataforge batch-convert --to yaml --directory configs/ --recursive
|
||||
|
||||
# Convert and save to output directory
|
||||
dataforge batch-convert --to toml --output-dir converted/
|
||||
```
|
||||
|
||||
### validate
|
||||
|
||||
Validate a file against a JSON Schema.
|
||||
|
||||
```bash
|
||||
dataforge validate INPUT_FILE [--schema SCHEMA_FILE] [--strict] [--quiet]
|
||||
```
|
||||
|
||||
**Options:**
|
||||
|
||||
- `INPUT_FILE`: Input file path (or `-` for stdin)
|
||||
- `--schema, -s`: Path to JSON Schema file
|
||||
- `--strict`: Strict validation mode
|
||||
- `--quiet, -q`: Minimal output
|
||||
|
||||
**Examples:**
|
||||
|
||||
```bash
|
||||
# Validate with schema
|
||||
dataforge validate config.json --schema schema.json
|
||||
|
||||
# Validate without schema (check format only)
|
||||
dataforge validate config.json
|
||||
|
||||
# Validate from stdin
|
||||
cat config.json | dataforge validate - --schema schema.json
|
||||
```
|
||||
|
||||
### batch-validate
|
||||
|
||||
Validate multiple files against a JSON Schema.
|
||||
|
||||
```bash
|
||||
dataforge batch-validate --schema SCHEMA_FILE [INPUT_FILES...] [--pattern GLOB] [--recursive] [--quiet]
|
||||
```
|
||||
|
||||
**Options:**
|
||||
|
||||
- `--schema, -s`: Path to JSON Schema file (required)
|
||||
- `--pattern, -p`: File pattern for batch processing (default: `*.{json,yaml,yml,toml}`)
|
||||
- `--recursive, -r`: Search recursively
|
||||
- `--quiet, -q`: Minimal output
|
||||
|
||||
**Examples:**
|
||||
|
||||
```bash
|
||||
# Validate all config files
|
||||
dataforge batch-validate --schema schema.json --pattern "*.{json,yaml,yml,toml}"
|
||||
|
||||
# Validate specific files
|
||||
dataforge batch-validate --schema schema.json config1.json config2.yaml config3.toml
|
||||
|
||||
# Recursively validate all files
|
||||
dataforge batch-validate --schema schema.json --recursive
|
||||
```
|
||||
|
||||
### typecheck
|
||||
|
||||
Check types in a data file.
|
||||
|
||||
```bash
|
||||
dataforge typecheck INPUT_FILE [--infer] [--quiet]
|
||||
```
|
||||
|
||||
**Options:**
|
||||
|
||||
- `INPUT_FILE`: Input file path (or `-` for stdin)
|
||||
- `--infer`: Infer schema from data and print it
|
||||
- `--quiet, -q`: Minimal output
|
||||
|
||||
**Examples:**
|
||||
|
||||
```bash
|
||||
# Check types in file
|
||||
dataforge typecheck config.json
|
||||
|
||||
# Infer and print schema
|
||||
dataforge typecheck config.json --infer
|
||||
|
||||
# Infer from stdin
|
||||
cat config.json | dataforge typecheck - --infer
|
||||
```
|
||||
|
||||
### Global Options
|
||||
|
||||
- `--help, -h`: Show help message
|
||||
- `--version`: Show version
|
||||
|
||||
## Exit Codes
|
||||
|
||||
DataForge uses exit codes to indicate success or failure:
|
||||
|
||||
- `0`: Success
|
||||
- `1`: Validation failed or error occurred
|
||||
|
||||
This makes it suitable for use in CI/CD pipelines and scripts.
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
No environment variables are required. All configuration is done via command-line options.
|
||||
|
||||
### Project Configuration
|
||||
|
||||
DataForge can be configured via `pyproject.toml`:
|
||||
|
||||
```toml
|
||||
[tool.dataforge]
|
||||
# Custom configuration options (future)
|
||||
```
|
||||
|
||||
## JSON Schema Validation
|
||||
|
||||
DataForge supports JSON Schema validation with the following features:
|
||||
|
||||
- Draft-07 and Draft 2019-09 schemas
|
||||
- Detailed error messages with path information
|
||||
- Support for all JSON Schema keywords
|
||||
|
||||
Example schema:
|
||||
|
||||
```json
|
||||
{
|
||||
"$schema": "http://json-schema.org/draft-07/schema#",
|
||||
"type": "object",
|
||||
"properties": {
|
||||
"name": {
|
||||
"type": "string"
|
||||
},
|
||||
"version": {
|
||||
"type": "string",
|
||||
"pattern": "^\\d+\\.\\d+\\.\\d+$"
|
||||
}
|
||||
},
|
||||
"required": ["name", "version"]
|
||||
}
|
||||
```
|
||||
|
||||
## Use Cases
|
||||
|
||||
### Configuration File Validation
|
||||
|
||||
Validate all configuration files before deployment:
|
||||
|
||||
```bash
|
||||
dataforge batch-validate --schema config-schema.json --pattern "*.{json,yaml,yml,toml}" --recursive
|
||||
```
|
||||
|
||||
### CI/CD Pipeline Integration
|
||||
|
||||
```bash
|
||||
# Validate configs in CI
|
||||
dataforge batch-validate --schema schema.json || exit 1
|
||||
```
|
||||
|
||||
### Data Format Conversion
|
||||
|
||||
Convert legacy configuration files:
|
||||
|
||||
```bash
|
||||
for file in legacy/*.json; do
|
||||
dataforge convert "$file" "${file%.json}.yaml" --to yaml
|
||||
done
|
||||
```
|
||||
|
||||
### API Response Validation
|
||||
|
||||
```bash
|
||||
# Validate API response schema
|
||||
curl -s https://api.example.com/data | dataforge validate - --schema api-schema.json
|
||||
```
|
||||
|
||||
## Development
|
||||
|
||||
### Setting Up Development Environment
|
||||
|
||||
```bash
|
||||
# Clone the repository
|
||||
git clone https://7000pct.gitea.bloupla.net/7000pctAUTO/dataforge-cli.git
|
||||
cd dataforge-cli
|
||||
|
||||
# Create virtual environment
|
||||
python -m venv venv
|
||||
source venv/bin/activate # On Windows: venv\Scripts\activate
|
||||
|
||||
# Install in development mode
|
||||
pip install -e ".[dev]"
|
||||
|
||||
# Run tests
|
||||
pytest -v
|
||||
|
||||
# Run tests with coverage
|
||||
pytest --cov=dataforge
|
||||
```
|
||||
|
||||
### Running Tests
|
||||
|
||||
```bash
|
||||
# Run all tests
|
||||
pytest -v
|
||||
|
||||
# Run specific test file
|
||||
pytest tests/test_dataforge_parsers.py -v
|
||||
|
||||
# Run with coverage
|
||||
pytest --cov=dataforge --cov-report=html
|
||||
```
|
||||
|
||||
### Code Style
|
||||
|
||||
```bash
|
||||
# Check code style (if configured)
|
||||
ruff check .
|
||||
```
|
||||
|
||||
## Contributing
|
||||
|
||||
1. Fork the repository
|
||||
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
|
||||
3. Commit your changes (`git commit -m 'Add amazing feature'`)
|
||||
4. Push to the branch (`git push origin feature/amazing-feature`)
|
||||
5. Open a Pull Request
|
||||
|
||||
## Roadmap
|
||||
|
||||
- [ ] Support for XML and CSV formats
|
||||
- [ ] Plugin system for custom validators
|
||||
- [ ] Configuration file support
|
||||
- [ ] Improved error reporting with colored output
|
||||
- [ ] Integration with popular frameworks
|
||||
|
||||
## License
|
||||
|
||||
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
|
||||
|
||||
## Acknowledgments
|
||||
|
||||
- [Click](https://click.palletsprojects.com/) for the CLI framework
|
||||
- [PyYAML](https://pyyaml.org/) for YAML support
|
||||
- [jsonschema](https://python-jsonschema.readthedocs.io/) for JSON Schema validation
|
||||
|
||||
Reference in New Issue
Block a user