hckr data

data related commands

hckr data [OPTIONS] COMMAND [ARGS]...

faker

This command Generate fake Data in different file formats

Example Usage:

  • let we have the following schema in format {“column_name”: “faker_provider”}

  • Please find all faker provider here

schema.json
{
    "user_name": "name",
    "user_address": "address",
    "user_email": "email"
}
  • Generate csv data using schema

$ hckr data faker -s schema.json -o file.avro
  • Similarly, we can generate avro (or any other valid format) data

$ hckr data faker -s schema.json -o file.csv
  • We can also provide data format explicitly, if file extension is not clear

$ hckr data faker -s schema.json -o output -f csv

Command Reference:

hckr data faker [OPTIONS]

Options

-c, --count <count>

Number of data entries to generate.

-s, --schema <schema>

Required Path to the JSON schema file.

-o, --output <output>

Required Output file path

-f, --format <format>

Output file format, Options: [‘txt’, ‘csv’, ‘avro’, ‘json’, ‘excel’, ‘parquet’] [default: Inferred from file extension]

peek

This command allows us to peek into top COUNT rows from a file

Example Usage:

  • We can look into an input file by providing -i or --input option

  • It will show top 10 rows ( by default )

$ hckr data peek -i input.avro
  • We can also provide number of rows we want to see using -c or --count option

$ hckr data peek -i input.avro -c 20
  • We can also provide data format explicitly using -f or --format option, if file extension is not clear

$ hckr data peek -i input.data -f csv

Command Reference:

hckr data peek [OPTIONS]

Options

-i, --input <input>

Required Input file to peek into

-c, --count <count>

Number of rows to show, [default: 10]

-f, --format <format>

Output file format, if not provided it gets inferred from file extension