Version: 0.30 (Latest)

Release Notes 0.30

New Features

Timepartition Parallel Method

Added a new parallel method Timepartition for time-based data partitioning. This method uses a date/datetime column to partition data into time-based chunks, allowing efficient parallel export based on temporal ranges.

Key features:

Ideal for time-series data and large temporal datasets
Supports multiple time granularities: year, month, week, and day
Each parallel thread exports data for a specific time period

Distribute Key Column Format:

The --distributekeycolumn parameter must be specified in a special format when using Timepartition:

(datecolumn, year, month) - Partition by year and month
(datecolumn, year, month, day) - Partition by year, month, and day
(datecolumn, year) - Partition by year only
(datecolumn, year, week) - Partition by year and week

Example:

.\FastBCP.exe `
 --connectiontype "mssql" `
 --server "localhost" `
 --database "tpch_copy" `
 --user "FastUser" `
 --password "FastPassword" `
 --sourceschema "tpch_10" `
 --sourcetable "orders_date_sorted" `
 --directory "D:\temp\TestPartition" `
 --fileoutput "orders.csv" `
 --method "Timepartition" `
 --distributekeycolumn "(o_orderdate,year,month)" `
 --paralleldegree 16 `
 --merge "False"

See Parallel Parameters for complete details.

Parquet Row Group Size Configuration

Available since version 0.30.1

This feature is available starting from FastBCP version 0.30.1.

Added support for configuring Parquet row group size through the FASTBCP_RGSIZE environment variable. This allows fine-tuning of Parquet file structure for optimal query performance and compression.

Key features:

Control row group size for Parquet exports
Default value: 1,000,000 rows
Configurable via environment variable
Impacts parallelization and compression efficiency

Usage:

Windows
Linux

# Set the row group size to 500,000 rows
$env:FASTBCP_RGSIZE = "500000"

# Run FastBCP
.\FastBCP.exe --connectiontype mssql --server "localhost" ...

# Set the row group size to 500,000 rows
export FASTBCP_RGSIZE=500000

# Run FastBCP
./FastBCP --connectiontype mssql --server "localhost" ...

See Parquet Formatting for complete details.

Config File Parameter

Added the --config parameter to support YAML configuration files. This provides a more structured and readable way to manage export configurations compared to JSON settings files.

Key benefits:

Human-friendly YAML format with comment support
Structured sections: connection, source, output, performance, logging
Version control friendly
Simplifies complex command-lines to a single parameter

YAML Configuration Sections:

connection: type, server, database, trusted
source: schema, table
output: file, directory, delimiter, encoding
performance: method, degree, distribute_key_column, merge
logging: run_id

Example:

# FastBCP – MSSQL to CSV using RangeId parallel method
connection:
  type: mssql
  server: localhost
  database: tpch10_collation_bin2
  trusted: true

source:
  schema: dbo
  table: orders

output:
  file: "mssql_orders_{startdate}.csv"
  directory: 'D:\temp\{database}\{schema}\{table}\full\'
  delimiter: "|"
  decimal_separator: "."
  date_format: "yyyy-MM-dd HH:mm:ss"
  encoding: UTF-8

performance:
  method: RangeId
  degree: -2
  distribute_key_column: o_orderkey
  merge: false

logging:
  run_id: mssql_to_csv_parallel-2_rangeid

Usage:

.\FastBCP.exe --config samples\sample_mssql_to_csv.yaml

See Advanced Parameters for complete details.

New Features​

Timepartition Parallel Method​

Parquet Row Group Size Configuration​

Config File Parameter​

New Features

Timepartition Parallel Method

Parquet Row Group Size Configuration

Config File Parameter