Skip to main content
Version: 0.32 (Latest)

Advanced Scenarios

Advanced use cases, partition functions, and troubleshooting examples.

SQL Server Partition Functions

Use SQL Server partition functions for efficient parallel exports.

Using Partition Function with DataDriven

Leverage existing partition functions to distribute work:

.\FastBCP.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "tpch10_collation_bin2" `
--trusted `
--sourceschema "dbo" `
--sourcetable "orders_part" `
--directory "D:\temp" `
--fileoutput "orders_partitioned.csv" `
--decimalseparator "." `
--delimiter "|" `
--dateformat "yyyy-MM-dd HH:mm:ss" `
--encoding "UTF-8" `
--parallelmethod "DataDriven" `
--distributekeycolumn "`$partition.PF_DATE(o_orderdate)" `
--paralleldegree 10 `
--merge "False"
Partition Function Syntax

Use `$partition.FunctionName(ColumnName) to reference a partition function. The backtick escapes the dollar sign in PowerShell.

Date-based Partitioning

Partition by date ranges for time-series data using database partitions:

.\FastBCP.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "TimeSeriesDB" `
--trusted `
--sourceschema "dbo" `
--sourcetable "SensorReadings" `
--directory "C:\temp" `
--fileoutput "sensor_readings.parquet" `
--parallelmethod "DataDriven" `
--distributekeycolumn "`$partition.PF_Monthly(reading_date)" `
--paralleldegree 12 `
--merge "False"

Partition by date ranges for time-series data using Timepartition FastBCP parallelmethod :

.\FastBCP.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "TimeSeriesDB" `
--trusted `
--sourceschema "dbo" `
--sourcetable "SensorReadings" `
--directory "s3://mycorpexportbucket/raw/TimeSeriesDB/SensorReadings/" `
--fileoutput "sensor_readings.parquet" `
--cloudprofile "mycorps3profile" `
--parallelmethod "Timepartition" `
--distributekeycolumn "(reading_date,year,month,day)" `
--paralleldegree 12 `
--merge "False"

This technic will produce one file per day inside a directory structure (local or cloud) as a Hive partitionning directory : /year=2026/month=11/day=11/sensor_readings.parquet

You can after a full export use a query to filter your data and overwrite files that was already generated on the overlaping period.

.\FastBCP.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "TimeSeriesDB" `
--trusted `
--sourceschema "dbo" `
--sourcetable "SensorReadings_Part" `
--query "SELECT * FROM dbo.SensorReadings_Part WHERE reading_date >= CAST(GETDATE() - 10 AS DATE)" `
--directory "s3://mycorpexportbucket/raw/TimeSeriesDB/SensorReadings/" `
--fileoutput "sensor_readings.parquet" `
--cloudprofile "mycorps3profile" `
--parallelmethod "Timepartition" `
--distributekeycolumn "(reading_date,year,month,day)" `
--paralleldegree 10 `
--merge "False"

Complex Queries

Multi-table Joins

Export data with a query with joins:

.\FastBCP.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "SalesDB" `
--trusted `
--query "SELECT o.order_id, o.order_date, c.customer_name, od.product_id, od.quantity, od.unit_price FROM orders o INNER JOIN customers c ON o.customer_id = c.customer_id INNER JOIN order_details od ON o.order_id = od.order_id WHERE o.order_date >= '2024-01-01'" `
--directory "C:\temp" `
--fileoutput "sales_detail.csv" `
--decimalseparator "." `
--delimiter "|" `
--dateformat "yyyy-MM-dd HH:mm:ss" `
--encoding "UTF-8" `
--parallelmethod "Ntile" `
--distributekeycolumn "o.order_id" `
--paralleldegree 8 `
--merge "False"

Aggregated Data

Export pre-aggregated data:

.\FastBCP.exe `
--connectiontype "pgsql" `
--server "localhost" `
--port "5432" `
--database "analytics" `
--user "postgres" `
--password "postgres" `
--query "SELECT date_trunc('day', order_date) as order_day, customer_region, COUNT(*) as order_count, SUM(total_amount) as total_sales FROM orders GROUP BY date_trunc('day', order_date), customer_region" `
--directory "C:\temp" `
--fileoutput "daily_sales_by_region.csv" `
--decimalseparator "." `
--delimiter "|" `
--dateformat "yyyy-MM-dd" `
--encoding "UTF-8" `
--parallelmethod "None"

Performance Tuning

Optimal Parallel Degree

Use negative values to auto-calculate based on CPU cores:

# Use all cores minus 2
.\FastBCP.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "LargeDB" `
--trusted `
--sourceschema "dbo" `
--sourcetable "BigTable" `
--directory "C:\temp" `
--fileoutput "bigtable.parquet" `
--parallelmethod "RangeId" `
--distributekeycolumn "id" `
--paralleldegree -2 `
--merge "False"

Parquet RowGroup Size

To reduce memory pressure and, in many cases, improve export performance, you can configure the Parquet row group size by setting the FASTBCP_RGSIZE environment variable.

A smaller FASTBCP_RGSIZE value reduces memory usage during export. However, this comes with a trade-off: smaller row groups may reduce Parquet compression efficiency and can sometimes increase the final file size.

Recommended values range from 1000 to 1_000_000.

In practice, values between 32,000 and 64,000 often provide a good balance between memory usage, performance, and compression efficiency.


$Env:FASTBCP_RGSIZE=32000

.\FastBCP.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "LargeDB" `
--trusted `
--sourceschema "dbo" `
--sourcetable "BigTable" `
--directory "C:\temp" `
--fileoutput "bigtable.parquet" `
--parallelmethod "RangeId" `
--distributekeycolumn "id" `
--paralleldegree -2 `
--merge "False"

Merge vs No-Merge

Combine parallel outputs into single file:

.\FastBCP.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "DataDB" `
--trusted `
--sourceschema "dbo" `
--sourcetable "Data" `
--directory "C:\temp" `
--fileoutput "data_merged.parquet" `
--parallelmethod "RangeId" `
--distributekeycolumn "id" `
--paralleldegree 8 `
--merge "True"

Result: Single file data_merged.parquet, slower and only OnPremise.

Enable Debug Logging

Get detailed execution information:

.\FastBCP.exe `
--connectiontype "mssql" `
--server "localhost" `
--database "TestDB" `
--trusted `
--sourceschema "dbo" `
--sourcetable "TestTable" `
--directory "C:\temp" `
--fileoutput "debug_export.csv" `
--parallelmethod "Random" `
--distributekeycolumn "id" `
--paralleldegree 4 `
--merge "False" `
--loglevel "Debug" `
--runid "debug_test_run"
Debug Mode

Use --loglevel "Debug" to see detailed execution steps, connection strings (without passwords), and query generation details.

tip

Build your command with the Wizard

Copyright © 2026 Architecture & Performance.