Add an Iceberg DataSource
This page explains the configuration required when adding an Iceberg DataSource.
BladePipe supports three types of Iceberg Catalogs and two storage backends, with the following combinations:
- AWS Glue + AWS S3
- Nessie + MinIO / AWS S3
- REST + MinIO / AWS S3
The configuration options in BladePipe for Iceberg are based on these supported combinations.
Configuration Explaination
Click DataSource > Add DataSource. Select Iceberg under Self Maintenance.
General Configuration
Address: Fill in the Catalog service endpoint. Example endpoints for the three supported Catalogs are as follows (replace the content within
<>
with actual values).- AWS Glue: glue.<aws_glue_region_code>.amazonaws.com
- Nessie: <nessie_server_ip>:19120/api/v1
- Rest: <rest_server_ip>:<rest_server_port>
Version: Select the exact Iceberg version.
Description: Add a description to easily identify the DataSource.
Physical Region: Select a region closer to the place where Catalog is deployed or keep the default value. It is used for identification.
Parameter Configuration
httpsEnabled: If the Catalog is AWS Glue, this parameter must be set to true. For the other two types (Nessie and Rest), set this value based on whether SSL is enabled for the deployed Catalog service.
catalogName: Specify the name of the Catalog.
catalogType: Define the Catalog type (GLUE/NESSIE/REST).
catalogWarehouse: Fill in the root path of the Iceberg file storage. For example, if set to s3://bladepipe-iceberg, both metadata and data files will be created under the /bladepipe-iceberg directory in the target storage.
catalogProps: The configuration varies depending on the combination of Catalog and storage type. Examples are provided below (replace values inside
<>
with actual ones):- AWS Glue + AWS S3
{
"io-impl": "org.apache.iceberg.aws.s3.S3FileIO",
"s3.endpoint": "https://s3.<aws_s3_region_code>.amazonaws.com",
"s3.access-key-id": "<aws_s3_iam_user_access_key>",
"s3.secret-access-key": "<aws_s3_iam_user_secret_key>",
"s3.path-style-access": "true",
"client.region": "<aws_s3_region>",
"client.credentials-provider.glue.access-key-id": "<aws_glue_iam_user_access_key>",
"client.credentials-provider.glue.secret-access-key": "<aws_glue_iam_user_secret_key>",
"client.credentials-provider": "com.amazonaws.glue.catalog.credentials.GlueAwsCredentialsProvider"
}- Nessie + AWS S3
{
"io-impl": "org.apache.iceberg.aws.s3.S3FileIO",
"s3.endpoint": "https://s3.<aws_s3_region_code>.amazonaws.com",
"s3.access-key-id": "<aws_s3_iam_user_access_key>",
"s3.secret-access-key": "<aws_s3_iam_user_secret_key>",
"s3.path-style-access": "true",
"client.region": "<aws_s3_region_code>"
}- Nessie + MinIO
{
"io-impl": "org.apache.iceberg.aws.s3.S3FileIO",
"s3.endpoint": "http://<minio_server>:<minio_port>",
"s3.access-key-id": "<minio_user>",
"s3.secret-access-key": "<minio_password>",
"s3.path-style-access": "true",
"client.region": "us-east-1"
}- Rest + AWS S3
{
"io-impl": "org.apache.iceberg.aws.s3.S3FileIO",
"s3.endpoint": "https://s3.<aws_s3_region_code>.amazonaws.com",
"s3.access-key-id": "<aws_s3_iam_user_access_key>",
"s3.secret-access-key": "<aws_s3_iam_user_secret_key>",
"s3.path-style-access": "true",
"client.region": "<aws_s3_region_code>"
}- Rest + MinIO
{
"io-impl": "org.apache.iceberg.aws.s3.S3FileIO",
"s3.endpoint": "http://<minio_server>:<minio_port>",
"s3.access-key-id": "<minio_user>",
"s3.secret-access-key": "<minio_password>",
"s3.path-style-access": "true",
"client.region": "us-east-1"
}