AWS Redshift¶
Connect Foundry to Amazon Redshift to read and write data between Redshift databases and Foundry datasets.
Supported capabilities¶
| Capability | Status |
|---|---|
| Exploration | 🟢 Generally available |
| Batch syncs | 🟢 Generally available |
| Incremental | 🟢 Generally available |
| Table exports | 🟢 Generally available |
Data model¶
The connector transfers relational data from Redshift tables into Foundry datasets. Schemas and data types are preserved during transfer. You can also export Foundry datasets to Redshift tables.
Performance and limitations¶
Performance depends on the size of your Redshift cluster and network conditions. Foundry uses Redshift's efficient data transfer mechanisms to optimize performance.
:::callout{theme="warning"} Network connectivity between your Foundry instance and AWS Redshift cluster is required. This may require VPC peering, AWS PrivateLink, or public Internet access with proper security configurations. :::
Setup¶
- Open the Data Connection application and select + New Source in the upper right corner of the screen.
- Select AWS Redshift from the available connector types.
- Follow the additional configuration prompts to continue the setup of your connector using the information in the sections below.
Learn more about setting up a connector in Foundry.
Configuration options¶
The following configuration options are available for the AWS Redshift connector:
| Option | Required? | Description |
|---|---|---|
Endpoint |
Yes | The endpoint to use to access Redshift (for example, redshift.us-east-1.amazonaws.com) |
Port |
Yes | The port to connect to Redshift (default: 5439) |
Database |
Yes | The name of the Redshift database |
Username |
Yes | Username for authentication |
Password |
Yes | Password for authentication |
JDBC properties |
No | Add property names and values to configure connection behavior. Learn more about JDBC properties below. |
JDBC properties¶
You can optionally add properties ↗ to your JDBC connection to configure behavior. Refer to the AWS documentation for additional available JDBC properties to add to your connection configuration.
Cloud identity configuration¶
Cloud identity authentication allows Foundry to access resources in your AWS instance. Cloud identities are configured and managed at the enrollment level in Control Panel. Learn how to configure cloud identities.
When using cloud identity authentication, the Role ARN will be displayed in the credentials section. After selecting the Cloud identity credential option, you must also configure the following:
- Configure an Identity and Access Management (IAM) role in the target Amazon AWS account.
- Grant the IAM role access to the Redshift cluster to which you wish to connect. You can generally do this with a bucket policy ↗.
- In the Redshift source configuration details, add the IAM role under the Security Token Service (STS) role ↗ configuration. The cloud identity IAM role in Foundry will attempt to assume the AWS Account IAM role ↗ when accessing Redshift.
- Configure a corresponding trust policy ↗ to allow the cloud identity IAM role to assume the target AWS account IAM role.
Table export configuration options¶
When exporting data to Redshift tables, records are inserted in batches for better performance. The default batch size is 1,000 records, but this can be configured up to 100,000 records per batch depending on your performance needs and data characteristics. For more information about table export configuration options, review our documentation.
中文翻译¶
AWS Redshift¶
将 Foundry 连接到 Amazon Redshift,以便在 Redshift 数据库与 Foundry 数据集之间读写数据。
支持的功能¶
| 功能 | 状态 |
|---|---|
| 数据探索 | 🟢 正式可用 |
| 批量同步 | 🟢 正式可用 |
| 增量同步 | 🟢 正式可用 |
| 表导出 | 🟢 正式可用 |
数据模型¶
该连接器将 Redshift 表中的关系型数据传输到 Foundry 数据集。传输过程中会保留模式(Schema)和数据类型。您也可以将 Foundry 数据集导出到 Redshift 表中。
性能与限制¶
性能取决于 Redshift 集群的大小和网络状况。Foundry 使用 Redshift 高效的数据传输机制来优化性能。
:::callout{theme="warning"} 需要确保 Foundry 实例与 AWS Redshift 集群之间的网络连通性。这可能需要通过 VPC 对等连接、AWS PrivateLink 或具有适当安全配置的公共互联网访问来实现。 :::
设置步骤¶
- 打开数据连接应用程序,在屏幕右上角选择 + 新建数据源。
- 从可用的连接器类型中选择 AWS Redshift。
- 按照后续配置提示,使用以下各节中的信息完成连接器的设置。
了解更多关于在 Foundry 中设置连接器的信息。
配置选项¶
AWS Redshift 连接器提供以下配置选项:
| 选项 | 是否必填 | 描述 |
|---|---|---|
端点(Endpoint) |
是 | 用于访问 Redshift 的端点(例如,redshift.us-east-1.amazonaws.com) |
端口(Port) |
是 | 连接 Redshift 的端口(默认值:5439) |
数据库(Database) |
是 | Redshift 数据库的名称 |
用户名(Username) |
是 | 用于身份验证的用户名 |
密码(Password) |
是 | 用于身份验证的密码 |
JDBC 属性(JDBC properties) |
否 | 添加属性名称和值以配置连接行为。了解更多关于下方 JDBC 属性 的信息。 |
JDBC 属性¶
您可以选择性地向 JDBC 连接添加属性 ↗ 以配置行为。请参考 AWS 文档了解可添加到连接配置中的其他 JDBC 属性。
云身份配置¶
云身份认证允许 Foundry 访问您 AWS 实例中的资源。云身份在控制面板的注册(Enrollment)级别进行配置和管理。了解如何配置云身份。
使用云身份认证时,角色 ARN 将显示在凭据部分。选择 云身份(Cloud identity) 凭据选项后,您还需要进行以下配置:
- 在目标 Amazon AWS 账户中配置一个身份与访问管理(IAM)角色。
- 授予该 IAM 角色对您要连接的 Redshift 集群的访问权限。通常可以通过存储桶策略 ↗来实现。
- 在 Redshift 数据源配置详情中,在安全令牌服务(STS)角色 ↗配置下添加该 IAM 角色。Foundry 中的云身份 IAM 角色将在访问 Redshift 时尝试扮演AWS 账户 IAM 角色 ↗。
- 配置相应的信任策略 ↗,以允许云身份 IAM 角色扮演目标 AWS 账户的 IAM 角色。
表导出配置选项¶
将数据导出到 Redshift 表时,记录会以批量方式插入以获得更好的性能。默认批量大小为 1,000 条记录,但根据您的性能需求和数据特征,每个批次最多可配置为 100,000 条记录。有关表导出配置选项的更多信息,请查阅我们的文档。