跳转至

Advanced configuration(高级配置(Advanced configuration))

This section describes how advanced configuration options can be used in Java transforms.

Maximum build duration

It may be desirable to limit the run duration of a job to ensure data freshness or to limit costs. For example, if a job is interacting with an external service and becomes unresponsive, it is useful to have a limit on its run duration, as it may not complete. In Code Repositories, you can limit job duration by using the MaxAllowedDuration and Compute decorators, as shown below:

package myproject.datasets;

import com.palantir.transforms.lang.java.api.*;
import java.time.Duration;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;

public final class FilterTransform {

   @MaxAllowedDuration(value = "PT2H")
   @Compute
   public void myComputeFunction(
           @Input("/examples/students_hair_eye_color") FoundryInput myInput,
           @Output("/examples/students_hair_eye_color_filtered") FoundryOutput myOutput) {
       Dataset<Row> inputDf = myInput.asDataFrame().read();
       myOutput.getDataFrameWriter(inputDf.filter("eye = 'Brown'")).write();
   }
}

:::callout{theme="neutral"} Note that despite the MaxAllowedDuration taking a Duration value, the job is polled every 5 minutes, so a value of PT3M (in ISO 8601 format) will cancel at 5 minutes, and a value of PT7M will cancel at 10 minutes, and so on. :::


中文翻译


高级配置(Advanced configuration)

本节介绍如何在Java转换(Java transforms)中使用各类高级配置选项。

最大构建时长(Maximum build duration)

你可能需要限制作业的运行时长,以保障数据新鲜度或控制成本。例如,若某个作业在与外部服务交互时出现无响应的情况,由于该作业大概率无法正常完成,此时设置运行时长限制就能发挥作用。 在代码仓库(Code Repositories)中,你可以使用MaxAllowedDurationCompute装饰器来限制作业时长,示例如下:

package myproject.datasets;

import com.palantir.transforms.lang.java.api.*;
import java.time.Duration;
import org.apache.spark.sql.Dataset;
import org.apache.spark.sql.Row;

public final class FilterTransform {

   @MaxAllowedDuration(value = "PT2H")
   @Compute
   public void myComputeFunction(
           @Input("/examples/students_hair_eye_color") FoundryInput myInput,
           @Output("/examples/students_hair_eye_color_filtered") FoundryOutput myOutput) {
       Dataset<Row> inputDf = myInput.asDataFrame().read();
       myOutput.getDataFrameWriter(inputDf.filter("eye = 'Brown'")).write();
   }
}

:::callout{theme="neutral"} 请注意,尽管MaxAllowedDuration接收的是时长(Duration)类型的参数,但系统每5分钟才会轮询一次作业状态,因此若设置值为PT3M(ISO 8601格式),作业会在第5分钟被取消;若设置值为PT7M,则会在第10分钟被取消,以此类推。 :::