跳转至

foundryts.functions.udf

foundryts.functions.udf(func, columns=None, types=None)

Returns a function that will call a user-defined function on the dataframe result of queries.

User defined functions (UDF) are a special time series feature that allow running custom Python code on the result of queries returning dataframes. The UDF is applied to the final dataframe in the result of all queries.

  • Parameters:
  • func (Callable[[pandas.DataFrame], Any]) – User defined function to apply.
  • columns (List [str ] , optional) – List of column names for the resulting dataframe when func returns pandas.DataFrame (default is the original column names in the input dataframe).
  • types (List *[*Any ] , optional) – List of column types for the resulting dataframe when func returns pandas.DataFrame (default is the original column types in the input dataframe).
  • Returns: The result of applying the UDF on the input \:py:class`pandas.DataFrame`.
  • Return type: Any

:::callout{theme="success" title="See Also"} dsl() :::

Examples

>>> series = F.points((0, 0.0), (100, 100.0), (140, 140.0), (200, 200.0), name="series")
>>> series.to_pandas()
                      timestamp  value
0 1970-01-01 00:00:00.000000000    0.0
1 1970-01-01 00:00:00.000000100  100.0
2 1970-01-01 00:00:00.000000140  140.0
3 1970-01-01 00:00:00.000000200  200.0
>>> def double(df: pandas.DataFrame) -> pandas.DataFrame:
...     df["value"] *= 2
...     return df
>>> doubled_series = F.udf(double, ["timestamp", "value"], [int, float])(series)
>>> doubled_series.to_pandas()
                      timestamp  value
0 1970-01-01 00:00:00.000000000    0.0
1 1970-01-01 00:00:00.000000100  200.0
2 1970-01-01 00:00:00.000000140  280.0
3 1970-01-01 00:00:00.000000200  400.0

中文翻译

foundryts.functions.udf

foundryts.functions.udf(func, columns=None, types=None)

返回一个函数,该函数将对查询结果的数据帧(DataFrame)调用用户自定义函数。

用户自定义函数(UDF)是一种特殊的时间序列功能,允许在返回数据帧(DataFrame)的查询结果上运行自定义Python代码。UDF应用于所有查询结果的最终数据帧(DataFrame)。

  • 参数:
  • func (Callable[[pandas.DataFrame], Any]) – 要应用的用户自定义函数。
  • columns (List [str ] , 可选) – 当func返回pandas.DataFrame时,结果数据帧(DataFrame)的列名列表(默认为输入数据帧的原始列名)。
  • types (List *[*Any ] , 可选) – 当func返回pandas.DataFrame时,结果数据帧(DataFrame)的列类型列表(默认为输入数据帧的原始列类型)。
  • 返回: 对输入\:py:class`pandas.DataFrame`应用UDF的结果。
  • 返回类型: Any

:::callout{theme="success" title="另请参阅"} dsl() :::

示例

>>> series = F.points((0, 0.0), (100, 100.0), (140, 140.0), (200, 200.0), name="series")
>>> series.to_pandas()
                      timestamp  value
0 1970-01-01 00:00:00.000000000    0.0
1 1970-01-01 00:00:00.000000100  100.0
2 1970-01-01 00:00:00.000000140  140.0
3 1970-01-01 00:00:00.000000200  200.0
>>> def double(df: pandas.DataFrame) -> pandas.DataFrame:
...     df["value"] *= 2
...     return df
>>> doubled_series = F.udf(double, ["timestamp", "value"], [int, float])(series)
>>> doubled_series.to_pandas()
                      timestamp  value
0 1970-01-01 00:00:00.000000000    0.0
1 1970-01-01 00:00:00.000000100  200.0
2 1970-01-01 00:00:00.000000140  280.0
3 1970-01-01 00:00:00.000000200  400.0