foundryts.nodes.FunctionNode¶
class foundryts.nodes.FunctionNode(children)¶
Lazy query container for transforming one or more timeseries to a new timeseries which is the output of the supplied transformation function.
Each FunctionNode can be transformed to another FunctionNode or computed to a final
SummarizerNode.
You can also resolve a lazy FunctionNode to a dataframe with FunctionNode.to_pandas() or
FunctionNode.to_dataframe() which will yield the transformed time series in the form of a dataframe.
Examples¶
>>> series = F.points(
... (100, 0.0),
... (200, float("inf")),
... (300, 3.14159),
... (2147483647, 1.0),
... name="series"
... )
>>> series.to_pandas()
timestamp value
0 1970-01-01 00:00:00.000000100 0.00000
1 1970-01-01 00:00:00.000000200 inf
2 1970-01-01 00:00:00.000000300 3.14159
3 1970-01-01 00:00:02.147483647 1.00000
>>> scaled = series.scale(1.5) # scaled is a FunctionNode that is not evaluated yet
# scaled can be chained to another FunctionNode operation resulting in another unevaluated FunctionNode
>>> time_shifted = scaled.time_shift(1000)
# converting time_shifted to a Pandas dataframe evaluates the lazy query with the output of the scaled and
# time_shifted functions
>>> time_shifted.to_pandas()
timestamp value
0 1970-01-01 00:00:00.000001100 0.000000
1 1970-01-01 00:00:00.000001200 inf
2 1970-01-01 00:00:00.000001300 4.712385
3 1970-01-01 00:00:02.147484647 1.500000
columns()¶
Returns a tuple of strings representing the column names of the pandas.DataFrame
that would be produced by evaluating this node to a pandas dataframe.
:::callout{theme="warning" title="Note"}
Keys of nested objects will be flattened into a tuple with nested keys joined with ..
:::
- Returns: Tuple containing names of the columns in the resulting dataframe which the current node gets evaluated to.
- Return type: Tuple[str]
Examples¶
>>> series_node = foundryts.functions.points(((100, 0.0), (200, 1.0))
>>> series_node.columns()
("timestamp", "value")
>>> stats_node = series_node.statistics(start=0, end=100, window_size=None)
>>> stats_node.columns()
("count", "smallest_point.timestamp", "start_timestamp", "latest_point.timestamp", "mean",
"earliest_point.timestamp", "largest_point.timestamp", "end_timestamp")
cumulative_aggregate(*args, **kwargs)¶
See foundryts.functions.cumulative_aggregate()
derivative()¶
See foundryts.functions.derivative()
distribution(start=None, end=None, bins=None, start_value=None, end_value=None)¶
See foundryts.functions.distribution()
dsl(program, return_type, labels=None, before='nearest', internal='linear', after='nearest')¶
first_point()¶
See foundryts.functions.first_point()
integral(method='LINEAR')¶
See foundryts.functions.integral()
interpolate(before=None, internal=None, after=None, frequency=None, rename_columns_by=None, static_column_name=None)¶
See foundryts.functions.interpolate()
last_point()¶
See foundryts.functions.last_point()
mean(children)¶
See foundryts.functions.mean()
periodic_aggregate(*args, **kwargs)¶
See foundryts.functions.periodic_aggregate()
rolling_aggregate(*args, **kwargs)¶
See foundryts.functions.rolling_aggregate()
scale(factor)¶
See foundryts.functions.scale()
scatter(start_timestamp, end_timestamp, first_interpolation, second_interpolation, regression_fit)¶
See foundryts.functions.scatter()
property series_ids¶
All series identifiers used by this node and its child nodes.
skip_nonfinite()¶
See foundryts.functions.skip_nonfinite()
statistics(start=None, end=None, window=None, **kwargs)¶
See foundryts.functions.statistics()
sum(children)¶
time_extent()¶
See foundryts.functions.time_extent()
time_range(start=None, end=None)¶
See foundryts.functions.time_range()
time_shift(duration)¶
See foundryts.functions.time_shift()
to_dataframe(fts=None)¶
Evaluates this node to a pyspark.sql.DataFrame.
PySpark DataFrames enable distributed data processing and parallelized transformations. They can be useful when
working with dataframes with a large number of rows, for example loading all the points in a raw series or the
result of a FunctionNode, or evaluating the results of multiple SummarizerNode or
FunctionNode together.
- Parameters: fts (foundryts.FoundryTS , optional) – FoundryTS session used to execute the query (a new session will be created if not provided).
- Returns: Output of the node evaluated to a PySpark dataframe.
- Return type: pyspark.sql.DataFrame
Examples¶
>>> series_node = F.points(
... (100, 0.0), (200, float("inf")), (300, 3.14159), (2147483647, 1.0), name="series"
... )
>>> series_node.to_dataframe().show()
+-------------------------------+---------+
| timestamp | value |
+-------------------------------+---------+
| 1970-01-01 00:00:00.000000100 | 0.0 |
| 1970-01-01 00:00:00.000000200 | Infinity|
| 1970-01-01 00:00:00.000000300 | 3.14159 |
| 1970-01-01 00:00:02.147483647 | 1.0 |
+-------------------------------+---------+
to_pandas(fts=None)¶
Evaluates this node to a pandas.DataFrame.
This is useful for loading raw or transformed time series data into a pandas.DataFrame
and performing transformations using operations provided by pandas.DataFrame.
- Parameters: fts (foundryts.FoundryTS , optional) – FoundryTS session used to execute the query (a new session will be created if not provided).
- Returns: Output of the node evaluated to a Pandas dataframe.
- Return type: pd.DataFrame
Examples¶
>>> series = F.points(
... (100, 0.0), (200, float("inf")), (300, 3.14159), (2147483647, 1.0), name="series"
... )
>>> series.to_pandas()
timestamp value
0 1970-01-01 00:00:00.000000100 0.00000
1 1970-01-01 00:00:00.000000200 inf
2 1970-01-01 00:00:00.000000300 3.14159
3 1970-01-01 00:00:02.147483647 1.00000
types()¶
Returns a tuple of types for the column of the pandas.DataFrame
that would be produced by evaluating this node to a pandas dataframe.
- Returns: Tuple containing types of the columns in the resulting dataframe which the current node gets evaluated to.
- Return type: Tuple[Type]
Examples¶
>>> node = foundryts.functions.points()
>>> node.types()
(<class 'int'>, <class 'float'>)
>>> stats_node = node.statistics(start=0, end=100, window_size=None)
>>> stats_node.types()
(<class 'int'>, <class 'pandas._libs.tslibs.timestamps.Timestamp'>, <class 'float'>,
<class 'pandas._libs.tslibs.timestamps.Timestamp'>, <class 'pandas._libs.tslibs.timestamps.Timestamp'>,
<class 'float'>, <class 'pandas._libs.tslibs.timestamps.Timestamp'>, <class 'float'>, <class 'float'>,
<class 'pandas._libs.tslibs.timestamps.Timestamp'>, <class 'float'>,
<class 'pandas._libs.tslibs.timestamps.Timestamp'>)
udf(func, columns=None, types=None)¶
unit_conversion(from_unit, to_unit)¶
See foundryts.functions.unit_conversion()
value_shift(delta)¶
See foundryts.functions.value_shift()
where(true=None, false=None)¶
See foundryts.functions.where()
中文翻译¶
foundryts.nodes.FunctionNode¶
class foundryts.nodes.FunctionNode(children)¶
用于将一个或多个时间序列(timeseries)转换为新时间序列的惰性查询容器,新时间序列是所提供的转换函数的输出。
每个 FunctionNode 可以转换为另一个 FunctionNode,或计算为最终的 SummarizerNode。
您还可以使用 FunctionNode.to_pandas() 或 FunctionNode.to_dataframe() 将惰性 FunctionNode 解析为数据框(dataframe),这将生成以数据框形式呈现的转换后时间序列。
示例¶
>>> series = F.points(
... (100, 0.0),
... (200, float("inf")),
... (300, 3.14159),
... (2147483647, 1.0),
... name="series"
... )
>>> series.to_pandas()
timestamp value
0 1970-01-01 00:00:00.000000100 0.00000
1 1970-01-01 00:00:00.000000200 inf
2 1970-01-01 00:00:00.000000300 3.14159
3 1970-01-01 00:00:02.147483647 1.00000
>>> scaled = series.scale(1.5) # scaled 是一个尚未求值的 FunctionNode
# scaled 可以链接到另一个 FunctionNode 操作,生成另一个未求值的 FunctionNode
>>> time_shifted = scaled.time_shift(1000)
# 将 time_shifted 转换为 Pandas 数据框会求值惰性查询,输出 scaled 和 time_shifted 函数的结果
>>> time_shifted.to_pandas()
timestamp value
0 1970-01-01 00:00:00.000001100 0.000000
1 1970-01-01 00:00:00.000001200 inf
2 1970-01-01 00:00:00.000001300 4.712385
3 1970-01-01 00:00:02.147484647 1.500000
columns()¶
返回一个字符串元组,表示将此节点求值为 pandas 数据框(dataframe)时生成的 pandas.DataFrame 的列名。
:::callout{theme="warning" title="注意"}
嵌套对象的键将被展平为元组,嵌套键之间用 . 连接。
:::
- 返回值: 包含当前节点求值后生成的数据框中列名的元组。
- 返回类型: Tuple[str]
示例¶
>>> series_node = foundryts.functions.points(((100, 0.0), (200, 1.0))
>>> series_node.columns()
("timestamp", "value")
>>> stats_node = series_node.statistics(start=0, end=100, window_size=None)
>>> stats_node.columns()
("count", "smallest_point.timestamp", "start_timestamp", "latest_point.timestamp", "mean",
"earliest_point.timestamp", "largest_point.timestamp", "end_timestamp")
cumulative_aggregate(*args, **kwargs)¶
参见 foundryts.functions.cumulative_aggregate()
derivative()¶
参见 foundryts.functions.derivative()
distribution(start=None, end=None, bins=None, start_value=None, end_value=None)¶
参见 foundryts.functions.distribution()
dsl(program, return_type, labels=None, before='nearest', internal='linear', after='nearest')¶
first_point()¶
参见 foundryts.functions.first_point()
integral(method='LINEAR')¶
参见 foundryts.functions.integral()
interpolate(before=None, internal=None, after=None, frequency=None, rename_columns_by=None, static_column_name=None)¶
参见 foundryts.functions.interpolate()
last_point()¶
参见 foundryts.functions.last_point()
mean(children)¶
periodic_aggregate(*args, **kwargs)¶
参见 foundryts.functions.periodic_aggregate()
rolling_aggregate(*args, **kwargs)¶
参见 foundryts.functions.rolling_aggregate()
scale(factor)¶
参见 foundryts.functions.scale()
scatter(start_timestamp, end_timestamp, first_interpolation, second_interpolation, regression_fit)¶
参见 foundryts.functions.scatter()
property series_ids¶
此节点及其子节点使用的所有序列标识符(series identifiers)。
skip_nonfinite()¶
参见 foundryts.functions.skip_nonfinite()
statistics(start=None, end=None, window=None, **kwargs)¶
参见 foundryts.functions.statistics()
sum(children)¶
time_extent()¶
参见 foundryts.functions.time_extent()
time_range(start=None, end=None)¶
参见 foundryts.functions.time_range()
time_shift(duration)¶
参见 foundryts.functions.time_shift()
to_dataframe(fts=None)¶
将此节点求值为一个 pyspark.sql.DataFrame。
PySpark DataFrames 支持分布式数据处理和并行化转换。当处理具有大量行的数据框时,例如加载原始序列中的所有点或 FunctionNode 的结果,或者一起求值多个 SummarizerNode 或 FunctionNode 的结果时,它们非常有用。
- 参数: fts (foundryts.FoundryTS , 可选) – 用于执行查询的 FoundryTS 会话(如果未提供,将创建一个新会话)。
- 返回值: 节点求值为 PySpark 数据框后的输出。
- 返回类型: pyspark.sql.DataFrame
示例¶
>>> series_node = F.points(
... (100, 0.0), (200, float("inf")), (300, 3.14159), (2147483647, 1.0), name="series"
... )
>>> series_node.to_dataframe().show()
+-------------------------------+---------+
| timestamp | value |
+-------------------------------+---------+
| 1970-01-01 00:00:00.000000100 | 0.0 |
| 1970-01-01 00:00:00.000000200 | Infinity|
| 1970-01-01 00:00:00.000000300 | 3.14159 |
| 1970-01-01 00:00:02.147483647 | 1.0 |
+-------------------------------+---------+
to_pandas(fts=None)¶
将此节点求值为一个 pandas.DataFrame。
这对于将原始或转换后的时间序列数据加载到 pandas.DataFrame 中,并使用 pandas.DataFrame 提供的操作执行转换非常有用。
- 参数: fts (foundryts.FoundryTS , 可选) – 用于执行查询的 FoundryTS 会话(如果未提供,将创建一个新会话)。
- 返回值: 节点求值为 Pandas 数据框后的输出。
- 返回类型: pd.DataFrame
示例¶
>>> series = F.points(
... (100, 0.0), (200, float("inf")), (300, 3.14159), (2147483647, 1.0), name="series"
... )
>>> series.to_pandas()
timestamp value
0 1970-01-01 00:00:00.000000100 0.00000
1 1970-01-01 00:00:00.000000200 inf
2 1970-01-01 00:00:00.000000300 3.14159
3 1970-01-01 00:00:02.147483647 1.00000
types()¶
返回一个类型元组,表示将此节点求值为 pandas 数据框(dataframe)时生成的 pandas.DataFrame 的列类型。
- 返回值: 包含当前节点求值后生成的数据框中列类型的元组。
- 返回类型: Tuple[Type]
示例¶
>>> node = foundryts.functions.points()
>>> node.types()
(<class 'int'>, <class 'float'>)
>>> stats_node = node.statistics(start=0, end=100, window_size=None)
>>> stats_node.types()
(<class 'int'>, <class 'pandas._libs.tslibs.timestamps.Timestamp'>, <class 'float'>,
<class 'pandas._libs.tslibs.timestamps.Timestamp'>, <class 'pandas._libs.tslibs.timestamps.Timestamp'>,
<class 'float'>, <class 'pandas._libs.tslibs.timestamps.Timestamp'>, <class 'float'>, <class 'float'>,
<class 'pandas._libs.tslibs.timestamps.Timestamp'>, <class 'float'>,
<class 'pandas._libs.tslibs.timestamps.Timestamp'>)
udf(func, columns=None, types=None)¶
unit_conversion(from_unit, to_unit)¶
参见 foundryts.functions.unit_conversion()
value_shift(delta)¶
参见 foundryts.functions.value_shift()