fix: 修复PythonGenerateNode节点和执行PythonExecuteNode 取SQL_RESULT_LIST_MEMORY（修复仅取最新一个sql的执行结果） by lijialun-leo · Pull Request #336 · spring-ai-alibaba/DataAgent

lijialun-leo · 2026-01-14T08:52:08Z

Describe what this PR does / why we need it

plan多步骤sql计划时，后续的PythonGenerateNode节点和执行PythonExecuteNode节点带入的是SQL_RESULT_LIST_MEMORY，应该取SQL_EXECUTE_NODE_OUTPUT中的所以步骤的sql结果和样例数据。

Does this pull request fix one issue?

Describe how you did it

修改PythonGenerateNode节点和执行PythonExecuteNode 取SQL_EXECUTE_NODE_OUTPUT中的所以步骤的sql结果和样例数据
同时修改prompt 读取多步骤的结果进行分析

Describe how to verify it

Special notes for reviews

mengnankkkk · 2026-01-14T15:13:02Z

...anagement/src/main/java/com/alibaba/cloud/ai/dataagent/workflow/node/PythonGenerateNode.java

 	}

+	// 从executionResults中获取sql结果的样例
+	private List<List<Map<String, String>>> convertExecutionResultsToList(HashMap<String, String> executionResults) {


如果这个方法复用，是不是可以提取一下

可以我抽取到PlanProcessUtil 这个类中？

可以我抽取到PlanProcessUtil 这个类中？

都可以，你觉得合理就OK的

zxuexingzhijie · 2026-01-16T09:02:27Z

data-agent-management/src/main/java/com/alibaba/cloud/ai/dataagent/util/PlanProcessUtil.java


+	public static List<List<Map<String, String>>> convertExecutionResultsToList(
+			HashMap<String, String> executionResults, Integer sampleDataNumber) {
+		List<List<Map<String, String>>> convertedResults = new ArrayList<>();


这个部分嵌套太多了，你优化一下

核心原因是SQL_EXECUTE_NODE_OUTPUT 结果复杂，该对象在几个地方已经使用。

这个建议定义一个实体类，比如这样：

record SqlResultRow(Map<String, ?> data ) {} record SqlResult(List<SqlResultRow> rows) {}

这样这里就可以返回List<SqlResult>。或者说，对于一次数据查询结果的若干个Map，key都是一致的，可以定义一个实体类

record SqlResult(Map<String, List<?>> data) {}

你可以单独写一个工具类，用来将SQL_EXECUTE_NODE_OUTPUT的结果转换为这样的值。如果你不方便这样修改的话可以在此处代码上标注todo，然后开一个issue说明

zxuexingzhijie · 2026-01-16T09:02:50Z

...management/src/main/java/com/alibaba/cloud/ai/dataagent/workflow/node/PythonExecuteNode.java

-					? StateUtil.getListValue(state, SQL_RESULT_LIST_MEMORY) : new ArrayList<>();
+			HashMap<String, String> executionResults = StateUtil.getObjectValue(state, SQL_EXECUTE_NODE_OUTPUT,
+					HashMap.class, new HashMap<>());
+			List<List<Map<String, String>>> sqlResults = PlanProcessUtil.convertExecutionResultsToList(executionResults,


Copilot

Pull request overview

This PR fixes an issue where PythonGenerateNode and PythonExecuteNode were only retrieving the most recent SQL execution result instead of all step results from a multi-step SQL plan.

Changes:

Modified both Python nodes to retrieve all SQL execution results from SQL_EXECUTE_NODE_OUTPUT (a HashMap) instead of SQL_RESULT_LIST_MEMORY (single result)
Added a new utility method convertExecutionResultsToList in PlanProcessUtil to convert execution results HashMap to a list of lists structure
Updated the Python code generation prompt template to handle multi-step SQL results as a List[List[Dict]] input format

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 7 comments.

File	Description
python-generator.txt	Updated prompt to specify input format as List[List[Dict]] for multi-step results and provided example code templates for processing multiple steps
PythonGenerateNode.java	Changed to retrieve all SQL results from SQL_EXECUTE_NODE_OUTPUT and convert to limited sample data using new utility method
PythonExecuteNode.java	Changed to retrieve all SQL results from SQL_EXECUTE_NODE_OUTPUT and convert to full list using new utility method
PlanProcessUtil.java	Added convertExecutionResultsToList method to parse JSON results from HashMap into structured list format

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-01-17T15:37:38Z

data-agent-management/src/main/resources/prompts/python-generator.txt

    result = \{
        "summary": \{\},
        "details": []
    \}

-    # 示例逻辑：计算每列的平均值（可根据需求调整）
-    for column in df.columns:
-        if pd.api.types.is_numeric_dtype(df[column]):
-            result["summary"][column] = \{
-                "mean": df[column].mean(),
-                "min": df[column].min(),
-                "max": df[column].max()
-            \}
-
-    # 示例逻辑：将原始数据分组并统计（可根据需求调整）
-    grouped = df.groupby(list(df.columns[:2])).size().reset_index(name="count")
-    result["details"] = grouped.to_dict(orient="records")
+    # 处理每个步骤的数据
+    for step_index, step_data in enumerate(input_data):
+        if not step_data:
+            continue
+
+        step_df = pd.DataFrame(step_data)
+        # 自动类型推断：将字符串形式的数值转换为实际数值类型（解决后端统一返回字符串的问题）
+        for col in step_df.columns:
+            step_df[col] = pd.to_numeric(step_df[col], errors='ignore')
+
+        step_result = \{
+            "step_number": step_index + 1,
+            "statistics": \{\}
+        \}
+
+        # 计算数值字段的统计指标
+        for col in step_df.columns:
+            if pd.api.types.is_numeric_dtype(step_df[col]):
+                step_result["statistics"][col] = \{
+                    "mean": step_df[col].mean(),
+                    "min": step_df[col].min(),
+                    "max": step_df[col].max(),
+                    "std": step_df[col].std()
+                \}
+
+        # 计算分类字段的频率分布
+        for col in step_df.columns:
+            if not pd.api.types.is_numeric_dtype(step_df[col]):
+                frequency = step_df[col].value_counts().to_dict()
+                step_result["statistics"][col] = \{
+                    "frequency": frequency
+                \}
+
+        result["step_analyses"].append(step_result)


In the example code template, the variable "step_analyses" is referenced but it was never initialized in the result dictionary. The result dict only has "summary" and "details" fields, but the code attempts to append to "result['step_analyses']" which will cause a KeyError at runtime.

Copilot · 2026-01-17T15:37:38Z

data-agent-management/src/main/java/com/alibaba/cloud/ai/dataagent/util/PlanProcessUtil.java

+		for (int i = 1; i <= executionResults.size(); i++) {
+			String stepKey = "step_" + i;
+			String jsonResult = executionResults.get(stepKey);


The loop assumes that step keys are numbered consecutively starting from 1, using executionResults.size() as the upper bound. However, if there are gaps in the step numbering (e.g., step_1, step_3, step_5), this will miss some steps and might attempt to access non-existent keys. The loop should iterate over the actual keys in the HashMap rather than assuming consecutive numbering.

Copilot · 2026-01-17T15:37:38Z

data-agent-management/src/main/java/com/alibaba/cloud/ai/dataagent/util/PlanProcessUtil.java

+						else {
+							convertedResults.add(stepData);
+						}
+					}


When a step's data is successfully parsed but stepData is null, it is silently skipped without being added to the result list. This creates inconsistency in the output where step indices in the returned list may not match the actual step numbers. Consider adding an empty list for steps with null data to maintain positional consistency, or document this behavior clearly.

Suggested change

}

}

else {

// Maintain positional consistency: add an empty list when step data is null

convertedResults.add(new ArrayList<>());

}

VLSMB · 2026-01-21T17:18:19Z

data-agent-management/src/main/java/com/alibaba/cloud/ai/dataagent/util/PlanProcessUtil.java

 		converter = new BeanOutputConverter<>(new ParameterizedTypeReference<>() {
 		});
+
+		objectMapper = new ObjectMapper().setSerializationInclusion(JsonInclude.Include.NON_NULL);


使用JsonUtil的ObjectMapper

VLSMB · 2026-01-21T17:20:35Z

data-agent-management/src/main/java/com/alibaba/cloud/ai/dataagent/util/PlanProcessUtil.java

+		}
+		// 按照步骤顺序处理结果
+		for (int i = 1; i <= executionResults.size(); i++) {
+			String stepKey = "step_" + i;


这里获得之前结果的方法可以写在PlanProcessUtil工具类里，统一让key的拼接方式都在工具类里

VLSMB · 2026-01-21T17:29:33Z

data-agent-management/src/main/java/com/alibaba/cloud/ai/dataagent/util/PlanProcessUtil.java


+	public static List<List<Map<String, String>>> convertExecutionResultsToList(
+			HashMap<String, String> executionResults, Integer sampleDataNumber) {
+		List<List<Map<String, String>>> convertedResults = new ArrayList<>();


这个建议定义一个实体类，比如这样：

record SqlResultRow(Map<String, ?> data ) {} record SqlResult(List<SqlResultRow> rows) {}

这样这里就可以返回List<SqlResult>。或者说，对于一次数据查询结果的若干个Map，key都是一致的，可以定义一个实体类

record SqlResult(Map<String, List<?>> data) {}

你可以单独写一个工具类，用来将SQL_EXECUTE_NODE_OUTPUT的结果转换为这样的值。如果你不方便这样修改的话可以在此处代码上标注todo，然后开一个issue说明

fix:修复生成python代码是带入前几步骤所有的sql结果样例,同时执行python代码是带入前几步骤所有的sql结果

bd3bbba

29463 added 2 commits January 14, 2026 19:12

fix:修复生成python代码是带入前几步骤所有的sql结果样例,同时执行python代码是带入前几步骤所有的sql结果

5c209f6

Merge branch 'main' of https://github.com/spring-ai-alibaba/DataAgent

1dcf5a9

mengnankkkk reviewed Jan 14, 2026

View reviewed changes

fix:优化代码，抽取公共部分

1e32b4d

zxuexingzhijie reviewed Jan 16, 2026

View reviewed changes

zxuexingzhijie requested a review from Copilot January 17, 2026 15:28

Copilot started reviewing on behalf of zxuexingzhijie January 17, 2026 15:29 View session

-					}
+					}
+					else {
+						// Maintain positional consistency: add an empty list when step data is null
+						convertedResults.add(new ArrayList<>());
+					}

Conversation

lijialun-leo commented Jan 14, 2026

Describe what this PR does / why we need it

Does this pull request fix one issue?

Describe how you did it

Describe how to verify it

Special notes for reviews

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Jan 17, 2026

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants