From d04987ac33f694d83c80a74e034fcaa91c2302b2 Mon Sep 17 00:00:00 2001 From: Ayesha Firdaus Date: Fri, 5 Jun 2026 03:42:42 +0530 Subject: [PATCH] =?UTF-8?q?=E2=80=98DOC-2652-gsql-files-output-4.2?= =?UTF-8?q?=E2=80=99?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- modules/querying/pages/data-types.adoc | 30 ++----------------- ...declaration-and-assignment-statements.adoc | 30 ++++++++++++++++++- .../output-statements-and-file-objects.adoc | 2 +- 3 files changed, 32 insertions(+), 30 deletions(-) diff --git a/modules/querying/pages/data-types.adoc b/modules/querying/pages/data-types.adoc index ce381aa5..a5764893 100644 --- a/modules/querying/pages/data-types.adoc +++ b/modules/querying/pages/data-types.adoc @@ -591,39 +591,13 @@ A `FILE` object is a sequential data storage object, associated with a text file When referring to a `FILE` object, we always capitalize the word `FILE` to distinguish it from ordinary files. ==== +See xref:declaration-and-assignment-statements.adoc#_file_objects[FILE Object Declaration] to declare a `FILE` object. + === Local disk file When a `FILE` object is declared, associated with a particular text file, any existing content in the text file will be erased. During the execution of the query, content written to the `FILE` will be appended to the `FILE`. When the query where the `FILE` was declared finishes running, the `FILE` contents are saved to the text file. -=== S3 object -==== Define s3 file object -The path should start with `s3://`, followed by the bucket name and the S3 path, e.g., `s3://bucket-name/queryoutput/output.csv`. During the execution of the query, content will be uploaded to the S3 bucket. Note that the S3 object cannot be modified or appended, if an S3 object with the same path already exists, it will be overwritten. - -==== Set S3 connection credentials -The S3 credentials can be set as GSQL session parameters, so they persist for a user for a full session. -[source,gsql] ----- -set s3_aws_access_key_id = ; -set s3_aws_secret_access_key = ; ----- - -These session parameters should be set within the GSQL Editor to enable read/write access to the specified S3 bucket for query results. Replace `` and `` with your actual AWS credentials. - -==== Output -Since S3 is a shared storage system, multiple nodes in a cluster can upload to the same S3 bucket. To handle potential conflicts and ensure unique output files, the S3 path can include a suffix based on the instance name, such as `\_GPE_{PartitionId}_{ReplicaId}`. For distributed queries, additional suffixes will be used to differentiate between the manager and worker roles on the same GPE. Specifically, suffixes like `_coordinator` and `_worker` will be added, where `_coordinator` refers to the worker manager and `_worker` refers to the worker node. - -==== Error code -For S3 bucket connection errors, refer to error code `GSQL-5301`. - -[NOTE] -==== -A `FILE` object can be passed as a parameter to another query. -When a query receives a `FILE` object as a parameter, for a file on the local machine, it can append data to that `FILE`, as can every other query which receives this FILE object as a parameter. -However, an S3 bucket `FILE` object cannot be appended to. -When you write to an S3 path, any existing object will be overwritten. -==== - == Query parameter types A query can have one or more input parameters having any of the following types: diff --git a/modules/querying/pages/declaration-and-assignment-statements.adoc b/modules/querying/pages/declaration-and-assignment-statements.adoc index 5731d270..4914c08b 100644 --- a/modules/querying/pages/declaration-and-assignment-statements.adoc +++ b/modules/querying/pages/declaration-and-assignment-statements.adoc @@ -396,7 +396,7 @@ Therefore, `S` must be declared as an ANY-type vertex set variable. [#_file_objects] === `FILE` objects -A `FILE` object is a sequential text storage object, associated with a text file on the local machine. +A `FILE` object is a sequential text storage object, associated with a text file on the local machine or with an S3 bucket. .EBNF for FILE object declaration [source,ebnf] @@ -419,6 +419,34 @@ See xref:querying:output-statements-and-file-objects.adoc#_file_println_statemen include::appendix:example$work_net/declaration_file_object_query.gsql[] ---- +=== S3 object +==== Define s3 file object +The path should start with `s3://`, followed by the bucket name and the S3 path, e.g., `s3://bucket-name/queryoutput/output.csv`. During the execution of the query, content will be uploaded to the S3 bucket. Note that the S3 object cannot be modified or appended, if an S3 object with the same path already exists, it will be overwritten. + +==== Set S3 connection credentials +The S3 credentials can be set as GSQL session parameters, so they persist for a user for a full session. +[source,gsql] +---- +set s3_aws_access_key_id = ; +set s3_aws_secret_access_key = ; +---- + +These session parameters should be set within the GSQL Editor to enable read/write access to the specified S3 bucket for query results. Replace `` and `` with your actual AWS credentials. + +==== Output +Since S3 is a shared storage system, multiple nodes in a cluster can upload to the same S3 bucket. To handle potential conflicts and ensure unique output files, the S3 path can include a suffix based on the instance name, such as `\_GPE_{PartitionId}_{ReplicaId}`. For distributed queries, additional suffixes will be used to differentiate between the manager and worker roles on the same GPE. Specifically, suffixes like `_coordinator` and `_worker` will be added, where `_coordinator` refers to the worker manager and `_worker` refers to the worker node. + +==== Error code +For S3 bucket connection errors, refer to error code `GSQL-5301`. + +[NOTE] +==== +A `FILE` object can be passed as a parameter to another query. +When a query receives a `FILE` object as a parameter, for a file on the local machine, it can append data to that `FILE`, as can every other query which receives this FILE object as a parameter. +However, an S3 bucket `FILE` object cannot be appended to. +When you write to an S3 path, any existing object will be overwritten. +==== + == Assignment and Accumulate Statements Assignment statements are used to set or update the value of a variable after it has been declared. diff --git a/modules/querying/pages/output-statements-and-file-objects.adoc b/modules/querying/pages/output-statements-and-file-objects.adoc index 6937c623..3609d41b 100644 --- a/modules/querying/pages/output-statements-and-file-objects.adoc +++ b/modules/querying/pages/output-statements-and-file-objects.adoc @@ -438,7 +438,7 @@ GSQL > RUN QUERY print_example_v2("person1") === Printing CSV to a FILE Object Instead of printing output in JSON format, output can be written to a `FILE` object in comma-separated values (CSV) format by appending the keyword `TO_CSV` followed by the `FILE` object name to the `PRINT` statement: -The FILE object can be a local file or a S3 bucket storage object, allowing flexibility in how and where the output is stored. +The FILE object can be a local file or a S3 bucket storage object, allowing flexibility in how and where the output is stored. See xref:declaration-and-assignment-statements.adoc#_file_objects[FILE Object Declaration] to declare a `FILE` object. [source,gsql] ----