Skip to content

Add scripts for remote_inference#4

Open
jortiz16 wants to merge 2 commits into
GoogleCloudPlatform:masterfrom
jortiz16:remote_inference_scripts
Open

Add scripts for remote_inference#4
jortiz16 wants to merge 2 commits into
GoogleCloudPlatform:masterfrom
jortiz16:remote_inference_scripts

Conversation

@jortiz16
Copy link
Copy Markdown
Collaborator

@jortiz16 jortiz16 commented Apr 2, 2024

The README.md includes instructions to run the scripts. There are four scripts in total. Two are for object tables and the other two are for structured tables.

Comment thread sql_scripts/remote_inference/README.md Outdated
There are two scripts based whether the input data is in an object table or a native BigQuery table.

## Object table script
The object table script creates a target table to store successful annotations. To do this, it calls the inference in a loop. In the first iteration, a small LIMIT is set on the inference call to quickly create a table with the desired schema. The number of rows to process for each inference call can be modified through the batch_size parameter.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not just for annotations, but for the result of the ML operation.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Use "`" to quote the batch_size to make it clear it's a code parameter.

This script applies to the following models:
- ML.ANNOTATE_IMAGE
- ML.PROCESS_DOCUMENT
- ML.TRANSCRIBE
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also for the ML.GENERATE_TEXT with the vision model.

Comment thread sql_scripts/remote_inference/README.md Outdated
ml_query
DEFAULT
FORMAT(
"SELECT %s, text AS content FROM `%s`", ARRAY_TO_STRING(key_columns, ','), source_table);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the default, probably simpler to just have it as:

DECLARE ml_query DEFAULT "SELECT *, /* ML operation dependent field */ FROM `" || source_table || "`";

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants