Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add default timeout for Client.get_job() #1935

Merged
merged 24 commits into from
May 31, 2024
Merged

Conversation

Linchin
Copy link
Contributor

@Linchin Linchin commented May 29, 2024

When calling QueryJob.result(), QueryJob.reload() is called in the background with an empty timeout value. This leads QueryJob to waiting indefinitely if QueryJob.reload() loses connection.

This PR:

  • Reroutes QueryJob.reload() to call Client.get_job() instead of calling API directly
  • Adds a 128s default timeout for Client.get_job()

Fixes #1922 🦕

@Linchin Linchin requested review from a team as code owners May 29, 2024 01:49
@Linchin Linchin requested a review from leahecole May 29, 2024 01:49
@product-auto-label product-auto-label bot added size: l Pull request size is large. api: bigquery Issues related to the googleapis/python-bigquery API. labels May 29, 2024
@Linchin Linchin requested review from tswast and removed request for leahecole May 29, 2024 01:50
Copy link
Contributor

@tswast tswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some tests where *Job.reload(timeout=None), *Job.done(timeout=None), and *Job.result(timeout=None) disable the timeouts for jobs.get.

google/cloud/bigquery/client.py Outdated Show resolved Hide resolved
google/cloud/bigquery/client.py Outdated Show resolved Hide resolved
google/cloud/bigquery/job/base.py Outdated Show resolved Hide resolved
extra_params["location"] = self.location
span_attributes = {"path": self.path}
kwargs: Dict[str, Any] = {}
if type(timeout) is object:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's add some comments explaining why we need to omit the timeout if we get _DEFAULT_VALUE. (Because we want to use the default API-level timeouts but wait indefinitely for the job to finish)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added explanation why kwargs is used here, but I feel like Queryjob.result() is a better place to explain the default timeout behavior, so I added it in docstring there instead.

@@ -913,7 +922,7 @@ def _set_future_result(self):
def done(
self,
retry: "retries.Retry" = DEFAULT_RETRY,
timeout: Optional[float] = None,
timeout: Optional[Union[float, object]] = PollingFuture._DEFAULT_VALUE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done isn't waiting for the job to complete, so I wonder if we actually need PollingFuture._DEFAULT_VALUE here?

It seems more analogous to Client.get_job so maybe we can use the same DEFAULT_GET_JOB_TIMEOUT, here? 🤔 We need to double-check that we aren't calling any other API requests like getQueryResults, if so.

Copy link
Contributor Author

@Linchin Linchin May 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done() calls reload(), which in turn calls Client.get_job(). So I added the sentinel values along the path result() -> done() -> reload(). Indeed I realize we don't have to add sentinel for done() or reload(), because it will be passed from result() anyway. As to getQueryResults, I don't think it's called from done() onward (it's used in result()). I think either way (using PollingFuture._DEFAULT_VALUE or DEFAULT_GET_JOB_TIMEOUT) it's effectively the same, so I think either way it's fine. Are there any other factors I'm not aware of?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to keep our type annotations as narrow as we can. My worry with PollingFuture._DEFAULT_VALUE is it opens it up to object, which technically is anything (string, bytes, anything) in Python.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, I ended up only using sentinel for result(), and used DEFAULT_GET_JOB_TIMEOUT for subsequent calls.

Comment on lines 1518 to 1522
if type(timeout) is object:
timeout = None
get_job_timeout = PollingFuture._DEFAULT_VALUE
else:
get_job_timeout = timeout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not do the kwargs trick here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's necessary - the purpose here is slightly different from when calling client.get_job(). I'm only preserving the sentinel value for the job.done() call, and use None as default for any other calls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, following later comments

tests/unit/test__job_helpers.py Outdated Show resolved Hide resolved
@@ -801,7 +801,7 @@ def reload(
self,
client=None,
retry: "retries.Retry" = DEFAULT_RETRY,
timeout: Optional[float] = None,
timeout: Optional[Union[float, object]] = POLLING_DEFAULT_VALUE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd much rather this defaulted to the same timeout as get_job() and anywhere that calls reload(), we make sure we do the check for POLLING_DEFAULT_VALUE before calling reload() to make sure we only pass in float or None or omit the value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -913,7 +924,7 @@ def _set_future_result(self):
def done(
self,
retry: "retries.Retry" = DEFAULT_RETRY,
timeout: Optional[float] = None,
timeout: Optional[Union[float, object]] = POLLING_DEFAULT_VALUE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here. I don't think this needs to handle POLLING_DEFAULT_VALUE since it's not waiting for the job to finish, just making at most 1 API call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

google/cloud/bigquery/job/query.py Outdated Show resolved Hide resolved
google/cloud/bigquery/job/query.py Outdated Show resolved Hide resolved
# timeout to QueryJob.done(), and use None for the other timeouts.
if type(timeout) is object:
timeout = None
get_job_timeout = POLLING_DEFAULT_VALUE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not set to DEFAULT_GET_JOB_TIMEOUT or do the kwargs trick to omit it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Contributor

@tswast tswast left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it! Just a couple of typos to fix please before merging

google/cloud/bigquery/job/base.py Outdated Show resolved Hide resolved
google/cloud/bigquery/job/base.py Outdated Show resolved Hide resolved
@Linchin Linchin merged commit 9fbad76 into googleapis:main May 31, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/python-bigquery API. size: l Pull request size is large.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

QueryJob.result() not identifying and returning after submitted query completes
2 participants