Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigQuery Java Connection API can hang / deadlock if connection is closed without reading all results #2543

Open
jonathanswenson opened this issue Feb 28, 2023 · 1 comment
Labels
api: bigquery Issues related to the googleapis/java-bigquery API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@jonathanswenson
Copy link
Contributor

jonathanswenson commented Feb 28, 2023

Thanks for stopping by to let us know something could be better!

If you are still having issues, please include as much information as possible:

Environment details

  1. Specify the API at the beginning of the title. For example, "BigQuery: ...").
    General, Core, and Other are also allowed as types
  2. OS type and version: Linux Ubuntu
  3. Java version: 17 adoptium
  4. version(s): 2.22.0

Steps to reproduce

  1. Run query that returns enough rows to fill the async buffer
  2. wait for the query to complete,
  3. read some data via JDBC API (but not all of it)
  4. ensure that the buffer is full (can just sleep)
  5. close connection

Code example

ConnectionSettings connectionSettings =
ConnectionSettings.newBuilder()
       .setUseReadAPI(false)
       .setRequestTimeout(10L)
       .setMaxResults(100L)
       .setUseQueryCache(true)
       .build();
Connection connection = bigquery.createConnection(connectionSettings);
String selectQuery = "<query that returns more than buffer size rows>";
ListenableFuture<ExecuteSelectResponse> executeSelectFuture = connection.executeSelectAsync(selectQuery, ...);

ExecuteSelectResponse response = executeSelectFuture.get();
ResultSet results = response.getResultSet();

// read some, but not all of the data
resultSet.next()
// sleep for a bit to ensure that the buffer is full
Thread.sleep(10000); 

// hangs forever.
connection.close();

Stack trace (from jstack thread dump)

"thread name" #35 prio=5 os_prio=0 cpu=917.01ms elapsed=4251.27s tid=0x0000558529454f40 nid=0x49 waiting on condition  [0x00007f9eefffd000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
	- parking to wait for  <0x000000074b91f798> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:341)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block([email protected]/AbstractQueuedSynchronizer.java:506)
	at java.util.concurrent.ForkJoinPool.unmanagedBlock([email protected]/ForkJoinPool.java:3463)
	at java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3434)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await([email protected]/AbstractQueuedSynchronizer.java:1623)
	at java.util.concurrent.LinkedBlockingDeque.putLast([email protected]/LinkedBlockingDeque.java:389)
	at java.util.concurrent.LinkedBlockingDeque.put([email protected]/LinkedBlockingDeque.java:642)
	at com.google.cloud.bigquery.ConnectionImpl.flagEndOfStream(ConnectionImpl.java:767)
	at com.google.cloud.bigquery.ConnectionImpl.close(ConnectionImpl.java:144)
	- locked <0x0000000743dab650> (a com.google.cloud.bigquery.ConnectionImpl)
	at ... my code calling close

Can see that the thread is blocked here: https://github.com/googleapis/java-bigquery/blob/main/google-cloud-bigquery/src/main/java/com/google/cloud/bigquery/ConnectionImpl.java#L767-L768

Async worker thread that is populating the queue is also blocked (as the queue is full).

"pool-67-thread-3" #239 prio=5 os_prio=0 cpu=186.77ms elapsed=8312.48s tid=0x00007f9ef4010560 nid=0x12c waiting on condition  [0x00007f9ee81fe000]
   java.lang.Thread.State: WAITING (parking)
	at jdk.internal.misc.Unsafe.park([email protected]/Native Method)
	- parking to wait for  <0x000000074b91f798> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
	at java.util.concurrent.locks.LockSupport.park([email protected]/LockSupport.java:341)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionNode.block([email protected]/AbstractQueuedSynchronizer.java:506)
	at java.util.concurrent.ForkJoinPool.unmanagedBlock([email protected]/ForkJoinPool.java:3463)
	at java.util.concurrent.ForkJoinPool.managedBlock([email protected]/ForkJoinPool.java:3434)
	at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await([email protected]/AbstractQueuedSynchronizer.java:1623)
	at java.util.concurrent.LinkedBlockingDeque.putLast([email protected]/LinkedBlockingDeque.java:389)
	at java.util.concurrent.LinkedBlockingDeque.put([email protected]/LinkedBlockingDeque.java:642)
	at com.google.cloud.bigquery.ConnectionImpl.lambda$populateBufferAsync$4(ConnectionImpl.java:733)
	at com.google.cloud.bigquery.ConnectionImpl$$Lambda$1385/0x0000000801740228.run(Unknown Source)
	at java.util.concurrent.ThreadPoolExecutor.runWorker([email protected]/ThreadPoolExecutor.java:1136)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run([email protected]/ThreadPoolExecutor.java:635)
	at java.lang.Thread.run([email protected]/Thread.java:833)
@product-auto-label product-auto-label bot added the api: bigquery Issues related to the googleapis/java-bigquery API. label Feb 28, 2023
@jonathanswenson jonathanswenson changed the title BigQuery Java Connection API can hang if connection is closed without reading all results BigQuery Java Connection API can hang / deadlock if connection is closed without reading all results Feb 28, 2023
@Neenu1995 Neenu1995 added priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Mar 7, 2023
@aishikbh
Copy link

aishikbh commented Jul 12, 2024

Hey @jonathanswenson, curious if you were able get around this issue?

I think we are running into something very similar. In our case we are reading partial results from the resultSet and using that to do some processing while we hold on to the resultSet.

After the processing is done, we go back to the resultSet again to process some more data. This gets stuck after a couple of iterations.

We checked that if we do it in one go i.e. in a single iteration, we do not have any issues. The issue arises only when we read partial data in iterations from the resultSet. One thing to note is that the processing the data in each iteration takes considerable amount of time, so we end up holding the resultSet for a total of ~8hours when we do it in multiple iteration vs ~1.5 hours in a single iteration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquery Issues related to the googleapis/java-bigquery API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

3 participants