Release Notes - Beam - Version 2.6.0 - HTML format

Sub-task

  • [BEAM-2915] - Java SDK support for portable user state
  • [BEAM-3673] - FlinkRunner: Harness manager for connecting operators to SDK Harnesses
  • [BEAM-3706] - Update CombinePayload to improved model for Portability
  • [BEAM-3708] - Implement the portable lifted Combiner transforms in Java SDK
  • [BEAM-3743] - Support for SDF splitting protocol in ULR
  • [BEAM-3833] - Java SDK harness should detect SDF ProcessFn and proactively checkpoint it
  • [BEAM-3883] - Python SDK stages artifacts when talking to job server
  • [BEAM-3949] - IOIT's setup() and teardown() db connection attempt sometimes fail resulting in test flakiness
  • [BEAM-4283] - Export nexmark execution times to bigQuery
  • [BEAM-4290] - ArtifactStagingService that stages to a distributed filesystem
  • [BEAM-4291] - ArtifactRetrievalService that retrieves artifacts from a distributed filesystem
  • [BEAM-4399] - Change CassandraIOIT to write-then-read Performance Tests
  • [BEAM-4451] - SchemaRegistry should support a ServiceLoader interface
  • [BEAM-4453] - Provide automatic schema registration for POJOs
  • [BEAM-4477] - Support EXISTS operator
  • [BEAM-4537] - CASE expression output type mismatch
  • [BEAM-4547] - Implement sum0 aggregation function
  • [BEAM-4568] - Add Apache headers to website sources
  • [BEAM-4602] - Implement Date Comparison in BeamSqlCompareExpression
  • [BEAM-4613] - Improve performance of SchemaCoder
  • [BEAM-4654] - Update pipeline translation for timers inside Java SDK
  • [BEAM-4659] - Add well known timer coder for Java SDK
  • [BEAM-4661] - Define well known timer URN
  • [BEAM-4716] - Remove findbugs declarations in build.gradle files since it is now globally a compileOnly and testCompileOnly dependency
  • [BEAM-4728] - Remove Logging Context form places in Python SDK
  • [BEAM-4730] - Replace try/except imports related to Py2/3 compatibility with from past.builtins imports

Bug

  • [BEAM-2732] - State tracking in Python is inefficient and has duplicated code
  • [BEAM-3042] - Add tracking of bytes read / time spent when reading side inputs
  • [BEAM-3314] - RedisIO: RedisConnectionConfiguration withEndpoint does not set host correctly.
  • [BEAM-3876] - unbounded source can not have any checkpoint when they end (no data case)
  • [BEAM-4016] - @SplitRestriction should execute after @Setup on SplittableDoFn
  • [BEAM-4086] - KafkaIOTest is flaky
  • [BEAM-4281] - GrpcDataServiceTest.testMessageReceivedBySingleClientWhenThereAreMultipleClients is flaky
  • [BEAM-4414] - Create more specific namespace for each IOIT in FileBasedIOIT
  • [BEAM-4447] - Python SDK assert_that keyword argument order change
  • [BEAM-4473] - Flaky org.apache.beam.runners.direct.portable.ReferenceRunnerTest.pipelineExecution
  • [BEAM-4474] - Ensure unbounded Go pipelines are not run in batch on Dataflow
  • [BEAM-4481] - Remove duplicate dependency declarations from runners/direct-java
  • [BEAM-4484] - Shading model-pipeline / model-fn-execution / model-job-management produces corrupted classes
  • [BEAM-4526] - FlinkRunner failing errorprone due to mutable enum
  • [BEAM-4533] - Beam SQL should support unquoted types
  • [BEAM-4540] - Hamcrest and JUnit are being leaked into the main path (should be test only)
  • [BEAM-4555] - Pre-commit file include triggering breaks phrase triggering
  • [BEAM-4570] - PubsubJsonClient listTopics and listSubscriptions give wrong results (ignore pagination)
  • [BEAM-4592] - Make Dataflow understand kind:varint as a well known since it already treats beam:coder:varint:v1 as a well known coder kind
  • [BEAM-4614] - Allow gradle build to take extra list of repositories through an init file
  • [BEAM-4616] - Customize welcome and help messages for Beam SQL shell
  • [BEAM-4622] - Many Beam SQL expressions never have their validation called
  • [BEAM-4632] - KafkaIO seems to fail on streaming mode over spark runner
  • [BEAM-4635] - Dataflow runner deletes the binary specified in flag --worker_binary
  • [BEAM-4644] - ExecutableStageDoFnOperator.java uses JUL instead of SLF4J
  • [BEAM-4649] - Failures in beam_PostCommit_Py_ValCont due to exception in read_log_control_messages
  • [BEAM-4666] - Go SDK fails to stage artifacts on Flink
  • [BEAM-4700] - JDBC driver cannot support TIMESTAMP data type
  • [BEAM-4706] - BigQueryTornadoesIT cannot be run using integrationTest and performanceTest tasks
  • [BEAM-4718] - ./gradlew build should run nightly before ./gradlew publish
  • [BEAM-4724] - Maven Go build broken
  • [BEAM-4733] - Python portable runner to pass pipeline options to job service
  • [BEAM-4744] - Jars are overwritten during release with -Ppublishing
  • [BEAM-4745] - SDF tests broken by innocent change due to Dataflow worker dependencies
  • [BEAM-4759] - SpannerIO MutationGroupEncoder wrong string encoding length
  • [BEAM-4773] - Flink job fails if image pull fails.
  • [BEAM-4787] - Ignore genrated vendored files for python container
  • [BEAM-4799] - Beam SQL JDBC broken
  • [BEAM-4800] - PutArtifactResponse not sent
  • [BEAM-4801] - Introduce Beam dependency ownership into the codebase
  • [BEAM-4802] - Update "Dependency" section of the Contribution Guide
  • [BEAM-4817] - Release build task :beam-sdks-java-extensions-join-library:compileJava failed
  • [BEAM-4836] - IOIT tests fails on Jenkins because of numpy version
  • [BEAM-4839] - EOF Exception writing non-english Characters to Spanner
  • [BEAM-5630] - supplement Bigquery Read IT test cases and ignorelist them in post-commit
  • [BEAM-5700] - remove extra copyright from bigquery_io_read_pipeline

New Feature

  • [BEAM-2588] - Portable Flink Runner Job API
  • [BEAM-3326] - Execute a Stage via the portability framework in the ReferenceRunner
  • [BEAM-3648] - Support Splittable DoFn in Flink Batch Runner
  • [BEAM-4020] - Add HBaseIO.readAll() based on SDF
  • [BEAM-4145] - Java SDK Harness populates control request headers with worker id
  • [BEAM-4147] - Abstractions for artifact delivery via arbitrary storage backends
  • [BEAM-4194] - [SQL] Support LIMIT on Unbounded Data
  • [BEAM-4205] - Java: WordCount runs against manually started Flink at master
  • [BEAM-4206] - Python: WordCount runs against manually started Flink at master
  • [BEAM-4216] - Flink: Staged artifacts are delivered to the SDK container
  • [BEAM-4258] - Integrate Docker Environment Management in the ReferenceRunner
  • [BEAM-4302] - Fix to dependency hell
  • [BEAM-4333] - Add integration tests for mobile game examples
  • [BEAM-4385] - Support LIKE operator
  • [BEAM-4394] - Consider enabling spotless java format throughout codebase
  • [BEAM-4575] - Beam SQL should cleanly transform graph from Calcite
  • [BEAM-4626] - Support text table format with a single column of the lines of the files
  • [BEAM-4651] - Bundled Beam SQL shell build
  • [BEAM-4652] - PubsubIO: create subscription on different project than the topic
  • [BEAM-4689] - Dataflow cannot deserialize SplittableParDo DoFns
  • [BEAM-4701] - Run SQL DSL-level tests through JDBC driver
  • [BEAM-4714] - Some DATETIME PLUS operators end up as ordinary PLUS and crash in accept()
  • [BEAM-4792] - Add support for bounded SDF to all runners

Improvement

  • [BEAM-2899] - Universal Local Runner
  • [BEAM-3418] - Python Fnapi - Support Multiple SDK workers on a single VM
  • [BEAM-3634] - [SQL] Refactor BeamRelNodes into PTransforms
  • [BEAM-3905] - Update Flink Runner to Flink 1.5.0
  • [BEAM-3907] - Clarify how watermark is estimated for watchForNewFiles() transforms
  • [BEAM-4137] - Split IOTestPipelineOptions to multiple, test-specific files
  • [BEAM-4267] - Implement a reusable library that can run an ExecutableStage with a given Environment
  • [BEAM-4311] - Enforce ErrorProne analysis in Flink runner project
  • [BEAM-4313] - Enforce ErrorProne analysis in Dataflow runner project
  • [BEAM-4318] - Enforce ErrorProne analysis in Spark runner project
  • [BEAM-4326] - Enforce ErrorProne analysis in the fn-execution project
  • [BEAM-4327] - Enforce ErrorProne analysis in the java harness project
  • [BEAM-4480] - Fixed deprecated method invoking for AvroCoder
  • [BEAM-4512] - Move DataflowRunner off of Maven build files
  • [BEAM-4551] - Update Spark runner to Spark version 2.3.1
  • [BEAM-4556] - Enforce ErrorProne analysis in sdks-java-core project
  • [BEAM-4562] - [SQL] Fix INSERT VALUES in JdbcDriver
  • [BEAM-4590] - Beam SQL JDBC driver should set User-agent PipelineOption
  • [BEAM-4642] - Allow setting PipelineOptions for JDBC connections
  • [BEAM-4646] - Improve SQL's BigQuery write integration test
  • [BEAM-4675] - Reduce the size of pretty string of BQ load jobs

Test

  • [BEAM-3214] - Add an integration test for HBaseIO Read/Write transforms

Wish

  • [BEAM-3831] - Update Google Cloud Core version (which depends on org.json)
  • [BEAM-4430] - Improve Performance Testing Documentation

Task

  • [BEAM-4475] - Go precommit should include "go test ./..."

Edit/Copy Release Notes

The text area below allows the project release notes to be edited and copied to another document.