Release Notes - Beam - Version 2.31.0 - HTML format

Sub-task

  • [BEAM-7320] - TextIOWriteTest.testWriteViaSink flaky
  • [BEAM-8644] - Beam Dependency Update Request: org.freemarker:freemarker
  • [BEAM-12009] - Implement Calc splitting.

Bug

  • [BEAM-7717] - PubsubIO watermark tracking hovers near start of epoch
  • [BEAM-7819] - PubsubMessage message parsing is lacking non-attribute fields
  • [BEAM-11738] - UDF jar tests should fail if system properties are unset.
  • [BEAM-11754] - KafkaIO.Write EOS documentation outdated
  • [BEAM-11851] - ConfluentSchemaRegistryProvider fails when authentication is required
  • [BEAM-12120] - Python IO MongoDB: integer and string `_id` keys are not supported
  • [BEAM-12121] - Python IO MongoDB: integer and string `_id` keys are not supported.
  • [BEAM-12122] - Python IO MongoDB: integer and string `_id` keys are not supported.
  • [BEAM-12138] - DataFrame API: groupby(level=) only works for level=0
  • [BEAM-12165] - ParquetIO sink should allow to pass an Avro data model
  • [BEAM-12223] - Javadoc for JdbcIO is cut off
  • [BEAM-12246] - ib.collect doesn't preserve the index from DeferredDataFrame instances
  • [BEAM-12253] - Read.UnboundedSourceAsSDFRestrictionTracker doesn't use cache for readers in getProgress
  • [BEAM-12257] - Can't infer accumulator coder for LazyAggregateCombineFn.
  • [BEAM-12258] - SQL postcommit timing out
  • [BEAM-12276] - Timer.withOutputTimestamp(Instant).offset(Duration).setRelative() might fail unexpectedly
  • [BEAM-12312] - UnsupportedOperationException in LazyAggregateCombineFn.mergeAccumulators
  • [BEAM-12316] - LGPL in bundled dependencies
  • [BEAM-12378] - GroupIntoBatches should support byte-size batches
  • [BEAM-12394] - SQL postcommit failing (SqlCreateFunctionTest)
  • [BEAM-12417] - Go Flink and Spark postcommits permared
  • [BEAM-12427] - gitignore does not ignore generated AutoValue classes in sdks>java>io module
  • [BEAM-12475] - When bundle processors are re-used, do not respond to splits for previous bundles.
  • [BEAM-12502] - ib.collect fails to materialize named DeferredDataFrame instances
  • [BEAM-12507] - Remove website from release
  • [BEAM-12508] - Compiled gradle-wrapper.jar part of release
  • [BEAM-12521] - Java SDK Bigquery IO : RuntimeException: ManagedChannel allocation site

New Feature

  • [BEAM-621] - Add MapValues and MapKeys functions
  • [BEAM-10925] - ZetaSQL: Support Java UDF
  • [BEAM-12106] - Support ARRAY type in BeamJavaUdfCalcRule.
  • [BEAM-12339] - Support CREATE FUNCTION statement in Calcite dialect.
  • [BEAM-12395] - Support google cloud profiler in python sdk

Improvement

  • [BEAM-11271] - Update BigQuery source to perform datasets.get instead of tables.get to determine location
  • [BEAM-11777] - Support correct kwargs in aggregation methods on DataFrame, Series
  • [BEAM-11839] - Attach a "reason" message to Singleton partitioning
  • [BEAM-12018] - Implement melt for DataFrame
  • [BEAM-12028] - DataFrame API error messages should be more helpful
  • [BEAM-12029] - WontImplementErrors should reference offending operation or argument and link to documentation
  • [BEAM-12089] - Improve documentation of artifacts-dir.
  • [BEAM-12093] - Overhaul ElasticsearchIO#Write
  • [BEAM-12119] - Python IO MongoDB: integer and string `_id` keys are not supported
  • [BEAM-12241] - Update bytebuddy to version 1.11.0
  • [BEAM-12252] - Upgrade Kotlin version in Kotlin example to 1.4.x
  • [BEAM-12277] - Add flink 1.13 build target
  • [BEAM-12280] - Upgrade Flink runner to Flink version 1.12.3
  • [BEAM-12281] - Drop support for Flink 1.10
  • [BEAM-12325] - Eliminate beam_fn_api flag from Dataflow usage
  • [BEAM-12329] - Avoid AWS S3 Filesystem warnings about not all bytes have been read
  • [BEAM-12341] - Use portable job submission for Go jobs to dataflow.
  • [BEAM-12342] - Upgrade Spark 2 to version 2.4.8
  • [BEAM-12343] - Test that changes on windowing after GBK behave correctly
  • [BEAM-12384] - Read.Bounded typeDescriptor is not set properly
  • [BEAM-12411] - Update Tensorflow to version 2.5.0
  • [BEAM-12415] - Update Spark 3 version to 3.1.2
  • [BEAM-12423] - Upgrade pyarrow to support version 4.0.0 too
  • [BEAM-12424] - Update Flink 1.12 to version 1.12.4
  • [BEAM-12722] - Add ElasticsearchIO External Versioning

Test

  • [BEAM-12421] - Migrate elasticsearchio tests to test containers

Task

  • [BEAM-11978] - Observe LIFTABLE_WITH_SUM aggregations in DataFrame.aggregate
  • [BEAM-11990] - Support DATE type in BeamJavaUdfCalcRule.
  • [BEAM-12302] - Support TIMESTAMP type in Java UDF.
  • [BEAM-12332] - Support NUMERIC type in Java UDF.

Edit/Copy Release Notes

The text area below allows the project release notes to be edited and copied to another document.