Release Notes - Beam - Version 2.7.0 - HTML format

Sub-task

  • [BEAM-2930] - Flink support for portable side input
  • [BEAM-3141] - Make coders & streams work in Python 3
  • [BEAM-3370] - Add ability to stage directories with compiled classes to Flink
  • [BEAM-3513] - Use portable CombinePayload in Java DataflowRunner
  • [BEAM-3654] - Port FilterExamplesTest off DoFnTester
  • [BEAM-3906] - Get Python Wheel Validation Automated
  • [BEAM-4094] - Remove ScopedMetricsContainer from Python SDK
  • [BEAM-4276] - Implement the portable lifted Combiner transforms in Go SDK
  • [BEAM-4452] - Create a lazy row on top of a generic Getter interface
  • [BEAM-4653] - Java SDK harness should support user timers
  • [BEAM-4658] - Update pipeline representation in runner support libraries to handle timers
  • [BEAM-4727] - Reduce metrics overhead
  • [BEAM-4794] - Move Nexmark and SQL to use the new Schema framework
  • [BEAM-4888] - Beam Dependency Update Request: org.apache.calcite:calcite-core 1.17.0
  • [BEAM-4889] - Beam Dependency Update Request: org.apache.calcite:calcite-linq4j 1.17.0
  • [BEAM-4907] - Beam Dependency Update Request: org.apache.derby:derby 10.14.2.0
  • [BEAM-4908] - Beam Dependency Update Request: org.apache.derby:derbyclient 10.14.2.0
  • [BEAM-4909] - Beam Dependency Update Request: org.apache.derby:derbynet 10.14.2.0
  • [BEAM-4922] - Beam Dependency Update Request: org.freemarker:freemarker 2.3.28
  • [BEAM-4964] - Beam Dependency Update Request: org.apache.httpcomponents:httpasyncclient 4.1.4
  • [BEAM-4965] - Beam Dependency Update Request: org.apache.httpcomponents:httpclient 4.5.6
  • [BEAM-4966] - Beam Dependency Update Request: org.apache.httpcomponents:httpcore 4.4.10
  • [BEAM-4967] - Beam Dependency Update Request: org.apache.httpcomponents:httpcore-nio 4.4.10
  • [BEAM-5012] - Beam Dependency Update Request: org.springframework:spring-expression 5.0.7.RELEASE
  • [BEAM-5019] - Beam Dependency Update Request: org.tukaani:xz 1.8
  • [BEAM-5027] - Schemas do not work on Dataflow runner of FnApi Runner
  • [BEAM-5030] - Consolidate defer overhead per bundle
  • [BEAM-5068] - Get Java mobile-gaming auto validations run on any local environments.
  • [BEAM-5084] - Beam Dependency Update Request: com.alibaba:fastjson 1.2.49

Bug

  • [BEAM-2277] - IllegalArgumentException when using Hadoop file system for WordCount example.
  • [BEAM-3095] - .withCompression() hinted at in docs, but not usable
  • [BEAM-3359] - Unable to change "flinkMaster" from "[auto]" in TestFlinkRunner
  • [BEAM-3744] - Support full PubsubMessages
  • [BEAM-3917] - Pipeline roots are not computed correctly for CoGBK
  • [BEAM-4262] - Google suggests that upgrade version of bigtable-client to 1.3.0 [1]
  • [BEAM-4285] - Flink batch state request handler
  • [BEAM-4359] - String encoding for a spanner mutation assumes that string length equals bytes length
  • [BEAM-4545] - Release guide should contain how to build python wheels
  • [BEAM-4699] - BeamFileSystemArtifactServicesTest.putArtifactsSingleSmallFileTest flake
  • [BEAM-4723] - Enhance Datetime*Expression Datetime Type
  • [BEAM-4748] - Flaky post-commit test org.apache.beam.runners.fnexecution.artifact.BeamFileSystemArtifactServicesTest.putArtifactsMultipleFilesConcurrentlyTest
  • [BEAM-4769] - Types$WildcardTypeImpl cannot be cast to java.lang.reflect.TypeVariable
  • [BEAM-4772] - TextIO.read transform does not respect .withEmptyMatchTreatment
  • [BEAM-4795] - Support Spanner library 0.53
  • [BEAM-4798] - IndexOutOfBoundsException when Flink parallelism > 1
  • [BEAM-4804] - Need dedicated jenkins workers for perf tests
  • [BEAM-4809] - Java preCommit and postCommit should build javadoc to check it builds ok
  • [BEAM-4810] - Flaky test BeamFileSystemArtifactServicesTest
  • [BEAM-4842] - Update Flink Runner to Flink 1.5.2
  • [BEAM-4843] - Incorrect docs on FileSystems.delete
  • [BEAM-4846] - updateOfflineRepositoryRoot broken
  • [BEAM-4860] - com/google/thirdparty/publicsuffix is not shaded along with guava
  • [BEAM-4887] - Beam Dependency Update Request: org.apache.calcite
  • [BEAM-5026] - Portable flink wordcount fails sometimes due to non-existent source path in FileBasedSink._check_state_for_finalize_write
  • [BEAM-5028] - getPerDestinationOutputFilenames() is getting processed before write is finished
  • [BEAM-5037] - HashFunction is not intialized in SyntheticOptions
  • [BEAM-5041] - Java Fn SDK Harness skips unprocessed pCollections
  • [BEAM-5060] - Issues with aws KPL while writing to kinesis using beam
  • [BEAM-5063] - Watermark does not progress for low traffic streams
  • [BEAM-5066] - beam_PostCommit_Go_GradleBuild: "failed to get digest sha256 ... no such file or directory"
  • [BEAM-5071] - Using the restful API in Beam dependency check system, get rid of bigquery
  • [BEAM-5098] - Combine.Globally::asSingletonView clears side inputs
  • [BEAM-5109] - Build nightly snapshot for Python SDK
  • [BEAM-5143] - Stop showing dependencies which are not able to upgraded in the weekly report
  • [BEAM-5145] - Make PTransform names stable in Join/CoGroupByKey
  • [BEAM-5155] - Custom sdk_location parameter not working with fn_api
  • [BEAM-5180] - Broken FileResultCoder via parseSchema change
  • [BEAM-5184] - Multimap side inputs with duplicate keys and values are being lost
  • [BEAM-5186] - Support for NUMERIC data type in BQ
  • [BEAM-5193] - KuduIO testWrite not correctly verifying behaviour
  • [BEAM-5211] - Flink Streaming ExecutableStage operator chain blocks grpc receiver threads
  • [BEAM-5246] - Beam metrics exported as flink metrics are not correct
  • [BEAM-5255] - Fix over-aggressive division futurization.
  • [BEAM-5293] - Unable to run IOIT on s3 filesystem
  • [BEAM-5351] - 2.7.0 RC1 jars missing META-INF/maven/groupId/artifactId/pom.xml
  • [BEAM-5375] - KafkaIO reader should handle runtime exceptions kafka client
  • [BEAM-5385] - Flink jobserver does not honor --flink-master-url
  • [BEAM-5831] - Can not use Values.Create() multiple times in pipeline

New Feature

  • [BEAM-2661] - Add KuduIO
  • [BEAM-4067] - Java: FlinkPortableTestRunner: runs portably via self-started local Flink
  • [BEAM-4130] - Portable Flink runner JobService entry point in a Docker container
  • [BEAM-4687] - Automatically file JIRA for dependency updates
  • [BEAM-4774] - Intergrate Nexmark SQL with Perfkit
  • [BEAM-4807] - Upgrade calcite to 1.17.0
  • [BEAM-4823] - Add Amazon Simple Notification Service (SNS) Sink
  • [BEAM-4828] - Add Amazon SqsIO
  • [BEAM-5187] - Create a ProcessJobBundleFactory for non-dockerized SDK harness
  • [BEAM-5239] - Allow configure latencyTrackingInterval

Improvement

  • [BEAM-2848] - Validate nexmark with the Google Dataflow runner
  • [BEAM-2886] - Portable pipeline submission proxy
  • [BEAM-3026] - Improve retrying in ElasticSearch client
  • [BEAM-3098] - Upgrade Java grpc version
  • [BEAM-3321] - Update gax-grpc dependency to latest
  • [BEAM-3412] - Update BigTable client version to 1.0
  • [BEAM-4031] - Add missing dataflow customization options for Go SDK
  • [BEAM-4257] - Add error reason and table destination to BigQueryIO streaming failed inserts
  • [BEAM-4417] - BigqueryIO Numeric datatype Support
  • [BEAM-4432] - Performance tests need a way to generate Synthetic data
  • [BEAM-4571] - RedisIO support for write using SET operation
  • [BEAM-4636] - Make beam.Run() (and/or friends) thread-safe.
  • [BEAM-4813] - Make Go Dataflow translation use protos directly
  • [BEAM-4814] - Support for S3FileSystem to work behind a proxy server
  • [BEAM-4835] - Add more flexible options for data loading to BigQueryIO.Write
  • [BEAM-4849] - Support running Beam Samza jobs in Yarn
  • [BEAM-5023] - BeamFnDataGrpcClient should pass the worker_id when connecting to the RunnerHarness
  • [BEAM-5035] - beam_PostCommit_Java_GradleBuild/1105 :beam-examples-java:compileTestJava FAILED
  • [BEAM-5147] - Expose document metadata in ElasticsearchIO read
  • [BEAM-5168] - Flink jobserver logging should be redirected to slf4j
  • [BEAM-5169] - Add options for master URL and log level to Flink jobserver runShadow task
  • [BEAM-5196] - Add MD5 consistency check on S3 uploads (writes)
  • [BEAM-5208] - Clearer Python SDK error message for streaming bigquery reads
  • [BEAM-5256] - Update Bigtable dependency
  • [BEAM-5264] - Reference DirectRunner implementation of Python user state and timers API
  • [BEAM-5370] - Set a cause exception in ElasticsearchIO#getBackendVersion

Test

  • [BEAM-4761] - Add postCommit scripts and perfkit dashboards for nexmark on Dataflow runner

Task

  • [BEAM-4118] - Remove FnAPI check for overriding Combine PTransforms
  • [BEAM-4791] - Integration test for portable Flink runner basic batch/streaming execution

Edit/Copy Release Notes

The text area below allows the project release notes to be edited and copied to another document.