-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add OpenTelemetry tracing to the Publisher and Subscriber #2086
base: main
Are you sure you want to change the base?
Conversation
No region tags are edited in this PR.This comment is generated by snippet-bot.
|
* feat: Add OpenTelemetry tracing to the SubscriberClient * feat: Add link to publisher create span in the subscribe process span * feat: Add Ack/Nack/ModAck RPC spans to the subscribe
* feat: Add OpenTelemetry tracing to the SubscriberClient * feat: Add link to publisher create span in the subscribe process span * feat: Add Ack/Nack/ModAck RPC spans to the subscribe * fix: Fix test errors caused by otel changes
<dependency> | ||
<groupId>io.opentelemetry</groupId> | ||
<artifactId>opentelemetry-semconv</artifactId> | ||
<version>1.26.0-alpha</version> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already specified in share-dependencies
<scope>test</scope> | ||
</dependency> | ||
<dependency> | ||
<groupId>org.assertj</groupId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curious that are there any cases Truth
and junit
can not do?
int ackDeadline, | ||
boolean isReceiptModack, | ||
boolean enableOpenTelemetryTracing) { | ||
if (enableOpenTelemetryTracing && tracer != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like we are passing enableOpenTelemetryTracing
and tracer
all the time, it seems we can at least extract them as fields and make OpenTelemetryUtil
as a context aware wrapper of Tracer
. In addition, we can have create an interface PubsubTracer
that have all the injection points, and we can create an OpenTelemetryPubsubTracer
only if enableOpenTelemetryTracing
is true. Similar concepts can be applied to PubsubMessageWrapper
as well.
We have this architecture in gax. There is a ApiTracer that defines all the injection points, and there is an OpenCensusTracer that is created conditionally, so we don't have to always check if instrumentation is enabled or not. This also has a clearer separation of concern that in case we need to move away from OpenTelemetry, we don't have to write all the code, only the implementation of the interface.
TopicName topicName = TopicName.parse(topic); | ||
Attributes attributes = | ||
createCommonSpanAttributesBuilder( | ||
topicName.getTopic(), topicName.getProject(), "Publisher.publishCall", "publish") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code.function
can just be the method name: https://opentelemetry.io/docs/specs/semconv/attributes-registry/code/
We should probably include the class name in code.namespace
but it wasn't included in the design and I think it's not strictly necessary.
* Creates, starts, and returns a publish RPC span for the given message batch. Bi-directional | ||
* links with the publisher parent span are created for sampled messages in the batch. | ||
*/ | ||
public static final Span startPublishRpcSpan( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method would currently be exposed to users right? Is it possible to expose this method to other parts of the library, but not to users?
SpanData flowControlSpanData = allSpans.get(0); | ||
SpanData batchingSpanData = allSpans.get(1); | ||
SpanData publishRpcSpanData = allSpans.get(2); | ||
SpanData publisherSpanData = allSpans.get(3); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How are these orders determined? It doesn't seem like it's chronological. Is it possible these change based on some race condition?
Attributes.builder() | ||
.put(SemanticAttributes.MESSAGING_OPERATION, rpcOperation) | ||
.build(); | ||
rpcSpan.addLink(message.getSubscriberSpan().getSpanContext(), linkAttributes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment as above: call addLink
as part of span creation when possible.
if (message.getPublisherSpan().getSpanContext().isSampled()) { | ||
Attributes linkAttributes = | ||
Attributes.builder().put(SemanticAttributes.MESSAGING_OPERATION, "publish").build(); | ||
publishRpcSpan.addLink(message.getPublisherSpan().getSpanContext(), linkAttributes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you do addLink earlier rather than here? It might make the code a bit more complicated, but it's preferred to do so as part of the span builder: https://javadoc.io/doc/io.opentelemetry/opentelemetry-api/latest/io/opentelemetry/api/trace/Span.html#addLink(io.opentelemetry.api.trace.SpanContext)
This is due to sampling decisions not respecting addLink called after span creation.
.setOrderingKey(ORDERING_KEY) | ||
.build(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have any way of testing the performance degradation of this change when EnableOpenTelemetry=false? We want to make sure that users not interested in tracing are not impacted. I suspect the changes are fairly small since we aren't creating any spans, but it would be nice to see the impact of the extra data structures / wrappers.
Thank you for opening a Pull Request! Before submitting your PR, there are a few things you can do to make sure it goes smoothly:
If you write sample code, please follow the samples format.