-
-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhaul solve_dependencies()
for performance
#5554
base: master
Are you sure you want to change the base?
Conversation
📝 Docs preview for commit da025ba at: https://635c04265b1bfe0e67529590--fastapi.netlify.app |
📝 Docs preview for commit 82a7ed5 at: https://635c06dc1f721607753f1568--fastapi.netlify.app |
📝 Docs preview for commit 76338e2 at: https://635c0fbb1f24421dadf274eb--fastapi.netlify.app |
76338e2
to
6435409
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #5554 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 540 541 +1
Lines 13969 14064 +95
=========================================
+ Hits 13969 14064 +95 ☔ View full report in Codecov by Sentry. |
📝 Docs preview for commit 80e12c2 at: https://635c131d2c15d81992024860--fastapi.netlify.app |
80e12c2
to
040cd2a
Compare
solve_dependencies()
solve_dependencies()
for performance
📝 Docs preview for commit 040cd2a at: https://635c146a802d901f8f3e73c7--fastapi.netlify.app |
📝 Docs preview for commit 4506386 at: https://635c1552802d90211c3e7302--fastapi.netlify.app |
4506386
to
53fcbdb
Compare
📝 Docs preview for commit 53fcbdb at: https://635fb8f3c22353274c583c28--fastapi.netlify.app |
📝 Docs preview for commit 395c3d4 at: https://635fe01e1f72164ed13f158a--fastapi.netlify.app |
📝 Docs preview for commit 22585d7 at: https://635ff257043bc468f7e37821--fastapi.netlify.app |
📝 Docs preview for commit 8ad762f at: https://635ff94f43976500ba2067d3--fastapi.netlify.app |
8ad762f
to
0e6d410
Compare
📝 Docs preview for commit 0e6d410 at: https://63603f299481443aa9c033c7--fastapi.netlify.app |
0e6d410
to
1a2d498
Compare
📝 Docs preview for commit 1a2d498 at: https://636044c4e0908c0077cd5f54--fastapi.netlify.app |
1a2d498
to
059a07f
Compare
📝 Docs preview for commit 059a07f at: https://636046dc56950d004e41a6d5--fastapi.netlify.app |
This avoids unnecessary recursion which is subsequently discarded.
The documentation does not list `Response` as as valid dependency for WebSocket handlers.
…rameter to endpoints.
059a07f
to
7c30b54
Compare
📝 Docs preview for commit 7c30b54 at: https://63651ce7c401f802a191f702--fastapi.netlify.app |
@@ -454,77 +455,76 @@ async def solve_generator( | |||
return await stack.enter_async_context(cm) | |||
|
|||
|
|||
@dataclasses.dataclass | |||
class DependencySolverContext: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you using dataclases.dataclass
here instead of Pydantic?
I just want to know if there's a reason in particular
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dataclass is basically a quick way to declare a class with initializers. I'm not looking for pydantic's validation. This is supposed to be low overhead.
Could have used a plain class instead and written my own __init__()
.
Could also have made the solve_dependencies() an instance method, but I don't see much of that kind of programming style in FastAPI so I left it functional.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, very good PR. I really liked it.
📝 Docs preview for commit 63dfd34 at: https://6388b5ef29572b004e385306--fastapi.netlify.app |
📝 Docs preview for commit 1800be7 at: https://639ce634abc4ab0873a92fe0--fastapi.netlify.app |
📝 Docs preview for commit 1b332f2 at: https://639eeec6e97dd36863989e88--fastapi.netlify.app |
fastapi/dependencies/utils.py
Outdated
context.values = values | ||
context.errors = errors | ||
for dependency_getter in dependant.dependency_getters: | ||
if inspect.iscoroutinefunction(dependency_getter): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to have two list of getters in order to avoid the iscoroutine check?
Also if it's done in that way, one could easily use gather instead of waiting each coroutine individually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, we can do that, but see my other comment about automatically gathering. We'd have to to some performance tests to make sure it doesn't slow down simple getters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I'm looking at the code, and these are the "built-in" dependencies (Headers, params, etc)
There is only one type of async
built-in dependency, and that tis the get_body_params()
.
This is used for Body()
type parameters. And the only kind of Body()
parameter where anything async happens is the File()
one. FastAPI already schedules FILE sequence reads internally... and there is only one getter created, for all the Body type params. but I'll split it into a separate method and get rid of the iscoroutinefunction test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are so few comments in the code, it takes a lot of effort to reverse-engineer this :)
|
||
solved_result = await solve_dependencies( | ||
request=request, | ||
sub_values, sub_errors = await solve_dependencies( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we try to run all these subtasks in parallel? Same for the code below
if is_gen_callable(call) or is_async_gen_callable(call):
stack = context.request.scope.get("fastapi_astack")
assert isinstance(stack, AsyncExitStack)
solved = await solve_generator(
call=call, stack=stack, sub_values=sub_values
)
elif is_coroutine_callable(call):
solved = await call(**sub_values)
else:
solved = await run_in_threadpool(call, **sub_values)
if sub_dependant.name is not None:
values[sub_dependant.name] = solved
See comment #5472 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's tricky. The thing is often dependencies don't do any IO at all, but they are marked async
to be not marked sync. All sync dependencies get run from a threadpool, because they potentially block. This is a huge performance waste for a dependency which does something relatively simple without any IO.
Similarly, calling asyncio.gather
on a bunch of async
dependencies which don't do any IO is potentially quite wasteful. It means that each will have a Task
created and that we have to go and have a whole loop through the event loop while these Tasks are scheduled and run to completion. This will cause some unnecessary latency.
So, I'm not super convinced that it should be automatically done. I think we should leave that up to the application to decide if multiple dependencies can be wrapped in a single common dependency and then explicitly call asyncio.gather
At least, we'd need to do some performance testing before deciding to go down that route. And I'd like for it to be a separate PR, because that would be a significant change in logic.
fastapi/dependencies/utils.py
Outdated
for field in dependant.path_params: | ||
|
||
def get_path_param_getter(field: ModelField) -> DependencyGetter: | ||
def get_param(context: DependencySolverContext) -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are all these getters defined inline? As they don't seem to be using any local state, could you move them out and declare them at the top level?
Note that this is specially necessary for getters defined in a loop as in each iteration it will create a new function, generating more memory garbage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a style decision. My original implementation attempted to not use the outer function at all (e.g. get_path_param_getter()
, but crate the closure directly (e.g. get_param()
, but that doesn't work in a loop neccessitating the outer helper function. I kept it at the same place as the closure to keep all the logic in the same place.
You need not worry about "memory garbage". The outer function is assigned to a local variable and gets destroyed when the "get_dependent_dependency_getters()" exits. Each iteration just redefines the function. But I can remove it out of the loop if you want. It will just reduce the locality of the logic a bit. Python is nice like that, local functions definitions are just fine.
This also happens during the setup of the endpoints so it is not critical performance-wise.
But I'm happy to move them out to the top level if you think that style is better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've moved the definitions out of the loop scope, though.
📝 Docs preview for commit caa8a57 at: https://63a1a4c5e073010797cce777--fastapi.netlify.app |
caa8a57
to
cec8eef
Compare
📝 Docs preview for commit cec8eef at: https://63a1adbbe073010ef9cce6fb--fastapi.netlify.app |
📝 Docs preview for commit 7e9e36f at: https://63a1b69312177a198aee6326--fastapi.netlify.app |
📝 Docs preview for commit 4336a43 at: https://63bfdc4d50d4fd112dacd587--fastapi.netlify.app |
Is there any interest in reducing call overhead for FastAPI endpoint dependencies? |
Could we get this PR merged, I'd love to get this into my project. |
If you are interested in faster dependency solving, I can suggest my project which can be used with fastapi together (or without it, if you need) https://github.com/reagento/dishka/tree/develop |
This pr provides an overhaul of solve_dependencies(). It is aimed at fixing performance and increasing readability.
if
statements for every Dependency evaluationDependencySolverContext
dataclass avoiding a lot of boilerplate code in the main solve_dependencies() functionA new test is added to test deep
Depends
hierarchies and caching.A specialized unittests is also added which uses the
timeit
module to measure performance ofDepends
processing. We run an AsyncClient to reduce overhead, for complete requests. Also direct evaluation of the endpoint dependencies, both forDepends
and alsoRequest
. (The test does nothing by default usingif False:
. Could also add a pytest switch to enable it.)On a local PC, I measure, with the
pytest_cahce_perf
unittest enabled andpytest --quiet -k test_deep_cache_perf
unittest:Performance of deep Depends hierarchies is doubled, mostly due to the fixing of the cache mechanic. But simple dependencies, like the Request, are also faster, since the whole depdendency evaluation loop is simpler, and we use dispatch functions to fill the parameters rather than switch statements.
Running the entire testsuite, with:
pytest --quiet tests
reports 28.4s before and 28.1 after the change. The whole testsuite however typically creates an app and runs the endpoint once. Also, there is tremendous overhead in using
TestClient
rather thanAsyncClient
for each endpoint call. The optimizations are expected to trade app init time for endpoint execution time.