Propose the addition of custom executors to Swift concurrency #1257

rjmccall · 2021-01-29T22:54:16Z

No description provided.

proposals/0000-custom-executors.md

weissi · 2021-02-01T12:26:29Z

proposals/0000-custom-executors.md

+  calls to the same actor without any intervening suspensions,
+  running the function explicitly on that actor's executor may
+  allow Swift to avoid a lot of switching overhead (or may even
+  be necessary to perform those calls "atomically").


Thanks @rjmccall ! If you want another example: Just as you describe below, many systems do have fixed-width thread pools that don't block as long as there is outstanding work to be done. If there's no outstanding work to be done, these threads do usually block (as you describe below). So without custom executors, implementing one of these systems (typically used for I/O) would also not be possible without prohibitive thread switching (because we're obviously can't block say a thread of the default executor in kqueue, epoll_wait, iouring, or mach_msg.

Or in other words: Any system that implements its custom I/O and also wants to run async code will need to use custom executors for that.

To be clear, and I do remember that the proposal discusses this: while you certainly could use custom executors for that, I'm not sure it's a good idea, because having two fixed-width thread pools in a process basically means having one poorly-managed thread pool with way too many threads. Abstractly, I think we really want to encourage people who are writing scales-to-the-number-of-cores thread pools to either use the default concurrent executor or replace it globally. But if you think having multiple pools is important, I'd love to understand how you hope to use the second pool, and particularly how you think it should affect thread allocation in the main pool.

The default mode in SwiftNIO is exactly what you describe: One scales-to-the-number-of-cores thread pool that is shared by everything. However, we leave this up to the user because not every application is homogenous.

For example, let's say your server is supposed to -- at peak times -- accept as many connections as possible. How you'd achieve that is to use one thread pool with number-of-cores threads which each bind on the same socket address (with SO_RESUEPORT). You could also give those threads elevated priority in the OS and possibly even pin them to an individual hardware CPU core. This will work well but is still quite latency sensitive because the OS's "listen queue" (queue of incoming connections that has yet to be accepted) is bounded. So if at the peak time you can't call accept fast enough, the OS will drop connections for you (once the listen queue is exhausted). So far, this works really well with just one replaced default pool. But imagine, in the same program, we also run other code which is also supposed to run but isn't supposed to interfere with the accept-as-fast-as-possible work. Imagine for example, some JSON is supposed to be decoded or some something like that. It could be a pretty big if the JSON decoding would now cause us to return to the scheduler less often. This may cause us to drop connections. Priorities can't solve this issue because they are not preemptive.

As a really simple fix, users can configure a second (fixed-width) thread pool on which they run the other work. They are basically ring fencing the two workloads that are running in the same address space from each other by having separate pools.

Other examples include

if you were to run two separate systems that perform I/O in one process. Let's say on Linux one system that multiplexes I/O using io_uring and another one that multiplexes using epoll. They can't share the same thread pool because their whole eventing and I/O mechanism is totally different. They could for example use two thread pools of size number_of_cores/2 each.

in unit testing we usually start up and tear down the thread pool per test (suite) for them to be completely independent, work better when tests are run in parallel, and also to be able to test interesting scheduling scenarios with say 1 thread, or 3 threads, or some other "interesting" number

say you have a service and an "admin interface". You may choose to have the admin interface run on a separate thread pool to be sure to never disrupt the main service from achieving its goals even if the admin interface does maybe do processing that's a little too heavy (say parse a JSON, protobuf, compress/decompress, ...) and may harm the latency of the main service

What Johannes explains applies to actors in general in the general sense in in my experience.

How this shows up is usually that actors may perform "known to be slow/blocking" work like that huge json example "synchronously" -- note that Codable is not going to be async, so an actor performing decode/encode could be such offender. It is a popular strategy (I guess the main reason we even exposed executors in the first place in akka), to say "I know this one will do such nasty stuff" and get these kinds of actors off the shared global pool, on which actors are expected to be snappy and return threads pretty quickly.

So the "other (than the global) pool" is really to "put the nasty blocking stuff on that other pool" so it does not interfere with the snappyness of other actors which are well behaved. Blocking IO is a good example too.

I have a good example how this shows up from previous work that illustrates this well.

So here we're using the system-wide dispatcher which runs all actors and also futures and do a nasty sleep on it (imagine this is doing a very slow SomeValue(from: decoder) etc):

// BAD! (due to the blocking in Future): implicit val defaultDispatcher = system.dispatcher val routes: Route = post { complete { Future { // uses defaultDispatcher <<<<<<<<<<<<< Thread.sleep(5000) // will block on the default dispatcher, System.currentTimeMillis().toString // starving the routing infra } } }

which stalls the entire threadpool that all actors (and futures) share in akka, and all actors get completely starved; We can't even write 500 or timeout replies, because the blocking work are taking up all the threads and we can't stop them.

Colour = thread state:

turquoise – SLEEPING
orange - WAITING
green - RUNNABLE

and the server grinds to a halt.

Compared with making a dedicated pool that we'll call the "blocking executor" (there's a mini DSL to declare those), and execute all those "slow/blocking" work on it:

// GOOD (due to the blocking in Future): implicit val blockingDispatcher = system.dispatchers.lookup("my-blocking-dispatcher") val routes: Route = post { complete { Future { // uses the good "blocking dispatcher" that we configured, // instead of the default dispatcher – the blocking is isolated. Thread.sleep(5000) System.currentTimeMillis().toString } } }

Which now behaves great -- the server's threads remain active and they can serve any actors on it; while the "bad work" continues elsewhere.

Of course... with async/await the sleep would suspend so that doesn't matter.

I'd love to say "never block" but realistically we still have blocking IO and other potentially very slow operations like encoding/decoding... Cases like "huge JSON comes in, and it takes ages to deserialize it", are bound to happen because that's what serialization inevitably ends up being; Codable itself isn't async either, but making sure we deserialize "heavy" payloads on a dedicated pool can sometimes be good for server systems -- so it generally makes sense to throw such "big slow operations" at a separate pool, even if it is small and dedicated only to "bad stuff".

Hope that's another useful real-world example!

glbrntt · 2021-02-01T14:02:20Z

proposals/0000-custom-executors.md

+```
+
+An `actor` may derive its executor implementation in one
+of the following ways. We may add more ways in the future.


These seem a bit brittle to me: if you have a typo in serialExecutor, for example, presumably the actor would use the default serial executor and you would be none-the-wiser?

I see your point, but I'm not sure what to do about it. It's not reasonable to require actors to explicitly opt in to using the default serial executor, which means that there's always going to be some series of errors of omission that can lead to using it accidentally.

If we had a way to annotate that a function is supposed to be used for a protocol — a long-standing request — that would work for the serialExecutor case specifically.

I'm not too worried here to be honest. Same issue as with any other customization hook like that...

If you REALLY wanted to make sure to not make that mistake I think there's a number of ways to address it in plain user code.

a) @glbrntt I think you could make a require the executor type by a protocol:

protocol Executor {} class EventLoop: Executor {} protocol Actor { associatedtype ActorExecutor = Executor var ex: ActorExecutor { get } } protocol NeedsEL: Actor { associatedtype ActorExecutor = EventLoop } class Bad: NeedsEL { // error: repl.swift:10:7: error: type 'Specific' does not conform to protocol 'Actor' // class Specific: NeedsEL { // ^ // // repl.swift:5:7: note: protocol requires property 'ex' with type 'EventLoop'; do you want to add a stub? // var ex: ActorExecutor { get } } class Good: NeedsEL { let ex: EventLoop init(executor eventLoop: EventLoop) { self.ex = eventLoop } }

This should be possible right? Synthesis would have to be aware or it would fail since the default serial executor would not fulfil this requirement.

It's not reasonable to require actors to explicitly opt in to using the default serial executor, which means that there's always going to be some series of errors of omission that can lead to using it accidentally.

I agree.

I'm also not worried about this, but seemed worth pointing out as a potential pitfall. Calling it out in documentation would probably be sufficient.

ktoso

This is looking good 👍 Some comments inline but overall looking great, seems this'll enable all we need 🤔

proposals/0000-custom-executors.md

ktoso · 2021-02-02T03:44:07Z

proposals/0000-custom-executors.md

+  calls to the same actor without any intervening suspensions,
+  running the function explicitly on that actor's executor may
+  allow Swift to avoid a lot of switching overhead (or may even
+  be necessary to perform those calls "atomically").


What Johannes explains applies to actors in general in the general sense in in my experience.

How this shows up is usually that actors may perform "known to be slow/blocking" work like that huge json example "synchronously" -- note that Codable is not going to be async, so an actor performing decode/encode could be such offender. It is a popular strategy (I guess the main reason we even exposed executors in the first place in akka), to say "I know this one will do such nasty stuff" and get these kinds of actors off the shared global pool, on which actors are expected to be snappy and return threads pretty quickly.

So the "other (than the global) pool" is really to "put the nasty blocking stuff on that other pool" so it does not interfere with the snappyness of other actors which are well behaved. Blocking IO is a good example too.

I have a good example how this shows up from previous work that illustrates this well.

So here we're using the system-wide dispatcher which runs all actors and also futures and do a nasty sleep on it (imagine this is doing a very slow SomeValue(from: decoder) etc):

// BAD! (due to the blocking in Future): implicit val defaultDispatcher = system.dispatcher val routes: Route = post { complete { Future { // uses defaultDispatcher <<<<<<<<<<<<< Thread.sleep(5000) // will block on the default dispatcher, System.currentTimeMillis().toString // starving the routing infra } } }

which stalls the entire threadpool that all actors (and futures) share in akka, and all actors get completely starved; We can't even write 500 or timeout replies, because the blocking work are taking up all the threads and we can't stop them.

Colour = thread state:

turquoise – SLEEPING
orange - WAITING
green - RUNNABLE

and the server grinds to a halt.

Compared with making a dedicated pool that we'll call the "blocking executor" (there's a mini DSL to declare those), and execute all those "slow/blocking" work on it:

// GOOD (due to the blocking in Future): implicit val blockingDispatcher = system.dispatchers.lookup("my-blocking-dispatcher") val routes: Route = post { complete { Future { // uses the good "blocking dispatcher" that we configured, // instead of the default dispatcher – the blocking is isolated. Thread.sleep(5000) System.currentTimeMillis().toString } } }

Which now behaves great -- the server's threads remain active and they can serve any actors on it; while the "bad work" continues elsewhere.

Of course... with async/await the sleep would suspend so that doesn't matter.

I'd love to say "never block" but realistically we still have blocking IO and other potentially very slow operations like encoding/decoding... Cases like "huge JSON comes in, and it takes ages to deserialize it", are bound to happen because that's what serialization inevitably ends up being; Codable itself isn't async either, but making sure we deserialize "heavy" payloads on a dedicated pool can sometimes be good for server systems -- so it generally makes sense to throw such "big slow operations" at a separate pool, even if it is small and dedicated only to "bad stuff".

Hope that's another useful real-world example!

proposals/0000-custom-executors.md

ktoso · 2021-02-02T06:29:03Z

proposals/0000-custom-executors.md

+```
+
+An `actor` may derive its executor implementation in one
+of the following ways. We may add more ways in the future.


I'm not too worried here to be honest. Same issue as with any other customization hook like that...

If you REALLY wanted to make sure to not make that mistake I think there's a number of ways to address it in plain user code.

a) @glbrntt I think you could make a require the executor type by a protocol:

protocol Executor {} class EventLoop: Executor {} protocol Actor { associatedtype ActorExecutor = Executor var ex: ActorExecutor { get } } protocol NeedsEL: Actor { associatedtype ActorExecutor = EventLoop } class Bad: NeedsEL { // error: repl.swift:10:7: error: type 'Specific' does not conform to protocol 'Actor' // class Specific: NeedsEL { // ^ // // repl.swift:5:7: note: protocol requires property 'ex' with type 'EventLoop'; do you want to add a stub? // var ex: ActorExecutor { get } } class Good: NeedsEL { let ex: EventLoop init(executor eventLoop: EventLoop) { self.ex = eventLoop } }

This should be possible right? Synthesis would have to be aware or it would fail since the default serial executor would not fulfil this requirement.

proposals/0000-custom-executors.md

DougGregor · 2021-02-02T21:12:42Z

@rjmccall can you pitch this over on the forums so the discussion can move there?

rjmccall · 2021-02-02T21:17:11Z

@rjmccall can you pitch this over on the forums so the discussion can move there?

Yeah, I'm just tweaking it now; I'll post it tonight.

Co-authored-by: filip-sakel <[email protected]>

ktoso

Looks great, thanks John :)

DougGregor · 2021-07-27T21:04:07Z

This isn't ready for review yet. Please open up a new PR after it's been re-pitched and is ready for review

rjmccall marked this pull request as draft January 29, 2021 22:54

filip-sakel reviewed Jan 31, 2021

View reviewed changes

weissi reviewed Feb 1, 2021

View reviewed changes

glbrntt reviewed Feb 1, 2021

View reviewed changes

ktoso approved these changes Feb 2, 2021

View reviewed changes

rjmccall and others added 3 commits February 3, 2021 15:45

Second draft of the custom-executors pitch.

345ef4e

Apply suggestions from code review

3590685

Co-authored-by: filip-sakel <[email protected]>

Fill out the rest of the sections and respond to review.

c8f568c

rjmccall force-pushed the custom-executors branch from ef7cc72 to c8f568c Compare February 4, 2021 00:05

ktoso approved these changes Feb 4, 2021

View reviewed changes

rjmccall added 2 commits February 3, 2021 23:07

Fix a typo

633b790

Editorial fix; thanks to Jens Ayton for finding this.

78b8411

DougGregor closed this Jul 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Propose the addition of custom executors to Swift concurrency #1257

Propose the addition of custom executors to Swift concurrency #1257

rjmccall commented Jan 29, 2021

weissi Feb 1, 2021

rjmccall Feb 1, 2021

weissi Feb 1, 2021

ktoso Feb 2, 2021

glbrntt Feb 1, 2021

rjmccall Feb 1, 2021

ktoso Feb 2, 2021

glbrntt Feb 2, 2021

ktoso left a comment

ktoso Feb 2, 2021

ktoso Feb 2, 2021

DougGregor commented Feb 2, 2021

rjmccall commented Feb 2, 2021

ktoso left a comment

DougGregor commented Jul 27, 2021

Propose the addition of custom executors to Swift concurrency #1257

Propose the addition of custom executors to Swift concurrency #1257

Conversation

rjmccall commented Jan 29, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ktoso left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

DougGregor commented Feb 2, 2021

rjmccall commented Feb 2, 2021

ktoso left a comment

Choose a reason for hiding this comment

DougGregor commented Jul 27, 2021