Notes from running out of threads when running start-instance command in parallel (13021: simultaneous start-instance commands fail). What is unique to this command is that it will run another command (start-local-instance) which in turn calls back to the DAS to sync the file system. This calling back uses another thread in grizzly and we can end up with a deadlock where several start-instance commands have started and called the start-local-instance command which are all calling back to the DAS to synchronize the file system. We can end up in a situation where there are no threads are available for the sync calls and none of the commands can complete successfully. Eventually commands start to time out. We have increased the size of the admin thread pool but there can always be the case where the user tries to start large number of instances in parallel and exhaust the threads in the pool. These threads are in the grizzly thread pool. The following proposal will allievate the problem and allow commands like start-instance to run in parallel successfully regardless of the number of instances the user is trying to start in parallel. We specifically don't want an unbounded thread pool as that could be a security breach. Therefore, we want to release the grizzly thread so it doesn't wait for a long running command to complete while still waiting for the initial command to complete before returning execution. This is done by using a custom thread pool in the AdminAdapter code.
Here is some pseudo-code based on what Alexey sent: static class AdminAdapter extends GrizzlyAdapter { @Override public void service(final GrizzlyRequest grizzlyRequest, final GrizzlyResponse grizzlyResponse) { // get the command, check if it is annotated with @UseThreadPool if (annotated) { grizzlyResponse.suspend(); // Suspend response here threadpool = get thread pool named in annotation threadpool.execute(new Runnable() { // Run task in the separate thread @Override public void run() { try { doCommand(....); // run the command the same way it is normally run, but in a different thread // write the response (same code that is at the end of AdminAdapter.service() ) } catch (IOException e) { } catch (InterruptedException e) { } finally { grizzlyResponse.resume(); // finish the HTTP request processing } } }); return; // return from the command, this releases the thread to be used for another request, but doesn't finish the response } } |