[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [apache-plusplus] Process model ideas for C++ Apache.
The "fiber" stuff I'm talking about you can find in the doc at
http://www.arctic.org/~dgaudet/apache/2.0/process-model
or in the docs subdirectory of the apache-2.0 cvs repository (which I
think has a few more updates in it).
But in a nutshell, a "fiber" is a user-level thread, a thread implemented
by user-level code. A "thread" is a kernel-level thread, and a "process"
is a process. There are a few models of multiprocessing that are
"interesting". One is to have multiple fibers inside a process (I call
this SSM, single process, single thread, multiple fibers), this is the
model that many unix thread libraries use -- they wrap system calls like
read()/write() with non-blocking reads and writes and possibly stick
things in select when the operation would block. Then there's SMS where
the kernel provides all the threads, and syscalls don't need any wrapping.
Solaris, NT, IRIX, and OSF/1 provide us with pthreads implementations that
are actually SMM -- a number of user-level threads are multiplexed on a
number of kernel-level threads. This (and MMM) are the models that I find
the most interesting... mostly based on some of the papers mentioned at
that URL, and on discussions with folks. NSPR happens to provide a SMM
implementation on some systems (it only needs an SMS pthreads
implementation and it can provide an SMM implementation on top of that).
One major fault with my paper above is that I don't think the programmer
should know about more than "processes" and "threads" -- the whether a
thread is user-level or kernel-level should be completely transparant. So
beyond the discussion of process models themselve I tend to just refer to
processes and threads... and skip fibers. Confusing yeah.
On Tue, 26 May 1998, Bret wrote:
> I must admit, I haven't had a chance to study your Apache-nspr stuff yet,
> Dean, but I'm curious as to you're reasoning for ditching the pool of
> threads concept. This might not be appropriate for this mailing list, but
> I would like to hear why you're going for what seems like, on a surface
> glance anyway, a less elegant system. Was it only out of concern for
> simplicity, or was there some other reason?
The "best" method used to dispatch incoming connections to threads depends
on how the underlying thread system is implemented. If it's SMS then a
pool of threads sitting in accept() could be the right answer, because the
kernel really knows how many cpus there are, and is able to maximize the
parallelism. But if the model is SSM then multiple threads in accept()
doesn't work as well as a single thread in accept() -- for example, the
blocking accept() could be implemented behind the scenes with a
non-blocking accept() and select() multiplexing. SMS thread creation time
is probably higher than SSM thread creation time...
Apache isn't the only application that needs this accept --> thread
operation... in fact it's something pretty easy to abstract. I'd like to
eventually abstract it and place it in NSPR itself. NSPR can easily
manage a pool of threads and dispatch according to the best model
appropriate for whatever implementation it's interested in. When a thread
"exits" NSPR doesn't have to reclaim the storage immediately, it can just
hold onto it, and use it for another connection... or for any other
PR_CreateThread call. The application shouldn't have to maintain a pool
of prespawned threads, it can easily be hidden.
So yeah, I plan to do this for simplicity, and I do have a plan for how to
make it go faster later. What I'm trying to do is not dig myself into a
hole -- so when I try to take away an "optimization" that we've done in
the past I try to make sure that it'll be easy to re-implement in another
way if it turns out we really do need it. I've been guilty of doing
optimizations for the sake of doing them, without benchmark information,
in the past... I'm trying to stop that... it just makes things far too
complex.
> avoids the "dispatch" overhead you expressed concern about before. I
> missed the lock to avoid starvation in the Apache code (have to go look for
> that...)
Look at http://www.apache.org/docs/misc/perf-tuning.html, search for
"accept", I discuss the starvation problem there. There's another form of
starvation which 1.2 is susceptable to, but 1.3 isn't, which depends on
the ordering of the Listen statements. If you've got a busy address first
(or last, I forget) it can starve the rest.
Dean