[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [apache-plusplus] Process model ideas for C++ Apache.



>I sent a similar message regarding ideas for a 2.0 process
>model to the new-httpd list, where it was totally ignored.
>Here's your chance to ignore it also.

	Well, I cannot speak for Ben or Bruce, but the only reason I've ignored
your posting is because I've had too many things come up which couldn't be
ignored.  Now that those are taken care of, I'll do my best not to ignore
you any further... :)

>My project is a hybrid of C++ classes and procedural C. All
>sorts of C++ conventions were violated (such as member functions
>which return the address of private data) to step around
>certain C++ inefficiencies. I used the book
>"Inside the C++ Object Model", by Stanley B. Lippman, 1996,
>Addison Wessley, ISBN 0-201-83454-5, to help me identify C++
>performance weak spots.

	Again, I can't speak for others, but this last paragraph sets off a few
warning bells in my head.  Without seeing your code it's impossible to say
anything for certain, but it certainly sounds ominous to state that you
violated standard C++ conventions! :)

>These are the major themes in the process model of my project:
>
>1.  threads are self-dispatching through lightweight
>    synchronization such as mutexes, condition-variables,
>    or semaphores (collectively call a poll); no central
>    dispatcher, no main scheduling loop, no select().

	I'm by no means an expert on all things threaded, but from the little I've
been able to pick up, it seems like this would require a relatively large
number of mutexes.  Did you find this was actually the case?
	Also, merely to satisfy my own curiousity, could you speak more
specifically about how you handled scheduling without a dispatcher thread?

>2.  the number of threads are minimized by specializing and
>    abstracting threads by "function" ( accept()or, read()er,
>    write()r, processer, etc.), so that no pending "function"
>    is waiting for another "function" to complete. For example,
>    a thread that accept()s -> read()s -> processes -> write()s,
>    cannot accept() another connection while it is read()ing.
>    Any read()er or write()r should be able to service any
>    connection provided by any accept()or. Any processor should
>    be able to service any message (request->response) from any
>    connection. More connections can be handled with fewer
>    threads since "functions" are asynchronous to each other.
>    There is a pool of read()ers, write()rs, processors, etc.
>    available to handle any message, waiting on its poll to go
>    to work. accept()ors are, of course, blocking on accept().
>    For network I/O, this scheme implies a bucket(s) of
>    connection records in common or shared memory over which
>    accept()ors, read()ers, and write()rs can operate.

	To me, who also has a somewhat limited knowledge of Apache, this sounds
like an enhancement to the Apache pool/resource model.  Ben, your thoughts?
	I'm a bit perplexed by your wording; first you appear to refer to a thread
that performs functions successively, in the case you cited accept()
followed by read() and so on.  Next, you refer to multiple pools of
specialized threads, each of which performs a single function.  This raises
two questions in my mind...

	1. Do threads change functionality over time?  In other words, does a
single thread perform multiple functions, or do you have pools of multiple
threads, each thread in a given pool performing the function assigned to
that pool?

	2. If the thread changes functionality over time, what controls that
change?  If it doesn't, and merely passes handling for the next function
off to a thread from within the available pool of threads for that
function, how is the passing handled?  What kind of overhead are you
looking at?

	Oh, I see... the polls handle the scheduling between the various pools of
threads.  Can't quite identify it, but I keep hearing "race condition" in
the back of my head somewhere... guess I'll have to study this a bit more
closely later...

>3.  connections are abstracted from network I/O. A connection
>    is nothing more than a record in the connection bucket.
>    Network protocols (TCP, UnixDomain, UDP, TLI, etc.) are
>    instantiated objects. Bucket records are self-contained
>    objects which know their state and handle their own concurrency.
>
>4.  complete abstraction of messages (requests & responses)
>    from connections (network I/O) and the processing of the
>    messages. This scheme implies a bucket(s) of message records
>    in common or shared memory over which read()ers, and
>    write()rs, processors, and other thread-types can operate.
>    Bucket records are self-contained objects which know their
>    state and handle their own concurrency.

	Have you looked at the ACE classes, available from Washington U., with
respect to these items?  They have a fairly robust implementation of some
of the things you describe in these points... might be worth your while to
check it out.
	Personally, I view this type of abstraction as being quite useful.  Apache
has little or no such abstraction; system calls are masked by one or two
functions, placed somewhat randomly.  To be expected given the nature of
the beast, but a more consistent and thoughtfull layer of abstraction might
be well worth our while, whether it comes from ACE or elsewhere...

>That's it. Of course, the devil is in the details.

	Well said... :)

>So, the macro processing model is:
>
>main thread:
>
>    1.  parse config file and command-line arguments.
>    2.  allocate buckets.
>    3.  startup initializations (thread attributes, etc.)
>    4.  start ThreadManager (who has a bucket of threads.)
>    5.  request ThreadManager to start Processors, Readers,
>        Writers Acceptors, and whatever other threads are needed.
>    6.  hang-out waiting for exit event, clean-up and die.
>
>ThreadManager:
>
>    1.  start threads as requested. Instantiate network
>        protocols for Acceptors.
>    2.  monitor connection- and message-record wait times.
>        start new threads as wait times increase (heavier
>        loads).
>    3.  reap dysfunctional and underused threads.
>
>Acceptors:
>
>    1.  grab a connection-record and accept().
>    2.  throw the connection-record (connection handle and
>        protocol object) into the connection bucket. Assert
>        the Reader's poll.
>    3.  loop to 1.
>
>Readers:
>
>    1.  grab a message-record from the message bucket; wait
>        on poll.
>    2.  when asserted, grab any ready connection-record
>        from the connection bucket and read() the message
>        (request) into the message-record.
>    3.  throw the message, with connection-record id, into
>        the message bucket; assert the Processor's poll.
>    4.  loop to 1.
>
>Processors:
>
>    1.  wait on poll.
>    2.  when asserted, grab any ready message-record
>        from the message bucket.
>    3.  do a table-lookup for request/module type;
>        escort message through module functions.
>    4.  throw the message into the message bucket;
>        assert the Writers's poll.
>    5.  loop to 1.
>
>Writers:
>
>    1.  wait on poll.
>    2.  when asserted, grab any ready message-record
>        from the message bucket.
>    3.  grab the connection-record from the connection
>        bucket; write() the message.
>    4.  clear the message- and connection-records.
>    5.  loop to 1.

	A few idle comments... note I haven't studied this in detail, these were
the first things that sprang to mind...
	I'll study this in a bit more detail later, but on a surface glance it
appears quite intriguing.  The only thing that strikes me right away is
your use of polling over select()... it seems to me, based on the limited
info I have in front of me, that you're mimicing the functionality of
select() by creating a multithreaded acceptor (i.e. a pool of acceptors, in
your lingo).  When discussing the accept() phase only, how does this
approach offer you any better performance than just using select()?
	I assume you have some method of controlling the possibility of deadlock
if two readers attempt to access the same connection-record simultaneously?
	It also appears to me that you simply remove any socket info when you
remove the message and connection records in your writer process.  When
talking about a Web server, you'd have to retain a list of old sockets like
Apache does so that, in the case of a restart, you won't run into "socket
in use" problems.

>Since the buckets and threads contain concurrency
>mechanisms, care must be taken not to serialize thread
>execution by overzealous locking and so lose the inherent
>parallelism possible in the design.

	Guess this is what I was getting at earlier... I could see this very
easily becoming a source of debugging nightmares ("Why am I losing so much
performance?").

>The performance characteristics of this approach will
>vary depending on the thread package type: kernel, user,
>or hybrid, and, in my theory, should scale well on
>multiple processor machines. I haven't done any
>measurements yet to verify/negate that theory.

	Now where did I find those benchmarks about threaded processes running on
multiprocessor machines... :)

>One of the keys to good performance will be matching the
>number of threads to the load. Fewer threads mean
>less memory and less contention for shared resources.
>There are good techniques for minimizing contention.

	What types of methods are you using in your product?

>I don't have any performance comparisons on this
>approach, since I don't have anything to compare it
>with that is roughly equivalent in functionality (to
>my project) and which uses a different approach. Also my
>project is much more complex, with component
>registrations, routing, load-balancing, message-queuing,
>message prioritizing, hot plug/unplug of components,
>workflow, protocol-hopping, blah, blah, blah ...
>These kind of features don't belong in a HTTP server
>and, poorly done, can really drag performance.

	Again, sounds a lot like ACE to me.  I'll give you the URL if you're
interested...

	Sorry about the delay in replying... I'll try to be a bit more prompt in
the future.
	Later...

								- Bret -