[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [apache-plusplus] Process model ideas for C++ Apache.
Sorry I haven't kept-up with the conversation. I caught a
cold and didn't look at mail for 2 days ...
Dean Gaudet wrote:
> >1. threads are self-dispatching through lightweight
> > synchronization such as mutexes, condition-variables,
> > or semaphores (collectively call a poll); no central
> > dispatcher, no main scheduling loop, no select().
>
> Synchronization sucks. Only the kernel knows when it is
> necessary and when it isn't (i.e. the kernel knows how
> many CPUs are in the box).
I should have been clearer here. Connection- and message-
records in their respective pools are protected by mutex.
Threads are informed that a record needs processing by a
condition-variable. I don't understand the relevance of
the statement, "Only the kernel knows ..."
> select() (well, rather poll()) will have to be there
> behind the scenes. You can't get rid of it and still
> implement the multiple fiber models
Your process-model paper defines "fiber" as a user-level
thread, as opposed to a kernel-level "thread" - are you
saying that user-level threads such as Chris Provenzano's
or Frank Mueller's pthreads packages use select() internally?
Does it matter if its hidden behind the pthreads interface?
> >2. the number of threads are minimized by specializing and
> > abstracting threads by "function" ( accept()or, read()er,
> > write()r, processer, etc.),
>
> Ack! I'm already imagining the state-machine hell that
> Squid is ... That's awful to understand, ... This is why
> I keep emphasizing fibers ...
In the model I described, messages move from state-to-state
under explicit control of each thread (I would make it table-
driven next time around). There are only a few states
(acceptor->reader, reader->processor, processor->writer)
which I had no trouble following. I don't understand where
"fibers" enter-in here, if they're abstracted behind the
pthreads interface.
> >4. complete abstraction of messages (requests & responses)
> > from connections (network I/O) and the processing of the
> > messages. This scheme implies a bucket(s) of message records
> > in common or shared memory over which read()ers, and
> > write()rs, processors, and other thread-types can operate.
> > Bucket records are self-contained objects which know their
> > state and handle their own concurrency.
>
> There's that state word...
>
> If I understand you, you want to make the state explicit
> and managed by the programmer. By contrast, I really think
> state should be implicit and managed behind the scenes
> by the run-time.
Are we talking about the same "state" here? I'm referring to
the state of a message (where it is in the list of processing
that needs to be done.) You seem to be referring to CPU
scheduling?
> So let me give an example why I think this is so damn
> difficult to use. ... You've got this Reader thread, and
> suppose it manages to read 1 byte. It doesn't know what
> that byte is, it just hands it off to the Processor
> pool. The processor pool immediately says "wtf? I need
> more than one byte!" and goes right back to the Reader pool.
> They go back and forth like this ... Every step along the
> way you've got mutexes being flipped around ...
Aha! we arrive at a fundamental mis-understanding and perhaps
a fundamental difference in the two types of projects (an
http server and a MOM (message-oriented middleware.)) In
the MOM, the read()er knows the length of the incoming
message from a header preceding the message data. This has
significant advantages and at least one inefficiency (two
read()s are required - the header, then the data). The
advantages are that its easy to allocate memory to hold the
data, and the read()er doesn't signal the processor() until
its done. Therefore, none of the mutex flipping you describe
happens - the condition-variable is asserted once for each
message. Another advantage is that the reader() can multiplex
messages: read()-in several messages on the same connection.
Even if the length isn't known in advance (which complicates
memory allocation), as long as the read()er knows when its
done, there is no need to "flip mutexes" more than one time.
I guess an http server only knows it has the entire message
when the client closes the connection, Does Apache begin
processing a message before its all there? Are there byte-by-
byte I/O optimizations?
> >I'd have to study the code and design docs again to identify my
> >various sins in regards to C++ conventions, but I believe the
> >main one's are these:
> >
> >3. class methods which return the address of private data,
> > especially aggregated structures for direct manipulation
> > outside the class methods.
> Doing that, of course, completely negates one of the advantages of C++
> (implementation hiding). Wouldn't inlined get/set methods get you the
> same performance benefit while still gaining you the ability to modify
> the implementation later?
My list of C++ sins inadvertantly omitted a couple of important
items:
Besides aggregating data members into structures, each structure
representing a message-record (for example), the structures
are aggregated into arrays by the constructor (array size depends
of the capacity of the platform.) So in effect
the message class is a complete message-queuing system. Methods
are provided to traverse/insert/delete/etc in the queue. The
records in the queue are static, and the operations on them
can be very efficient since they are array'd. I first tried
a more "object" approach - a container class, and was appalled
by the amount of memory allocation and memory copying for the
dynamically-sized queue, and by the traversal cost and memory
copying for the statically-sized queue.
Since the operations on the data are more complex than simple
get/set, the inline approach wasn't used. The ugliness of
returning addresses of private data members is two-fold: 1. the
implementation is exposed, but this is no different than using
structures in C, and 2. the referenced object may not be there
when referenced. Since the queue is static, this wasn't a
problem. This was a carefully considered compromise I made
between the C and C++ conventions at the time.
In any case, my particular implementation has no bearing on
the process model I described - I could have easily done it
100% OO (actually, the first attempt was OO, and it was a real
dog!)
> > ... In my theory, no thread should ever be using CPU
> >unless they have real work to do.
>
> The implication I'm reading from this is that with an
> approach such as what I've got going with apache-nspr
> right now you believe that CPU is wasted by idle threads...
Nope, no such implication intended. I was merely trying to
define what I meant by "poll", which I had used ambiguously
before.
> But the truth is, I stopped caring about this level of performance detail. ... ... and present as "pthreads".
Well, we agree on this.
--
Mike Anderson
mka@redes.int.com.mx
+52 473 23730 voice/fax
Guanajuato, GTO, Mexico
"If it looks like a bug, waddles like a bug,
and quacks like a bug, its a quack!"