








当前位置:锐英源 / 开源技术 / Apache / apache过滤器随笔
Java 安卓移动开发
Q Q:396806883



What are you talking about? RemoveOutputFilter .shtml DOESN'T REMOVE A FILTER!!!
RemoveOutputFilter .shtml不会删除一个过滤器。
It eliminates the association between .shtml and whatever filters had been otherwise configured with AddOutputFilter somefilters .shtml. It works _exactly_ like every other AddSomething/RemoveSomething for mime association.
Notice that your DeleteFilter does exactly that, it would delete a filter added by ANY METHOD :) So that's pretty unambigious. DeleteOutputFilter Includes would take out that filter if it was added by AddOutputFilter, AddOutputFilterByType, SetOutputFilter, or InsertOutputFilter. Pretty powerful, very useful.



in mod_perl 2.0 we register only four filter names (in:out:req:conn) and then we install the actuall perl callbacks using one of these four filter names and
storing the actual filter's callback information in f->ctx. If later on we want to do something with an inserted filter we have no API to find it, since
it's not identified by name, but the data inside f->ctx. Therefore we need a new API to traverse the filter chain and find what we want using a custom

Here is the API and implementation that I came up with (it needs the docco and the standard DECLARE stuff, which I ask to disregard for now and concentrate
on the API/implementation itself; once it's polished I'll post a complete patch).

The function that I'm requesting to add is somewhat similar to apr_table_do.

typedef int (ap_filter_chain_traverse_fh_t)(void *data, ap_filter_t *f);
int ap_filter_chain_traverse(ap_filter_chain_traverse_fh_t *traverse,
void *data, ap_filter_t **chain);

int ap_filter_chain_traverse(ap_filter_chain_traverse_fh_t *traverse,
void *data, ap_filter_t **chain)
int rv = 0;
ap_filter_t *curr = *chain;

while (curr) {
if ((rv = (*traverse)(data, curr)) != 0) {
return rv;
curr = curr->next;
return rv;

I'm not sure regarding the chain argument. Looking at the util_filter.c, it looks in proto_ and normal chain. Could there be a problem on the caller side?
Won't it be enough to use one of r->output_filters, c->output_filters, r->input_filters, c->input_filters if all I care is the custom filters?

Here is an example of usage by mod_perl 2.0. This is an implementation of a function that will remove a filter by its perl handler name, e.g.:


there is a wrapper that translates this perl-side call to C's modperl_filter_remove_by_handler_name().

typedef struct {
char* filter_name;
char* handler_name;
ap_filter_t* f;
} filter_chain_traverse_t;

static int find_filter_by_handler_name(void *data, ap_filter_t *f)
apr_pool_t* p = f->r ? f->r->pool : f->c->pool;
filter_chain_traverse_t* traverse = (filter_chain_traverse_t*)data;
char *normalized_name;

/* 'name' in frec is always lowercased */
normalized_name = apr_pstrdup(p, traverse->filter_name);

/* skip non-mod_perl filters */
if (strNE(f->frec->name, normalized_name)) {
return 0;
} else {
modperl_filter_ctx_t *ctx = f->ctx;
if (strEQ(ctx->handler->name, traverse->handler_name)) {
traverse->f = f;
return 1; /* found what we wanted */

return 0;

/* modperl_filter_remove_by_handler_name(aTHX_ r, c,
* "MyFilter::output_lc")
void modperl_filter_remove_by_handler_name(pTHX_ request_rec *r,
conn_rec *c,
modperl_filter_mode_e mode,
char* handler_name)
int rv = 0;
apr_pool_t *pool = r ? r->pool : c->pool;
ap_filter_t *f;
filter_chain_traverse_t *traverse =
apr_pcalloc(pool, sizeof(filter_chain_traverse_t*));

/* XXX: generalize for conn/req in/out */
traverse->filter_name = MP_FILTER_REQUEST_OUTPUT_NAME;
traverse->handler_name = handler_name;

rv = ap_filter_chain_traverse(find_filter_by_handler_name, traverse,
&r->output_filters); /* XXX: generalize */
if (rv) {
f = traverse->f; /* XXX: validate */
MP_TRACE_f(MP_FUNC, "found filter handler %s\n", handler_name);
else {
Perl_croak(aTHX_ "unable to find filter handler '%s'\n", handler_name);

MP_TRACE_f(MP_FUNC, "removing filter %s\n", handler_name);
if (mode == MP_INPUT_FILTER_MODE) {
else {


Exactly. filter_init has a very specific purpose. Removing it makes certain classes of filters (like the now removed PHP httpd-2.x filter) not able to
work correctly. – justin
确实这样,filter_init有非常明确的目的。删除它使某些过滤器类(象目前删除PHP httpd-2.x过滤器)不能正常工作。


as long as you can remove all filters in a subdirectory by specifying a empty 'SetOutputfilter' (or maybe ever 'Clearoutputfilter') the command
should be then renamed to SetOutputFilterChain as you are Setting/Removing the entire chain.
the other issue I have with filters is 'enabling' them, I looked at the INCLUDE and CASE filters, and the both have a method of turning them on (either by allow
include in mod-includes case or caseFilter On.
we need to come to agreement on whether we need to do this. Setting the filter in the chain should enable it. period.



This isn't possible, it was my first design, and it just doesn't work. First of all, figuring out which state you are in is hard at the
http_filter level, because you really don't know what the headers look like. In the future, I expect http_filter to be the thing that actually
generates headers_in, and then all the other filters will be installed on top of that.
The problem with having four different filters, is where does the brigade go when we are switching filters? Since the brigade has to sit in
sombody's ctx pointer, it needs to be associated with a filter. Because the core_filter doesn't know how much data to pass up at a time, the
brigade will almost always end up sitting in http_filter's ctx pointer. If you try to remove that filter the brigade disappears, and you
can't continue the request.
As far as http_filter switching on the state, that requires http_filter really knowing the protocol, because otherwise it can't determine what
state it is in. There is no communication between the request_rec and the http_filter. Since the request_rec is the thing that knows the state, the
http_filter doesn't. The only solution, is to add an http_body filter, which is also a bit buggy, because that assumes that http_filter will
never be called during body processing, which is just incorrect. As soon as we need data from the socket, we will call back down the stack, and
http_filter will need to know if we are in body or header state.
This is just wrong. I'm sorry. The filters were always designed to be one way and one way only. It is not possible for our filters to push data
back down. The thing is, that currently, we have a hack because I want to make progress. Getline is bogus. http_filter should be the entity
creating the request_rec, and filling it out. That solves this whole issue, because then http_filter could actually really easily determine
which state it is in, and it can act accordingly. Plus, it makes adding different protocols as easy as replaing http_filter.
Those issues are gone now. The http_filter solved them. Please read what I wrote above, and then let me know if it still doesn't make sense as to
what some of the issues are here. The input filtering is non-intuitive, because we are operating on data on the way back up the stack. The
problem with that, is that it makes it much harder to add and remove filters while filtering data.
We can add/remove filters based on URI, but only at certain times. The problem with your patch from last week, is that it makes input filtering
look like output filtering, even though they are dramatically different to the end user.


How I Think Filters Should Work

Let me be more precise. I'm not saying that we shouldn't use brigades. What I'm saying is we shouldn't be dealing with specific types
of data at this level. Right now, by requiring a filter to request "bytes" or "lines", we are seriously constraining the performance of
the filters. A filter should only inspect the types of the buckets it retrieves and then move on. The bytes should only come in to play once
we have actually retrieved a bucket of a certain type that we are able to process.

Furthermore, we should be using a dynamic type system, and liberally creating new bucket types as we invent new implementations. Filters need
not know which filters are upstream or downstream from them, but they should have been strategically placed to consume certain buckets from
upstream filters and to produce certain buckets required by downstream filters.

[Warning: long-winded brainstorm follows:]

I want a typical filter chain to look like this:

input_source ---> protocol filters --> sub-protocol filters --> handlers

an input socket would produce this:


an http header parser filter would produce these:

DATA (extra data read past headers)

an http request parser would only work at the request level, performing
dechunking, dealing with content-length, and dealing with pipelined
requests. It would produce these:

... and so on

a multipart input handler would then pass all types except BODY_DATA,
which it could use to produce:


or a magic mime filter could simply buffer enough BODY_DATA buckets until
it knew the type, prepending a MIME_TYPE to the front and sending
the whole thing downstream.


The basic pattern for any input filter (which is pull-based at the moment in Apache) would be the following:

1. retrieve next "abstract data unit"
2. inspect "abstract data unit", can we operate on it?
3. if yes, operate_on(unit) and pass the result to the next filter.
4. if no, pass the current unit to the next filter.
5. go to #1

In this model, the operate_on() behavior has been separated from the mechanics of passing data around. I believe this would improve filter
performance as well as simplifying the implementation details that module authors must understand. I also think this would dramatically
improve the extendability of the Apache Filters system.

[.Sorry for the long brain dump. Some of these ideas have been floating around in my head for a long time. When they become clear enough I will
write up a more formal and concise proposal on how I think the future filter system should work (possible for 2.1 or beyond). I think the
apr-serf project is a perfect place to play with some of these ideas. I would appreciate any constructive comments to the above. ]


That's fine, as long as you ensure that the retrieval can be bounded. When the HTTP processor realizes that it can only read 100 more bytes from the
next-filter, then you're outside of "abstract data unit" and into "concrete 100 bytes."

Due to the presence of the Upgrade: header, an HTTP processing filter must always be per-request, and must never read past the end of its request. That
enforces a number of limitations on your design.

[. unless you go for "pushback", which I believe is a poor design. ]

What would be neat is to have a connection-level filter that does HTTP processing, but can be signalled to morph itself into a simple buffer. For
example, let's say that filter pulls 10k from next-filter ("pull" here, remember). It parses up the data into some headers and a 500 byte body. It
has 9k leftover, which it holds to the side.

Now, the request processor sees an "Upgrade" and switches protocols to something else entirely. The connection filter gets demoted to a simple
buffer, returning the 9k without processing. When the buffer is empty, it removes itself from the filter stack.

The implication here is that filters need to register with particular hooks in the server. In particular, with a hook to state that a protocol change
has occurred on <this> connection (also implying an input and an output filter stack). The protocol-related filters in the stack can then take
appropriate action (in the above example, to disable HTTP processing and just be a buffer). Other subsystems may have also registered with the hook
and will *install* new protocol handler filters.

You could even use this protocol-change hook to set up the initial HTTP processing filters. Go from "null" protocol to "http", and that installs
your chunking, http processing, etc. It could even be the mechanism which tells the MPM to call ap_run_request (??) (the app-level thing which starts
sucking input from the filter stack and processing it).

版权所有 Copyright(c)2004-2021 锐英源软件
公司注册号:410105000449586 豫ICP备08007559号 最佳分辨率 1024*768