I have been real upset by my recent failure with the 3D DOF effect (too slow if a MSAA FBO is used), and especially with the 7 day long mess to roll back the FBO setup to what it was before my 3D exercises started.
Now that the setup is restored, I've decided to go slowly but steadily. First, I'm implementing simple 1-pass 2D post-processors that need only one color FBO (if used one at a time) and are compatible with the current MSAA settings -- so that you have something to play with too. (FBOs are new to me, and I develop as I learn them)
The next step will be figuring out how to chain them one after another in an arbitrary order to introduce compound post-processing without a dramatic drop in the FPS rate. GLSL allows real time recompilation of shaders based on C-stylish #ifdef directives in the shader code. So, I think we can use this feature to dynamically recompile a dedicated compound PP shader for each model at its load time based on what PP exactly the model needs.
It will allow us to avoid multiple conditional branching in the shader(s). Note that there are ca. 2M screen pixels to process in each of our render frames. Note also that the entire shader is run not just once per frame but rather 2M times in each frame, once for every pixel. Now if there are even as few as only 5 if-clauses in the shader, then the GPU will have to process over 10M conditional branches in each frame with all the associated misprediction latencies and stalls. And this is real heavy even on the most modern hardware! If real time recompilation proves feasible and effective, then similar recompilation optimization should be introduced in our main rendering uber-shader, too.
Dynamic soft shadowing and bloom/glow effects require juggling with several FBOs at once in several passes. That's why I'll proceed to work on them only after I'm through with the simple 1-FBO 1-pass 2D post-processor chain.