My To-Do ListPlease note that the order here is arbitrary and does not denote relative priority.
Transparent Color (untouched).
Space and time coordinates, while at first glance it seems easier that it's a track-bar, in actuality it's more complex. You usually have to use a calculator to position the logo exactly as you want it, and the return is a number, so you have to play around with the track-bar to find the right position, instead of simply typing it in.
Alpha blending (untouched, possibly using 255 as default value).
Fade. Ok, this is the most perplexing, you should simply remove the "out end" field. You already have the starting frame and duration, so you should only need to type the fade-in and fade-out frames. If the duration is infinite and the fade-out is bigger than 0, then you simply begin fade-out right before the end of the video, according to the number of frames used, or even not fade at all (if someone does a fadeout on an infinite duration, it stands to reason he doesn't know what he's doing, and he could always set a duration for it to kick in)." [Yaron Gur]
Also: "What I'm wondering is. Would it be possible to convert the Decimate filter into a frame serving application that would read a telecide-output AVI file frame served by VirtualDub and then decimate the AVI and in turn frame serve it to another instance of VirtualDub (or any other program for that matter), thus allowing for IVTC to be done in one step." [Yaron Gur]
Now, there are some tricks to pulling this off, the least of which that the accumulators have to be 64-bit, and have to be distinct from the source pixels. I suspect that the filter is already thrashing the L1 cache by accessing 5+ rows simultaneously, so I'm not sure whether it would be slower or faster than the current implementation, given that it requires 2.5x the memory bandwidth. There is also the option of doing two or four pixels in parallel in the horizontal direction, although that is a low-level optimization rather than an algorithmic one, and only works well for large kernels.
Also, can try other speedup tricks, such as using fixed window array." [Avery Lee]