SCIMP has a one-way key derivation, similar to my Hash Chain Prefix (HCP) mode. Since it is supposed that messages sent are short, I see no performance problem in re-keying the AES cipher over and over. You can check the section “Forward Secrecy” of the paper.
ZRTP also has some kind of forward secrecy, but only after the audio stream is closed. The key derivation function is explained here. So for ZRTP I would suggest changin the AES-CTR mode to HCCK, with counter re-hashing every 5K blocks, and consecutive counters in between.
For long conversations or for streaming real-time audio/video surveillance, it is much better to provide forward secrecy at the block level, or every minute using HCCK.