There is really little benefits for Voice to use block acks to reduce the amount of ACKs.
You will introduce an additional 20ms delay for every voice packer store in the frame.
You can achieve the nearly same thing with using larger voice samples say 30 or 40 ms if most of your calls are internally.
I Wonder if VoWiFi handsets will support block ACKs at all. Do not know what the standards require.
Anyone seen the WiFi VoWiFi for Enterprise certification spec yet?
I can't answer your last question directly as I worked with telephony systems, that even though they were digital, still used hard line (non-radio) connections. So I had many types of buffering/jitter/lag conditions, but VOIP has even more that I have not had direct programming experience with.
But I can tell you that, in addition to the speech that was digitized, it was/is possible, for example, to put many seconds of silence in a buffer that is only long enough for a couple hundred milliseconds of speech. So not all buffers of a certain time length are the same number of bytes long.
My system was using CVSD (Continuous Variable Sloped Delta modulation), AND on top of that there were other things that could be encoded besides voice - "silence" being one of them.
Another technique that could be used, is to simply throw out the last so many milliseconds of a sentence, and the human brain will fill in the missing piece without noticable impairment.
I hope that helps some.