Hi Vitaliy,
802.3ad standard dictate that mis-ordering of frames should not occur and so ensure that ALL frames that are part of any given flow are transmitted on a single link: every flow is distribuited , and than pinned on selected link, according to an hash function based on various L2 L3 or even L4 frames bits:
distribution algo varies from vendor to vendor but a minimum of SA/DA xor algo is guaranteed but may be based on SrcIP/DstIP L4 port ecc (f.e. on IOS >15 you can a WLB too).
Dont think that a LaG ports group will forward EVERY frames in a round robin fashion..yes you can achieve an higher link usage increasing the number of conversations but is not a N+N_link linear increase.
Regarding to iSCSI is a better choice to use a native MPIO solution with a multi portals/target/initiators scenario.
Another thing this one specific to Windows 2008 or above:
a) if you have broadcom hardware UPDATE to latest drivers/firmwares!
b) disable every IP/TCPOE bit (rss,chimney,bla bla) since it's REALLY broken (combined with broadcom old hw/fw/drivers is deadly!)
c) configure the IP/TCP stack to a more conservative behavior (no autotuning, congestion algo,ecc)
d) disable IPv6 from registry if you dont use it
Regards,
Antonio