Comware

 View Only
  • 1.  FlexFabric 5700 IRF - OpenvSwitch

    Posted Oct 23, 2017 07:18 PM

    hi

    i have two flexfabric 5700 in a irf fabric. I configure three Bridge Aggrigations over both chassis with link-aggregation mode dynmaic. The irf fabric works fine but i have a speed problem to my linux hypervisor.

    I use ovs-bond to connect 2xtengigabit interfaces over the switches.

    display link-aggregation verbose Bridge-Aggregation 1

    show me two Selected ports on all LAGs.

    my two tenGigabit links are up und running but ovs-appctl bond/show bond0 show:

    ---- bond0 ----
    bond_mode: balance-slb
    bond may use recirculation: no, Recirc-ID : -1
    bond-hash-basis: 0
    updelay: 0 ms
    downdelay: 0 ms
    next rebalance: 1986 ms
    lacp_status: negotiated
    active slave mac: a0:36:9f:f2:c4:b8(eth4)

    slave eth4: enabled
    active slave
    may_enable: true
    hash 203: 45396 kB load

    slave eth5: enabled
    may_enable: true
    hash 87: 50 kB load

    ---- bond0 ----
    bond_mode: balance-slb
    bond may use recirculation: no, Recirc-ID : -1
    bond-hash-basis: 0
    updelay: 0 ms
    downdelay: 0 ms
    next rebalance: 1986 ms
    lacp_status: negotiated
    active slave mac: a0:36:9f:f2:c4:b8(eth4)

    slave eth4: enabled
    active slave
    may_enable: true
    hash 203: 45396 kB load

    slave eth5: enabled
    may_enable: true
    hash 87: 50 kB load

    .... show eth4 active slave but why....

    any ideas?

    thanks...



  • 2.  RE: FlexFabric 5700 IRF - OpenvSwitch

    Posted Oct 24, 2017 02:47 AM

    IRF Fabric side, have your three LAGs been configured using LACP load balancing mode?

    I ask because [*] adopting SLB Bonding on the Open vSwitch as load balancing mode as opposed to LACP requires no switch cooperation nor knowledge of the Open vSwitch network side NIC configuration from the Switch standpoint (as opposed to using LACP).

    On your IRF Fabric, with regard your three LAGs, what are the sanitized outputs of display link-aggregation verbose bridge-aggregation n commands (n=1, 2, 3 or whatever else was used)?

    Regarding the slave term...AFAIK OvS uses that term for any interface that will be member of a bond (so you will not see any master term as opposite)...and probably active slave is used to describe the interface that, with respect to all others bind to the same bond, is the currently most loaded.

    On your OvS, with regard to bond0, what are the sanitized outputs of ovs-appctl lacp/show bond0 and ovs-vsctl list port bond0?

    [*] Open vSwitch uses the term "balance-slb" for bond_mode even if using LACP...so it confused me quite a bit (lacp is indeed "negotiated").



  • 3.  RE: FlexFabric 5700 IRF - OpenvSwitch

    Posted Oct 24, 2017 06:46 AM

    hi thanks for reply...

    Yes the LAGs been configured with Aggregation Mode Dynamic:

    I change slb bonding to tcp bonding in the openvswitch.

    From my first BAGG group:

    sw-clu]display link-aggregation verbose Bridge-Aggregation 1
    Loadsharing Type: Shar -- Loadsharing, NonS -- Non-Loadsharing
    Port Status: S -- Selected, U -- Unselected,
    I -- Individual, * -- Management port
    Flags: A -- LACP_Activity, B -- LACP_Timeout, C -- Aggregation,
    D -- Synchronization, E -- Collecting, F -- Distributing,
    G -- Defaulted, H -- Expired

    Aggregate Interface: Bridge-Aggregation1
    Aggregation Mode: Dynamic
    Loadsharing Type: Shar
    Management VLAN : None
    System ID: 0x8000, 40b9-3c48-1d7f
    Local:
    Port Status Priority Oper-Key Flag
    --------------------------------------------------------------------------------
    XGE1/0/1 S 32768 1 {ACDEF}
    XGE2/0/1 U 32768 1 {A}
    Remote:
    Actor Partner Priority Oper-Key SystemID Flag
    --------------------------------------------------------------------------------
    XGE1/0/1 1 65535 1 0xfffe, a036-9ff2-c0f8 {ABCDEF}
    XGE2/0/1 2 65535 1 0xfffe, a036-9ff2-c0f8 {ABCDEF}

    On my Server site with the connection to BAGG1:

    root@prx1:~# ovs-appctl lacp/show bond0
    ---- bond0 ----
    status: active negotiated
    sys_id: a0:36:9f:f2:c0:f8
    sys_priority: 65534
    aggregation key: 1
    lacp_time: fast

    slave: eth4: current attached
    port_id: 1
    port_priority: 65535
    may_enable: true

    actor sys_id: a0:36:9f:f2:c0:f8
    actor sys_priority: 65534
    actor port_id: 1
    actor port_priority: 65535
    actor key: 1
    actor state: activity timeout aggregation synchronized collecting distributing

    partner sys_id: 40:b9:3c:48:1d:7f
    partner sys_priority: 32768
    partner port_id: 1
    partner port_priority: 32768
    partner key: 1
    partner state: activity aggregation synchronized collecting distributing

    slave: eth5: current attached
    port_id: 2
    port_priority: 65535
    may_enable: false

    actor sys_id: a0:36:9f:f2:c0:f8
    actor sys_priority: 65534
    actor port_id: 2
    actor port_priority: 65535
    actor key: 1
    actor state: activity timeout aggregation synchronized collecting distributing

    partner sys_id: 40:b9:3c:48:1d:7f
    partner sys_priority: 32768
    partner port_id: 206
    partner port_priority: 32768
    partner key: 1
    partner state: activity

    oot@prx1:~# ovs-vsctl list port bond0
    _uuid : 9dde6868-bcf7-4895-9d0c-89df578b463b
    bond_active_slave : "a0:36:9f:f2:c0:f8"
    bond_downdelay : 0
    bond_fake_iface : true
    bond_mode : balance-tcp
    bond_updelay : 0
    external_ids : {}
    fake_bridge : false
    interfaces : [5ef5c997-4ff0-48a5-95f4-59fcafdeddf5, b1f2f25e-0b6a-45a6-a252-3e144f5afb3f]
    lacp : active
    mac : []
    name : "bond0"
    other_config : {lacp-time=fast}
    qos : []
    rstp_statistics : {}
    rstp_status : {}
    statistics : {}
    status : {}
    tag : []
    trunks : []
    vlan_mode : []



  • 4.  RE: FlexFabric 5700 IRF - OpenvSwitch

    Posted Oct 24, 2017 08:41 AM

    IRF Fabric side, is in your opinion:

    XGE2/0/1 U 32768 1 {A}

    a good sign?

    with respect to:

    XGE1/0/1 S 32768 1 {ACDEF}

    I mean: eth5 port on the OVS side looks different to eth4. I would have expected that both eth4 and eth5 show as ACDEF as normally happens on well formed LACP Port Trunking.

    Then:

    • eth4 reports that Partner State as "activity aggregation synchronized collecting distributing" and its Actor State reports "activity timeout aggregation synchronized collecting distributing"
    • eth5 instead reports Partner State as "activity" only and its Actor State is equal to the one reported on eth4

    Looks strange to see that on a LACP Trunk...

    Also that may_enable: false on eth5 doesn't sound very good to me (I could be wrong...), I'm blindly expecting a true there.

    Out of curiosity: which OVS version are you using?

    Intel NIC device driver is OK?



  • 5.  RE: FlexFabric 5700 IRF - OpenvSwitch

    Posted Oct 24, 2017 12:07 PM

    i see the error on eth5

    now i change die ovs bond to a normal linux bond.

    Port Status Priority Oper-Key Flag
    --------------------------------------------------------------------------------
    XGE1/0/1 S 32768 1 {ACDEF}
    XGE2/0/1 S 32768 1 {ACDEF}
    Remote:
    Actor Partner Priority Oper-Key SystemID Flag
    --------------------------------------------------------------------------------
    XGE1/0/1 1 255 15 0xffff, a036-9ff2-c0f8 {ACDEF}
    XGE2/0/1 2 255 15 0xffff, a036-9ff2-c0f8 {ACDEF}

    with a normal linux bond with mode 802.3ad looks better on the switch site.

    ethtool on bond0 show 20Gbit/s but my iperf tests looks:

    [SUM] 0.0-10.0 sec 11.0 GBytes 9.41 Gbits/sec

     



  • 6.  RE: FlexFabric 5700 IRF - OpenvSwitch

    Posted Oct 24, 2017 12:57 PM

    That looks better than before.

    Note about iperf test: to saturate a 20 Gbps port trunk (2x10 Gbps Full Duplex) you need (a) an iperf server able to sustain that throughput in/out and (b) many iperf clients able to concurrently saturate their individual connections - grouped together - while hitting the iperf server's host...or vice versa.



  • 7.  RE: FlexFabric 5700 IRF - OpenvSwitch

    Posted Oct 24, 2017 04:48 PM

    ok thats good.

    i have three server with the same hardware. Than the iperf server and iperf client have the same throughput. Every server have 20GBps port trunk.

     



  • 8.  RE: FlexFabric 5700 IRF - OpenvSwitch

    Posted Oct 24, 2017 05:07 PM

    It's a matter of understanding (or accepting, if already understood) how LACP and aggregated interfaces really work...many users believe that simply aggregating n links (using LACP) will make the single data flow's throughput flowing through that aggregated link n times faster if compared to the same single data flow's throughput flowing through the same aggregated interfaces' member link...it doesn't work that way (it start working that way in a many-to-one or, very better, in a many-to-many scenario).