Comware

 View Only
Expand all | Collapse all

Device attached to IRF member losing network

This thread has been viewed 42 times
  • 1.  Device attached to IRF member losing network

    Posted Sep 11, 2023 04:57 AM
    Edited by spgsitsupport Sep 13, 2023 06:40 AM

    So I got this setup "working"

    5900 JC772A IRF via HPE X140 40G QSFP+ MPO SR4 Transceivers

    https://community.arubanetworks.com/discussion/5900-jc772a-irf-via-hpe-x140-40g-qsfp-mpo-sr4-transceivers

    But devices attached to the Member 3 (one in another building) seem to drop off the network quite regularly in odd way (they will be there one moment, gone next & back again)

    Using JG709A - HPE X140 40G QSFP+ MPO SR4 Transceiver

    C:\Users\admin>ping sw-gardenbuilding
    
    Pinging sw-gardenbuilding [10.10.14.13] with 32 bytes of data:
    Reply from 10.10.14.13: bytes=32 time=5ms TTL=254
    Request timed out.
    Request timed out.
    Request timed out.
    
    Ping statistics for 10.10.14.13:
        Packets: Sent = 4, Received = 1, Lost = 3 (75% loss),
    Approximate round trip times in milli-seconds:
        Minimum = 5ms, Maximum = 5ms, Average = 5ms
    
    C:\Users\admin>ping sw-gardenbuilding
    
    Pinging sw-gardenbuilding [10.10.14.13] with 32 bytes of data:
    Reply from 10.10.14.13: bytes=32 time=1ms TTL=254
    Reply from 10.10.14.13: bytes=32 time<1ms TTL=254
    Reply from 10.10.14.13: bytes=32 time<1ms TTL=254
    Reply from 10.10.14.13: bytes=32 time<1ms TTL=254
    
    Ping statistics for 10.10.14.13:
        Packets: Sent = 4, Received = 4, Lost = 0 (0% loss),
    Approximate round trip times in milli-seconds:
        Minimum = 0ms, Maximum = 1ms, Average = 0ms
    

    Where could I try to start troubleshooting?

    Thanks

    Seb



    ------------------------------
    spgsitsupport
    ------------------------------



  • 2.  RE: Device attached to IRF member losing network

    Posted Sep 11, 2023 07:35 AM

    Hello

    I shall start with the log (display log) and check if there is any IRF or interface event.

    then check the stability of the switching part  :

    "display arp " and collect the mac address for 10.10.14.13 (do it a couple of times, and if mac-add changes, thent it's a IP conflict)

    "display mac-addr <mac-addr collected> " multiple times to check that it persists and is seen on the same interface.

    Post these traces here if needed.



    ------------------------------
    Frederic MEUNIER
    ------------------------------



  • 3.  RE: Device attached to IRF member losing network

    Posted Sep 11, 2023 08:16 AM
    Edited by spgsitsupport Sep 13, 2023 06:35 AM

    Ofcourse MAC address does not change, as it is static on remote switch

    If I attach the switch to member 1 or 2 then I never ever have an issue (the issue only exists when it is on the member 3 which is part of the IRF via OM4 fibre)

    Only ports 49 & 51 are used in every unit for IRF purposes

    [HPE5900-SR1]dis irf
    MemberID    Role    Priority  CPU-Mac         Description
     *+1        Master  32        f010-90db-7402  ---
       2        Standby 30        f010-90db-7403  ---
       3        Standby 20        f010-90db-7404  ---
       4        Standby 10        f010-90db-7405  ---
    --------------------------------------------------
     * indicates the device is the master.
     + indicates the device through which the user logs in.
    
     The bridge MAC of the IRF is: e8f7-2451-3464
     Auto upgrade                : yes
     Mac persistent              : 6 min
     Domain ID                   : 0
     IRF mode                    : normal
    
    [HPE5900-SR1]dis irf topology
                                  Topology Info
     -------------------------------------------------------------------------
                   IRF-Port1                IRF-Port2
     MemberID    Link       neighbor      Link       neighbor    Belong To
     1           UP         4             UP         2           f010-90db-7402
     4           UP         3             UP         1           f010-90db-7402
     3           UP         2             UP         4           f010-90db-7402
     2           UP         1             UP         3           f010-90db-7402
    
    [HPE5900-SR1]dis irf link
    Member 1
     IRF Port  Interface                             Status
     1         FortyGigE1/0/49                       UP
               FortyGigE1/0/50                       DOWN
     2         FortyGigE1/0/51                       UP
               FortyGigE1/0/52                       DOWN
    Member 2
     IRF Port  Interface                             Status
     1         FortyGigE2/0/49                       UP
               FortyGigE2/0/50                       DOWN
     2         FortyGigE2/0/52                       DOWN
               Ten-GigabitEthernet2/0/51:1           UP
               Ten-GigabitEthernet2/0/51:2           UP
               Ten-GigabitEthernet2/0/51:3           UP
               Ten-GigabitEthernet2/0/51:4           UP
    Member 3
     IRF Port  Interface                             Status
     1         Ten-GigabitEthernet3/0/49:1           UP
               Ten-GigabitEthernet3/0/49:2           UP
               Ten-GigabitEthernet3/0/49:3           UP
               Ten-GigabitEthernet3/0/49:4           UP
     2         Ten-GigabitEthernet3/0/51:1           UP
               Ten-GigabitEthernet3/0/51:2           UP
               Ten-GigabitEthernet3/0/51:3           UP
               Ten-GigabitEthernet3/0/51:4           UP
    Member 4
     IRF Port  Interface                             Status
     1         FortyGigE4/0/50                       DOWN
               Ten-GigabitEthernet4/0/49:1           UP
               Ten-GigabitEthernet4/0/49:2           UP
               Ten-GigabitEthernet4/0/49:3           UP
               Ten-GigabitEthernet4/0/49:4           UP
     2         FortyGigE4/0/51                       UP
               FortyGigE4/0/52                       DOWN

    No IRF error in log

    The switch attached to XGE3/0/10 simply disappears every few moments

    [HPE5900-SR1]display mac-address 3863-bb52-348b
    MAC Address      VLAN ID    State            Port/NickName            Aging
    3863-bb52-348b   14         Learned          XGE3/0/10                Y
    [HPE5900-SR1]display mac-address 3863-bb52-348b
    MAC Address      VLAN ID    State            Port/NickName            Aging
    [HPE5900-SR1]
    



    ------------------------------
    spgsitsupport
    ------------------------------



  • 4.  RE: Device attached to IRF member losing network

    Posted Sep 11, 2023 08:54 AM

    wait ! some stack links are 40G and some are 4x10G (splitted from 40G) ?

    From my point of view, it's not a supported configuration : all links should be the same sort.



    ------------------------------
    Frederic MEUNIER
    ------------------------------



  • 5.  RE: Device attached to IRF member losing network

    Posted Sep 11, 2023 09:48 AM

    Isn't the split (40 to 40 via 4x10) supposed to be exactly the same, 40G ?

    I could not find any info about it anywhere (what supported, what is not)

    I have a server attached to the satellite switch (10.10.14.13) and the server never looses a single ping

    It is only the 2920 stack that seems to be affected

    Seb



    ------------------------------
    spgsitsupport
    ------------------------------



  • 6.  RE: Device attached to IRF member losing network

    Posted Sep 12, 2023 02:26 AM

    Not really. It's 4 x 10G so it is Aggregate link not the single link. Ideally you would want to have same speed links for IRF accross the stack.  

    I would check 2920 and uplink ports between both stacks. Also check jumbo frames, LACP settings (if any),...

    Best, Gorazd



    ------------------------------
    Gorazd Kikelj
    MVP Expert 2023
    ------------------------------



  • 7.  RE: Device attached to IRF member losing network

    Posted Sep 12, 2023 05:06 AM

    Same speed (40G) across 200m+ OM4 is not possible (I did not find transceiver that can do it for this unit)

    There is nothing wrong with 2920 stack (that works (and worked) fine for years



    ------------------------------
    spgsitsupport
    ------------------------------



  • 8.  RE: Device attached to IRF member losing network

    Posted Sep 12, 2023 05:42 AM

    You have JG709A HPE X140 40G QSFP+ MPO MM 850nm CSR4 300m Transceiver in the SFP supported list for 5900.

    Do you see any strange counter values on xge 3/0/10 port or on 2920 uplink port? 

    Best, Gorazd



    ------------------------------
    Gorazd Kikelj
    MVP Expert 2023
    ------------------------------



  • 9.  RE: Device attached to IRF member losing network

    Posted Sep 12, 2023 09:00 AM

    Well, yes, but that is SM & I have MM OM4 (so does not work)

    [HPE5900-SR1]dis counters inbound interface Ten-GigabitEthernet 3/0/10
    Interface         Total (pkts)   Broadcast (pkts)   Multicast (pkts)  Err (pkts)
    XGE3/0/10           3805578038           16704674           17484828           0
    
     Overflow: More than 14 digits (7 digits for column "Err").
           --: Not supported.
    [HPE5900-SR1]dis counters outbound interface Ten-GigabitEthernet 3/0/10
    Interface         Total (pkts)   Broadcast (pkts)   Multicast (pkts)  Err (pkts)
    XGE3/0/10           4215373515          173388129          340861288           0
    
     Overflow: More than 14 digits (7 digits for column "Err").
           --: Not supported.
    
    


    ------------------------------
    spgsitsupport
    ------------------------------



  • 10.  RE: Device attached to IRF member losing network

    Posted Sep 12, 2023 11:40 AM

    Hi.

    This QSFP is Multi Mode for 300m to 400m depend on wavelength and will work just fine with OM4 cables you have..

    Line looks fine. No errors from 5900 side.

    Best, Gorazd



    ------------------------------
    Gorazd Kikelj
    MVP Expert 2023
    ------------------------------



  • 11.  RE: Device attached to IRF member losing network

    Posted Sep 13, 2023 06:02 AM

    You say "I have a server attached to the satellite switch (10.10.14.13) and the server never looses a single ping"

    but the ping to the switch is unstable, right ? the machines behind the sat switch are OK, but not the switch itself ?

    well, it may mean that it's probably not a stack issue, except if the hashing algorithm (that will distribute the frames through the splitted 4x10G irf link) has a problem working with that (trying to send the frame to lane 2, but frame not correctly received ?) 

    Did you try to connect another switch as satellite (just to have another mac-address and probably hit another hashing  result) ?

    Also :

    display interface Ten-GigabitEthernet3/0/49:1

    display interface Ten-GigabitEthernet3/0/49:2

    display interface Ten-GigabitEthernet3/0/49:3

    display interface Ten-GigabitEthernet3/0/49:4

    display interface Ten-GigabitEthernet3/0/51:1

    display interface Ten-GigabitEthernet3/0/51:2

    display interface Ten-GigabitEthernet3/0/51:3

    display interface Ten-GigabitEthernet3/0/51:4

    and check for strange values (counters, errors, state)



    ------------------------------
    Frederic MEUNIER
    ------------------------------



  • 12.  RE: Device attached to IRF member losing network

    Posted Sep 13, 2023 06:50 AM
    [HPE5900-SR1]display interface Ten-GigabitEthernet3/0/49:1
    Ten-GigabitEthernet3/0/49:1
    Current state: UP
    IP packet frame type: Ethernet II, hardware address: d894-039b-e255
    Description: Ten-GigabitEthernet3/0/49:1 Interface
    Bandwidth: 10000000 kbps
    Loopback is not set
    Media type is optical fiber, port hardware type is 40G_BASE_CSR4_QSFP_PLUS
    10Gbps-speed mode, full-duplex mode
    Link speed type is autonegotiation, link duplex type is autonegotiation
    Maximum frame length: 10000
    MDI type: Automdix
    Last link flapping: 4 weeks 0 days 18 hours 10 minutes
    Last clearing of counters: Never
     Peak input rate: 1056132 bytes/sec, at 2023-09-12 12:16:36
     Peak output rate: 10957844 bytes/sec, at 2023-08-21 09:40:05
     Last 300 second input: 234 packets/sec 57081 bytes/sec 0%
     Last 300 second output: 438 packets/sec 129345 bytes/sec 0%
     Input (total):  426743904 packets, 64861693235 bytes
             271350541 unicasts, 54160594 broadcasts, 101232769 multicasts, 0 pauses
     Input (normal):  426743904 packets, - bytes
             271350541 unicasts, 54160594 broadcasts, 101232769 multicasts, 0 pauses
     Input:  0 input errors, 0 runts, 0 giants, 0 throttles
             0 CRC, 0 frame, - overruns, 0 aborts
             - ignored, - parity errors
     Output (total): 249687568 packets, 83419525137 bytes
             107515630 unicasts, 46926160 broadcasts, 95245778 multicasts, 0 pauses
     Output (normal): 249687568 packets, - bytes
             107515630 unicasts, 46926160 broadcasts, 95245778 multicasts, 0 pauses
     Output: 0 output errors, - underruns, - buffer failures
             0 aborts, 0 deferred, 0 collisions, 0 late collisions
             0 lost carrier, - no carrier
    


    ------------------------------
    spgsitsupport
    ------------------------------



  • 13.  RE: Device attached to IRF member losing network

    Posted Sep 13, 2023 06:59 AM

    the hashing algorithm (that will distribute the frames through the splitted 4x10G irf link) has a problem working with that (trying to send the frame to lane 2, but frame not correctly received ?) 

    That is what I suspect, the 4 lanes are not really aggregated into 1



    ------------------------------
    spgsitsupport
    ------------------------------



  • 14.  RE: Device attached to IRF member losing network

    Posted Sep 13, 2023 06:03 AM

    "If I attach the switch to member 1 or 2 then I never ever have an issue "

    how do you do this ? using the same 4x10G splitter or using 40G dac ?



    ------------------------------
    Frederic MEUNIER
    ------------------------------



  • 15.  RE: Device attached to IRF member losing network

    Posted Sep 13, 2023 06:34 AM

    Everything is perfectly stated above

    1,2,4 are commander/members in building A

    3 is member in different building B (connected via OM4 & MPO splitters)

    1--> 2 40G DAC

    2 -- > 3 MPO splitter x4

    3 --> 4 MPO splitter x4

    4 --> 1 40G DAC



    ------------------------------
    spgsitsupport
    ------------------------------



  • 16.  RE: Device attached to IRF member losing network

    Posted Sep 13, 2023 06:46 AM

    If the switch which loses the connectivity (Aruba 2920 stack) gets connected to Commander or Member 2 (in Building A) - that would be the same 10Gb fibre gbic (just plugged across patch panel running back on another set of fibres)



    ------------------------------
    spgsitsupport
    ------------------------------



  • 17.  RE: Device attached to IRF member losing network

    Posted Sep 13, 2023 07:02 AM

    This would point to the problems in IRF link between switch 2 and 3 or 3 and 4. Switch 3 is the only one that has two splitters. It would be possible that hashing algorithm play some role here so the packets are not received on expected interface or are not in correct order.

    Best, Gorazd 



    ------------------------------
    Gorazd Kikelj
    MVP Expert 2023
    ------------------------------



  • 18.  RE: Device attached to IRF member losing network

    Posted Sep 13, 2023 07:12 AM
    Edited by spgsitsupport Sep 13, 2023 07:16 AM

    Agree, some hashing send it wrong way (still puzzling why the device attached to the satellite switch works fine, but not the 2920 stack itself)



  • 19.  RE: Device attached to IRF member losing network

    Posted Sep 13, 2023 07:34 AM

    Yes, but only switch 3 has two splitters to both neighbors. Did you test if switch 4 has the same problems? I would presume it doesn't as it has 40G direct connection to switch 1 and can use it.

    display diagnostic-information on sw3 could provide some in-depth information about switch inner workings. This is usually work for TAC to analyze it.

    You would look into "irf message info", "irf global info" and "debug port linkstatus" and maybe some other sections. 

    TAC should be able to analyze it more in detail than we can on the open forum. 

    Best, Gorazd



    ------------------------------
    Gorazd Kikelj
    MVP Expert 2023
    ------------------------------