Jump to content

Welcome to Smart Home Forum by FIBARO

Dear Guest,

 

as you can notice parts of Smart Home Forum by FIBARO is not available for you. You have to register in order to view all content and post in our community. Don't worry! Registration is a simple free process that requires minimal information for you to sign up. Become a part of of Smart Home Forum by FIBARO by creating an account.

 

As a member you can:

  •     Start new topics and reply to others
  •     Follow topics and users to get email updates
  •     Get your own profile page and make new friends
  •     Send personal messages
  •     ... and learn a lot about our system!

 

Regards,

Smart Home Forum by FIBARO Team


  • 0

Device "transfer failed" following by "transfer OK"


AutoFrank

Question

Hi 

I have a good many devices that when they process an action (switch on/off)

I get a Transfer failed

Please login or register to see this image.

/monthly_2017_04/failed.PNG.17dc6d978046edf12731b1389da8181f.PNG" />

 

followed a few seconds later with a transfer OK

 

success.PNG.cc036db8363ab14bc0d627427c25211d.PNG

 

The screenshots above are on device masters (as I was naming them) but the same is seen on device slaves

 

The action gets completed but I don't recall seeing the failed before

It's across a number of device types

 

Is anybody seeing the same or does anybody know what this is happening

 

HC2 running 4.120

 

Thanks

_f

 

Edited by AutoFrank
Link to comment
Share on other sites

Recommended Posts

  • 0
10 minutes ago, AutoFrank said:

 

Thanks @pos

my fibaro multi sensor was fw 2.7

I tried to remove it but it didn't remove fully.. so I've started the recovery process as a last ditch effort before I ask Fibaro to dial in and take a look

 

 

 

Please post the result...

 

Curious

Peo

Link to comment
Share on other sites

  • 0
  • Inquirer
  • Just now, pos said:

     

    Please post the result...

     

    Curious

    Peo

     

    @pos

    Will do 

     

    recovery complete, logged in as admin and now started the restore process from last good backup

     

     

     

     

     

    Link to comment
    Share on other sites

    • 0
  • Inquirer
  • Hi Fibaro Admins 

     

    @A.Socha, @T.Konopka, @M.Baranowski

     

    I have been having an issue with my system and some of the fibaro devices. I don't think its a device issue, I think it is more a system issue.

    I have tried a lot of things as you can see from the thread above and other threads. This evening I tried a full recovery and restore and it didn't fix the issue. I am still getting transfer fails and transfer ok.

    I also have one device that is stuck in a reconfiguration loop and I'm not sure if that is related.

     

    Could you organise fibaro support to dial into my system on friday morning if they are available to see if they can determine what is wrong from systems logs, etc.

    The HC system resources seem okay but there is a possibility that the zwave queue is being flooded by some process or device.

    I have done what I can from my side

     

    Thanks

    _f

     

    Support request logged earlier this morning - no case number yet

     

    Please login or register to see this attachment.

     

     

     

    Edited by AutoFrank
    Link to comment
    Share on other sites

    • 0
  • Inquirer
  •  

    Hi,

    I think I am making a little progress here

    I did a full recovery and restore last night and apart from loosing the custom icons which is expected and won't tale long to fix

    The system is back runnning but my issue is not fully resolved. I do have the same issue as last time that the switch configuration of my dimmer 1 modules is no incorrect (toggle v momentary ) so I'll have to reset these

     

    I have logged a support case with Fibaro to remote in to my HC2 and remove some 'Not configured' devices and stop a reconfigration loop that I cannot.

     

    One of  @petergebruers suggestion was that I have something overloading the zwave queue (and perhaps fibaro can see from the logs) and I think he is correct. The system seems sluggish but not for everything. Anything leveraging my sonos-api is working, my alarm as normal and system resource utilistion is healthy. Anything that leverages a http request that is not to a dodcy zwave device seems okay.

     

    I also found that at some point the "Mark if Dead" was either not enabled or disabled and I have now enabled this for all devices. The troublesome devices (dimmers and relays) may have always be an issue or just a victim of one recovery process and I wasn't aware.

    After a recovery it may be worth checking these two (physical switch setting for dimemrs and "Mark if Dead")

     

    Plan...

    I think I have one or more rogue devices that is impact the zwave queue and my chief suspect is one or more of my swiid cord switches (I have 7 in total)

    This morning I plugged out all of these devices. I may exclude all of these devices this evening and leave the out all together.

    I'll see if the zwave part of my setup stabilises. This evening I'll try a mesh reconfiguration on some of the 'dead devices' and failing that I'll try and exclude/include one or two to see if that resolves their stability and "transfer failed / transfer OK" issue I'm observing.

     

    I think I'm making progress and hopefully support can remote in and stop the reconfiguration loop because the HC2 cannot reconfigure more than one at a time and all other reconfiguration requests just queue and never get started/completed

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

     

    Edited by AutoFrank
    Link to comment
    Share on other sites

    • 0
    17 hours ago, AutoFrank said:

    @petergebruers

     

    I did a but more checking and noticed that many of my fibaro devices were not enabled for "Mark if dead" so I went trough them all and enabled them. I am now seeing a few more dead devices.

     

    I ran your script and seem to get a different result each time

    Is this what you'd expect 

     

    Here is the result from three lights in the kitchen (repeated twice)

    (...)

    and this is the result from multiple lights 

    (...)

    Some strange ones here - utility light was dead and then OK. I was watching the device in the WebUI and the message toggled between no communication, transfer failed, transfer OK

     

    TBH, I'm not sure what to make of these results.

     

     

    I think it's really best to let devices die when communication fails. If it's really, really, important that a device gets a message, then you can modify my script (or I can do it for you) so it dus a few attemps tu make it "undead".

     

    Yes, the output of the script varies, if you have time-outs or communication problems. Let me go through the different sections of the log

     

    Please login or register to see this code.

    Three devices turned and confirmed within 1 second. All good.

     

    Please login or register to see this code.

    Some 15 seconds later, you try the same three devices and it doesn't look good, ID 176 now takes 4-5 seconds to get OK. You might have seen the message "Zwave transfer failed." briefly on your homepage.

     

    The script continues and gives similar results for 174 (4 seconds) and 90 (7 seconds).

     

    I see no "DEAD" in the log, I think in post #9 you accidentally copy/pasted the same output for the second test.

     

    But anyway, the only difference is that you well get a "dead" device, but the script will make it undead by calling:

     

    Please login or register to see this code.

    With id = the ID of the failed device.

     

    It seems to suggest to me that sometimes everything is fin, and then for at least 30 seconds, your network is very busy.

     

    Some information regarding turning of devices: if you power off a device, it's best to let it 'die'. If it is dead, the HC2 makes no attempt to send information to the devices. But if you decide to make it immortal, every command, even one simple 'turnOn' will cause a lot of traffic, because Z-Wave makes several (futile) attempts to get the message across. 

     

    Link to comment
    Share on other sites

    • 0
  • Inquirer
  • 3 minutes ago, petergebruers said:

    I think in post #9 you accidentally copy/pasted the same output for the second test

     

    Oops @petergebruers

     

    Please login or register to see this code.

     

    4 minutes ago, petergebruers said:

    It seems to suggest to me that sometimes everything is fin, and then for at least 30 seconds, your network is very busy.

     

    @petergebruers

    zwave network I assume ?

    5 minutes ago, petergebruers said:

    But if you decide to make it immortal

     

    How could i (accidentally) make it immortal ?

    I don't have an scene or vd that tries to wake up dead devices....

    6 minutes ago, petergebruers said:

    .

     

    I have disabled almost all VD's at this stage and the majority of scenes and I'm still getting dead devices

    I think these may be dead all the time as opposed to being in that state from a flooded zwave queue or something else

     

    At this stage I'm out of ideas and hopefully when Fibaro support remote in they will find something

    One question regarding reconfiguring the mesh  - am I better off bringing the HC2 close to that location or does it matter at all...

     

    thanks again for all the help in trying to resolve

    _f

     

    Link to comment
    Share on other sites

    • 0
    4 minutes ago, AutoFrank said:

    zwave network I assume ?

     

    How could i (accidentally) make it immortal ?

    I don't have an scene or vd that tries to wake up dead devices....

     

    Yes, Z-Wave network is most likely explanation. Not memory, not CPU.

     

    But keep in mind, that your HC2 might be the cause. For example, if you send ten thousand 'turnOn' commands to a device, they will be queued and sent. I guesstimate, on a normal network and with a direct connection, this will keep your network busy for about 10 minutes.

     

    You only have the global "mark as dead" and individual "mark as dead" flags, but that's it. So if you never use "wakeUpAllDevices" then you can't accidentally change status 'dead' into 'alive'.

     

    The second data set looks worse (performance). All devices would be 'dead' after this test. Oh, no, that's not true! Look at the last line... device 1499 is OK. And the device before that takes a very long time to acknowledge, but it's never dead. But before that, device 88 doesn't seem to recover within 45 seconds, a limit I put in the code (because I don't want to be the cause of loop of pointless traffic sent...).

     

    I think... I'd try to limit scenes and VDs, check reporting of sensors (you already meantioned that yourself, regarding the MS6). Trial and error, unfortunately.

    Link to comment
    Share on other sites

    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.

    Guest
    Answer this question...

    ×   Pasted as rich text.   Paste as plain text instead

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.

    ×
    ×
    • Create New...