Jump to content

Welcome to Smart Home Forum by FIBARO

Dear Guest,

 

as you can notice parts of Smart Home Forum by FIBARO is not available for you. You have to register in order to view all content and post in our community. Don't worry! Registration is a simple free process that requires minimal information for you to sign up. Become a part of of Smart Home Forum by FIBARO by creating an account.

 

As a member you can:

  •     Start new topics and reply to others
  •     Follow topics and users to get email updates
  •     Get your own profile page and make new friends
  •     Send personal messages
  •     ... and learn a lot about our system!

 

Regards,

Smart Home Forum by FIBARO Team


Recommended Posts

  • Topic Author
  • Posted (edited)
    7 hours ago, PSi said:

    The QA itself dos not reboot the system but the QA seems to slow down the API under some conditions which leads to unresponsiveness and reboot.

    It does not happen directly but takes some time (but less than a day) with my example.

    Using a less efficient line, it lead to a CPU issue as mentioned above..alternating, one core had high utilization and then switched to next one. Already replaced this one

    Everything in Fibaro hubs is API commands, so by saying AOQ slows down API sounds not realistic.

    For example: I am running follow scene, which sends API request every 50mSec!!!! (By the way this is in parallel to two AOQs that I'm running. One my private AOQ, and one for testing)

     

    k=1
    while k==1 do
    fibaro.sleep(50)
    print(api.get("/devices/203")["properties"]["dead"])
    end
     
    Now here is the CPU load looks like (This scene runs now over 10 hours, and I don't see any unresponsiveness whatsoever)

    Please login or register to see this spoiler.

    Edited by cag014
    Posted

    Hi,

    I have two controllers and am using AIO on both to control each other devices.

    Encountred a problem yesterday which raised some questions. It started when I changed one of the controllers password and forgot to edit in AIO in the other one. The scene stopped working coz of wrong authentication which is understandable, but even after reverting the password back to original the scene didnt work. I had to stop it and re-run manually.

    Now I have the following questions:

    1. What happens if the slave controller is off the network for some time? Will the master keep trying to connect till slave come back ?? Or it will try for few times then give up?

    2. How to change this behaviour, I want the master to keep trying endlessly till the connecting parameters are met ( user, pass and connectivity ofcourse).There can be some delay between trials. How can this be achieved?

     

    Please help@cag014.

     

  • Topic Author
  • Posted (edited)
    6 hours ago, Mohamed Refaat said:

    Hi,

    I have two controllers and am using AIO on both to control each other devices.

    Encountred a problem yesterday which raised some questions. It started when I changed one of the controllers password and forgot to edit in AIO in the other one. The scene stopped working coz of wrong authentication which is understandable, but even after reverting the password back to original the scene didnt work. I had to stop it and re-run manually.

    Now I have the following questions:

    1. What happens if the slave controller is off the network for some time? Will the master keep trying to connect till slave come back ?? Or it will try for few times then give up?

    2. How to change this behaviour, I want the master to keep trying endlessly till the connecting parameters are met ( user, pass and connectivity ofcourse).There can be some delay between trials. How can this be achieved?

     

    Please help@cag014.

     

    As I understand you are using AOQ run on HC3.

     

    1. If slave is offline, warning message displayed and AOQ will continue to probe the connection (every 45 seconds) and when it restored, message displayed.

    2. At startup if no connection, AOQ will stop, could be wrong authentication and there is no sense to try again and again.

        But as you know the QA restarted automatically by the system every few seconds, so yes, there is a mechanism to run it in loop on startup.

    Edited by cag014
    Posted (edited)
    33 minutes ago, cag014 said:

    As I understand you are using AOQ run on HC3.

     

    1. If slave is offline, warning message displayed and AOQ will continue to probe the connection (every 45 seconds) and when it restored, message displayed.

    2. At startup if no connection, AOQ will stop, could be wrong authentication and there is no sense to try again and again.

        But as you know the QA restarted automatically by the system every few seconds, so yes, there is a mechanism to run it in loop on startup.

    Thanks for your answer, however its not the real case.

    I have just tested:

    1. I get the slave offline, so the AOQ stopped and displayed message.

    2. Returned back connection, the AOQ kept not running till manually re-run it.

     

    Is there a parameter or something need to be checked from my side??

    Edited by Mohamed Refaat
  • Topic Author
  • Posted (edited)
    On 1/20/2023 at 1:42 PM, Mohamed Refaat said:

    Thanks for your answer, however its not the real case.

    I have just tested:

    1. I get the slave offline, so the AOQ stopped and displayed message.

    2. Returned back connection, the AOQ kept not running till manually re-run it.

     

    Is there a parameter or something need to be checked from my side??

     

    I did some changes to minimize the waiting time when slave back online and more robust algorithm to catch network issues. 

    put/post commands now are displayed as unexecuted to make sure that user can see what won't work till slave back online. 

    Please login or register to see this spoiler.

    Please download attached version below

     

    Please login or register to see this attachment.

     

    Please let me know if it works fine for you.

    Edited by cag014
    Posted (edited)
    14 hours ago, cag014 said:

     

    I did some changes to minimize the waiting time when slave back online and more robust algorithm to catch network issues. 

    put/post commands now are displayed as unexecuted to make sure that user can see what won't work till slave back online. 

    Please login or register to see this spoiler.

    Thanks for the update, I will test it.

    Just some quick questions:

    1. In case I have more than one slave, What will happen if one of them went offline? will the whole scene crash or it will continue to serve online slaves till the disconnected ones come back? Please consider startup case too in your answer (ex. after power shutdown, power comes back and all master and slaves boot together. One can delay more than another)

    2. Other question, does this update include the update you did to avoid  "rest api high respond time" problem as I had it today. If not, could you make an update that merge both updates?

    Edited by Mohamed Refaat
    Posted
    1 hour ago, Mohamed Refaat said:

    Thanks for the update, I will test it.

    Just some quick questions:

    1. In case I have more than one slave, What will happen if one of them went offline? will the whole scene crash or it will continue to serve online slaves till the disconnected ones come back? Please consider startup case too in your answer (ex. after power shutdown, power comes back and all master and slaves boot together. One can delay more than another)

    2. Other question, does this update include the update you did to avoid  "rest api high respond time" problem as I had it today. If not, could you make an update that merge both updates?

    I would like to add for the above and say: Assume before power return that 1 slave is damaged or totally removed from the system coz of any reason, what will be the impact of such scenario on that scene after power comes back?? how will the master deal with such scenario? How can the master contunue serving other slaves regardless all slaves exist at startup or not?

    Sorry for public brainstorm 😅

  • Topic Author
  • Posted
    1 hour ago, Mohamed Refaat said:

    Thanks for the update, I will test it.

    Just some quick questions:

    1. In case I have more than one slave, What will happen if one of them went offline? will the whole scene crash or it will continue to serve online slaves till the disconnected ones come back? Please consider startup case too in your answer (ex. after power shutdown, power comes back and all master and slaves boot together. One can delay more than another)

    2. Other question, does this update include the update you did to avoid  "rest api high respond time" problem as I had it today. If not, could you make an update that merge both updates?

    1. Yes, it will continue to run with connected slaves, till the disconnected slave back online.

      On startup if slave disconnected, AOQ cannot read the devices and other data, so nothing could work.

    2. the rest api high respond is just a warning... the  QA continues to run as usual.

    5 minutes ago, mjahedobeid said:

    I would like to add for the above and say: Assume before power return that 1 slave is damaged or totally removed from the system coz of any reason, what will be the impact of such scenario on that scene after power comes back?? how will the master deal with such scenario? How can the master contunue serving other slaves regardless all slaves exist at startup or not?

    Sorry for public brainstorm 😅

    As I mentioned, on startup you must have slaves online in order to read their configuration, without this data nothing could work or executed.

    Posted
    1 hour ago, Mohamed Refaat said:

    2. Other question, does this update include the update you did to avoid  "rest api high respond time" problem as I had it today. If not, could you make an update that merge both updates?

     

  • Topic Author
  • Posted
    2 hours ago, mjahedobeid said:
    4 hours ago, Mohamed Refaat said:

    2. Other question, does this update include the update you did to avoid  "rest api high respond time" problem as I had it today. If not, could you make an update that merge both updates?

     

    As I mentioned, it's just a warning about "high response time.", AOQ continues to run...

    You can set different time (bigger, like a few seconds) by setting user global parameter.

    slaveApiTime= 600 -- max. hub response time (milliseconds), warning shown if greater detected.
     
    or in jSlavel{} configuration for a specific slave hub.
    hc2={user="xxx" ,passwd="xxx", ip="198.0.0.69", alarmPin=1111, slaveApiTime=3000},
     
    Posted
    On 1/22/2023 at 3:52 AM, cag014 said:

     

    I did some changes to minimize the waiting time when slave back online and more robust algorithm to catch network issues. 

    put/post commands now are displayed as unexecuted to make sure that user can see what won't work till slave back online. 

    Please login or register to see this spoiler.

     

    I tested the attached, crashed after approx  10 hrs with following Error
    [23.01.2023] [10:23:22] [ERROR] [QUICKAPP38]: QuickApp crashed
    [23.01.2023] [10:23:22] [ERROR] [QUICKAPP38]: main.lua:2756: attempt to index a nil value (field '?')

     

    Any thought why it happened ?

  • Topic Author
  • Posted
    1 hour ago, mjahedobeid said:

     

    I tested the attached, crashed after approx  10 hrs with following Error
    [23.01.2023] [10:23:22] [ERROR] [QUICKAPP38]: QuickApp crashed
    [23.01.2023] [10:23:22] [ERROR] [QUICKAPP38]: main.lua:2756: attempt to index a nil value (field '?')

     

    Any thought why it happened ?

    Is it happened when some slave was disconnected?

    Posted
    23 minutes ago, cag014 said:

    Is it happened when some slave was disconnected?

    No, not at all

  • Topic Author
  • Posted
    56 minutes ago, Mohamed Refaat said:

    No, not at all

    I don't find any sense for error in this line... may be the lines are shifted.

    could you please post 2756 line of your code?

    Posted
    21 minutes ago, cag014 said:

    I don't find any sense for error in this line... may be the lines are shifted.

    could you please post 2756 line of your code?

    Am so sorry, my bad, as I added some debug lines
    The line that has the problem is the following
     

    jDp[tblId]["srcId"]={value=v.sourceId,lTime=os_time()}

    its line 2751 in your code
  • Topic Author
  • Posted
    54 minutes ago, mjahedobeid said:

    Am so sorry, my bad, as I added some debug lines
    The line that has the problem is the following
     

    jDp[tblId]["srcId"]={value=v.sourceId,lTime=os_time()}

    its line 2751 in your code

    May I ask you, if you have created a new scene while AOQ is running? It could explain the error... new scene does not exist in AOQ data table. 

    Posted
    36 minutes ago, cag014 said:

    May I ask you, if you have created a new scene while AOQ is running? It could explain the error... new scene does not exist in AOQ data table. 

     

    yes, thats it then.

    Really appreciate your work.

     

    What else can happen that can cause the code to crash other than scenes?

    Global values?

    add/remove devices?

    what else?

  • Topic Author
  • Posted
    3 minutes ago, Mohamed Refaat said:

     

    yes, thats it then.

    Really appreciate your work.

     

    What else can happen that can cause the code to crash other than scenes?

    Global values?

    add/remove devices?

    what else?

    Working on that... to avoid crashes.

    Now new added scene, device or global won't crash the code.

    • Like 1
  • Topic Author
  • Posted

    To all,

    Interesting situation when new scene/device have added.

    There are two options: 

    1. To display a warning that new scene/device/variable has been detected and to ignore it?

    2. To re-initialize AOQ with new items?

     

    Any thoughts? 

  • Topic Author
  • Posted
    1 hour ago, Mohamed Refaat said:

     

    yes, thats it then.

    Really appreciate your work.

     

    What else can happen that can cause the code to crash other than scenes?

    Global values?

    add/remove devices?

    what else?

    By the way, if device has been removed and was part of your jM lines, off course AOQ will crash. There is no other option, but to remove the device from jM lines and to restart the AOQ.

    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.

    Guest
    Reply to this topic...

    ×   Pasted as rich text.   Paste as plain text instead

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.

    ×
    ×
    • Create New...