Jump to content

Welcome to Smart Home Forum by FIBARO

Dear Guest,

 

as you can notice parts of Smart Home Forum by FIBARO is not available for you. You have to register in order to view all content and post in our community. Don't worry! Registration is a simple free process that requires minimal information for you to sign up. Become a part of of Smart Home Forum by FIBARO by creating an account.

 

As a member you can:

  •     Start new topics and reply to others
  •     Follow topics and users to get email updates
  •     Get your own profile page and make new friends
  •     Send personal messages
  •     ... and learn a lot about our system!

 

Regards,

Smart Home Forum by FIBARO Team


Question

Posted

Hello guys,

 

What is your opinion on the capacity of HC3 and how to reduce the load ?

 

If the gateway connection would work, that both of the HC can work independently but can share data from devices, would be much better than to leave all the work on one unit.

 

My current system looks like this

 

Please login or register to see this attachment.

 

And it seems to be pretty heavy on the unit.

 

Please login or register to see this attachment.

 

Does anyone of you work with a heavy loaded system or has done some spliting to more units or can maybe fibaro guys tell us, what the load is. Is it the Zwave chip load ? I have already as much optimized all the apps and all the devices reports as much as possible. Not fighting with this the first year.

 

 

 

 

 

13 answers to this question

Recommended Posts

  • 0
Posted

The utilisation graph relates to CPU usage. Interesting that the HC3 doesn't balance processes across the cores very well. Good to hear that youve optimised where you can already. I would say though, you are at the limits of the box. 200 devices is about the most recommended.  You have many scenes and QAs, we can't tell from here how intensive they are, but this number is pushing things. You might want to consider splitting the load across an additional (slave) HC3.

  • 0
  • Inquirer
  • Posted

    Hello,

     

    Yes, I also see that the processors doesent work together but one of the cores is loaded more than the rest. I also saw another HC3 that looks like its using all of the cpu together. So witch one is correct ? 

    Please login or register to see this attachment.

     

     

    At my place some QA are only virtual switches. Some of them are doing something, for example a median, but that runs once per minute and is really lightweight. Some of them are only for setting things for value, like from mobile phone. There are only a few that are actualy computing something. And only one of them asks the electricity meter for data every 5 seconds. Also the scenes, half of them are inactive. Because they were used as "old versions". 

     

    The interesting thing is, at night, when no devices are reporting motion and the photovoltics isnt working. The processes are quite low. Will make a screen in night. 

     

    A slave HC3 would only allow me to put some zwave devices to it, but not to reduce the cpu load because the hc3 wont do any scenes/qa or something. Right ? Or is there a way, that the one HC3 with its 50zwave devices, few QA and a few Scenes, could allow another HC3 to see its devices values for example and maybe controll them ?

     

    Can any of the Fib developers tell us their opinion ? Only if it is wrong, or something is bad or what. I have one of the first HC3s ... sn 15xx, dont know if there is a difference with the hardware for example.

     

     

    • 0
    Posted (edited)

    This small QA will run for 30s taking as much cpu as possible (busy wait)

    Please login or register to see this code.

     

    If we run 1 of these QAs and monitor the Diagnostics panel we can see that it takes one core and all its cpu. It may switch to another core halfway, but it only runs on 1 core at a time.

    The Lua engine is single threaded so this is natural. It also means that a QA's children (if it has any) also runs on the same core, because they are running in the same Lua environment.

    If we run setTimeout to start a new function it is all within the same Lua environment and core. 

    If we do a fibaro.wait(x) it will not use cpu for that time.

     

    If we make copies of this QA and start two, they take 2 cores, 3 takes 3 cores...

     

    However, at 4 QAs, it seems like they are only scheduled across 3 cores, maybe reserving a core for the system. It's a bit tricky to visually decide, but it seems like 4 100% cpu QAs 

    utilizes the cores a little bit worse than 3 QAs... as the average load seems a little bit less... but it's hard to tell and the graphics can fool us.

    Anyway, it makes sense for the system to reserve cpu for itself.

     

    Anyway, QAs in practice should not use the cpu as in the example above. Lua is VERY fast. The QA above increments the local variable 'a' 38,5 million times...

    I have some ~120 QAs (~25% children) and a few Scenes and my Diagnostics panel average around 10% cpu across the cores (when I don't run QAs like the above).

    That includes many QAs polling for events, polling external system with http etc. Many experimental QAs that abuse the system in one way or another...

     

    So, I would say, that having a system with that load indicates some poorly written QA that eats cpu unnecessary.. or 

    som Scene that does some strange thing, with a busy wait or something in the action part...

     

    Then of course there could be a bug in the Fibaro system where the housekeeping tasks run amok and eat all cpu...

     

     

    My panel

    Please login or register to see this attachment.

    Edited by jgab
    • 0
  • Inquirer
  • Posted

    Okay, that makes sense. I mostly use setTimeout functions. They run to an end of a scene or QA, setTimeout and after lets say 60 seconds the loop starts over, until the scene gets an exit function. That should be okay right ? That doesent consume CPU power right ?

     

    Also, im using mostly fibaro.sleep(x) , is that wrong ?

     

     

    The second thing is that I cant figure out, if the cpu load could be from the zwave devices in the network. Does anynone have an app that counts how many zwave events he uses in a time period ? Last time I spoke with a technician from Fibaro, he told me that the zwave chip should be able to compute a lot of events, but not sure about the number.

     

    Please login or register to see this attachment.

     

    Theese are my most used sensors and actors in 10 minutes.  Previously there was a qubino smart meter that could make even 4000 frames. But I have disabled it. I also have reduced the amounts of triggers for scenes, that they dont run empty, when the conditions arent fullfilled. And mostly all of my scenes are blocked to be restarted. 

    • 0
    Posted

    Could be, because I don't really have a lot of zwave devices these days - it's a handful 5-6.

    Mostly zigbee devices via Philips and Ikea (controlled by one QA each with many children) and some HASS imported devices (also one QA with many child devices)

    • 0
  • Inquirer
  • Posted

    Well, yes, as soon as I removed the high frame devices, the load went down. Im keeping mostly the zwave devices, because if eveything else breaks, the system still works. Not depending on bridges or wifi or lan. 

     

    But if you said it like that, means that if the HC3 would have minimum devices on itself, but had to deal with the QA, lua scenes, and the Zwave traffic would be on the connected gateways, should the load drop ?

     

    If I disable QA, would it stop to go through oninit ? If I would make a script that disables all of the QA for an hour for example, and the load would not drop, would it mean its in the lua / zwave communication ?

    • 0
    Posted

    My experience is that most QAs don't respect the disable option in the GUI. 

    The coder of the QA has to, first thing in :onInit, check if the QAs property enabled is false and in that case return from :onInit.

    Very few QAs seems to do that...

    • 0
  • Inquirer
  • Posted

    Most of the apps are my own creation, so I could write this into the QA so that I have a possibility to disable them all.

     

    From the programming way, are the commands as setTimeout, setInterval, wait not processor heavy, right ?

    Only some wierd loops that run faster than a second would be a problem. But also some of my exit codes are running 10 seconds with a wait of 1 second until t > 10 or condition fullfiled. That also should not be a problem.

     

    • 0
    Posted

    Very close topic to me.

     

    I have 14-16.000 lines of code in my QuickApps (~estimated) and similar, but little higher CPU charge as @jgab shows above. It's mainly my own code, I use only eventLib as outside of my hands, valuable resource.

    I can responsively say that nearly all "higher CPU charge" comes from my older programs, when I have limited knowledge about LUA and Fibaro. That times I used to throw myself in at the deep end by taking on relatively large and complex tasks.

    I'm not saying that I'm skilled programmer. But after ~2 years of learning know much more.

     

    LUA has it's hacks and HC3 also. What can I advice?

    Many times it happened that I do not cancel old setTimeout and establish new one. Especially when program is large & complicated non-cancelled setTimeout can be a problem and are quite challenging to discover - because they are growing exponentially. Every one failure with it generates two more. I recommend you to see if your code does not suffer from this issue.

    Learn how LUA works, especially regarding the memory management, references, data access. You will find o.e. that access to local function variable is x14 (~) faster than global table field. And table indexed by a number is NOT faster than one indexed by a string, even very long...

    Maybe the most important: I advice to learn event-driven way of programming. That's what really HC is - an event driven machine not mainly computing machine. Mainly @jgab (and other members, thank you all) published many valuable articles about it here. Also - as a tutor - chatGPT can ve a valuable source knowledge of LUA (but not programming HC3 itself, IMHO).
    When you switch to this approach, your QuickApp will be active only when necessary (when event arrives), other time - nearly idle. On example it's completely unnecessary in HC3 to continuously monitor the sensor to do something (and use hub.sleep to "unload" the busy loop). Better is to wait (patiently) when event of changing sensor state arises.

     

    Telling more exceeds the volume of single message, but I hope it's a good signpost. 

     

    Please login or register to see this attachment.

     

     

    • Like 1
    • 0
    Posted (edited)
    17 hours ago, Smarti said:

    Most of the apps are my own creation, so I could write this into the QA so that I have a possibility to disable them all.

     

    From the programming way, are the commands as setTimeout, setInterval, wait not processor heavy, right ?

    Only some wierd loops that run faster than a second would be a problem. But also some of my exit codes are running 10 seconds with a wait of 1 second until t > 10 or condition fullfiled. That also should not be a problem.

     

    setTimeout, setInterval themselves don't take cpu while they wait to be invoked. They do take a small amount of memory (to remember when to run).

    However, if you do

    Please login or register to see this code.

    It will be a very fast loop eating a lot of cpu. On the other hand, the good news is that between each invocation, other code will get a chance to run.

     

    For QAs you shouldn't use fibaro.sleep as it blocks other timers, and UI interactions (The QA won't react to buttons or sliders while fibaro.sleep is running)

    In general you shouldn't mix setTimeout and sleep in the same code. 

    Please login or register to see this code.

    In a QA it results in
     

    Please login or register to see this code.

    Which makes sense. The setTimeout runs as soon as the sleep is done. setTimeout wanted to run after 1s, but it was first after 2s that it could run because fibaro.sleep blocked everything.

     

    It is very uncommon that you schedule a lua function with setTimeout that when it runs takes 2 cpu seconds. That's a huge amount of computing. So in the normal case using setTimout you will have a rather fair scheduling of functions and even if they can't run exactly on the ms when scheduled, it usually evens out in the long run.

    The event style programming mentioned by @

    Please login or register to see this link.

     is kind of a model where everything is scheduled by setTimeout, incoming system events, user events, etc.

     

    The exception the "fairness" is if you have a timed loop. Then you have may have be careful that it doesn't start to drift.
     

    In simple Scene I can imagine using fibaro.sleep for some timing before sending a reponse, but in general use setTimeout 

    Edited by jgab
    • 0
    Posted (edited)

    Please login or register to see this code.

    So, this minuteLoop pings on the minute (seconds = :00).

    However, because print("PING") takes a few ms, the loop will start to drift and eventually start to ping on :01 etc...

     

    Those kind of loops should be time compensated in the call to setTimeout

    Please login or register to see this code.

    So, even if print would take 30s because the system does some housekeeping, the next loop will be scheduled after another 30s in the future.

     

    To be really safe, one needs to consider if the nextMinute has passed altogether, and what should be done then. Should the call be skipped and we go for the next minute or do we schedule it immediately (after 0 ms)
     

    Edited by jgab
    • 0
  • Inquirer
  • Posted (edited)

    Sure,

     

    My normal code is someting like this

     

    scene starts on trigger

    devices

    variables

     

     

    function loop()

     

    code ... maybe some wait times, some conditions and a question for exit

     

    setTimeout(60*1000,loop())

    end

     

     

    -- starting contditions

    if true then

    loop()

    end

    ---

     

     

    and thats it. If the exit condition is fullfilled, the setTimout wont be triggered normaly. But the retrigger of a loop normaly isnt sooner than 10 - 15 seconds. But normaly are the scenes have a set timeout for 30 seconds or more, dont need to have a restart of loop in the exact second . Thats because all of the scenes, if they start doing something, they shall also end.  Like when light turns on, the loop remains active till the condition for turnOff is fullfiled when the loop wont get the setTimeout. So if the scene ends, the light is also surely off.

     

    I think this is a nice way to do a program, is much better than to start a "off scene with some new trigger" and like I said, all of the scenes are blocked from restarting. Or Im I thinking wrong ?

    Edited by Smarti
    • 0
  • Inquirer
  • Posted

    So after a few days testing, I have rebuild all my QA that they can be disabled. A scene that can enable and disable all of them, took 3+ minutes on all 4 cores on heavy load when it happens. 

     

    With the QA off, the load was max to 50% on some cores and that could be the zwave traffic, logs, some scenes, so I assume that the 175 running QA do need some computing time. Do the QA use cpu even if they dont have anything in the Oninit ? No loop no nothing. I guess that is the case because the if there is some error in a QA, the system tries to reboot it every minute.

     

    But the good thing is, its not about some bad writen QA, it looks like there is just too much devices and QA running. Also the command fibaro.getDevicesID for finding the QA plugins took around 25 seconds only to list all of the 175 devices in a table. So I would say that the api of the devices is quite big and that could be the reason. Looks like my api/devices has something over 5 500 000 characters.

     

    Join the conversation

    You can post now and register later. If you have an account, sign in now to post with your account.

    Guest
    Answer this question...

    ×   Pasted as rich text.   Paste as plain text instead

      Only 75 emoji are allowed.

    ×   Your link has been automatically embedded.   Display as a link instead

    ×   Your previous content has been restored.   Clear editor

    ×   You cannot paste images directly. Upload or insert images from URL.

    ×
    ×
    • Create New...