Jump to content

Welcome to Smart Home Forum by FIBARO

Dear Guest,

 

as you can notice parts of Smart Home Forum by FIBARO is not available for you. You have to register in order to view all content and post in our community. Don't worry! Registration is a simple free process that requires minimal information for you to sign up. Become a part of of Smart Home Forum by FIBARO by creating an account.

 

As a member you can:

  •     Start new topics and reply to others
  •     Follow topics and users to get email updates
  •     Get your own profile page and make new friends
  •     Send personal messages
  •     ... and learn a lot about our system!

 

Regards,

Smart Home Forum by FIBARO Team


Recommended Posts

Posted (edited)

google translator

I have been suffering from delays, I always thought that the zwave engine could not be, because the actions hc2-> device were immediate, but with delay in the gui, and the delays are always device> hc2, which made me think that the problem it's something in the fibaro system.

 

In my experience, delays begin when the database becomes too large.

 

I can erase the history of devices and energy, and compact the database, when I do I have a perfect system again. Quick system reboots and no latency in the system. until after the time the database grows again and problems return.

 

for some reason the historical ones are never erased and the database grows and grows and grows, until the system can not handle it. database> 70MB delays start,> 90MB many delays,> 100MB after a restart it takes a long time to start services and may never start again.

 

in my case 190 devices, after clearing history and compacting 14MB.

 

for this reason the desperate people who start again from the beginning including all the devices are solved the problems, they have an empty database.

fibaro should include some historical erase system every so often and compact the database. And if it does already (that I do not know) do it more often.

 

My system has 190 devices for 3 years and I have been updating all versions, when the delays begin, I eliminate the historical ones and everything works perfectly again.

Edited by rls46
  • Like 1
Posted

interessting @rls46,

can you tell me how I can do a cleanup?

 

BR - kro

Posted
On 8/7/2018 at 11:01 AM, chaicka said:

What kind of technological firm practices this kind of technical support in this era? 15 years ago, maybe plentiful. Now in 2018, none that I can recall except some small local firms with only 5-20 staffs.

 

On 8/7/2018 at 12:18 PM, HomeSystem.sk said:

We are small local firm with 2 employees and we would not behave that way to our customers. As a matter of fact, I myself spent hours in customer houses trying to fix issues which are at the end Fibaros bugs.

 

Same here, used to have 6 employees and now i only work for myself.

Ive probably spent +50h! on 1 customer alone trying to fix Fibaro delays and problems for free since november 2017...

 

So, in my experience, small firms are usually better that big firms as they provide a better level of service.

 

Fibaro have looked in the box 4-5 times but without a fix (or any information to the problem)!

Now I'm totally ignored by support.

 

Its now a case of hopefully, perhaps, maybe Fibaro fixes this issue so my biggest client can again have a working system (like 4.140) and moving on to a different system.

Its so good in so may ways and i will use it at my private home where i can somewhat deal with the issues that Fibaro has, but installing it at customers, well that trust is now gone completley.

This just costs to much time, money, reputation, stress and so on, on, on... its not worth it any more..

 

Posted
11 minutes ago, speedy said:

Ive probably spent +50h! on 1 customer alone trying to fix Fibaro delays and problems for free since november 2017...

 

Tell me... What is your *gut* feeling. Don't think too long. Just tell me what you think is going on...

Posted (edited)
35 minutes ago, petergebruers said:

 

Tell me... What is your *gut* feeling. Don't think too long. Just tell me what you think is going on...

 

Everything started with the "Cannot query interpeter state" and the system freeze that occurred with that error.

So it was a system update from Fibaro 4.140+ that set everything off.

@HomeSystem.sk have the same experience.

 

Before that everything worked well, sometime small stuff happen that where non showstoppers (Sonos plugin stopped working sometimes etc), and the customer was 100% ok with that.

Never big issues with z-wave traffic, delays, Fibaro alarm, everything worked as intended for years.

 

So, Fibaro updated their firmware = huge issues.

I cant say exactly what the problem is, but its definitely connected to that.

 

I dont know if thats the answer you where looking for, or if its more of a personal "what do i think what is going on"? answer you want? :-) 

 

All these hours are from having to write LUA scenes that now check that devices have gone on / off properly (smoke machine etc) (or armed, disarmed) (as the freeze stops things from happening)

And these scenes need to be coded in such a way that they account for delay to happen, its a big pain.

As ist not consistent, the customer has had his house filled with smoke from the alarmsystem multiple times when the freeze happened as the device where never disarmed (now fixed with LUA scenes but disarming can sometimes take minutes!).

And not to mention, before it was found all the faultfinding, restarts, add, exclude, include devices, pinging, debugging, checking my code and so on.on.on.

And the freeze can also happen when the alarm goes off, (nothing i can do about that) so in the event of a alarm, there can be big delays so the alarm doesent go of for minuets even when it should.

Edited by speedy
Posted
14 minutes ago, speedy said:

So it was a system update from Fibaro 4.140+ that set everything off.

 

Thanks. This is in agreement with many posts on this forum.

 

15 minutes ago, speedy said:

I dont know if thats the answer you where looking for, or if its more of a personal "what do i think what is going on"? answer you want? :-) 

 

If you are bit like me... If you've spent >50h on a problem, it is difficult to summarize. I wanted to avoid you "overthink" and just shout something like... "I guess after this firmware release the system is slower".

 

It seems Fibaro does not want to say what they have found out after checking @Sankotronic's system, nor tell us what tell do.

 

So I like to speculate a bit.

 

So far, from personal experience + monitoring this forum delays can be caused by:

 

  • Over-estimating the capacity of the network. This is a broad category of too much polling, sending to much data, having too many unsolicited reports, ...
  • Network issues caused by bad mesh. Difficult to diagnose, but sometimes easy to solve (just do a few updates on some well chosen nodes).
  • Odd bugs (like, device toggles between direct route and routed connection causing issues with some command classes)
  • What this topic is about: @Sankotronic's system seems to be "a bug on the HC2 causing delays"
  • User code errors, like forgetting "sleep" in a loop.
  • This is new to me, what @rls46 mentioned.... Maybe there is an issue with the main database, causing it to get slower and slower over time. This would explain why "starting from scratch" seems to help in many cases.
  • "kill scenes bug" mentioned on this forum.

If the next release fixes the bug discussed in this topic... Or the next... and *that* does NOT help... why don't we join forces to find out what is going on?

Posted (edited)
2 hours ago, petergebruers said:

It seems Fibaro does not want to say what they have found out after checking @Sankotronic's system, nor tell us what tell do.

Same for me, they have checked 4-5 times now in the customers box and first time was in may, and finally i was ignored when it could not be fixed..

 

 

Quote

So far, from personal experience + monitoring this forum delays can be caused by:

  • Over-estimating the capacity of the network. This is a broad category of too much polling, sending to much data, having too many unsolicited reports, ...
  • Network issues caused by bad mesh. Difficult to diagnose, but sometimes easy to solve (just do a few updates on some well chosen nodes).
  • Odd bugs (like, device toggles between direct route and routed connection causing issues with some command classes)
  • What this topic is about: @Sankotronic's system seems to be "a bug on the HC2 causing delays"
  • User code errors, like forgetting "sleep" in a loop.
  • This is new to me, what @rls46 mentioned.... Maybe there is an issue with the main database, causing it to get slower and slower over time. This would explain why "starting from scratch" seems to help in many cases.
  • "kill scenes bug" mentioned on this forum.

Yes, it can be all of them, but almost all of them are not present in 4.140 and Fibaro is silent.

How many countless hours of posts, searching, testing, trying have we all done to fix this when it all worked in 4.140?

I think combined absolutely more than Fibaro.

 

Quote

If the next release fixes the bug discussed in this topic... Or the next... and *that* does NOT help... why don't we join forces to find out what is going on?

I think everyone on this forum want to help so solve issues, everyone but Fibaro.

I along with others have sent countless e-mails, videos, debug files, done everything support has asked us to fix/check in the systems and in the end its not fixed and finnaly we are ignored and left with questions and systems that have problems.

So, yeah, we all would love to join forces i think (and we are, it shows in this forum with all the great users trying to help), but Fibaro ignores us as you have seen, so then what do we do?

 

Sorry if i come across as negative, but every update since november 2017 have had the same bug without a fix so update, after update with us trying all we can to point out the problems and now 10 moths later its still there.

How much more must we do?

I'm just saying its not us, its Fibaro that doesn't care at all, no matter how helpful we are, that is just insane!

How can Fibaro as a company not use this excellent talent and will to help here as an asset?

 

Im so sick of being ignored, explaining to customers, working for free and in the end, nothing.

How can I run a business installing/selling Fibaro products when it cant be trusted to work properly, or fixes delivered promptly?

 

Its just time to leave, this relationship is not a healthy one, even though there are some stuff i still really like.

 

Edited by speedy
  • Like 1
Posted

Ok so I gave Fibaro access to Fibaro systems of my customers (opened ports). It was 4 days ago (and the case was already opened, so it is not a new ticket in line). So far no answer from support, seems they really care that much (customer openning ports is nor an urgent situation I guess).

 

If they would look at the systems, they may see the bahavior, but again seems not to be too urgent :D

 

And about the issue with MESH - no this is not a MESH problem. I do know that as I have CIT connected to one of those systems and here you can see the MESH there. IT is STRONG and basically away from remote controls there are no devices without multiple mesh roads possible.

Posted
17 minutes ago, HomeSystem.sk said:

And about the issue with MESH - no this is not a MESH problem.

Let's take that off the list of possible causes. BTW I don't think anyone claimed you have a "mesh" problem... Or did I miss something? Did Fibaro tell you to check this (I bet you checked it already before they asked).

 

 

Fibaro said they "saw" an issue on sanko's HC - and they are preparing a fix. Can't you wait until they release that fix? Or... I'm curious, is something telling you "this problem is different"?

 

Posted (edited)
23 hours ago, speedy said:

Same for me, they have checked 4-5 times now in the customers box and first time was in may, and finally i was ignored when it could not be fixed..

 

 

Yes, it can be all of them, but almost all of them are not present in 4.140 and Fibaro is silent.

How many countless hours of posts, searching, testing, trying have we all done to fix this when it all worked in 4.140?

I think combined absolutely more than Fibaro.

 

I think everyone on this forum want to help so solve issues, everyone but Fibaro.

I along with others have sent countless e-mails, videos, debug files, done everything support has asked us to fix/check in the systems and in the end its not fixed and finnaly we are ignored and left with questions and systems that have problems.

So, yeah, we all would love to join forces i think (and we are, it shows in this forum with all the great users trying to help), but Fibaro ignores us as you have seen, so then what do we do?

 

Sorry if i come across as negative, but every update since november 2017 have had the same bug without a fix so update, after update with us trying all we can to point out the problems and now 10 moths later its still there.

How much more must we do?

I'm just saying its not us, its Fibaro that doesn't care at all, no matter how helpful we are, that is just insane!

How can Fibaro as a company not use this excellent talent and will to help here as an asset?

 

Im so sick of being ignored, explaining to customers, working for free and in the end, nothing.

How can I run a business installing/selling Fibaro products when it cant be trusted to work properly, or fixes delivered promptly?

 

Its just time to leave, this relationship is not a healthy one, even though there are some stuff i still really like.

 

it can be as easy as that fibaro zwave has trouble with learning new routes, if a neighbour is bad then the homecenter should be able to learn new (faster) routes to reach a device.

there are fast and slow communictaing devices and not every controller adjust the route

 

 

Please login or register to see this attachment.

Edited by akatar
Posted
3 hours ago, petergebruers said:

Fibaro said they "saw" an issue on sanko's HC - and they are preparing a fix. Can't you wait until they release that fix? Or... I'm curious, is something telling you "this problem is different"?

 

 

Oh brother, I wish I could be so optimistic. Let me tell you, last year end of summer I reported issue with MCO thermostat. I got MCO company to contact Fibaro too. It took over month for me to make Fibaro see the issue and acnowledge it (lets say this is the same time as we have now with this issue). They said - "yes we see the issue, let us fix it. It WILL be fixed before the heating season starts..." Long storry short - the MCO thermostats issue is still not fixed...

So sadly, Fibaro aknowledging the issue is just one very little tiny step forward. Eventually if we shut up, they will forget it and a year later you will be explaining it again and again... It is sad, but it is true.
 

So I believe the issue is the same as sankos, but I do need them to understand the importancy of the problem, because I am standing there in front of customers promissing fix, I am putting my name on the line here. We are in referencing business, sometimes sadly...

Posted (edited)
On 8/13/2018 at 2:30 PM, kroeatschge said:

interessting @rls46,

can you tell me how I can do a cleanup?

 

I checked with @rls46 - he's on holiday, so I answer in his place.

 

I also contacted Fibaro support and they can easily detect and clean the history table for you... As an end user you have no access. Cleaning the (raw) database yourself voids warranty.

 

If you suspect, this is your problem, send me a PM. I am trying a script to detect if you have devices that log too much, but it is in "alpha" stage so I cannot publish it here. Or contact [email protected] and ask them to have a look at the "history records" of your db.

 

Cleaning the database does not stop it from growing, you'll have to find out which devices cause it to record (too much) data.

 

Please don't jump to conclusions, we don't know if what @rls46 found out is the cause of the problems of @Sankotronic but I try to find out...

On 8/14/2018 at 6:23 PM, akatar said:

it can be as easy as that fibaro zwave has trouble with learning new routes, if a neighbour is bad then the homecenter should be able to learn new (faster) routes to reach a device.

there are fast and slow communictaing devices and not every controller adjust the route

 

I totally agree with you, but I'd like to point out after reading > 1000 pages of official docs, there is no single document explaining how "routing" actually works.

In fact, I think even with piecing all the docs together, you'll still won't get a complete picture. Unless, of course, I am wrong and got lost in all the docs!

 

All nodes (of type "routing slave" but that is pretty much anything) participate in routing, it is not a privilege of the controller.

All nodes store a routing table.

Routing capabilities however depend somewhat on the age of the device.

 

I'd say "routing" is the secret sauce of Z-Wave.

 

For instance, a node can transmit data with or without a certain option:

 

4.3.3.1.4.4 TRANSMIT_OPTION_EXPLORE
The transmit option TRANSMIT_OPTION_EXPLORE MAY be used to enable dynamic route resolution.
Dynamic route resolution allows a node to discover new routes if all known routes are failing.
An explorer frame cannot wake up FLiRS nodes.
An explorer frame uses normal RF power level minus 6dB. This is also the power level used by a node
finding its neighbors.
The API function ZW_SetRoutingMAX MAY be used to specify the maximum number of routing attempts
based on routing table lookups to use before the Z-Wave protocol layer resorts to dynamic route
resolution.
A default value of five routing attempts SHOULD be used.
For backwards compatibility reasons, transmissions to nodes which do not support dynamic route
resolution will ignore the transmit option flag TRANSMIT_OPTION_EXPLORE.

 

In laymans terms... If a node sends data, and sets the "explorer frames" option, then the device will first try normal routing, then do a kind of "broadcast" aka "beam" which then hopefully gets picked up by the destination. The destination ACKS the reception by reversing the recorded route and applies some heuristic to tell itself "now I know a better route To X" and the ACK tells the sending route  it worked.

 

Note in the description, per default five "normal" routing attempts are made so if all of the fail you'll get delays... Theoretically, the "beaming" should find a new route and the next transmission should be fast.

 

You really need a Z-Wave Sniffer to diagnose this kind of stuff - and a small test setup

 

BTW beams are "special bursts of transmission" so while beaming is happening, network bandwidth is reduced. I cannot tell by how much, The docs tell the pattern used and the duration but in my network I rarely see such events (I have 2xFGT001 being FLiRS, so I certainly see beaming, but it does not affect my network).

Edited by petergebruers
  • Like 1
Posted (edited)
On 8/10/2018 at 7:10 PM, rls46 said:

google translator

I have been suffering from delays, I always thought that the zwave engine could not be, because the actions hc2-> device were immediate, but with delay in the gui, and the delays are always device> hc2, which made me think that the problem it's something in the fibaro system.

 

In my experience, delays begin when the database becomes too large.

 

I can erase the history of devices and energy, and compact the database, when I do I have a perfect system again. Quick system reboots and no latency in the system. until after the time the database grows again and problems return.

 

for some reason the historical ones are never erased and the database grows and grows and grows, until the system can not handle it. database> 70MB delays start,> 90MB many delays,> 100MB after a restart it takes a long time to start services and may never start again.

 

in my case 190 devices, after clearing history and compacting 14MB.

 

for this reason the desperate people who start again from the beginning including all the devices are solved the problems, they have an empty database.

fibaro should include some historical erase system every so often and compact the database. And if it does already (that I do not know) do it more often.

 

My system has 190 devices for 3 years and I have been updating all versions, when the delays begin, I eliminate the historical ones and everything works perfectly again.

 

Hi @rls46

This is interesting.. thanks for sharing. I had a Fibaro installer look at my system a while back for an issue and they didn’t find a solution but commented that the db looked v big. At the time I wondered was this a result of many many failed device inclusions that I experienced when my system started to get close to 80 physical devices.(my experience is well documented here) but I never got to the root of the issue and I just stopped adding ðevices around 110 to stabilisers my solution.

 

i wonder could you check the following if you get time... ( or perhaps you know) 

1) When the HC2 backup completes does it truncate / delete the database transaction log ?

2) Is the database max size set to a fixed value / % of disk size or is it set to infinite? ( same question for the transaction log)

3) if the database is not set to infinite, do the various databases ( assuming there is more one database in the HC2) have a fixed filegrowth/auto growth size or is it a % 

 

thanks .....

 

Edited by AutoFrank
Posted

 

ahhhhggrr!  how I can clear or compact [censored] FIBARO SQLite db without accessing OS fs /opt/fibaro/db?! or run "delete from new_device_history" or VACUUM  SQL command?

 

btw., here is shell script for clearing \opt\fibaro\scripts\cleanDatabase.sh 

but .... 

Posted (edited)
1 hour ago, AutoFrank said:

i wonder could you check the following if you get time... ( or perhaps you know) 

He can't (he is on holiday), but I'll take the lead from now on. He knows I am doing that.

 

1 hour ago, AutoFrank said:

truncate / delete the database transaction log ?

It is not about transaction log. It is the size of the table tht holds power reporting.

 

The biggest databases reported by installers fit on the disk, this is not an issue. But of course you have a valid question.

 

Please:

 

2 hours ago, petergebruers said:

 

If you suspect, this is your problem, send me a PM. I am trying a script to detect if you have devices that log too much, but it is in "alpha" stage so I cannot publish it here. Or contact [email protected] and ask them to have a look at the "history records" of your db.

 

1 hour ago, 10der said:

ahhhhggrr!  how I can clear or compact [censored] FIBARO SQLite db without accessing OS fs /opt/fibaro/db?! or run "delete from new_device_history" or VACUUM  SQL command?

See previous explanation.

Edited by petergebruers
Posted

Thanks @petergebruers

my comments was more thinking out loud to stimulate more discussion...I think this may have long term merit for some users.

it was also referring more to the database strategy rather than an issue with a specific table...as the original comment was at the database level....

 

it sounds like there is just one database with many tables...

The transaction log is important because how/when  the data gets written to the main database can impact the performance of the database. This can be the result of a backup or just normal destaging of the data.

the size is also important because even if the database doesn’t exceed the disk size, how it is managed can impact the performance as well. It’d it is set to infinite or a fixed size then this be beneficial or limiting depending on what size system it was originally spec’d for. If it is not unlimited and governed by an initial size and file growth setting/algothirm then this can also cause issues if not properly managed.. 

dynamic file growth value - this can cause large file growth events that can impact the ability to write to the database that can manifest itself to the end user in many ways..

static file growth - if it is small then this is better as many small events are better and impacts the db less. If it is large they occur less often but could lock the db for multiple seconds .....

 

if if we could understand the db management strategy it may help understand how the db grows and shrinks under normal operation. How specific tables like history, etc are managed are slightly different but important as well...

 

 

 

Posted

thx @petergebruers,

for picking this up!

Meanwhile I identified myself that a bad mesh is the reason for it. As the lags only appear when the controller has to comunicate with

3 newly added devices (two dimmer 2 and a singl switch). They are located outside of the house.

I stumbled across the thread where you explained apparent routings versus real routings based on a garage. That helped a lot to understand that a straight line, is not always the shortest route.

The route where I thought is taken is blocked with approx. 1 meter of concrete.

It's not fixed yet, but I'll be able to improve the routes with adding a fibaro wall plug at a good location.

 

Additionaly I upgraded from 4.180 to 4.510 without any issues. So I'm still happy about HC2.

 

The database just got my attention as I program since several years around databases. And neither the smalles access db, nor the biggest oracle db i had in my fingers was

in good shape if not maintained, cleaned and compacted from time to time.

I pray to the Z-Wave gods that fibaro has this covered, and if not, covers it before too many users are affected.

 

Do you think it would be good practice to have debuging disabled if not used. And additionaly remove everything from event log, that's not really needed.

 

br - kro

Posted

The topic has been moved from "

Please login or register to see this link.

" to "

Please login or register to see this link.

".

 

Temat został przeniesiony z "

Please login or register to see this link.

" do "

Please login or register to see this link.

".

Posted
20 minutes ago, AutoFrank said:

my comments was more thinking out loud to stimulate more discussion...I think this may have long term merit for some users.

Right! Absolutely agree!

 

21 minutes ago, AutoFrank said:

it sounds like there is just one database with many tables...

The transaction log is important because how/when  the data gets written to the main database can impact the performance of the database.

I agree, I know what you are talking about. It is an sql lite 3 database (one file). The problem reported by @rls46 is due to the growth of one table, used for power reporting. This is not speculation. Fibaro does not like discussions that expose internals.

 

This can be discussed via PM only.

 

This is not about bad database design.

 

It is about all your devices posting power consumption.

 

10 minutes ago, kroeatschge said:

It's not fixed yet, but I'll be able to improve the routes with adding a fibaro wall plug at a good locati

That is good news and sounds like a plan.

 

11 minutes ago, kroeatschge said:

Do you think it would be good practice to have debuging disabled if not used. And additionaly remove everything from event log, that's not really needed.

Yes. Minimize debugging. Minimize reporting (power, lux, temp, humidity, ...) check device manual + event log...

 

Posted
5 minutes ago, petergebruers said:

 

  This is not about bad database design.

Design is just one part..I assume it is designed well or at least 'was' at one point in time.  Database design needs to evolve in line with the application. In the surface neither the web ui nor the apps have changed much since I started this journey about 3 years ago.. but who knows what changes are under the 'hood' have been made. I think that most companies are good with the design and evolving it as it's more to the fore in the mind of the developer as they need to change/evolve tables and the overall schema.

 

 Database management strategies  (sizes, growth rates, indexes,   lifecycle management of the data, ) is a very different story and a part often forgotten. It has a tendency to be set once and then never revisited... It can have awful consequences if not kept in check particularily for a system has the capabiliity to log large amounts transactional data ( think IOT) Hopefully fibaro have this part well under control ......

 

I'm on holidays are the moment ( hence my increased # of posts) but may drop you a pm when I get home on your beta script.. more information is always better :-)

 

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...