Forums

Articles
Create
cancel
Showing results for 
Search instead for 
Did you mean: 

Heavy load on some Nodes on data center even no body using it

Murugan Mittapalli
Contributor
July 22, 2020

Hi, 

This is a brand new Jira on the cluster environment, I see too much load on node-2 even nobody using this node and on the other nodes(nod-0 and node-1) hardly 5 people are using.

 

Thank you.

2020-07-22 21:26:37,882+0000 localq-stats-0 INFO [c.a.j.c.distribution.localq.LocalQCacheManager] [scheduled] Running cache replication queue stats for: 20 queues...
2020-07-22 21:26:37,886+0000 localq-stats-0 INFO [c.a.j.c.distribution.localq.LocalQCacheManager] Cache replication queue stats per node: node-0 snapshot stats: {"timestampMillis":1595453197884,"nodeId":"node-0","queueSize":0,"startQueueSize":0,"startTimestampMillis":1595453137884,"startMillisAgo":60000,"closeCounter":0,"addCounter":2,"droppedOnAddCounter":0,"criticalAddCounter":0,"criticalPeekCounter":0,"criticalRemoveCounter":0,"peekCounter":0,"peekOrBlockCounter":2,"removeCounter":2,"backupQueueCounter":0,"closeErrorsCounter":0,"addErrorsCounter":0,"peekErrorsCounter":0,"peekOrBlockErrorsCounter":0,"removeErrorsCounter":0,"backupQueueErrorsCounter":0,"lastAddTimestampMillis":1595453144276,"lastAddMillisAgo":53608,"lastPeekTimestampMillis":0,"lastPeekMillisAgo":0,"lastPeekOrBlockTimestampMillis":1595453144285,"lastPeekOrBlockMillisAgo":53599,"lastRemoveTimestampMillis":1595453144304,"lastRemoveMillisAgo":53580,"lastBackupQueueTimestampMillis":0,"lastBackupQueueMillisAgo":0,"timeToAddMillis":{"count":2,"min":0,"max":3,"sum":6,"avg":3},"timeToPeekMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"timeToPeekOrBlockMillis":{"count":2,"min":0,"max":120026,"sum":240010,"avg":120005},"timeToRemoveMillis":{"count":2,"min":0,"max":2,"sum":4,"avg":2},"timeToBackupQueueMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"staleCounter":0,"sendCounter":2,"droppedOnSendCounter":0,"timeToSendMillis":{"count":2,"min":0,"max":19,"sum":35,"avg":17},"sendRuntimeExceptionCounter":0,"sendCheckedExceptionCounter":0,"sendNotBoundExceptionCounter":0}
2020-07-22 21:26:37,886+0000 localq-stats-0 INFO [c.a.j.c.distribution.localq.LocalQCacheManager] Cache replication queue stats per node: node-0 total stats: {"timestampMillis":1595453197884,"nodeId":"node-0","queueSize":0,"startQueueSize":0,"startTimestampMillis":1595445623007,"startMillisAgo":7574877,"closeCounter":0,"addCounter":420,"droppedOnAddCounter":0,"criticalAddCounter":0,"criticalPeekCounter":0,"criticalRemoveCounter":0,"peekCounter":0,"peekOrBlockCounter":420,"removeCounter":420,"backupQueueCounter":0,"closeErrorsCounter":0,"addErrorsCounter":0,"peekErrorsCounter":0,"peekOrBlockErrorsCounter":0,"removeErrorsCounter":0,"backupQueueErrorsCounter":0,"lastAddTimestampMillis":1595453144276,"lastAddMillisAgo":53608,"lastPeekTimestampMillis":0,"lastPeekMillisAgo":0,"lastPeekOrBlockTimestampMillis":1595453144285,"lastPeekOrBlockMillisAgo":53599,"lastRemoveTimestampMillis":1595453144304,"lastRemoveMillisAgo":53580,"lastBackupQueueTimestampMillis":0,"lastBackupQueueMillisAgo":0,"timeToAddMillis":{"count":420,"min":0,"max":21,"sum":1590,"avg":3},"timeToPeekMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"timeToPeekOrBlockMillis":{"count":420,"min":0,"max":419993,"sum":30142360,"avg":71767},"timeToRemoveMillis":{"count":420,"min":0,"max":12,"sum":901,"avg":2},"timeToBackupQueueMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"staleCounter":0,"sendCounter":420,"droppedOnSendCounter":0,"timeToSendMillis":{"count":420,"min":0,"max":583,"sum":8792,"avg":20},"sendRuntimeExceptionCounter":0,"sendCheckedExceptionCounter":0,"sendNotBoundExceptionCounter":0}
2020-07-22 21:26:37,887+0000 localq-stats-0 INFO [c.a.j.c.distribution.localq.LocalQCacheManager] Cache replication queue stats per node: node-1 snapshot stats: {"timestampMillis":1595453197887,"nodeId":"node-1","queueSize":0,"startQueueSize":0,"startTimestampMillis":1595453137884,"startMillisAgo":60003,"closeCounter":0,"addCounter":2,"droppedOnAddCounter":0,"criticalAddCounter":0,"criticalPeekCounter":0,"criticalRemoveCounter":0,"peekCounter":0,"peekOrBlockCounter":2,"removeCounter":2,"backupQueueCounter":0,"closeErrorsCounter":0,"addErrorsCounter":0,"peekErrorsCounter":0,"peekOrBlockErrorsCounter":0,"removeErrorsCounter":0,"backupQueueErrorsCounter":0,"lastAddTimestampMillis":1595453144272,"lastAddMillisAgo":53615,"lastPeekTimestampMillis":0,"lastPeekMillisAgo":0,"lastPeekOrBlockTimestampMillis":1595453144275,"lastPeekOrBlockMillisAgo":53612,"lastRemoveTimestampMillis":1595453144295,"lastRemoveMillisAgo":53592,"lastBackupQueueTimestampMillis":0,"lastBackupQueueMillisAgo":0,"timeToAddMillis":{"count":2,"min":0,"max":5,"sum":8,"avg":4},"timeToPeekMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"timeToPeekOrBlockMillis":{"count":2,"min":0,"max":120018,"sum":240004,"avg":120002},"timeToRemoveMillis":{"count":2,"min":0,"max":3,"sum":5,"avg":2},"timeToBackupQueueMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"staleCounter":0,"sendCounter":2,"droppedOnSendCounter":0,"timeToSendMillis":{"count":2,"min":0,"max":19,"sum":37,"avg":18},"sendRuntimeExceptionCounter":0,"sendCheckedExceptionCounter":0,"sendNotBoundExceptionCounter":0}
2020-07-22 21:26:37,888+0000 localq-stats-0 INFO [c.a.j.c.distribution.localq.LocalQCacheManager] Cache replication queue stats per node: node-1 total stats: {"timestampMillis":1595453197887,"nodeId":"node-1","queueSize":0,"startQueueSize":0,"startTimestampMillis":1595445622965,"startMillisAgo":7574922,"closeCounter":0,"addCounter":420,"droppedOnAddCounter":0,"criticalAddCounter":0,"criticalPeekCounter":0,"criticalRemoveCounter":0,"peekCounter":0,"peekOrBlockCounter":420,"removeCounter":420,"backupQueueCounter":0,"closeErrorsCounter":0,"addErrorsCounter":0,"peekErrorsCounter":0,"peekOrBlockErrorsCounter":0,"removeErrorsCounter":0,"backupQueueErrorsCounter":0,"lastAddTimestampMillis":1595453144272,"lastAddMillisAgo":53615,"lastPeekTimestampMillis":0,"lastPeekMillisAgo":0,"lastPeekOrBlockTimestampMillis":1595453144275,"lastPeekOrBlockMillisAgo":53612,"lastRemoveTimestampMillis":1595453144295,"lastRemoveMillisAgo":53592,"lastBackupQueueTimestampMillis":0,"lastBackupQueueMillisAgo":0,"timeToAddMillis":{"count":420,"min":0,"max":52,"sum":2069,"avg":4},"timeToPeekMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"timeToPeekOrBlockMillis":{"count":420,"min":0,"max":419992,"sum":30144121,"avg":71771},"timeToRemoveMillis":{"count":420,"min":0,"max":17,"sum":1034,"avg":2},"timeToBackupQueueMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"staleCounter":0,"sendCounter":420,"droppedOnSendCounter":0,"timeToSendMillis":{"count":420,"min":0,"max":586,"sum":7244,"avg":17},"sendRuntimeExceptionCounter":0,"sendCheckedExceptionCounter":0,"sendNotBoundExceptionCounter":0}
2020-07-22 21:26:37,888+0000 localq-stats-0 INFO [c.a.j.c.distribution.localq.LocalQCacheManager] [scheduled] ... done running cache replication queue stats for: 20 queues.
2020-07-22 21:27:31,408+0000 NodeReindexServiceThread:thread-1 INFO [c.a.j.index.ha.DefaultNodeReindexService] Node replay index operations stats: nodeId=node-2, numberOfOperations=0, timeToReplay=629ms, errors=0, period=5.084 min
2020-07-22 21:27:37,882+0000 localq-stats-0 INFO [c.a.j.c.distribution.localq.LocalQCacheManager] [scheduled] Running cache replication queue stats for: 20 queues...
2020-07-22 21:27:37,886+0000 localq-stats-0 INFO [c.a.j.c.distribution.localq.LocalQCacheManager] Cache replication queue stats per node: node-0 snapshot stats: {"timestampMillis":1595453257883,"nodeId":"node-0","queueSize":0,"startQueueSize":0,"startTimestampMillis":1595453197883,"startMillisAgo":60000,"closeCounter":0,"addCounter":2,"droppedOnAddCounter":0,"criticalAddCounter":0,"criticalPeekCounter":0,"criticalRemoveCounter":0,"peekCounter":0,"peekOrBlockCounter":2,"removeCounter":2,"backupQueueCounter":0,"closeErrorsCounter":0,"addErrorsCounter":0,"peekErrorsCounter":0,"peekOrBlockErrorsCounter":0,"removeErrorsCounter":0,"backupQueueErrorsCounter":0,"lastAddTimestampMillis":1595453204289,"lastAddMillisAgo":53594,"lastPeekTimestampMillis":0,"lastPeekMillisAgo":0,"lastPeekOrBlockTimestampMillis":1595453204290,"lastPeekOrBlockMillisAgo":53593,"lastRemoveTimestampMillis":1595453204304,"lastRemoveMillisAgo":53579,"lastBackupQueueTimestampMillis":0,"lastBackupQueueMillisAgo":0,"timeToAddMillis":{"count":2,"min":0,"max":5,"sum":8,"avg":4},"timeToPeekMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"timeToPeekOrBlockMillis":{"count":2,"min":0,"max":119990,"sum":239956,"avg":119978},"timeToRemoveMillis":{"count":2,"min":0,"max":4,"sum":6,"avg":3},"timeToBackupQueueMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"staleCounter":0,"sendCounter":2,"droppedOnSendCounter":0,"timeToSendMillis":{"count":2,"min":0,"max":12,"sum":22,"avg":11},"sendRuntimeExceptionCounter":0,"sendCheckedExceptionCounter":0,"sendNotBoundExceptionCounter":0}
2020-07-22 21:27:37,886+0000 localq-stats-0 INFO [c.a.j.c.distribution.localq.LocalQCacheManager] Cache replication queue stats per node: node-0 total stats: {"timestampMillis":1595453257884,"nodeId":"node-0","queueSize":0,"startQueueSize":0,"startTimestampMillis":1595445623007,"startMillisAgo":7634877,"closeCounter":0,"addCounter":422,"droppedOnAddCounter":0,"criticalAddCounter":0,"criticalPeekCounter":0,"criticalRemoveCounter":0,"peekCounter":0,"peekOrBlockCounter":422,"removeCounter":422,"backupQueueCounter":0,"closeErrorsCounter":0,"addErrorsCounter":0,"peekErrorsCounter":0,"peekOrBlockErrorsCounter":0,"removeErrorsCounter":0,"backupQueueErrorsCounter":0,"lastAddTimestampMillis":1595453204289,"lastAddMillisAgo":53595,"lastPeekTimestampMillis":0,"lastPeekMillisAgo":0,"lastPeekOrBlockTimestampMillis":1595453204290,"lastPeekOrBlockMillisAgo":53594,"lastRemoveTimestampMillis":1595453204304,"lastRemoveMillisAgo":53580,"lastBackupQueueTimestampMillis":0,"lastBackupQueueMillisAgo":0,"timeToAddMillis":{"count":422,"min":0,"max":21,"sum":1598,"avg":3},"timeToPeekMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"timeToPeekOrBlockMillis":{"count":422,"min":0,"max":419993,"sum":30382316,"avg":71996},"timeToRemoveMillis":{"count":422,"min":0,"max":12,"sum":907,"avg":2},"timeToBackupQueueMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"staleCounter":0,"sendCounter":422,"droppedOnSendCounter":0,"timeToSendMillis":{"count":422,"min":0,"max":583,"sum":8814,"avg":20},"sendRuntimeExceptionCounter":0,"sendCheckedExceptionCounter":0,"sendNotBoundExceptionCounter":0}
2020-07-22 21:27:37,888+0000 localq-stats-0 INFO [c.a.j.c.distribution.localq.LocalQCacheManager] Cache replication queue stats per node: node-1 snapshot stats: {"timestampMillis":1595453257886,"nodeId":"node-1","queueSize":0,"startQueueSize":0,"startTimestampMillis":1595453197883,"startMillisAgo":60003,"closeCounter":0,"addCounter":2,"droppedOnAddCounter":0,"criticalAddCounter":0,"criticalPeekCounter":0,"criticalRemoveCounter":0,"peekCounter":0,"peekOrBlockCounter":2,"removeCounter":2,"backupQueueCounter":0,"closeErrorsCounter":0,"addErrorsCounter":0,"peekErrorsCounter":0,"peekOrBlockErrorsCounter":0,"removeErrorsCounter":0,"backupQueueErrorsCounter":0,"lastAddTimestampMillis":1595453204283,"lastAddMillisAgo":53603,"lastPeekTimestampMillis":0,"lastPeekMillisAgo":0,"lastPeekOrBlockTimestampMillis":1595453204285,"lastPeekOrBlockMillisAgo":53601,"lastRemoveTimestampMillis":1595453204301,"lastRemoveMillisAgo":53585,"lastBackupQueueTimestampMillis":0,"lastBackupQueueMillisAgo":0,"timeToAddMillis":{"count":2,"min":0,"max":5,"sum":10,"avg":5},"timeToPeekMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"timeToPeekOrBlockMillis":{"count":2,"min":0,"max":119989,"sum":239957,"avg":119978},"timeToRemoveMillis":{"count":2,"min":0,"max":4,"sum":6,"avg":3},"timeToBackupQueueMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"staleCounter":0,"sendCounter":2,"droppedOnSendCounter":0,"timeToSendMillis":{"count":2,"min":0,"max":13,"sum":23,"avg":11},"sendRuntimeExceptionCounter":0,"sendCheckedExceptionCounter":0,"sendNotBoundExceptionCounter":0}
2020-07-22 21:27:37,889+0000 localq-stats-0 INFO [c.a.j.c.distribution.localq.LocalQCacheManager] Cache replication queue stats per node: node-1 total stats: {"timestampMillis":1595453257887,"nodeId":"node-1","queueSize":0,"startQueueSize":0,"startTimestampMillis":1595445622965,"startMillisAgo":7634922,"closeCounter":0,"addCounter":422,"droppedOnAddCounter":0,"criticalAddCounter":0,"criticalPeekCounter":0,"criticalRemoveCounter":0,"peekCounter":0,"peekOrBlockCounter":422,"removeCounter":422,"backupQueueCounter":0,"closeErrorsCounter":0,"addErrorsCounter":0,"peekErrorsCounter":0,"peekOrBlockErrorsCounter":0,"removeErrorsCounter":0,"backupQueueErrorsCounter":0,"lastAddTimestampMillis":1595453204283,"lastAddMillisAgo":53604,"lastPeekTimestampMillis":0,"lastPeekMillisAgo":0,"lastPeekOrBlockTimestampMillis":1595453204285,"lastPeekOrBlockMillisAgo":53602,"lastRemoveTimestampMillis":1595453204301,"lastRemoveMillisAgo":53586,"lastBackupQueueTimestampMillis":0,"lastBackupQueueMillisAgo":0,"timeToAddMillis":{"count":422,"min":0,"max":52,"sum":2079,"avg":4},"timeToPeekMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"timeToPeekOrBlockMillis":{"count":422,"min":0,"max":419992,"sum":30384078,"avg":72000},"timeToRemoveMillis":{"count":422,"min":0,"max":17,"sum":1040,"avg":2},"timeToBackupQueueMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"staleCounter":0,"sendCounter":422,"droppedOnSendCounter":0,"timeToSendMillis":{"count":422,"min":0,"max":586,"sum":7267,"avg":17},"sendRuntimeExceptionCounter":0,"sendCheckedExceptionCounter":0,"sendNotBoundExceptionCounter":0}
2020-07-22 21:27:37,889+0000 localq-stats-0 INFO [c.a.j.c.distribution.localq.LocalQCacheManager] [scheduled] ... done running cache replication queue stats for: 20 queues.
2020-07-22 21:28:00,015+0000 Caesium-1-3 DEBUG ServiceRunner [c.a.j.web.filters.ThreadLocalQueryProfiler] RESULT GROUP: OfBizDelegator
2020-07-22 21:28:00,016+0000 Caesium-1-3 DEBUG ServiceRunner [c.a.j.web.filters.ThreadLocalQueryProfiler] 1:3ms findByPrimaryKey [3]
2020-07-22 21:28:00,017+0000 Caesium-1-3 DEBUG ServiceRunner [c.a.j.web.filters.ThreadLocalQueryProfiler] OfBizDelegator: 1 keys (1 unique) took 3ms/4ms : 75.0% 3ms/query avg.
2020-07-22 21:28:00,017+0000 Caesium-1-3 DEBUG ServiceRunner [c.a.j.web.filters.ThreadLocalQueryProfiler]
2020-07-22 21:28:00,017+0000 Caesium-1-3 DEBUG ServiceRunner [c.a.j.web.filters.ThreadLocalQueryProfiler] PROFILED : 1 keys (1 unique) took 3ms/4ms : 75.0% 3ms/query avg.
2020-07-22 21:28:37,882+0000 localq-stats-0 INFO [c.a.j.c.distribution.localq.LocalQCacheManager] [scheduled] Running cache replication queue stats for: 20 queues...
2020-07-22 21:28:37,886+0000 localq-stats-0 INFO [c.a.j.c.distribution.localq.LocalQCacheManager] Cache replication queue stats per node: node-0 snapshot stats: {"timestampMillis":1595453317884,"nodeId":"node-0","queueSize":0,"startQueueSize":0,"startTimestampMillis":1595453257883,"startMillisAgo":60001,"closeCounter":0,"addCounter":2,"droppedOnAddCounter":0,"criticalAddCounter":0,"criticalPeekCounter":0,"criticalRemoveCounter":0,"peekCounter":0,"peekOrBlockCounter":2,"removeCounter":2,"backupQueueCounter":0,"closeErrorsCounter":0,"addErrorsCounter":0,"peekErrorsCounter":0,"peekOrBlockErrorsCounter":0,"removeErrorsCounter":0,"backupQueueErrorsCounter":0,"lastAddTimestampMillis":1595453264281,"lastAddMillisAgo":53603,"lastPeekTimestampMillis":0,"lastPeekMillisAgo":0,"lastPeekOrBlockTimestampMillis":1595453264284,"lastPeekOrBlockMillisAgo":53600,"lastRemoveTimestampMillis":1595453264303,"lastRemoveMillisAgo":53581,"lastBackupQueueTimestampMillis":0,"lastBackupQueueMillisAgo":0,"timeToAddMillis":{"count":2,"min":0,"max":3,"sum":6,"avg":3},"timeToPeekMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"timeToPeekOrBlockMillis":{"count":2,"min":0,"max":59982,"sum":119961,"avg":59980},"timeToRemoveMillis":{"count":2,"min":0,"max":4,"sum":6,"avg":3},"timeToBackupQueueMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"staleCounter":0,"sendCounter":2,"droppedOnSendCounter":0,"timeToSendMillis":{"count":2,"min":0,"max":16,"sum":28,"avg":14},"sendRuntimeExceptionCounter":0,"sendCheckedExceptionCounter":0,"sendNotBoundExceptionCounter":0}
2020-07-22 21:28:37,886+0000 localq-stats-0 INFO [c.a.j.c.distribution.localq.LocalQCacheManager] Cache replication queue stats per node: node-0 total stats: {"timestampMillis":1595453317884,"nodeId":"node-0","queueSize":0,"startQueueSize":0,"startTimestampMillis":1595445623007,"startMillisAgo":7694877,"closeCounter":0,"addCounter":424,"droppedOnAddCounter":0,"criticalAddCounter":0,"criticalPeekCounter":0,"criticalRemoveCounter":0,"peekCounter":0,"peekOrBlockCounter":424,"removeCounter":424,"backupQueueCounter":0,"closeErrorsCounter":0,"addErrorsCounter":0,"peekErrorsCounter":0,"peekOrBlockErrorsCounter":0,"removeErrorsCounter":0,"backupQueueErrorsCounter":0,"lastAddTimestampMillis":1595453264281,"lastAddMillisAgo":53603,"lastPeekTimestampMillis":0,"lastPeekMillisAgo":0,"lastPeekOrBlockTimestampMillis":1595453264284,"lastPeekOrBlockMillisAgo":53600,"lastRemoveTimestampMillis":1595453264303,"lastRemoveMillisAgo":53581,"lastBackupQueueTimestampMillis":0,"lastBackupQueueMillisAgo":0,"timeToAddMillis":{"count":424,"min":0,"max":21,"sum":1604,"avg":3},"timeToPeekMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"timeToPeekOrBlockMillis":{"count":424,"min":0,"max":419993,"sum":30502277,"avg":71939},"timeToRemoveMillis":{"count":424,"min":0,"max":12,"sum":913,"avg":2},"timeToBackupQueueMillis":{"count":0,"min":0,"max":0,"sum":0,"avg":0},"staleCounter":0,"sendCounter":424,"droppedOnSendCounter":0,"timeToSendMillis":{"count":424,"min":0,"max":583,"sum":8842,"avg":20},"sendRuntimeExceptionCounter":0,"sendCheckedExceptionCounter":0,"sendNotBoundExceptionCounter":0}

Cluster load.PNG

1 answer

1 accepted

2 votes
Answer accepted
Nic Brough -Adaptavist-
Rising Star
Rising Star
Rising Stars are recognized for providing high-quality answers to other users. Rising Stars receive a certificate of achievement and are on the path to becoming Community Leaders.
July 22, 2020

I'm afraid there's no way to tell what might be imposing load without monitoring what requests are landing on it and what it is doing.

As an example, one way to generate this sort of pattern might be to have a node that is underpowered, but your load balancer doesn't know that.  Imagine a DC setup where 3 nodes are running on 16GB quad core blades and node 4 is a Raspberry Pi.  Is your load balancer aware that it should not look at node 4 for user access, it's only for backup and replicated reporting?

That is an extreme example (but a real life one!).  Could the load on node-2 look high because that node is less powerful than the others?

Murugan Mittapalli
Contributor
July 24, 2020

Thanks for your response Nic, our team, is working on it.

 

Thanks once again.

Suggest an answer

Log in or Sign up to answer