OFBiz › OFBiz - User

OFBiz OutOfMemory and stucked JobPoller issue

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

11 messages Options

Giulio Speri - MpStyle Srl

OFBiz OutOfMemory and stucked JobPoller issue

Hello everyone,

hope you're doing good.
I am writing, because I am struggling with a quite strange problem, over an
ofbiz installation, for one of our customers.
This installation is composed by two instances of OFBiz (v13.07.03), served
via an Apache Tomcat webserver, along with a load balancer.
The database server is MariaDB.

We had the first problems, about 3 weeks ago, when suddenly, the front1
(ofbiz instance 1), stopped serving web requests; front2, instead, was
still working correctly.

Obviously we checked the log files, and we saw that async services were
failing; the failure was accompanied by this error line:

*Thread "AsyncAppender-async" java.lang.OutOfMemoryError: GC overhead limit
exceeded*

We analyzed the situation with our system specialists, and they told us
that the application was highly stressing machine resources (cpu always at
or near 100%, RAM usage rapidly increasing), until the jvm run out of
memory.
This "resource-high-consumption situation", occurred only when ofbiz1
instance was started with the JobPoller enabled; if the JobPoller was not
enabled, ofbiz run with low resource usage.

We then focused on the db, to check first of all the dimensions; the result
was disconcerting; 45GB, mainly divided on four tables: SERVER_HIT (about
18 GB), VISIT (about 15 GB), ENTITY_SYNC_REMOVE (about 8 GB), VISITOR
(about 2 GB).
All the other tables had a size in the order of few MB each.

The first thing we did, was to clear all those tables, reducing
considerably the db size.
After the cleaning, we tried to start ofbiz1 again, with the JobPoller
component enabled; this caused a lot of old scheduled/queued jobs, to
execute.
Except than for the start-up time, the resource usage of the machine,
stabilized around normal to low values (cpu 1-10%).
Ofbiz seemed to work (web request was served), but we noticed taht the
JobPoller did not schedule or run jobs, anymore.
The number of job in "Pending" state in the JobSandbox entity was small
(about 20); no Queued, no Failed, no jobs in other states.
In addition to this, unfortunately, after few hours, jvm run out of memory
again.

Our jvm has an heap maximum size of 20GB ( we have 32GB on the machine),
so it's not so small, I think.
The next step we're going to do is set-up locally the application over the
same production db to see what happens.

Now that I explained the situation, I am going to ask if, in your
opinion/experience:

Could the JobPoller component be the root (and only) cause of the
OutOfMemory of the jvm?

Could this issue be related to this
https://issues.apache.org/jira/browse/OFBIZ-5710?

Dumping and analyzing the heap of the jvm could help in some way to
understand what or who fills the memory or is this operation a waste of
time?

Is there something that we did not considered or missed during the whole
process of problem analysis?

I really thank you all for your attention and your help; any suggestion or
advice would really be greatly appreciated.

Kind regards,
Giulio

--
Giulio Speri

*Mp Styl**e Srl*
via Antonio Meucci, 37
41019 Limidi di Soliera (MO)
T 059/684916
M 334/3779851

www.mpstyle.it

taher

Re: OFBiz OutOfMemory and stucked JobPoller issue

This smells like a memory leak. The problem is how to pin down the exact
cause. Maybe something in network setting, or the job poller or something
else.

Perhaps the first thing to try to do to narrow it down is to run a profiler
against the JVM to see where the leak is happening. If you can narrow down
the class / method then it would be much easier to handle.

On Thu, Sep 20, 2018, 7:43 PM Giulio Speri - MpStyle Srl <
[hidden email]> wrote:

> Hello everyone,
>
> hope you're doing good.
> I am writing, because I am struggling with a quite strange problem, over an
> ofbiz installation, for one of our customers.
> This installation is composed by two instances of OFBiz (v13.07.03), served
> via an Apache Tomcat webserver, along with a load balancer.
> The database server is MariaDB.
>
> We had the first problems, about 3 weeks ago, when suddenly, the front1
> (ofbiz instance 1), stopped serving web requests; front2, instead, was
> still working correctly.
>
> Obviously we checked the log files, and we saw that async services were
> failing; the failure was accompanied by this error line:
>
> *Thread "AsyncAppender-async" java.lang.OutOfMemoryError: GC overhead limit
> exceeded*
>
> We analyzed the situation with our system specialists, and they told us
> that the application was highly stressing machine resources (cpu always at
> or near 100%, RAM usage rapidly increasing), until the jvm run out of
> memory.
> This "resource-high-consumption situation", occurred only when ofbiz1
> instance was started with the JobPoller enabled; if the JobPoller was not
> enabled, ofbiz run with low resource usage.
>
> We then focused on the db, to check first of all the dimensions; the result
> was disconcerting; 45GB, mainly divided on four tables: SERVER_HIT (about
> 18 GB), VISIT (about 15 GB), ENTITY_SYNC_REMOVE (about 8 GB), VISITOR
> (about 2 GB).
> All the other tables had a size in the order of few MB each.
>
> The first thing we did, was to clear all those tables, reducing
> considerably the db size.
> After the cleaning, we tried to start ofbiz1 again, with the JobPoller
> component enabled; this caused a lot of old scheduled/queued jobs, to
> execute.
> Except than for the start-up time, the resource usage of the machine,
> stabilized around normal to low values (cpu 1-10%).
> Ofbiz seemed to work (web request was served), but we noticed taht the
> JobPoller did not schedule or run jobs, anymore.
> The number of job in "Pending" state in the JobSandbox entity was small
> (about 20); no Queued, no Failed, no jobs in other states.
> In addition to this, unfortunately, after few hours, jvm run out of memory
> again.
>
> Our jvm has an heap maximum size of 20GB ( we have 32GB on the machine),
> so it's not so small, I think.
> The next step we're going to do is set-up locally the application over the
> same production db to see what happens.
>
> Now that I explained the situation, I am going to ask if, in your
> opinion/experience:
>
> Could the JobPoller component be the root (and only) cause of the
> OutOfMemory of the jvm?
>
> Could this issue be related to this
> https://issues.apache.org/jira/browse/OFBIZ-5710?
>
> Dumping and analyzing the heap of the jvm could help in some way to
> understand what or who fills the memory or is this operation a waste of
> time?
>
> Is there something that we did not considered or missed during the whole
> process of problem analysis?
>
>
> I really thank you all for your attention and your help; any suggestion or
> advice would really be greatly appreciated.
>
> Kind regards,
> Giulio
>
>
>
>
>
>
> --
> Giulio Speri
>
>
> *Mp Styl**e Srl*
> via Antonio Meucci, 37
> 41019 Limidi di Soliera (MO)
> T 059/684916
> M 334/3779851
>
> www.mpstyle.it
>

Arun Patidar-3

Re: OFBiz OutOfMemory and stucked JobPoller issue

Hello Giulio,

You can split your OFBiz instance into two instances[1], one for processing
synchronous web requests and another for processing only jobs(async
server). In this way, you may narrow down your analyses.

[1]:
https://www.hotwaxsystems.com/ofbiz/ofbiz-tutorials/apache-ofbiz-performance/

Kind Regards,

Arun Patidar
Director of Information Systems

*HotWax CommerceReal OmniChannel. Real Results.*
m: +91 9827353082
w: www.hotwax.co

<https://www.linkedin.com/company/hotwaxcommerce/>
<https://www.facebook.com/HotWaxCommerce/>
<https://twitter.com/hotwaxcommerce>

On Thu, Sep 20, 2018 at 10:39 PM Taher Alkhateeb <[hidden email]>
wrote:

> This smells like a memory leak. The problem is how to pin down the exact
> cause. Maybe something in network setting, or the job poller or something
> else.
>
> Perhaps the first thing to try to do to narrow it down is to run a profiler
> against the JVM to see where the leak is happening. If you can narrow down
> the class / method then it would be much easier to handle.
>
> On Thu, Sep 20, 2018, 7:43 PM Giulio Speri - MpStyle Srl <
> [hidden email]> wrote:
>
> > Hello everyone,
> >
> > hope you're doing good.
> > I am writing, because I am struggling with a quite strange problem, over
> an
> > ofbiz installation, for one of our customers.
> > This installation is composed by two instances of OFBiz (v13.07.03),
> served
> > via an Apache Tomcat webserver, along with a load balancer.
> > The database server is MariaDB.
> >
> > We had the first problems, about 3 weeks ago, when suddenly, the front1
> > (ofbiz instance 1), stopped serving web requests; front2, instead, was
> > still working correctly.
> >
> > Obviously we checked the log files, and we saw that async services were
> > failing; the failure was accompanied by this error line:
> >
> > *Thread "AsyncAppender-async" java.lang.OutOfMemoryError: GC overhead
> limit
> > exceeded*
> >
> > We analyzed the situation with our system specialists, and they told us
> > that the application was highly stressing machine resources (cpu always
> at
> > or near 100%, RAM usage rapidly increasing), until the jvm run out of
> > memory.
> > This "resource-high-consumption situation", occurred only when ofbiz1
> > instance was started with the JobPoller enabled; if the JobPoller was not
> > enabled, ofbiz run with low resource usage.
> >
> > We then focused on the db, to check first of all the dimensions; the
> result
> > was disconcerting; 45GB, mainly divided on four tables: SERVER_HIT (about
> > 18 GB), VISIT (about 15 GB), ENTITY_SYNC_REMOVE (about 8 GB), VISITOR
> > (about 2 GB).
> > All the other tables had a size in the order of few MB each.
> >
> > The first thing we did, was to clear all those tables, reducing
> > considerably the db size.
> > After the cleaning, we tried to start ofbiz1 again, with the JobPoller
> > component enabled; this caused a lot of old scheduled/queued jobs, to
> > execute.
> > Except than for the start-up time, the resource usage of the machine,
> > stabilized around normal to low values (cpu 1-10%).
> > Ofbiz seemed to work (web request was served), but we noticed taht the
> > JobPoller did not schedule or run jobs, anymore.
> > The number of job in "Pending" state in the JobSandbox entity was small
> > (about 20); no Queued, no Failed, no jobs in other states.
> > In addition to this, unfortunately, after few hours, jvm run out of
> memory
> > again.
> >
> > Our jvm has an heap maximum size of 20GB ( we have 32GB on the machine),
> > so it's not so small, I think.
> > The next step we're going to do is set-up locally the application over
> the
> > same production db to see what happens.
> >
> > Now that I explained the situation, I am going to ask if, in your
> > opinion/experience:
> >
> > Could the JobPoller component be the root (and only) cause of the
> > OutOfMemory of the jvm?
> >
> > Could this issue be related to this
> > https://issues.apache.org/jira/browse/OFBIZ-5710?
> >
> > Dumping and analyzing the heap of the jvm could help in some way to
> > understand what or who fills the memory or is this operation a waste of
> > time?
> >
> > Is there something that we did not considered or missed during the whole
> > process of problem analysis?
> >
> >
> > I really thank you all for your attention and your help; any suggestion
> or
> > advice would really be greatly appreciated.
> >
> > Kind regards,
> > Giulio
> >
> >
> >
> >
> >
> >
> > --
> > Giulio Speri
> >
> >
> > *Mp Styl**e Srl*
> > via Antonio Meucci, 37
> > 41019 Limidi di Soliera (MO)
> > T 059/684916
> > M 334/3779851
> >
> > www.mpstyle.it
> >
>

grv

Re: OFBiz OutOfMemory and stucked JobPoller issue

In reply to this post by taher

Hi Giulio

Since JobPoller mostly deals with creating a ThreadPool and your
description says you seem to have stuck on the poller, it looks like there
is a possibility of hung threads resulting in OOM. But, again, this is
based on cursory glance and might require further investigation based on
thread and heap dump analysis.

Too many hung threads can also resulting in OOM and of course memory leak
can play its part. But you will have a better idea if you have thread and
heap dump with you. Do you see hug thread stacktrace in the log files?

I think it'd help you have JVM properly configured to dump "heap dump" as
soon as OOM occurs. With the team of specialists you have, I am hoping you
already have that configured. For the thread dump, you can always request
JVM (may be with jconsole or jstack or visualvm) to dump "thread dump" as
you see spike in the CPU usage.

It may also be premature to cast doubt on JobPoller only; for, it is very
much possible for other components to play their part. Things can get a bit
tricky when you are dealing with such a large amount of data. So, it is
best to have thread and heap dump for the necessary analysis and then have
them analysed with your choice of tools to determine the root cause.

Thanks and Best regards,
Girish Vasmatkar
HotWax Systems

On Thu, Sep 20, 2018 at 10:39 PM Taher Alkhateeb <[hidden email]>
wrote:

iwolf

AW: OFBiz OutOfMemory and stucked JobPoller issue

In reply to this post by Giulio Speri - MpStyle Srl

Hallo Guilio,

I had a similar issue some time ago. My fix was to enable caching for groovy scripts etc.

framework/base/config/cache.properties.

There is a part that starts says: # Development Mode - comment these out to better cache groovy scripts, etc

After checking the heat cache analysis I found that for every groovy script call a new instance of that class was created. Caching fixed it completely. Maybe it will work for you too. You will then have to clear the cache in ofbiz if you change something in your groovy scripts.

Best regards
Ingo

-----Ursprüngliche Nachricht-----
Von: Giulio Speri - MpStyle Srl <[hidden email]>
Gesendet: Donnerstag, 20. September 2018 18:43
An: [hidden email]
Betreff: OFBiz OutOfMemory and stucked JobPoller issue

Hello everyone,

hope you're doing good.
I am writing, because I am struggling with a quite strange problem, over an ofbiz installation, for one of our customers.
This installation is composed by two instances of OFBiz (v13.07.03), served via an Apache Tomcat webserver, along with a load balancer.
The database server is MariaDB.

We had the first problems, about 3 weeks ago, when suddenly, the front1 (ofbiz instance 1), stopped serving web requests; front2, instead, was still working correctly.

Obviously we checked the log files, and we saw that async services were failing; the failure was accompanied by this error line:

*Thread "AsyncAppender-async" java.lang.OutOfMemoryError: GC overhead limit
exceeded*

We analyzed the situation with our system specialists, and they told us that the application was highly stressing machine resources (cpu always at or near 100%, RAM usage rapidly increasing), until the jvm run out of memory.
This "resource-high-consumption situation", occurred only when ofbiz1 instance was started with the JobPoller enabled; if the JobPoller was not enabled, ofbiz run with low resource usage.

We then focused on the db, to check first of all the dimensions; the result was disconcerting; 45GB, mainly divided on four tables: SERVER_HIT (about
18 GB), VISIT (about 15 GB), ENTITY_SYNC_REMOVE (about 8 GB), VISITOR (about 2 GB).
All the other tables had a size in the order of few MB each.

The first thing we did, was to clear all those tables, reducing considerably the db size.
After the cleaning, we tried to start ofbiz1 again, with the JobPoller component enabled; this caused a lot of old scheduled/queued jobs, to execute.
Except than for the start-up time, the resource usage of the machine, stabilized around normal to low values (cpu 1-10%).
Ofbiz seemed to work (web request was served), but we noticed taht the JobPoller did not schedule or run jobs, anymore.
The number of job in "Pending" state in the JobSandbox entity was small (about 20); no Queued, no Failed, no jobs in other states.
In addition to this, unfortunately, after few hours, jvm run out of memory again.

Our jvm has an heap maximum size of 20GB ( we have 32GB on the machine), so it's not so small, I think.
The next step we're going to do is set-up locally the application over the same production db to see what happens.

Now that I explained the situation, I am going to ask if, in your
opinion/experience:

Could the JobPoller component be the root (and only) cause of the OutOfMemory of the jvm?

Could this issue be related to this
https://issues.apache.org/jira/browse/OFBIZ-5710?

Dumping and analyzing the heap of the jvm could help in some way to understand what or who fills the memory or is this operation a waste of time?

Is there something that we did not considered or missed during the whole process of problem analysis?

I really thank you all for your attention and your help; any suggestion or advice would really be greatly appreciated.

Kind regards,
Giulio

--
Giulio Speri

*Mp Styl**e Srl*
via Antonio Meucci, 37
41019 Limidi di Soliera (MO)
T 059/684916
M 334/3779851

www.mpstyle.it

taher

Re: OFBiz OutOfMemory and stucked JobPoller issue

I think it would be much better if we can find a root-cause fix instead of
altering settings. I would be interested in helping out digging out
whatever is causing the memory leak if Guilio is interested in opening a
JIRA for this and investigating further.

On Fri, Sep 21, 2018, 11:48 AM Ingo Wolfmayr <[hidden email]>
wrote:

> Hallo Guilio,
>
> I had a similar issue some time ago. My fix was to enable caching for
> groovy scripts etc.
>
> framework/base/config/cache.properties.
>
> There is a part that starts says: # Development Mode - comment these out
> to better cache groovy scripts, etc
>
> After checking the heat cache analysis I found that for every groovy
> script call a new instance of that class was created. Caching fixed it
> completely. Maybe it will work for you too. You will then have to clear the
> cache in ofbiz if you change something in your groovy scripts.
>
> Best regards
> Ingo
>
>
> -----Ursprüngliche Nachricht-----
> Von: Giulio Speri - MpStyle Srl <[hidden email]>
> Gesendet: Donnerstag, 20. September 2018 18:43
> An: [hidden email]
> Betreff: OFBiz OutOfMemory and stucked JobPoller issue
>
> Hello everyone,
>
> hope you're doing good.
> I am writing, because I am struggling with a quite strange problem, over
> an ofbiz installation, for one of our customers.
> This installation is composed by two instances of OFBiz (v13.07.03),
> served via an Apache Tomcat webserver, along with a load balancer.
> The database server is MariaDB.
>
> We had the first problems, about 3 weeks ago, when suddenly, the front1
> (ofbiz instance 1), stopped serving web requests; front2, instead, was
> still working correctly.
>
> Obviously we checked the log files, and we saw that async services were
> failing; the failure was accompanied by this error line:
>
> *Thread "AsyncAppender-async" java.lang.OutOfMemoryError: GC overhead limit
> exceeded*
>
> We analyzed the situation with our system specialists, and they told us
> that the application was highly stressing machine resources (cpu always at
> or near 100%, RAM usage rapidly increasing), until the jvm run out of
> memory.
> This "resource-high-consumption situation", occurred only when ofbiz1
> instance was started with the JobPoller enabled; if the JobPoller was not
> enabled, ofbiz run with low resource usage.
>
> We then focused on the db, to check first of all the dimensions; the
> result was disconcerting; 45GB, mainly divided on four tables: SERVER_HIT
> (about
> 18 GB), VISIT (about 15 GB), ENTITY_SYNC_REMOVE (about 8 GB), VISITOR
> (about 2 GB).
> All the other tables had a size in the order of few MB each.
>
> The first thing we did, was to clear all those tables, reducing
> considerably the db size.
> After the cleaning, we tried to start ofbiz1 again, with the JobPoller
> component enabled; this caused a lot of old scheduled/queued jobs, to
> execute.
> Except than for the start-up time, the resource usage of the machine,
> stabilized around normal to low values (cpu 1-10%).
> Ofbiz seemed to work (web request was served), but we noticed taht the
> JobPoller did not schedule or run jobs, anymore.
> The number of job in "Pending" state in the JobSandbox entity was small
> (about 20); no Queued, no Failed, no jobs in other states.
> In addition to this, unfortunately, after few hours, jvm run out of memory
> again.
>
> Our jvm has an heap maximum size of 20GB ( we have 32GB on the machine),
> so it's not so small, I think.
> The next step we're going to do is set-up locally the application over the
> same production db to see what happens.
>
> Now that I explained the situation, I am going to ask if, in your
> opinion/experience:
>
> Could the JobPoller component be the root (and only) cause of the
> OutOfMemory of the jvm?
>
> Could this issue be related to this
> https://issues.apache.org/jira/browse/OFBIZ-5710?
>
> Dumping and analyzing the heap of the jvm could help in some way to
> understand what or who fills the memory or is this operation a waste of
> time?
>
> Is there something that we did not considered or missed during the whole
> process of problem analysis?
>
>
> I really thank you all for your attention and your help; any suggestion or
> advice would really be greatly appreciated.
>
> Kind regards,
> Giulio
>
>
>
>
>
>
> --
> Giulio Speri
>
>
> *Mp Styl**e Srl*
> via Antonio Meucci, 37
> 41019 Limidi di Soliera (MO)
> T 059/684916
> M 334/3779851
>
> www.mpstyle.it
>

Arun Patidar-3

Re: OFBiz OutOfMemory and stucked JobPoller issue

In reply to this post by Giulio Speri - MpStyle Srl

Hello Giulio,

For large size of Visit and ServerHit data issue, you can opt any one from
below options:

1) Use separate DB for group="org.apache.ofbiz.stats" entities and make
regular cleanup of this DB
2) If you are not using ServerHits and Visits data for analytical purpose
then simply disable visits.

Kind Regards,

Arun Patidar
Director of Information Systems

*HotWax CommerceReal OmniChannel. Real Results.*
m: +91 9827353082

On Thu, Sep 20, 2018 at 10:13 PM Giulio Speri - MpStyle Srl <
[hidden email]> wrote:

iwolf

AW: OFBiz OutOfMemory and stucked JobPoller issue

In reply to this post by taher

The root cause on my side was using a groovy service that was called thousends of times with the caching of groovy scripts disabled. If you to a heap cache analyzis check the count of groovy instances. It was fixed by change the caching of groovy scripts. Maybe that saves you some time.

Gesendet von meinem Windows 10-Gerät.

Von: Taher Alkhateeb<mailto:[hidden email]>
Gesendet: Freitag, 21. September 2018 11:01
An: OFBiz user mailing list<mailto:[hidden email]>
Betreff: Re: OFBiz OutOfMemory and stucked JobPoller issue

I think it would be much better if we can find a root-cause fix instead of
altering settings. I would be interested in helping out digging out
whatever is causing the memory leak if Guilio is interested in opening a
JIRA for this and investigating further.

On Fri, Sep 21, 2018, 11:48 AM Ingo Wolfmayr <[hidden email]>
wrote:

Giulio Speri - MpStyle Srl

Re: OFBiz OutOfMemory and stucked JobPoller issue

Hi everyone,

first of all, I'd like to thank you very much, for the quick response and
the suggestions you gave me.
We are absolutely convinced to investigate deeper into this issue and find
the root cause, because we strongly believe in the OFBiz framework and for
us it's becoming a core part of our business.

To answer to @Arun:
regarding the large tables sizes: at the moment we're using the Visit table
(along with other tables) for Business Intelligence purposes, so what we
did and what we will do, in accordance with the customer, is to keep those
tables smaller, keeping record for the last XX days only and deleting
everything is older than that number of days (with a scheduled weekly
service, for example); this should keep the dimensions of those tables,
acceptable and at the same time give good data for BI analysis.

Regarding the split service configuration, instead, we have our
installation setup in a very similar mode; we have front1 (ofbiz1 instance)
where the JobPoller is enabled and a second front where it is disabled
(ofbiz2 instance). Web requests, are however, served by both instances.
Indeed the problems we're having, are limited to front1 only (but we did
not test if, enabling the JobPoller on the front2, we obtain the same
behaviour).

@Taher We gladly accept your help proposal, so should I proceed with the
creation of a Jira task?

Thank you very much,

Giulio

Il giorno ven 21 set 2018 alle ore 11:16 Ingo Wolfmayr <
[hidden email]> ha scritto:

> The root cause on my side was using a groovy service that was called
> thousends of times with the caching of groovy scripts disabled. If you to a
> heap cache analyzis check the count of groovy instances. It was fixed by
> change the caching of groovy scripts. Maybe that saves you some time.
>
>
>
> Gesendet von meinem Windows 10-Gerät.
>
>
>
> Von: Taher Alkhateeb<mailto:[hidden email]>
> Gesendet: Freitag, 21. September 2018 11:01
> An: OFBiz user mailing list<mailto:[hidden email]>
> Betreff: Re: OFBiz OutOfMemory and stucked JobPoller issue
>
>
>
> I think it would be much better if we can find a root-cause fix instead of
> altering settings. I would be interested in helping out digging out
> whatever is causing the memory leak if Guilio is interested in opening a
> JIRA for this and investigating further.
>
> On Fri, Sep 21, 2018, 11:48 AM Ingo Wolfmayr <[hidden email]>
> wrote:
>
> > Hallo Guilio,
> >
> > I had a similar issue some time ago. My fix was to enable caching for
> > groovy scripts etc.
> >
> > framework/base/config/cache.properties.
> >
> > There is a part that starts says: # Development Mode - comment these out
> > to better cache groovy scripts, etc
> >
> > After checking the heat cache analysis I found that for every groovy
> > script call a new instance of that class was created. Caching fixed it
> > completely. Maybe it will work for you too. You will then have to clear
> the
> > cache in ofbiz if you change something in your groovy scripts.
> >
> > Best regards
> > Ingo
> >
> >
> > -----Ursprüngliche Nachricht-----
> > Von: Giulio Speri - MpStyle Srl <[hidden email]>
> > Gesendet: Donnerstag, 20. September 2018 18:43
> > An: [hidden email]
> > Betreff: OFBiz OutOfMemory and stucked JobPoller issue
> >
> > Hello everyone,
> >
> > hope you're doing good.
> > I am writing, because I am struggling with a quite strange problem, over
> > an ofbiz installation, for one of our customers.
> > This installation is composed by two instances of OFBiz (v13.07.03),
> > served via an Apache Tomcat webserver, along with a load balancer.
> > The database server is MariaDB.
> >
> > We had the first problems, about 3 weeks ago, when suddenly, the front1
> > (ofbiz instance 1), stopped serving web requests; front2, instead, was
> > still working correctly.
> >
> > Obviously we checked the log files, and we saw that async services were
> > failing; the failure was accompanied by this error line:
> >
> > *Thread "AsyncAppender-async" java.lang.OutOfMemoryError: GC overhead
> limit
> > exceeded*
> >
> > We analyzed the situation with our system specialists, and they told us
> > that the application was highly stressing machine resources (cpu always
> at
> > or near 100%, RAM usage rapidly increasing), until the jvm run out of
> > memory.
> > This "resource-high-consumption situation", occurred only when ofbiz1
> > instance was started with the JobPoller enabled; if the JobPoller was not
> > enabled, ofbiz run with low resource usage.
> >
> > We then focused on the db, to check first of all the dimensions; the
> > result was disconcerting; 45GB, mainly divided on four tables: SERVER_HIT
> > (about
> > 18 GB), VISIT (about 15 GB), ENTITY_SYNC_REMOVE (about 8 GB), VISITOR
> > (about 2 GB).
> > All the other tables had a size in the order of few MB each.
> >
> > The first thing we did, was to clear all those tables, reducing
> > considerably the db size.
> > After the cleaning, we tried to start ofbiz1 again, with the JobPoller
> > component enabled; this caused a lot of old scheduled/queued jobs, to
> > execute.
> > Except than for the start-up time, the resource usage of the machine,
> > stabilized around normal to low values (cpu 1-10%).
> > Ofbiz seemed to work (web request was served), but we noticed taht the
> > JobPoller did not schedule or run jobs, anymore.
> > The number of job in "Pending" state in the JobSandbox entity was small
> > (about 20); no Queued, no Failed, no jobs in other states.
> > In addition to this, unfortunately, after few hours, jvm run out of
> memory
> > again.
> >
> > Our jvm has an heap maximum size of 20GB ( we have 32GB on the machine),
> > so it's not so small, I think.
> > The next step we're going to do is set-up locally the application over
> the
> > same production db to see what happens.
> >
> > Now that I explained the situation, I am going to ask if, in your
> > opinion/experience:
> >
> > Could the JobPoller component be the root (and only) cause of the
> > OutOfMemory of the jvm?
> >
> > Could this issue be related to this
> > https://issues.apache.org/jira/browse/OFBIZ-5710?
> >
> > Dumping and analyzing the heap of the jvm could help in some way to
> > understand what or who fills the memory or is this operation a waste of
> > time?
> >
> > Is there something that we did not considered or missed during the whole
> > process of problem analysis?
> >
> >
> > I really thank you all for your attention and your help; any suggestion
> or
> > advice would really be greatly appreciated.
> >
> > Kind regards,
> > Giulio
> >
> >
> >
> >
> >
> >
> > --
> > Giulio Speri
> >
> >
> > *Mp Styl**e Srl*
> > via Antonio Meucci, 37
> > 41019 Limidi di Soliera (MO)
> > T 059/684916
> > M 334/3779851
> >
> > www.mpstyle.it<http://www.mpstyle.it>
> >
>

--
Giulio Speri

*Mp Styl**e Srl*
via Antonio Meucci, 37
41019 Limidi di Soliera (MO)
T 059/684916
M 334/3779851

www.mpstyle.it

Arun Patidar-3

Re: OFBiz OutOfMemory and stucked JobPoller issue

Managing of Visits and serverHit data in separate DB is very easy and will
be faster to delete old records without affecting the main DB.

A dedicated server only for processing jobs performs well as per my
experience.

Apart from this, root cause should be identified and fixed as Taher
mentioned.

Kind Regards,

Arun Patidar
Director of Information Systems

*HotWax CommerceReal OmniChannel. Real Results.*
m: +91 9827353082

On Fri, Sep 21, 2018 at 4:07 PM Giulio Speri - MpStyle Srl <
[hidden email]> wrote:

> Hi everyone,
>
> first of all, I'd like to thank you very much, for the quick response and
> the suggestions you gave me.
> We are absolutely convinced to investigate deeper into this issue and find
> the root cause, because we strongly believe in the OFBiz framework and for
> us it's becoming a core part of our business.
>
> To answer to @Arun:
> regarding the large tables sizes: at the moment we're using the Visit table
> (along with other tables) for Business Intelligence purposes, so what we
> did and what we will do, in accordance with the customer, is to keep those
> tables smaller, keeping record for the last XX days only and deleting
> everything is older than that number of days (with a scheduled weekly
> service, for example); this should keep the dimensions of those tables,
> acceptable and at the same time give good data for BI analysis.
>
> Regarding the split service configuration, instead, we have our
> installation setup in a very similar mode; we have front1 (ofbiz1 instance)
> where the JobPoller is enabled and a second front where it is disabled
> (ofbiz2 instance). Web requests, are however, served by both instances.
> Indeed the problems we're having, are limited to front1 only (but we did
> not test if, enabling the JobPoller on the front2, we obtain the same
> behaviour).
>
> @Taher We gladly accept your help proposal, so should I proceed with the
> creation of a Jira task?
>
>
> Thank you very much,
>
> Giulio
>
> Il giorno ven 21 set 2018 alle ore 11:16 Ingo Wolfmayr <
> [hidden email]> ha scritto:
>
> > The root cause on my side was using a groovy service that was called
> > thousends of times with the caching of groovy scripts disabled. If you
> to a
> > heap cache analyzis check the count of groovy instances. It was fixed by
> > change the caching of groovy scripts. Maybe that saves you some time.
> >
> >
> >
> > Gesendet von meinem Windows 10-Gerät.
> >
> >
> >
> > Von: Taher Alkhateeb<mailto:[hidden email]>
> > Gesendet: Freitag, 21. September 2018 11:01
> > An: OFBiz user mailing list<mailto:[hidden email]>
> > Betreff: Re: OFBiz OutOfMemory and stucked JobPoller issue
> >
> >
> >
> > I think it would be much better if we can find a root-cause fix instead
> of
> > altering settings. I would be interested in helping out digging out
> > whatever is causing the memory leak if Guilio is interested in opening a
> > JIRA for this and investigating further.
> >
> > On Fri, Sep 21, 2018, 11:48 AM Ingo Wolfmayr <[hidden email]>
> > wrote:
> >
> > > Hallo Guilio,
> > >
> > > I had a similar issue some time ago. My fix was to enable caching for
> > > groovy scripts etc.
> > >
> > > framework/base/config/cache.properties.
> > >
> > > There is a part that starts says: # Development Mode - comment these
> out
> > > to better cache groovy scripts, etc
> > >
> > > After checking the heat cache analysis I found that for every groovy
> > > script call a new instance of that class was created. Caching fixed it
> > > completely. Maybe it will work for you too. You will then have to clear
> > the
> > > cache in ofbiz if you change something in your groovy scripts.
> > >
> > > Best regards
> > > Ingo
> > >
> > >
> > > -----Ursprüngliche Nachricht-----
> > > Von: Giulio Speri - MpStyle Srl <[hidden email]>
> > > Gesendet: Donnerstag, 20. September 2018 18:43
> > > An: [hidden email]
> > > Betreff: OFBiz OutOfMemory and stucked JobPoller issue
> > >
> > > Hello everyone,
> > >
> > > hope you're doing good.
> > > I am writing, because I am struggling with a quite strange problem,
> over
> > > an ofbiz installation, for one of our customers.
> > > This installation is composed by two instances of OFBiz (v13.07.03),
> > > served via an Apache Tomcat webserver, along with a load balancer.
> > > The database server is MariaDB.
> > >
> > > We had the first problems, about 3 weeks ago, when suddenly, the front1
> > > (ofbiz instance 1), stopped serving web requests; front2, instead, was
> > > still working correctly.
> > >
> > > Obviously we checked the log files, and we saw that async services were
> > > failing; the failure was accompanied by this error line:
> > >
> > > *Thread "AsyncAppender-async" java.lang.OutOfMemoryError: GC overhead
> > limit
> > > exceeded*
> > >
> > > We analyzed the situation with our system specialists, and they told us
> > > that the application was highly stressing machine resources (cpu always
> > at
> > > or near 100%, RAM usage rapidly increasing), until the jvm run out of
> > > memory.
> > > This "resource-high-consumption situation", occurred only when ofbiz1
> > > instance was started with the JobPoller enabled; if the JobPoller was
> not
> > > enabled, ofbiz run with low resource usage.
> > >
> > > We then focused on the db, to check first of all the dimensions; the
> > > result was disconcerting; 45GB, mainly divided on four tables:
> SERVER_HIT
> > > (about
> > > 18 GB), VISIT (about 15 GB), ENTITY_SYNC_REMOVE (about 8 GB), VISITOR
> > > (about 2 GB).
> > > All the other tables had a size in the order of few MB each.
> > >
> > > The first thing we did, was to clear all those tables, reducing
> > > considerably the db size.
> > > After the cleaning, we tried to start ofbiz1 again, with the JobPoller
> > > component enabled; this caused a lot of old scheduled/queued jobs, to
> > > execute.
> > > Except than for the start-up time, the resource usage of the machine,
> > > stabilized around normal to low values (cpu 1-10%).
> > > Ofbiz seemed to work (web request was served), but we noticed taht the
> > > JobPoller did not schedule or run jobs, anymore.
> > > The number of job in "Pending" state in the JobSandbox entity was small
> > > (about 20); no Queued, no Failed, no jobs in other states.
> > > In addition to this, unfortunately, after few hours, jvm run out of
> > memory
> > > again.
> > >
> > > Our jvm has an heap maximum size of 20GB ( we have 32GB on the
> machine),
> > > so it's not so small, I think.
> > > The next step we're going to do is set-up locally the application over
> > the
> > > same production db to see what happens.
> > >
> > > Now that I explained the situation, I am going to ask if, in your
> > > opinion/experience:
> > >
> > > Could the JobPoller component be the root (and only) cause of the
> > > OutOfMemory of the jvm?
> > >
> > > Could this issue be related to this
> > > https://issues.apache.org/jira/browse/OFBIZ-5710?
> > >
> > > Dumping and analyzing the heap of the jvm could help in some way to
> > > understand what or who fills the memory or is this operation a waste of
> > > time?
> > >
> > > Is there something that we did not considered or missed during the
> whole
> > > process of problem analysis?
> > >
> > >
> > > I really thank you all for your attention and your help; any suggestion
> > or
> > > advice would really be greatly appreciated.
> > >
> > > Kind regards,
> > > Giulio
> > >
> > >
> > >
> > >
> > >
> > >
> > > --
> > > Giulio Speri
> > >
> > >
> > > *Mp Styl**e Srl*
> > > via Antonio Meucci, 37
> > > 41019 Limidi di Soliera (MO)
> > > T 059/684916
> > > M 334/3779851
> > >
> > > www.mpstyle.it<http://www.mpstyle.it>
> > >
> >
>
>
> --
> Giulio Speri
>
>
> *Mp Styl**e Srl*
> via Antonio Meucci, 37
> 41019 Limidi di Soliera (MO)
> T 059/684916
> M 334/3779851
>
> www.mpstyle.it
>

Pritam Kute

Re: OFBiz OutOfMemory and stucked JobPoller issue

We had faced the similar issue in one of the custom project. Our server
memory was getting full after some time when we restart the server and it
was causing the server to not to respond to web requests. Also, we faced
the issue with job poller not picking the new jobs for execution. We have
executed the following steps and now the server is very stable.
-- Followed the sync-async architecture as mentioned Arun
-- Disabled the server-hit and visits as it was not used in the system
-- Analysed the code level issues using thread dumps and fixed the code
sections.

Thanks and Regards
--
Pritam Kute

On Fri, Sep 21, 2018 at 5:43 PM Arun Patidar <[hidden email]> wrote:

> Managing of Visits and serverHit data in separate DB is very easy and will
> be faster to delete old records without affecting the main DB.
>
> A dedicated server only for processing jobs performs well as per my
> experience.
>
> Apart from this, root cause should be identified and fixed as Taher
> mentioned.
>
>
>
> Kind Regards,
>
> Arun Patidar
> Director of Information Systems
>
> *HotWax CommerceReal OmniChannel. Real Results.*
> m: +91 9827353082
>
>
>
> On Fri, Sep 21, 2018 at 4:07 PM Giulio Speri - MpStyle Srl <
> [hidden email]> wrote:
>
> > Hi everyone,
> >
> > first of all, I'd like to thank you very much, for the quick response and
> > the suggestions you gave me.
> > We are absolutely convinced to investigate deeper into this issue and
> find
> > the root cause, because we strongly believe in the OFBiz framework and
> for
> > us it's becoming a core part of our business.
> >
> > To answer to @Arun:
> > regarding the large tables sizes: at the moment we're using the Visit
> table
> > (along with other tables) for Business Intelligence purposes, so what we
> > did and what we will do, in accordance with the customer, is to keep
> those
> > tables smaller, keeping record for the last XX days only and deleting
> > everything is older than that number of days (with a scheduled weekly
> > service, for example); this should keep the dimensions of those tables,
> > acceptable and at the same time give good data for BI analysis.
> >
> > Regarding the split service configuration, instead, we have our
> > installation setup in a very similar mode; we have front1 (ofbiz1
> instance)
> > where the JobPoller is enabled and a second front where it is disabled
> > (ofbiz2 instance). Web requests, are however, served by both instances.
> > Indeed the problems we're having, are limited to front1 only (but we did
> > not test if, enabling the JobPoller on the front2, we obtain the same
> > behaviour).
> >
> > @Taher We gladly accept your help proposal, so should I proceed with the
> > creation of a Jira task?
> >
> >
> > Thank you very much,
> >
> > Giulio
> >
> > Il giorno ven 21 set 2018 alle ore 11:16 Ingo Wolfmayr <
> > [hidden email]> ha scritto:
> >
> > > The root cause on my side was using a groovy service that was called
> > > thousends of times with the caching of groovy scripts disabled. If you
> > to a
> > > heap cache analyzis check the count of groovy instances. It was fixed
> by
> > > change the caching of groovy scripts. Maybe that saves you some time.
> > >
> > >
> > >
> > > Gesendet von meinem Windows 10-Gerät.
> > >
> > >
> > >
> > > Von: Taher Alkhateeb<mailto:[hidden email]>
> > > Gesendet: Freitag, 21. September 2018 11:01
> > > An: OFBiz user mailing list<mailto:[hidden email]>
> > > Betreff: Re: OFBiz OutOfMemory and stucked JobPoller issue
> > >
> > >
> > >
> > > I think it would be much better if we can find a root-cause fix instead
> > of
> > > altering settings. I would be interested in helping out digging out
> > > whatever is causing the memory leak if Guilio is interested in opening
> a
> > > JIRA for this and investigating further.
> > >
> > > On Fri, Sep 21, 2018, 11:48 AM Ingo Wolfmayr <[hidden email]>
> > > wrote:
> > >
> > > > Hallo Guilio,
> > > >
> > > > I had a similar issue some time ago. My fix was to enable caching for
> > > > groovy scripts etc.
> > > >
> > > > framework/base/config/cache.properties.
> > > >
> > > > There is a part that starts says: # Development Mode - comment these
> > out
> > > > to better cache groovy scripts, etc
> > > >
> > > > After checking the heat cache analysis I found that for every groovy
> > > > script call a new instance of that class was created. Caching fixed
> it
> > > > completely. Maybe it will work for you too. You will then have to
> clear
> > > the
> > > > cache in ofbiz if you change something in your groovy scripts.
> > > >
> > > > Best regards
> > > > Ingo
> > > >
> > > >
> > > > -----Ursprüngliche Nachricht-----
> > > > Von: Giulio Speri - MpStyle Srl <[hidden email]>
> > > > Gesendet: Donnerstag, 20. September 2018 18:43
> > > > An: [hidden email]
> > > > Betreff: OFBiz OutOfMemory and stucked JobPoller issue
> > > >
> > > > Hello everyone,
> > > >
> > > > hope you're doing good.
> > > > I am writing, because I am struggling with a quite strange problem,
> > over
> > > > an ofbiz installation, for one of our customers.
> > > > This installation is composed by two instances of OFBiz (v13.07.03),
> > > > served via an Apache Tomcat webserver, along with a load balancer.
> > > > The database server is MariaDB.
> > > >
> > > > We had the first problems, about 3 weeks ago, when suddenly, the
> front1
> > > > (ofbiz instance 1), stopped serving web requests; front2, instead,
> was
> > > > still working correctly.
> > > >
> > > > Obviously we checked the log files, and we saw that async services
> were
> > > > failing; the failure was accompanied by this error line:
> > > >
> > > > *Thread "AsyncAppender-async" java.lang.OutOfMemoryError: GC overhead
> > > limit
> > > > exceeded*
> > > >
> > > > We analyzed the situation with our system specialists, and they told
> us
> > > > that the application was highly stressing machine resources (cpu
> always
> > > at
> > > > or near 100%, RAM usage rapidly increasing), until the jvm run out of
> > > > memory.
> > > > This "resource-high-consumption situation", occurred only when ofbiz1
> > > > instance was started with the JobPoller enabled; if the JobPoller was
> > not
> > > > enabled, ofbiz run with low resource usage.
> > > >
> > > > We then focused on the db, to check first of all the dimensions; the
> > > > result was disconcerting; 45GB, mainly divided on four tables:
> > SERVER_HIT
> > > > (about
> > > > 18 GB), VISIT (about 15 GB), ENTITY_SYNC_REMOVE (about 8 GB), VISITOR
> > > > (about 2 GB).
> > > > All the other tables had a size in the order of few MB each.
> > > >
> > > > The first thing we did, was to clear all those tables, reducing
> > > > considerably the db size.
> > > > After the cleaning, we tried to start ofbiz1 again, with the
> JobPoller
> > > > component enabled; this caused a lot of old scheduled/queued jobs, to
> > > > execute.
> > > > Except than for the start-up time, the resource usage of the machine,
> > > > stabilized around normal to low values (cpu 1-10%).
> > > > Ofbiz seemed to work (web request was served), but we noticed taht
> the
> > > > JobPoller did not schedule or run jobs, anymore.
> > > > The number of job in "Pending" state in the JobSandbox entity was
> small
> > > > (about 20); no Queued, no Failed, no jobs in other states.
> > > > In addition to this, unfortunately, after few hours, jvm run out of
> > > memory
> > > > again.
> > > >
> > > > Our jvm has an heap maximum size of 20GB ( we have 32GB on the
> > machine),
> > > > so it's not so small, I think.
> > > > The next step we're going to do is set-up locally the application
> over
> > > the
> > > > same production db to see what happens.
> > > >
> > > > Now that I explained the situation, I am going to ask if, in your
> > > > opinion/experience:
> > > >
> > > > Could the JobPoller component be the root (and only) cause of the
> > > > OutOfMemory of the jvm?
> > > >
> > > > Could this issue be related to this
> > > > https://issues.apache.org/jira/browse/OFBIZ-5710?
> > > >
> > > > Dumping and analyzing the heap of the jvm could help in some way to
> > > > understand what or who fills the memory or is this operation a waste
> of
> > > > time?
> > > >
> > > > Is there something that we did not considered or missed during the
> > whole
> > > > process of problem analysis?
> > > >
> > > >
> > > > I really thank you all for your attention and your help; any
> suggestion
> > > or
> > > > advice would really be greatly appreciated.
> > > >
> > > > Kind regards,
> > > > Giulio
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Giulio Speri
> > > >
> > > >
> > > > *Mp Styl**e Srl*
> > > > via Antonio Meucci, 37
> > > > 41019 Limidi di Soliera (MO)
> > > > T 059/684916
> > > > M 334/3779851
> > > >
> > > > www.mpstyle.it<http://www.mpstyle.it>
> > > >
> > >
> >
> >
> > --
> > Giulio Speri
> >
> >
> > *Mp Styl**e Srl*
> > via Antonio Meucci, 37
> > 41019 Limidi di Soliera (MO)
> > T 059/684916
> > M 334/3779851
> >
> > www.mpstyle.it
> >
>