[jira] [Updated] (OFBIZ-10592) OutOfMemory and stucked JobPoller issue

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

[jira] [Updated] (OFBIZ-10592) OutOfMemory and stucked JobPoller issue

Nicolas Malin (Jira)

     [ https://issues.apache.org/jira/browse/OFBIZ-10592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Giulio Speri updated OFBIZ-10592:
---------------------------------
    Attachment: OFBIZ-10592-trunkv18-OutOfMemory_ShoppingListServices.patch
                OFBIZ-10592-trunkv18-OutOfMemory_order_properties.patch
                OFBIZ-10592_OutOfMemory_order_properties.patch

> OutOfMemory and stucked JobPoller issue
> ---------------------------------------
>
>                 Key: OFBIZ-10592
>                 URL: https://issues.apache.org/jira/browse/OFBIZ-10592
>             Project: OFBiz
>          Issue Type: Bug
>          Components: ALL APPLICATIONS
>    Affects Versions: Release Branch 13.07
>            Reporter: Giulio Speri
>            Assignee: Giulio Speri
>            Priority: Critical
>         Attachments: OFBIZ-10592-trunkv18-OutOfMemory_ShoppingListServices.patch, OFBIZ-10592-trunkv18-OutOfMemory_order_properties.patch, OFBIZ-10592_OutOfMemory_order_properties.patch, Screenshot from 2019-04-20 02-32-37.png, ShoppingListServices.patch, ShoppingListServices.patch, ShoppingListServices_patchv2.patch, alloc_tree_600k_12102018.png, jvm_ofbiz1_profi_telem.png, jvm_prof_ofbiz1_telem2.png, ofbiz1_jvm_profil_nojobpoller.png, order_properties.patch, order_properties_patchv2.patch, recorder_object_600k_12102018.png, telemetry_ovrl_600k_12102018.png
>
>
>  
> This installation is composed by two instances of OFBiz (v13.07.03), served via an Apache Tomcat webserver, along with a load balancer.
> The database server is MariaDB.
>  
> We had the first problems, about 3 weeks ago, when suddenly, the front1 (ofbiz instance 1), stopped serving web requests; front2, instead, was still working correctly.
>  
> Obviously we checked the log files, and we saw that async services were failing; the failure was accompanied by this error line:
>  
> *_Thread "AsyncAppender-async" java.lang.OutOfMemoryError: GC overhead limit exceeded_*
>  
> We analyzed the situation with our system specialists, and they told us that the application was highly stressing machine resources (cpu always at or near 100%, RAM usage rapidly increasing), until the jvm run out of memory.
> This "resource-high-consumption situation", occurred only when ofbiz1 instance was started with the JobPoller enabled; if the JobPoller was not enabled, ofbiz run with low resource usage. 
>  
> We then focused on the db, to check first of all the dimensions; the result was disconcerting; 45GB, mainly divided on four tables: SERVER_HIT (about 18 GB), VISIT (about 15 GB), ENTITY_SYNC_REMOVE (about 8 GB), VISITOR (about 2 GB).
> All the other tables had a size in the order of few MB each.
>  
> The first thing we did, was to clear all those tables, reducing considerably the db size.
> After the cleaning, we tried to start ofbiz1 again, with the JobPoller component enabled; this caused a lot of old scheduled/queued jobs, to execute.
> Except than for the start-up time, the resource usage of the machine, stabilized around normal to low values (cpu 1-10%).
> Ofbiz seemed to work (web request was served), but we noticed that the JobPoller did not schedule or run jobs, anymore. 
> The number of job in "Pending" state in the JobSandbox entity was small (about 20); no Queued, no Failed, no jobs in other states.
> In addition to this, unfortunately, after few hours, jvm run out of memory again.
>  
> Our jvm has an heap maximum size of 20GB ( we have 32GB on the  machine), so it's not so small, I think.
> The next step we're going to do is set-up locally the application over the same production db to see what happens.
>  
> Now that I explained the situation, I am going to ask if, in your opinion/experience:
>  
> Could the JobPoller component be the root (and only) cause of the OutOfMemory of the jvm?
>  
> Could this issue be related to OFBIZ-5710?
>  
> Dumping and analyzing the heap of the jvm could help in some way to understand what or who fills the memory or is this operation a waste of time?
>  
> Is there something that we did not considered or missed during the whole process of problem analysis?
>  
>  
> I really thank you all for your attention and your help; any suggestion or advice would really be greatly appreciated.
>  
> Kind regards,
> Giulio



--
This message was sent by Atlassian Jira
(v8.3.4#803005)