Tomcat Unresponsive

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

Tomcat Unresponsive

darrell73s
Hi All,

I've been having some strange behavior on a test system (with little to no volume) AND a live system for a custom OFBiz 12.04.02 application which I've created. It's worth noting that in the custom application, raw connections are not used, and everything is always done within the confines of the entity engine and framework.

I'll briefly explain the architecture which applies to the TEST and LIVE systems, but more details can be provided as needed. I have a PostgreSQL database which is shared between two different instances of OFBiz. One instance (1) has certain webapps running, and another instance (2) has a different set of webapps running. We run Apache 2.4 in front of each instance, and Apache is connected to the OFBiz tomcat via AJP using mod_jk. I've also ensured that only one instance's job manager is actually processing jobs, to keep the table locking to a minimum.

The strange part is that frequently, you will attempt to connect to OFBiz instance 1, and Tomcat is unresponsive. I've determined there to be an issue connecting to Tomcat as mod_jk reports errors about not being able to connect to the ajp port, and since I also run the http connector, if I attempt to connect locally on the machine (i.e. http://localhost:8080/webtools - completely bypassing Apache 2.4) the browser hangs and I never receive a response. The other OFBiz instance appears to be unaffected in this manner. I only mention this other instance in case there's some multi-instance interaction occurring which I may not be aware of.

Generally, the only way to fix this is to restart the application, and it's fine for a while, but ultimately I'll see the same behavior before too long. I've run jVisualVM, performed a thread dump, and noticed there to be locks on the GenericObjectPool.doBorrow() method which is in the same call stack as retrieving database connections from the pool. Figuring that it could perhaps be a database pooling issue, I ran using the DebugManagedDataSource in the DBCPConnectionFactory, but the information it provided didn't appear to indicate any issues.

Interestingly enough, the LIVE system (which users are constantly using), lasts longer without needing restarts than the TEST system which has little to no volume.

Have any of you seen anything like this? Given that I'm seeing an issue even with Apache out of the picture it almost seems as though there may be a misconfiguration in OFBiz?

Thanks!
Darrell
Reply | Threaded
Open this post in threaded view
|

Re: Tomcat Unresponsive

Jacopo Cappellato-4
Hi Darrell,

this is just a shot in the dark but it may be that after some time that the connections in the pool are not used, they are terminated by the db server and are no longer valid.
It may be a configuration issue as this should not happen... it may be useful if you could share the thread dump.

Jacopo

On Nov 26, 2014, at 9:36 PM, darrell73s <[hidden email]> wrote:

> Hi All,
>
> I've been having some strange behavior on a test system (with little to no
> volume) AND a live system for a custom OFBiz 12.04.02 application which I've
> created. It's worth noting that in the custom application, raw connections
> are not used, and everything is always done within the confines of the
> entity engine and framework.
>
> I'll briefly explain the architecture which applies to the TEST and LIVE
> systems, but more details can be provided as needed. I have a PostgreSQL
> database which is shared between two different instances of OFBiz. One
> instance (1) has certain webapps running, and another instance (2) has a
> different set of webapps running. We run Apache 2.4 in front of each
> instance, and Apache is connected to the OFBiz tomcat via AJP using mod_jk.
> I've also ensured that only one instance's job manager is actually
> processing jobs, to keep the table locking to a minimum.
>
> The strange part is that frequently, you will attempt to connect to OFBiz
> instance 1, and Tomcat is unresponsive. I've determined there to be an issue
> connecting to Tomcat as mod_jk reports errors about not being able to
> connect to the ajp port, and since I also run the http connector, if I
> attempt to connect locally on the machine (i.e.
> http://localhost:8080/webtools - completely bypassing Apache 2.4) the
> browser hangs and I never receive a response. The other OFBiz instance
> appears to be unaffected in this manner. I only mention this other instance
> in case there's some multi-instance interaction occurring which I may not be
> aware of.
>
> Generally, the only way to fix this is to restart the application, and it's
> fine for a while, but ultimately I'll see the same behavior before too long.
> I've run jVisualVM, performed a thread dump, and noticed there to be locks
> on the GenericObjectPool.doBorrow() method which is in the same call stack
> as retrieving database connections from the pool. Figuring that it could
> perhaps be a database pooling issue, I ran using the DebugManagedDataSource
> in the DBCPConnectionFactory, but the information it provided didn't appear
> to indicate any issues.
>
> Interestingly enough, the LIVE system (which users are constantly using),
> lasts longer without needing restarts than the TEST system which has little
> to no volume.
>
> Have any of you seen anything like this? Given that I'm seeing an issue even
> with Apache out of the picture it almost seems as though there may be a
> misconfiguration in OFBiz?
>
> Thanks!
> Darrell
>
>
>
> --
> View this message in context: http://ofbiz.135035.n4.nabble.com/Tomcat-Unresponsive-tp4659005.html
> Sent from the OFBiz - User mailing list archive at Nabble.com.

Reply | Threaded
Open this post in threaded view
|

Re: Tomcat Unresponsive

Mike Z
In reply to this post by darrell73s
My guess is maybe the JDK. What java version are you running and settings
are you launching ofbiz?

Sent from my BlackBerry® PlayBook™
www.blackberry.com

------------------------------
*From:* "darrell73s" <[hidden email]>
*To:* "[hidden email]" <[hidden email]>
*Sent:* November 26, 2014 12:45 PM
*Subject:* Tomcat Unresponsive

Hi All,

I've been having some strange behavior on a test system (with little to no
volume) AND a live system for a custom OFBiz 12.04.02 application which I've
created. It's worth noting that in the custom application, raw connections
are not used, and everything is always done within the confines of the
entity engine and framework.

I'll briefly explain the architecture which applies to the TEST and LIVE
systems, but more details can be provided as needed. I have a PostgreSQL
database which is shared between two different instances of OFBiz. One
instance (1) has certain webapps running, and another instance (2) has a
different set of webapps running. We run Apache 2.4 in front of each
instance, and Apache is connected to the OFBiz tomcat via AJP using mod_jk.
I've also ensured that only one instance's job manager is actually
processing jobs, to keep the table locking to a minimum.

The strange part is that frequently, you will attempt to connect to OFBiz
instance 1, and Tomcat is unresponsive. I've determined there to be an issue
connecting to Tomcat as mod_jk reports errors about not being able to
connect to the ajp port, and since I also run the http connector, if I
attempt to connect locally on the machine
(i.e.http://localhost:8080/webtools - completely bypassing Apache 2.4)
the
browser hangs and I never receive a response. The other OFBiz instance
appears to be unaffected in this manner. I only mention this other instance
in case there's some multi-instance interaction occurring which I may not be
aware of.

Generally, the only way to fix this is to restart the application, and it's
fine for a while, but ultimately I'll see the same behavior before too long.
I've run jVisualVM, performed a thread dump, and noticed there to be locks
on the GenericObjectPool.doBorrow() method which is in the same call stack
as retrieving database connections from the pool. Figuring that it could
perhaps be a database pooling issue, I ran using the DebugManagedDataSource
in the DBCPConnectionFactory, but the information it provided didn't appear
to indicate any issues.

Interestingly enough, the LIVE system (which users are constantly using),
lasts longer without needing restarts than the TEST system which has little
to no volume.

Have any of you seen anything like this? Given that I'm seeing an issue even
with Apache out of the picture it almost seems as though there may be a
misconfiguration in OFBiz?

Thanks!
Darrell



--
View this message in context:
http://ofbiz.135035.n4.nabble.com/Tomcat-Unresponsive-tp4659005.html
Sent from the OFBiz - User mailing list archive at Nabble.com.
Reply | Threaded
Open this post in threaded view
|

Re: Tomcat Unresponsive

darrell73s
Hi folks,

As always, thanks for your input

Jacopo:

That's interesting, I was wondering the same thing, and seems to make some sense based on the dump. It should also have been noted that the network which I am operating on is a bit volatile, so whether the DB server (PostgreSQL in my case) may be closing the connections, or perhaps some other external interference (e.g. connection drop), is probably quite possible. In some of my non-OFBiz apps, I set a hard connection age so that the pool is reacquiring connections often, even if the connections have not been idle for an extended period of time. I've studied the javadocs in regards to DBCP 1.4 and the commons GenericObjectPool which is used in the DBCP connection factory, but have not seen any similar option (in 1.4) for setting a connection age such that a connection will be reacquired when that time expires, however this seems to be available via the maxConnLifetimeMillis setting in the new versions.

In my 12.04.02 copy of DBCPConnectionFactory the only options being set on the pool (which are pulled from entityengine.xml) are:
- timeBetweenEvictionRunsMillis
- maxActive
- maxIdle
- minIdle
- maxWait

I suppose I could manually set the "testOnBorrow" flag to support or rule out whether this may be the issue (not sure how much of a performance hit having this on would be, if it turns out to alleviate the problem), but based on your knowledge of the framework and DBCP configurations, do you have any other suggestions on what configurations could be set to further confirm that a connection "gone bad" may be the cause?

I've attached the pertinent threaddump content below

Mike Z:

I'm running JDK 6 update 45 (64-bit). Upon launch of OFBiz, the only setting being set is setting maximum JVM memory (i.e. pos.memory.max.param=-Xmx4096M).


Partial Threaddump:

"ajp-bio-0.0.0.0-8009-exec-38" daemon prio=10 tid=0x000000005c11f000 nid=0x561a waiting for monitor entry [0x00002b0ab9959000]
   java.lang.Thread.State: BLOCKED (on object monitor)
        at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:781)
        - waiting to lock <0x0000000700f02318> (a org.apache.commons.pool.impl.GenericObjectPool)
        at org.apache.commons.dbcp.managed.ManagedConnection.updateTransactionStatus(ManagedConnection.java:117)
        at org.apache.commons.dbcp.managed.ManagedConnection.<init>(ManagedConnection.java:55)
        at org.apache.commons.dbcp.managed.ManagedDataSource.getConnection(ManagedDataSource.java:77)
        at org.ofbiz.entity.connection.DBCPConnectionFactory.getConnection(DBCPConnectionFactory.java:66)
        at org.ofbiz.entity.jdbc.ConnectionFactory.getManagedConnection(ConnectionFactory.java:135)
        at org.ofbiz.geronimo.GeronimoTransactionFactory.getConnection(GeronimoTransactionFactory.java:83)
        at org.ofbiz.entity.transaction.TransactionFactory.getConnection(TransactionFactory.java:97)
        at org.ofbiz.entity.jdbc.ConnectionFactory.getConnection(ConnectionFactory.java:85)
        at org.ofbiz.entity.jdbc.SQLProcessor.getConnection(SQLProcessor.java:250)
        at org.ofbiz.entity.jdbc.SQLProcessor.prepareStatement(SQLProcessor.java:356)
        at org.ofbiz.entity.jdbc.SQLProcessor.prepareStatement(SQLProcessor.java:340)
        at org.ofbiz.entity.datasource.GenericDAO.select(GenericDAO.java:532)
        at org.ofbiz.entity.datasource.GenericDAO.select(GenericDAO.java:503)
        at org.ofbiz.entity.datasource.GenericHelperDAO.findByPrimaryKey(GenericHelperDAO.java:85)
        at org.ofbiz.entity.GenericDelegator.findOne(GenericDelegator.java:1572)
        at org.ofbiz.entity.GenericDelegator.findOne(GenericDelegator.java:1538)
        at org.ofbiz.webapp.stats.VisitHandler.getVisitor(VisitHandler.java:245)
        - locked <0x00000007b1206a68> (a org.apache.catalina.session.StandardSessionFacade)
        at org.ofbiz.webapp.control.LoginWorker.setWebContextObjects(LoginWorker.java:509)
        at org.ofbiz.webapp.control.LoginWorker.doBasicLogout(LoginWorker.java:639)
        at org.ofbiz.webapp.control.LoginWorker.logout(LoginWorker.java:585)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.ofbiz.webapp.event.JavaEventHandler.invoke(JavaEventHandler.java:93)
        at org.ofbiz.webapp.event.JavaEventHandler.invoke(JavaEventHandler.java:79)
        at org.ofbiz.webapp.control.RequestHandler.runEvent(RequestHandler.java:660)
        at org.ofbiz.webapp.control.RequestHandler.doRequest(RequestHandler.java:406)
        at org.ofbiz.webapp.control.ControlServlet.doGet(ControlServlet.java:224)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:621)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
        at org.ofbiz.webapp.control.ContextFilter.doFilter(ContextFilter.java:337)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
        at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
        at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:200)
        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
        at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
        - locked <0x00000007c1735640> (a org.apache.tomcat.util.net.SocketWrapper)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:662)

   Locked ownable synchronizers:
        - <0x00000007c17356c0> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

"ajp-bio-0.0.0.0-8009-exec-37" daemon prio=10 tid=0x000000005cda8000 nid=0x55e4 runnable [0x00002b0ab9858000]
   java.lang.Thread.State: RUNNABLE
        at java.net.SocketInputStream.socketRead0(Native Method)
        at java.net.SocketInputStream.read(SocketInputStream.java:129)
        at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:143)
        at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:112)
        at org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:71)
        at org.postgresql.core.PGStream.ReceiveChar(PGStream.java:269)
        at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1704)
        at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:255)
        - locked <0x00000007aab17b88> (a org.postgresql.core.v3.QueryExecutorImpl)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.execute(AbstractJdbc2Statement.java:559)
        at org.postgresql.jdbc2.AbstractJdbc2Statement.executeWithFlags(AbstractJdbc2Statement.java:403)
        at org.postgresql.jdbc2.AbstractJdbc2Connection.execSQLQuery(AbstractJdbc2Connection.java:357)
        at org.postgresql.jdbc2.AbstractJdbc2Connection.execSQLQuery(AbstractJdbc2Connection.java:349)
        at org.postgresql.jdbc2.AbstractJdbc2Connection.getTransactionIsolation(AbstractJdbc2Connection.java:883)
        at org.apache.commons.dbcp.DelegatingConnection.getTransactionIsolation(DelegatingConnection.java:353)
        at org.apache.commons.dbcp.PoolableConnectionFactory.activateObject(PoolableConnectionFactory.java:706)
        at org.apache.commons.pool.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:854)
        - locked <0x0000000700f02318> (a org.apache.commons.pool.impl.GenericObjectPool)
        at org.apache.commons.dbcp.managed.ManagedConnection.updateTransactionStatus(ManagedConnection.java:117)
        at org.apache.commons.dbcp.managed.ManagedConnection.<init>(ManagedConnection.java:55)
        at org.apache.commons.dbcp.managed.ManagedDataSource.getConnection(ManagedDataSource.java:77)
        at org.ofbiz.entity.connection.DBCPConnectionFactory.getConnection(DBCPConnectionFactory.java:66)
        at org.ofbiz.entity.jdbc.ConnectionFactory.getManagedConnection(ConnectionFactory.java:135)
        at org.ofbiz.geronimo.GeronimoTransactionFactory.getConnection(GeronimoTransactionFactory.java:83)
        at org.ofbiz.entity.transaction.TransactionFactory.getConnection(TransactionFactory.java:97)
        at org.ofbiz.entity.jdbc.ConnectionFactory.getConnection(ConnectionFactory.java:85)
        at org.ofbiz.entity.jdbc.SQLProcessor.getConnection(SQLProcessor.java:250)
        at org.ofbiz.entity.jdbc.SQLProcessor.prepareStatement(SQLProcessor.java:356)
        at org.ofbiz.entity.jdbc.SQLProcessor.prepareStatement(SQLProcessor.java:340)
        at org.ofbiz.entity.datasource.GenericDAO.select(GenericDAO.java:532)
        at org.ofbiz.entity.datasource.GenericDAO.select(GenericDAO.java:503)
        at org.ofbiz.entity.datasource.GenericHelperDAO.findByPrimaryKey(GenericHelperDAO.java:85)
        at org.ofbiz.entity.GenericDelegator.findOne(GenericDelegator.java:1572)
        at org.ofbiz.webapp.control.ControlEventListener.sessionDestroyed(ControlEventListener.java:80)
        at org.apache.catalina.session.StandardSession.expire(StandardSession.java:806)
        - locked <0x00000007b1206010> (a org.apache.catalina.session.StandardSession)
        at org.apache.catalina.session.StandardSession.expire(StandardSession.java:742)
        at org.apache.catalina.session.StandardSession.invalidate(StandardSession.java:1253)
        at org.apache.catalina.session.StandardSessionFacade.invalidate(StandardSessionFacade.java:190)
        at org.ofbiz.webapp.control.LoginWorker.doBasicLogout(LoginWorker.java:623)
        at org.ofbiz.webapp.control.LoginWorker.logout(LoginWorker.java:585)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.ofbiz.webapp.event.JavaEventHandler.invoke(JavaEventHandler.java:93)
        at org.ofbiz.webapp.event.JavaEventHandler.invoke(JavaEventHandler.java:79)
        at org.ofbiz.webapp.control.RequestHandler.runEvent(RequestHandler.java:660)
        at org.ofbiz.webapp.control.RequestHandler.doRequest(RequestHandler.java:406)
        at org.ofbiz.webapp.control.ControlServlet.doGet(ControlServlet.java:224)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:621)
        at javax.servlet.http.HttpServlet.service(HttpServlet.java:722)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:305)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
        at org.ofbiz.webapp.control.ContextFilter.doFilter(ContextFilter.java:337)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:243)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:210)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:222)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:123)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:502)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:99)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:118)
        at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:953)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
        at org.apache.coyote.ajp.AjpProcessor.process(AjpProcessor.java:200)
        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:589)
        at org.apache.tomcat.util.net.JIoEndpoint$SocketProcessor.run(JIoEndpoint.java:310)
        - locked <0x00000007ab063978> (a org.apache.tomcat.util.net.SocketWrapper)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
        at java.lang.Thread.run(Thread.java:662)

   Locked ownable synchronizers:
        - <0x00000007ab0639f8> (a java.util.concurrent.locks.ReentrantLock$NonfairSync)

Thanks!
Darrell
Reply | Threaded
Open this post in threaded view
|

Re: Tomcat Unresponsive

darrell73s
Hi all,

Just as a follow up on this, I was ultimately able to get to the bottom of the issue. It turned out to be a product of my environment. I came to find out that in my environment, there is a stateful firewall which will terminate connections which are idle for a given period of time. By writing a test program which uses the driver directly (taking OFBiz out of the equation), I simulated the issue, and discovered that attempting to use the connection once it gets into the strange state after the firewall having killed it, the call to run the query hangs for about 15 minutes before returning and throwing an exception (if I remember correctly it was a socket read timeout). Even with the connection pooler attempting to run the test query would cause this hanging behavior, as it's attempting to execute the query on the bad connection. This means that when attempting to visit the application, the application retrieves a connection from the pool, attempts to log the Visit, but because the connection is in a bad state, the query hangs, without throwing an exception, and you see the original behavior I described.

Because of my environmental constraints, I was not able to get the firewall modified, but was able to work around it by manipulating 'timeBetweenEvictionRunsMillis' and 'minEvictableIdleTimeMillis' such that connections are closed and re-established in the pool at a rate under the time at which the firewall will terminate them.

I'm now working on tackling a different issue which I see every few days in my LIVE system where the status of a query in PostgreSQL will be "idle in transaction" and seems to cause other queries to begin piling up. So far, it has been queries on the Product, and ProductCalculatedInfo entities which are idle in transaction, but it appears to be at random times: one second the query works fine, the next it becomes idle in transaction. I've searched all of my hot deploy code to ensure that there are no transactions being manually controlled in the custom code (e.g. Maybe there was a begin transaction with no commit/rollback) but it all appears fine. Have any of you folks ever seen this problem with OFBiz and PostgreSQL?

Thanks!
Reply | Threaded
Open this post in threaded view
|

Re: Tomcat Unresponsive

Jacopo Cappellato-4

On Dec 20, 2014, at 3:52 AM, darrell73s <[hidden email]> wrote:

> Hi all,
>
> Just as a follow up on this, I was ultimately able to get to the bottom of
> the issue. It turned out to be a product of my environment. I came to find
> out that in my environment, there is a stateful firewall which will
> terminate connections which are idle for a given period of time. By writing
> a test program which uses the driver directly (taking OFBiz out of the
> equation), I simulated the issue, and discovered that attempting to use the
> connection once it gets into the strange state after the firewall having
> killed it, the call to run the query hangs for about 15 minutes before
> returning and throwing an exception (if I remember correctly it was a socket
> read timeout). Even with the connection pooler attempting to run the test
> query would cause this hanging behavior, as it's attempting to execute the
> query on the bad connection. This means that when attempting to visit the
> application, the application retrieves a connection from the pool, attempts
> to log the Visit, but because the connection is in a bad state, the query
> hangs, without throwing an exception, and you see the original behavior I
> described.
>
> Because of my environmental constraints, I was not able to get the firewall
> modified, but was able to work around it by manipulating
> 'timeBetweenEvictionRunsMillis' and 'minEvictableIdleTimeMillis' such that
> connections are closed and re-established in the pool at a rate under the
> time at which the firewall will terminate them.

I agree that setting a proper value for the time-between-eviction-runs-millis is the right way to go.
With it, you can also enable:
test-while-idle="true"
and this should enable the validation of idle connections (by running the test query) at the frequency set by time-between-eviction-runs-millis

I hope it helps and congratulations for your progress.
Thanks for sharing these details, I am sure it will be valuable information for many users.

Jacopo

>
> I'm now working on tackling a different issue which I see every few days in
> my LIVE system where the status of a query in PostgreSQL will be "idle in
> transaction" and seems to cause other queries to begin piling up. So far, it
> has been queries on the Product, and ProductCalculatedInfo entities which
> are idle in transaction, but it appears to be at random times: one second
> the query works fine, the next it becomes idle in transaction. I've searched
> all of my hot deploy code to ensure that there are no transactions being
> manually controlled in the custom code (e.g. Maybe there was a begin
> transaction with no commit/rollback) but it all appears fine. Have any of
> you folks ever seen this problem with OFBiz and PostgreSQL?
>
> Thanks!
>
>
>
> --
> View this message in context: http://ofbiz.135035.n4.nabble.com/Tomcat-Unresponsive-tp4659005p4659936.html
> Sent from the OFBiz - User mailing list archive at Nabble.com.