OutOfMemory provoked by Cache overflow
-------------------------------------- Key: OFBIZ-2186 URL: https://issues.apache.org/jira/browse/OFBIZ-2186 Project: OFBiz Issue Type: Bug Components: framework Affects Versions: SVN trunk, Release Branch 4.0 Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo Reporter: Philippe Mouawad Priority: Critical In our production system, we had an OutOfMemoryError. I analyzed the generated Heap Dump and found the following: The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: - maxSize is set to 5000 (as set in the cache.properties) - size is 128930 => PROBLEM IN cache.properties: entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): public CacheLine<V> get(Object key) { if (key == null) { if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); } return getNoCheck(key); } Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): public V get(Object key) { Entry<K,V> e = (Entry<K,V>)getEntry(key); if (e == null) return null; e.recordAccess(this); return e.value; } So the default of synchronization corrupts the state of LRUMap which grows indefinitely I will submit a patch for this on the trunk. Philippe www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philippe Mouawad updated OFBIZ-2186: ------------------------------------ Attachment: CacheLineTable-patch.txt Addition of synchronized to get() > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Priority: Critical > Attachments: CacheLineTable-patch.txt > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673279#action_12673279 ] David E. Jones commented on OFBIZ-2186: --------------------------------------- It sounds like it is in your opinion that the problem is the missing synchronization. IMO, that would be really surprising... Before considering a patch we really need to _know_ what is causing the problem and come up with a fix. Adding a blanket synchronization will slow things down a LOT, so we should do whatever is needed to find an alternative. Still, it doesn't matter because unless we know this fixes the problem we should not consider the change. > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Priority: Critical > Attachments: CacheLineTable-patch.txt > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673321#action_12673321 ] Philippe Mouawad commented on OFBIZ-2186: ----------------------------------------- Hello M. Jones, If you don't think that my arguments are correct, can you explain why ? I agree with you about the synchronization performance loss but is it acceptable that this absence of synchronization provokes a server crash through OOM ? I will propose another solution soon based on another implementation of LRU but meanwhile we will put this in production and come back with results. Now about my explanation, since a LRU is used, LRU is computed based on get(), look at the code of LinkedHashMap and you will see that get() modifies the state of the Map, if so Synchronization is mandatory. Tell me what kind of info you need to be convinced. To be more convinced run the program I attach with the following options on a JDK 5: -Xmx5m -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=C:/temp/ It will throw an OutOfMemory. Add synchronized to CacheLineUsage#get() and see NO MORE PROBLEM. Inspect the Heap Dump and you get exactly what I told you: size of Map is 31362 while maxSize is 5000 Memory occupation is 6056872 > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Priority: Critical > Attachments: CacheLineTable-patch.txt > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Philippe Mouawad updated OFBIZ-2186: ------------------------------------ Attachment: OfbizBug.zip Mimics the bug. LRUMap is Ofbiz class CacheLineUsage uses the LRUMap the way CacheLineTable uses it, no synchro on get TestLRU run some threads Run this with -Xmx set to 5Mo > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Priority: Critical > Attachments: CacheLineTable-patch.txt, OfbizBug.zip > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12673338#action_12673338 ] David E. Jones commented on OFBIZ-2186: --------------------------------------- Philippe, Please understand that what you're saying now is very different than what you said before. I wasn't questioning your conclusion, I was just responding to your statement that you opinion was that the non-synchronized get was the problem: "IMHO the bug is a missing synchonized in get()". I totally agree that if there is code that changes things it needs to be synchronized, and we should look into that more, and how to isolate it so that it is done only when the LRU list is used (which is not the common case, BTW). Thank you for providing more detail and being more clear in your additional comments. This one was actually pretty easy, so I made some changes in SVN rev 744205 that should resolve the issue, and only have an effect on performance if there is a size limit on the cache. Please comment if this fixes the problem you're running into, and if so we can close the issue... and if not we'll have to look into it more! > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Priority: Critical > Attachments: CacheLineTable-patch.txt, OfbizBug.zip > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12679609#action_12679609 ] Philippe Mouawad commented on OFBIZ-2186: ----------------------------------------- Hello M. Jones, Thank you for taking into account this bug. We tested the patch you proposed in production, the same problem occured, the size of LRUMap exceed the configured maxSize. We tested the initial patch I proposed and the problem is solved, we did a voluntary heap dump using jmap, and the hprof file generated shows that the size of LRUMap is OK. I hope you can take into account this remark as soon as possible. Philippe www.ubik-ingenierie.com > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Priority: Critical > Attachments: CacheLineTable-patch.txt, OfbizBug.zip > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680008#action_12680008 ] Philippe Mouawad commented on OFBIZ-2186: ----------------------------------------- Hello M. Jones, A colleague (Benoit) examined in detail your patch, the problem comes from the fact that the synchronized block is on 2 different objects so synchronization fails. > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Priority: Critical > Attachments: CacheLineTable-patch.txt, OfbizBug.zip > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680011#action_12680011 ] David E. Jones commented on OFBIZ-2186: --------------------------------------- What are the 2 different objects? I appreciate the clue, but no I'm not going to hunt it down on my own if you already know the answer... Also please feel free to submit a patch, but understand I'll only accept it if it follows the pattern used so far where the functionality for the size limit on the cache will not slow down the cache when it is turned off. > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Priority: Critical > Attachments: CacheLineTable-patch.txt, OfbizBug.zip > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680015#action_12680015 ] Benoit Wiart commented on OFBIZ-2186: ------------------------------------- M. Jones, CacheLineTable and LRUMap do not synchronize on the same lock - CacheLineTable synchronization use the CacheLineTable instance monitor - LRUMap synchronization use the LRUMap instance monitor the LRUMap (memoryTable) can be updated concurrently by 2 differents threads (or more than 2) - one in the CacheLineTable #put or CacheLineTable#remove - another in the CacheLineTable#get this leads to the corrupton of the LRUMap which is not thread-safe the CacheLineTable#get should be synchronized or the LRUMap should be replaced by a thead safe collection Benoit Wiart www.ubik-ingenierie.com > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Priority: Critical > Attachments: CacheLineTable-patch.txt, OfbizBug.zip > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680015#action_12680015 ] benoitw edited comment on OFBIZ-2186 at 3/8/09 2:16 PM: ------------------------------------------------------------- M. Jones, CacheLineTable and LRUMap do not synchronize on the same lock - CacheLineTable synchronization use the CacheLineTable instance monitor - LRUMap synchronization use the LRUMap instance monitor the LRUMap (memoryTable) can be updated concurrently by 2 differents threads (or more than 2) - one in the CacheLineTable #put or CacheLineTable#remove - another in the CacheLineTable#get this leads to the corruption of the LRUMap which is not thread-safe the CacheLineTable#get should be synchronized or the LRUMap should be replaced by a thead safe collection Benoit Wiart www.ubik-ingenierie.com was (Author: benoitw): M. Jones, CacheLineTable and LRUMap do not synchronize on the same lock - CacheLineTable synchronization use the CacheLineTable instance monitor - LRUMap synchronization use the LRUMap instance monitor the LRUMap (memoryTable) can be updated concurrently by 2 differents threads (or more than 2) - one in the CacheLineTable #put or CacheLineTable#remove - another in the CacheLineTable#get this leads to the corrupton of the LRUMap which is not thread-safe the CacheLineTable#get should be synchronized or the LRUMap should be replaced by a thead safe collection Benoit Wiart www.ubik-ingenierie.com > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Priority: Critical > Attachments: CacheLineTable-patch.txt, OfbizBug.zip > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680015#action_12680015 ] benoitw edited comment on OFBIZ-2186 at 3/8/09 2:17 PM: ------------------------------------------------------------- M. Jones, CacheLineTable and LRUMap do not synchronize on the same lock - CacheLineTable synchronization use the CacheLineTable instance monitor - LRUMap synchronization use the LRUMap instance monitor the LRUMap (memoryTable) can be updated concurrently by 2 differents threads - one in the CacheLineTable #put or CacheLineTable#remove - another in the CacheLineTable#get this leads to the corruption of the LRUMap which is not thread-safe the CacheLineTable#get should be synchronized or the LRUMap should be replaced by a thead safe collection Benoit Wiart www.ubik-ingenierie.com was (Author: benoitw): M. Jones, CacheLineTable and LRUMap do not synchronize on the same lock - CacheLineTable synchronization use the CacheLineTable instance monitor - LRUMap synchronization use the LRUMap instance monitor the LRUMap (memoryTable) can be updated concurrently by 2 differents threads (or more than 2) - one in the CacheLineTable #put or CacheLineTable#remove - another in the CacheLineTable#get this leads to the corruption of the LRUMap which is not thread-safe the CacheLineTable#get should be synchronized or the LRUMap should be replaced by a thead safe collection Benoit Wiart www.ubik-ingenierie.com > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Priority: Critical > Attachments: CacheLineTable-patch.txt, OfbizBug.zip > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680088#action_12680088 ] David E. Jones commented on OFBIZ-2186: --------------------------------------- Thank you for the additional information. Could you try the changes in SVN rev 751619? With this the LRUMap is made synchronized using the Collections.synchronizedMap method when it is first created, and that should make it thread safe (except for certain iteration related operations, but we aren't doing those with the memoryTable field as far as I can tell). This should fix the synchronized get problem, and the separately monitored get vs put/remove problem. > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Priority: Critical > Attachments: CacheLineTable-patch.txt, OfbizBug.zip > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680105#action_12680105 ] Philippe Mouawad commented on OFBIZ-2186: ----------------------------------------- Hello M. Jones, First thank you for your reactivity. Second here is the status about the two previous patches: - The first one (you proposed) made the production crash every day for 5 days, an OOM was generated at least once a day on each server - With the second one (my proposition) production has been up for 10 days (we did a jmap to verify the leak was solved as I said), furthermore we didn't notice slow response time (if you are worried about the synchronized block). The ecommerce site we are using ofbiz for is a big one with 3 Balanced JVM servers. Now about you patch, I am not sure that your patch cannot introduce a Java Deadlock risk since now there are 2 synchronization occuring is the case of put/delete (one on the CacheLineTable through synchronized and one on LRUMap through Collections.synchronizedMap wrapping). As you may guess we cannot afford testing this new patch with the risk of production crashing again, it is an ECommerce site. I am wondering why you don't want to take my patch since it is OK on a big production SITE and does not incur an important performance loss. Thanks for your answer. > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Priority: Critical > Attachments: CacheLineTable-patch.txt, OfbizBug.zip > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680109#action_12680109 ] Philippe Mouawad commented on OFBIZ-2186: ----------------------------------------- Hello Again, Concerning the Limitation on cache. On a big SITE with a rather important catalog, not limiting the cache leads to a lot of OutOfMemory . We experienced these problems at first that's why we are limiting more and more caches. So you should not think that cache limitation concerns only few caches. Philippe. > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Priority: Critical > Attachments: CacheLineTable-patch.txt, OfbizBug.zip > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacques Le Roux reassigned OFBIZ-2186: -------------------------------------- Assignee: Jacques Le Roux > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Assignee: Jacques Le Roux > Priority: Critical > Attachments: CacheLineTable-patch.txt, OfbizBug.zip > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12687999#action_12687999 ] Jacques Le Roux commented on OFBIZ-2186: ---------------------------------------- Hi Philippe, When you used this patch have you already applied the patch you submitted in OFBIZ-2124 ? > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Assignee: Jacques Le Roux > Priority: Critical > Attachments: CacheLineTable-patch.txt, OfbizBug.zip > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12688000#action_12688000 ] Philippe Mouawad commented on OFBIZ-2186: ----------------------------------------- Hello M. Le Roux, Yes. - OFBIZ-2124 was blocking since the application was totally blocked - This bug was also critical since the application dumpled every day on the 3 servers We are now in production for 20 days with my 2 patches applied and the situation is OK. We have been tracking Memory and response Time, and we don't have memory leaks. Philippe. > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: SVN trunk, Release Branch 4.0 > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Assignee: Jacques Le Roux > Priority: Critical > Attachments: CacheLineTable-patch.txt, OfbizBug.zip > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
In reply to this post by Nicolas Malin (Jira)
[ https://issues.apache.org/jira/browse/OFBIZ-2186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacques Le Roux closed OFBIZ-2186. ---------------------------------- Resolution: Fixed Fix Version/s: SVN trunk Release Branch 9.04 Thanks Philippe, Your patch is in trunk at r882072, R9.04 at r882073. I was notsable to merge wih R4.0. If you could provide a patch it would be great... > OutOfMemory provoked by Cache overflow > -------------------------------------- > > Key: OFBIZ-2186 > URL: https://issues.apache.org/jira/browse/OFBIZ-2186 > Project: OFBiz > Issue Type: Bug > Components: framework > Affects Versions: Release Branch 4.0, SVN trunk > Environment: Linux, JDK 1.5_15, Xmx set to 1156Mo > Reporter: Philippe Mouawad > Assignee: Jacques Le Roux > Priority: Critical > Fix For: Release Branch 9.04, SVN trunk > > Attachments: CacheLineTable-patch.txt, OfbizBug.zip > > > In our production system, we had an OutOfMemoryError. > I analyzed the generated Heap Dump and found the following: > The cache entitycache.entity-list.default.ProductCategoryMember retains a heap of 369314128 Bytes. > The LRUMap hold by this object is occupying this space (369314128) and this object has a stange state: > - maxSize is set to 5000 (as set in the cache.properties) > - size is 128930 => PROBLEM > IN cache.properties: > entitycache.entity-list.default.ProductCategoryMember.expireTime=3600000 > entitycache.entity-list.default.ProductCategoryMember.useSoftReference=true > entitycache.entity-list.default.ProductCategoryMember.maxInMemory=5000 > entitycache.entity-list.default.ProductCategoryMember.maxSize=7500 > I analyzed the code of LRUMap and its usage in CacheLineTable and IMHO the bug is a missing synchonized in get(): > public CacheLine<V> get(Object key) { > if (key == null) { > if (Debug.verboseOn()) Debug.logVerbose("In CacheLineTable tried to get with null key, using NullObject" + this.cacheName, module); > } > return getNoCheck(key); > } > Since LRUMap extends LinkedHashMap, if you look at get method, it changes the state of the Map by calling e.recordAccess(this): > public V get(Object key) { > Entry<K,V> e = (Entry<K,V>)getEntry(key); > if (e == null) > return null; > e.recordAccess(this); > return e.value; > } > So the default of synchronization corrupts the state of LRUMap which grows indefinitely > I will submit a patch for this on the trunk. > Philippe > www.ubik-ingenierie.com -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. |
Free forum by Nabble | Edit this page |