Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
29 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Adam Heath-2
On 04/20/2012 07:53 AM, [hidden email] wrote:

> Author: jacopoc
> Date: Fri Apr 20 12:53:35 2012
> New Revision: 1328356
>
> URL: http://svn.apache.org/viewvc?rev=1328356&view=rev
> Log:
> The cache of parsed Groovy scripts was not thread safe; this issue, in instances with several concurrent threads running the same script the first time (i.e. not cached) could cause the same script parsed multiple times and then added to the cache (overriding the previous value); this event was causing the clearing of caches in Freemarker; because of a bug in Freemarker [*] this could cause a deadlock.
>   The issue is present on all versions of Freemarker but it is less frequent on latest version because of the refactoring of caches happened after release 2.3.10.
>
> [*] https://sourceforge.net/tracker/?func=detail&aid=3519805&group_id=794&atid=100794
>
> Modified:
>      ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
>
> Modified: ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
> URL: http://svn.apache.org/viewvc/ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java?rev=1328356&r1=1328355&r2=1328356&view=diff
> ==============================================================================
> --- ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java (original)
> +++ ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java Fri Apr 20 12:53:35 2012
> @@ -127,10 +127,17 @@ public class GroovyUtil {
>                   } else {
>                       scriptClass = parseClass(scriptUrl.openStream(), location);
>                   }
> -                if (Debug.verboseOn()) {
> -                    Debug.logVerbose("Caching Groovy script at: " + location, module);
> +                synchronized (parsedScripts) {
> +                    Class<?>  scriptClassCached = parsedScripts.get(location);
> +                    if (scriptClassCached == null) {
> +                        if (Debug.verboseOn()) {
> +                            Debug.logVerbose("Caching Groovy script at: " + location, module);
> +                        }
> +                        parsedScripts.put(location, scriptClass);
> +                    } else {
> +                        scriptClass = scriptClassCached;
> +                    }
>                   }
> -                parsedScripts.put(location, scriptClass);
>               }
>               return scriptClass;
>           } catch (Exception e) {
> @@ -177,10 +184,18 @@ public class GroovyUtil {
>               Class<?>  scriptClass = parsedScripts.get(script);
>               if (scriptClass == null) {
>                   scriptClass = loadClass(script);
> -                if (Debug.verboseOn()) Debug.logVerbose("Caching Groovy script: " + script, module);
> -                parsedScripts.put(script, scriptClass);
> +                synchronized (parsedScripts) {
> +                    Class<?>  scriptClassCached = parsedScripts.get(script);
> +                    if (scriptClassCached == null) {
> +                        if (Debug.verboseOn()) {
> +                            Debug.logVerbose("Caching Groovy script: " + script, module);
> +                        }
> +                        parsedScripts.put(script, scriptClass);
> +                    } else {
> +                        scriptClass = scriptClassCached;
> +                    }
> +                }
>               }
> -
>               return InvokerHelper.createScript(scriptClass, getBinding(context)).run();
>           } catch (CompilationFailedException e) {
>               String errMsg = "Error loading Groovy script [" + script + "]: " + e.toString();

Er, no, this is the *wrong* way to fix this.  parsedScripts is a
UtilCache instance, not a Map, not even a HashMap.  It is *designed* to
not corrupt without synchronization.

The proper fix is:

Class scriptClass = parsedScripts.get(script);
if (scriptClass == null) {
   scriptClass = genClass();
   parsedScripts.putIfAbsent(script, scriptClass);
   scriptClass = parsedScripts.get(script);
}

What was the real problem you were attempting to fix?  The only issue
that would have happened with the previous code was the case of a static
variable being defined in the parsed script, and during a race condition
while parsing, 2 separate classes for the same script being in the system.

This change is in the release branch.  I'd like to know the real reason
this was changed in ofbiz, and then my fix applied to the branch.

ps: I can't recommend it enough, but in addition to the entity model
data resource books, everyone working on ofbiz should also be reading
java concurrency in practice.

pps: It is my goal to remove all the DCL in ofbiz; I'd love it if others
would help.
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Jacopo Cappellato-4
Hi Adam,

please see inline:

On May 11, 2012, at 4:52 AM, Adam Heath wrote:

> On 04/20/2012 07:53 AM, [hidden email] wrote:
>> Author: jacopoc
>> Date: Fri Apr 20 12:53:35 2012
>> New Revision: 1328356
>>
>> URL: http://svn.apache.org/viewvc?rev=1328356&view=rev
>> Log:
>> The cache of parsed Groovy scripts was not thread safe; this issue, in instances with several concurrent threads running the same script the first time (i.e. not cached) could cause the same script parsed multiple times and then added to the cache (overriding the previous value); this event was causing the clearing of caches in Freemarker; because of a bug in Freemarker [*] this could cause a deadlock.
>>  The issue is present on all versions of Freemarker but it is less frequent on latest version because of the refactoring of caches happened after release 2.3.10.
>>
>> [*] https://sourceforge.net/tracker/?func=detail&aid=3519805&group_id=794&atid=100794
>>
>> Modified:
>>     ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
>>
>> Modified: ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
>> URL: http://svn.apache.org/viewvc/ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java?rev=1328356&r1=1328355&r2=1328356&view=diff
>> ==============================================================================
>> --- ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java (original)
>> +++ ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java Fri Apr 20 12:53:35 2012
>> @@ -127,10 +127,17 @@ public class GroovyUtil {
>>                  } else {
>>                      scriptClass = parseClass(scriptUrl.openStream(), location);
>>                  }
>> -                if (Debug.verboseOn()) {
>> -                    Debug.logVerbose("Caching Groovy script at: " + location, module);
>> +                synchronized (parsedScripts) {
>> +                    Class<?>  scriptClassCached = parsedScripts.get(location);
>> +                    if (scriptClassCached == null) {
>> +                        if (Debug.verboseOn()) {
>> +                            Debug.logVerbose("Caching Groovy script at: " + location, module);
>> +                        }
>> +                        parsedScripts.put(location, scriptClass);
>> +                    } else {
>> +                        scriptClass = scriptClassCached;
>> +                    }
>>                  }
>> -                parsedScripts.put(location, scriptClass);
>>              }
>>              return scriptClass;
>>          } catch (Exception e) {
>> @@ -177,10 +184,18 @@ public class GroovyUtil {
>>              Class<?>  scriptClass = parsedScripts.get(script);
>>              if (scriptClass == null) {
>>                  scriptClass = loadClass(script);
>> -                if (Debug.verboseOn()) Debug.logVerbose("Caching Groovy script: " + script, module);
>> -                parsedScripts.put(script, scriptClass);
>> +                synchronized (parsedScripts) {
>> +                    Class<?>  scriptClassCached = parsedScripts.get(script);
>> +                    if (scriptClassCached == null) {
>> +                        if (Debug.verboseOn()) {
>> +                            Debug.logVerbose("Caching Groovy script: " + script, module);
>> +                        }
>> +                        parsedScripts.put(script, scriptClass);
>> +                    } else {
>> +                        scriptClass = scriptClassCached;
>> +                    }
>> +                }
>>              }
>> -
>>              return InvokerHelper.createScript(scriptClass, getBinding(context)).run();
>>          } catch (CompilationFailedException e) {
>>              String errMsg = "Error loading Groovy script [" + script + "]: " + e.toString();
>
> Er, no, this is the *wrong* way to fix this.  parsedScripts is a UtilCache instance, not a Map, not even a HashMap.  It is *designed* to not corrupt without synchronization.
>
> The proper fix is:
>
> Class scriptClass = parsedScripts.get(script);
> if (scriptClass == null) {
>  scriptClass = genClass();
>  parsedScripts.putIfAbsent(script, scriptClass);
>  scriptClass = parsedScripts.get(script);
> }
>

Yeah, you are 100% right that your above is functionally equivalent (it fixes the bug too) but your is more concise and elegant. The only reason I implemented a custom check-then-act pattern (this is not a DCL!) is that I didn't notice that we were using a thread safe class: I know it may sound silly, but I was working on a similar fix for the freemarker code and in freemarker they were using an HashMap for the cache so I have implemented a similar code that I then copied over to OFBiz.

> What was the real problem you were attempting to fix?  The only issue that would have happened with the previous code was the case of a static variable being defined in the parsed script, and during a race condition while parsing, 2 separate classes for the same script being in the system.

The problem was that the same class was parsed by different threads and then added to the cache multiple times (of course each time overriding the same value as the key was the same): this is harmless in OFBiz but Freemarker interpreted this as if the class was changed and this triggered a complete clearing of cache; and, because of a bug in freemarker this could cause a deadlock (in freemarker).

As I mentioned, I have sent a fix to Freemarker for this, but if you are interested to the details you can refer to:

https://sourceforge.net/tracker/?func=detail&aid=3519805&group_id=794&atid=100794
https://sourceforge.net/mailarchive/message.php?msg_id=29148020

>
> This change is in the release branch.  I'd like to know the real reason this was changed in ofbiz, and then my fix applied to the branch.

I can update the trunk and release branches with code that actually takes into account that the parsedScript variable is an instance of a thread safe class... then you can review my changes.

>
> ps: I can't recommend it enough, but in addition to the entity model data resource books, everyone working on ofbiz should also be reading java concurrency in practice.

"Java Concurrency In Action" is a great book, one of the best technical books I have ever read, and it is always in my desk.

>
> pps: It is my goal to remove all the DCL in ofbiz; I'd love it if others would help.

I completely agree with this goal, but, as I mentioned above, the code I wrote above doesn't have anything to do with the double-checked locking anti-pattern. But I am with you completely when we talk about properly implementing concurrency in OFBiz... there is a lot of code that concerns me greatly (see for example ImageManagementServices.checkExistsImage(...)).

Kind regards,

Jacopo

Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Adrian Crum-3
In reply to this post by Adam Heath-2
On 5/11/2012 3:52 AM, Adam Heath wrote:

> On 04/20/2012 07:53 AM, [hidden email] wrote:
>> Author: jacopoc
>> Date: Fri Apr 20 12:53:35 2012
>> New Revision: 1328356
>>
>> URL: http://svn.apache.org/viewvc?rev=1328356&view=rev
>> Log:
>> The cache of parsed Groovy scripts was not thread safe; this issue,
>> in instances with several concurrent threads running the same script
>> the first time (i.e. not cached) could cause the same script parsed
>> multiple times and then added to the cache (overriding the previous
>> value); this event was causing the clearing of caches in Freemarker;
>> because of a bug in Freemarker [*] this could cause a deadlock.
>>   The issue is present on all versions of Freemarker but it is less
>> frequent on latest version because of the refactoring of caches
>> happened after release 2.3.10.
>>
>> [*]
>> https://sourceforge.net/tracker/?func=detail&aid=3519805&group_id=794&atid=100794
>>
>> Modified:
>>      ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
>>
>> Modified:
>> ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
>> URL:
>> http://svn.apache.org/viewvc/ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java?rev=1328356&r1=1328355&r2=1328356&view=diff
>> ==============================================================================
>>
>> ---
>> ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
>> (original)
>> +++
>> ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
>> Fri Apr 20 12:53:35 2012
>> @@ -127,10 +127,17 @@ public class GroovyUtil {
>>                   } else {
>>                       scriptClass =
>> parseClass(scriptUrl.openStream(), location);
>>                   }
>> -                if (Debug.verboseOn()) {
>> -                    Debug.logVerbose("Caching Groovy script at: " +
>> location, module);
>> +                synchronized (parsedScripts) {
>> +                    Class<?>  scriptClassCached =
>> parsedScripts.get(location);
>> +                    if (scriptClassCached == null) {
>> +                        if (Debug.verboseOn()) {
>> +                            Debug.logVerbose("Caching Groovy script
>> at: " + location, module);
>> +                        }
>> +                        parsedScripts.put(location, scriptClass);
>> +                    } else {
>> +                        scriptClass = scriptClassCached;
>> +                    }
>>                   }
>> -                parsedScripts.put(location, scriptClass);
>>               }
>>               return scriptClass;
>>           } catch (Exception e) {
>> @@ -177,10 +184,18 @@ public class GroovyUtil {
>>               Class<?>  scriptClass = parsedScripts.get(script);
>>               if (scriptClass == null) {
>>                   scriptClass = loadClass(script);
>> -                if (Debug.verboseOn()) Debug.logVerbose("Caching
>> Groovy script: " + script, module);
>> -                parsedScripts.put(script, scriptClass);
>> +                synchronized (parsedScripts) {
>> +                    Class<?>  scriptClassCached =
>> parsedScripts.get(script);
>> +                    if (scriptClassCached == null) {
>> +                        if (Debug.verboseOn()) {
>> +                            Debug.logVerbose("Caching Groovy script:
>> " + script, module);
>> +                        }
>> +                        parsedScripts.put(script, scriptClass);
>> +                    } else {
>> +                        scriptClass = scriptClassCached;
>> +                    }
>> +                }
>>               }
>> -
>>               return InvokerHelper.createScript(scriptClass,
>> getBinding(context)).run();
>>           } catch (CompilationFailedException e) {
>>               String errMsg = "Error loading Groovy script [" +
>> script + "]: " + e.toString();
>
> Er, no, this is the *wrong* way to fix this.  parsedScripts is a
> UtilCache instance, not a Map, not even a HashMap.  It is *designed*
> to not corrupt without synchronization.
>
> The proper fix is:
>
> Class scriptClass = parsedScripts.get(script);
> if (scriptClass == null) {
>   scriptClass = genClass();
>   parsedScripts.putIfAbsent(script, scriptClass);
>   scriptClass = parsedScripts.get(script);
> }
>
> What was the real problem you were attempting to fix?  The only issue
> that would have happened with the previous code was the case of a
> static variable being defined in the parsed script, and during a race
> condition while parsing, 2 separate classes for the same script being
> in the system.
>
> This change is in the release branch.  I'd like to know the real
> reason this was changed in ofbiz, and then my fix applied to the branch.
>
> ps: I can't recommend it enough, but in addition to the entity model
> data resource books, everyone working on ofbiz should also be reading
> java concurrency in practice.
>
> pps: It is my goal to remove all the DCL in ofbiz; I'd love it if
> others would help.

I remove DCL when I come across it, but I haven't turned it into a task.
My recent work on Mini-language fixed some concurrency issues.

-Adrian

Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Jacopo Cappellato-4
In reply to this post by Adam Heath-2
On May 11, 2012, at 4:52 AM, Adam Heath wrote:

> This change is in the release branch.  I'd like to know the real reason this was changed in ofbiz, and then my fix applied to the branch.

I have backported a new version of my fix to 12.04 (in rev. 1337054) and 11.04 (in rev. 1337097); I didn't backport to 10.04 because putIfAbsent was implemented after the branch was created and it is not available there (but the old version of my fix is there and will be fine).

Jacopo

Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Adam Heath-2
In reply to this post by Jacopo Cappellato-4
On 05/11/2012 12:40 AM, Jacopo Cappellato wrote:

> Hi Adam,
>
> please see inline:
>
> On May 11, 2012, at 4:52 AM, Adam Heath wrote:
>
>> On 04/20/2012 07:53 AM, [hidden email] wrote:
>>> Author: jacopoc
>>> Date: Fri Apr 20 12:53:35 2012
>>> New Revision: 1328356
>>>
>>> URL: http://svn.apache.org/viewvc?rev=1328356&view=rev
>>> Log:
>>> The cache of parsed Groovy scripts was not thread safe; this issue, in instances with several concurrent threads running the same script the first time (i.e. not cached) could cause the same script parsed multiple times and then added to the cache (overriding the previous value); this event was causing the clearing of caches in Freemarker; because of a bug in Freemarker [*] this could cause a deadlock.
>>>   The issue is present on all versions of Freemarker but it is less frequent on latest version because of the refactoring of caches happened after release 2.3.10.
>>>
>>> [*] https://sourceforge.net/tracker/?func=detail&aid=3519805&group_id=794&atid=100794
>>>
>>> Modified:
>>>      ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
>>>
>>> Modified: ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
>>> URL: http://svn.apache.org/viewvc/ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java?rev=1328356&r1=1328355&r2=1328356&view=diff
>>> ==============================================================================
>>> --- ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java (original)
>>> +++ ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java Fri Apr 20 12:53:35 2012
>>> @@ -127,10 +127,17 @@ public class GroovyUtil {
>>>                   } else {
>>>                       scriptClass = parseClass(scriptUrl.openStream(), location);
>>>                   }
>>> -                if (Debug.verboseOn()) {
>>> -                    Debug.logVerbose("Caching Groovy script at: " + location, module);
>>> +                synchronized (parsedScripts) {
>>> +                    Class<?>   scriptClassCached = parsedScripts.get(location);
>>> +                    if (scriptClassCached == null) {
>>> +                        if (Debug.verboseOn()) {
>>> +                            Debug.logVerbose("Caching Groovy script at: " + location, module);
>>> +                        }
>>> +                        parsedScripts.put(location, scriptClass);
>>> +                    } else {
>>> +                        scriptClass = scriptClassCached;
>>> +                    }
>>>                   }
>>> -                parsedScripts.put(location, scriptClass);
>>>               }
>>>               return scriptClass;
>>>           } catch (Exception e) {
>>> @@ -177,10 +184,18 @@ public class GroovyUtil {
>>>               Class<?>   scriptClass = parsedScripts.get(script);
>>>               if (scriptClass == null) {
>>>                   scriptClass = loadClass(script);
>>> -                if (Debug.verboseOn()) Debug.logVerbose("Caching Groovy script: " + script, module);
>>> -                parsedScripts.put(script, scriptClass);
>>> +                synchronized (parsedScripts) {
>>> +                    Class<?>   scriptClassCached = parsedScripts.get(script);
>>> +                    if (scriptClassCached == null) {
>>> +                        if (Debug.verboseOn()) {
>>> +                            Debug.logVerbose("Caching Groovy script: " + script, module);
>>> +                        }
>>> +                        parsedScripts.put(script, scriptClass);
>>> +                    } else {
>>> +                        scriptClass = scriptClassCached;
>>> +                    }
>>> +                }
>>>               }
>>> -
>>>               return InvokerHelper.createScript(scriptClass, getBinding(context)).run();
>>>           } catch (CompilationFailedException e) {
>>>               String errMsg = "Error loading Groovy script [" + script + "]: " + e.toString();
>>
>> Er, no, this is the *wrong* way to fix this.  parsedScripts is a UtilCache instance, not a Map, not even a HashMap.  It is *designed* to not corrupt without synchronization.
>>
>> The proper fix is:
>>
>> Class scriptClass = parsedScripts.get(script);
>> if (scriptClass == null) {
>>   scriptClass = genClass();
>>   parsedScripts.putIfAbsent(script, scriptClass);
>>   scriptClass = parsedScripts.get(script);
>> }
>>
>
> Yeah, you are 100% right that your above is functionally equivalent (it fixes the bug too) but your is more concise and elegant. The only reason I implemented a custom check-then-act pattern (this is not a DCL!) is that I didn't notice that we were using a thread safe class: I know it may sound silly, but I was working on a similar fix for the freemarker code and in freemarker they were using an HashMap for the cache so I have implemented a similar code that I then copied over to OFBiz.
>
>> What was the real problem you were attempting to fix?  The only issue that would have happened with the previous code was the case of a static variable being defined in the parsed script, and during a race condition while parsing, 2 separate classes for the same script being in the system.
>
> The problem was that the same class was parsed by different threads and then added to the cache multiple times (of course each time overriding the same value as the key was the same): this is harmless in OFBiz but Freemarker interpreted this as if the class was changed and this triggered a complete clearing of cache; and, because of a bug in freemarker this could cause a deadlock (in freemarker).
>
> As I mentioned, I have sent a fix to Freemarker for this, but if you are interested to the details you can refer to:
>
> https://sourceforge.net/tracker/?func=detail&aid=3519805&group_id=794&atid=100794
> https://sourceforge.net/mailarchive/message.php?msg_id=29148020

Haven't yet looked at those urls, but just based on your description,
the only way to fix code that has broken locking is to wrap it in
synchronized.  If you use non-locking techniques, the code being called
might still be run in parallel, which won't help the situation.

Let me look into this further before you this code is changed.

If I find out that there is a fixed freemarker(or, I come up with a
patch), then I assume we'd be for including that fixed freemarker.  What
about backporting that to other branches(just the locking fix for
freemarker, plus whatever locking fix in ofbiz)?

.>> This change is in the release branch.  I'd like to know the real
reason this was changed in ofbiz, and then my fix applied to the branch.

>
> I can update the trunk and release branches with code that actually takes into account that the parsedScript variable is an instance of a thread safe class... then you can review my changes.
>
>>
>> ps: I can't recommend it enough, but in addition to the entity model data resource books, everyone working on ofbiz should also be reading java concurrency in practice.
>
> "Java Concurrency In Action" is a great book, one of the best technical books I have ever read, and it is always in my desk.
>
>>
>> pps: It is my goal to remove all the DCL in ofbiz; I'd love it if others would help.
>
> I completely agree with this goal, but, as I mentioned above, the code I wrote above doesn't have anything to do with the double-checked locking anti-pattern. But I am with you completely when we talk about properly implementing concurrency in OFBiz... there is a lot of code that concerns me greatly (see for example ImageManagementServices.checkExistsImage(...)).

The code above that I originally quoted is doing DCL.

Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Jacopo Cappellato-4
On May 11, 2012, at 3:53 PM, Adam Heath wrote:

> Haven't yet looked at those urls, but just based on your description, the only way to fix code that has broken locking is to wrap it in synchronized.  If you use non-locking techniques, the code being called might still be run in parallel, which won't help the situation.
>
> Let me look into this further before you this code is changed.
>
> If I find out that there is a fixed freemarker(or, I come up with a patch), then I assume we'd be for including that fixed freemarker.  What about backporting that to other branches(just the locking fix for freemarker, plus whatever locking fix in ofbiz)?

I already proposed a fix for the deadlock condition in freemarker, they asked me to send a signed icla (I already sent it to them) and we will have to wait a bit before it is committed.
The fix I did in OFBiz fixes exactly what I described:
* before my fix the same script could be parsed multiple times and added to the cache by several threads running a script not in the cache: at the end the parsed script in the cache was the one processed by the slowest thread (i.e. each thread was overriding the value put by the previous thread)
* after my fix the same script could be parsed multiple times by several threads running a script not in the cache; but only the fastest thread is now adding a value to the cache: at the end the parsed script in the cache is the one processed by the fastest thread

Please read the whole story if you are interested but at this point the problem (yes, the problem was real) is resolved.

Jacopo

Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Jacopo Cappellato-4
In reply to this post by Adam Heath-2

On May 11, 2012, at 3:53 PM, Adam Heath wrote:

>
> The code above that I originally quoted is doing DCL.

You are wrong; the DCL (anti) pattern is:

if (sharedResource == null) {
    synchronized(lock) {
        sharedResource = new Resource();
    }
}

what I did is completely different because scriptClass is a *local* variable; please read carefully.

Jacopo

Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Adam Heath-2
In reply to this post by Adam Heath-2
On 05/11/2012 08:53 AM, Adam Heath wrote:

> On 05/11/2012 12:40 AM, Jacopo Cappellato wrote:
>> Hi Adam,
>>
>> please see inline:
>>
>> On May 11, 2012, at 4:52 AM, Adam Heath wrote:
>>
>>> On 04/20/2012 07:53 AM, [hidden email] wrote:
>>>> Author: jacopoc
>>>> Date: Fri Apr 20 12:53:35 2012
>>>> New Revision: 1328356
>>>>
>>>> URL: http://svn.apache.org/viewvc?rev=1328356&view=rev
>>>> Log:
>>>> The cache of parsed Groovy scripts was not thread safe; this issue,
>>>> in instances with several concurrent threads running the same script
>>>> the first time (i.e. not cached) could cause the same script parsed
>>>> multiple times and then added to the cache (overriding the previous
>>>> value); this event was causing the clearing of caches in Freemarker;
>>>> because of a bug in Freemarker [*] this could cause a deadlock.
>>>> The issue is present on all versions of Freemarker but it is less
>>>> frequent on latest version because of the refactoring of caches
>>>> happened after release 2.3.10.
>>>>
>>>> [*]
>>>> https://sourceforge.net/tracker/?func=detail&aid=3519805&group_id=794&atid=100794
>>>>
>>>>
>>>> Modified:
>>>> ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
>>>>
>>>> Modified:
>>>> ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
>>>> URL:
>>>> http://svn.apache.org/viewvc/ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java?rev=1328356&r1=1328355&r2=1328356&view=diff
>>>>
>>>> ==============================================================================
>>>>
>>>> ---
>>>> ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
>>>> (original)
>>>> +++
>>>> ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java
>>>> Fri Apr 20 12:53:35 2012
>>>> @@ -127,10 +127,17 @@ public class GroovyUtil {
>>>> } else {
>>>> scriptClass = parseClass(scriptUrl.openStream(), location);
>>>> }
>>>> - if (Debug.verboseOn()) {
>>>> - Debug.logVerbose("Caching Groovy script at: " + location, module);
>>>> + synchronized (parsedScripts) {
>>>> + Class<?> scriptClassCached = parsedScripts.get(location);
>>>> + if (scriptClassCached == null) {
>>>> + if (Debug.verboseOn()) {
>>>> + Debug.logVerbose("Caching Groovy script at: " + location, module);
>>>> + }
>>>> + parsedScripts.put(location, scriptClass);
>>>> + } else {
>>>> + scriptClass = scriptClassCached;
>>>> + }
>>>> }
>>>> - parsedScripts.put(location, scriptClass);
>>>> }
>>>> return scriptClass;
>>>> } catch (Exception e) {

Jacopo:  this is not DCL.

>>>> @@ -177,10 +184,18 @@ public class GroovyUtil {
>>>> Class<?> scriptClass = parsedScripts.get(script);
>>>> if (scriptClass == null) {
>>>> scriptClass = loadClass(script);
>>>> - if (Debug.verboseOn()) Debug.logVerbose("Caching Groovy script: "
>>>> + script, module);
>>>> - parsedScripts.put(script, scriptClass);
>>>> + synchronized (parsedScripts) {
>>>> + Class<?> scriptClassCached = parsedScripts.get(script);
>>>> + if (scriptClassCached == null) {
>>>> + if (Debug.verboseOn()) {
>>>> + Debug.logVerbose("Caching Groovy script: " + script, module);
>>>> + }
>>>> + parsedScripts.put(script, scriptClass);
>>>> + } else {
>>>> + scriptClass = scriptClassCached;
>>>> + }
>>>> + }
>>>> }
>>>> -

This is DCL.
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Jacopo Cappellato-4

On May 11, 2012, at 4:26 PM, Adam Heath wrote:

>>>>> @@ -177,10 +184,18 @@ public class GroovyUtil {
>>>>> Class<?> scriptClass = parsedScripts.get(script);
>>>>> if (scriptClass == null) {
>>>>> scriptClass = loadClass(script);
>>>>> - if (Debug.verboseOn()) Debug.logVerbose("Caching Groovy script: "
>>>>> + script, module);
>>>>> - parsedScripts.put(script, scriptClass);
>>>>> + synchronized (parsedScripts) {
>>>>> + Class<?> scriptClassCached = parsedScripts.get(script);
>>>>> + if (scriptClassCached == null) {
>>>>> + if (Debug.verboseOn()) {
>>>>> + Debug.logVerbose("Caching Groovy script: " + script, module);
>>>>> + }
>>>>> + parsedScripts.put(script, scriptClass);
>>>>> + } else {
>>>>> + scriptClass = scriptClassCached;
>>>>> + }
>>>>> + }
>>>>> }
>>>>> -
>
> This is DCL.

Oh no, you are wrong Adam: it would be DCL if scriptClass was a shared variable, which is not: it is a *local* variable.

Jacopo
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Adam Heath-2
On 05/11/2012 09:36 AM, Jacopo Cappellato wrote:

>
> On May 11, 2012, at 4:26 PM, Adam Heath wrote:
>
>>>>>> @@ -177,10 +184,18 @@ public class GroovyUtil {
>>>>>> Class<?> scriptClass = parsedScripts.get(script);
>>>>>> if (scriptClass == null) {
>>>>>> scriptClass = loadClass(script);
>>>>>> - if (Debug.verboseOn()) Debug.logVerbose("Caching Groovy script: "
>>>>>> + script, module);
>>>>>> - parsedScripts.put(script, scriptClass);
>>>>>> + synchronized (parsedScripts) {
>>>>>> + Class<?> scriptClassCached = parsedScripts.get(script);
>>>>>> + if (scriptClassCached == null) {
>>>>>> + if (Debug.verboseOn()) {
>>>>>> + Debug.logVerbose("Caching Groovy script: " + script, module);
>>>>>> + }
>>>>>> + parsedScripts.put(script, scriptClass);
>>>>>> + } else {
>>>>>> + scriptClass = scriptClassCached;
>>>>>> + }
>>>>>> + }
>>>>>> }
>>>>>> -
>>
>> This is DCL.
>
> Oh no, you are wrong Adam: it would be DCL if scriptClass was a shared variable, which is not: it is a *local* variable.

==
result = parsedScripts.get(key)
if (result == null) {
  syncyronized (parsedScripts) {
    result = parsedScripts.get(key)
    if (result == null) {
      result = genResult()
      parsedScripts.put(key, result)
    }
  }
}
return result
==

DCL is about the protected resource, parsedScripts.  Please try again.

The clue here it to look at what is being protected by the
synchronized keyword.
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Adam Heath-2
On 05/11/2012 09:51 AM, Adam Heath wrote:

> On 05/11/2012 09:36 AM, Jacopo Cappellato wrote:
>>
>> On May 11, 2012, at 4:26 PM, Adam Heath wrote:
>>
>>>>>>> @@ -177,10 +184,18 @@ public class GroovyUtil {
>>>>>>> Class<?> scriptClass = parsedScripts.get(script);
>>>>>>> if (scriptClass == null) {
>>>>>>> scriptClass = loadClass(script);
>>>>>>> - if (Debug.verboseOn()) Debug.logVerbose("Caching Groovy script: "
>>>>>>> + script, module);
>>>>>>> - parsedScripts.put(script, scriptClass);
>>>>>>> + synchronized (parsedScripts) {
>>>>>>> + Class<?> scriptClassCached = parsedScripts.get(script);
>>>>>>> + if (scriptClassCached == null) {
>>>>>>> + if (Debug.verboseOn()) {
>>>>>>> + Debug.logVerbose("Caching Groovy script: " + script, module);
>>>>>>> + }
>>>>>>> + parsedScripts.put(script, scriptClass);
>>>>>>> + } else {
>>>>>>> + scriptClass = scriptClassCached;
>>>>>>> + }
>>>>>>> + }
>>>>>>> }
>>>>>>> -
>>>
>>> This is DCL.
>>
>> Oh no, you are wrong Adam: it would be DCL if scriptClass was a shared variable, which is not: it is a *local* variable.

And of course scriptClass needs to locking, because of *course* it is
local.  But parsedScripts is shared, needs protection, but the pattern
you introduced is the DCL.  DCL is about protecting shared resources.

> ==
> result = parsedScripts.get(key)
> if (result == null) {
>   syncyronized (parsedScripts) {
>     result = parsedScripts.get(key)
>     if (result == null) {
>       result = genResult()
>       parsedScripts.put(key, result)
>     }
>   }
> }
> return result
> ==
>
> DCL is about the protected resource, parsedScripts.  Please try again.
>
> The clue here it to look at what is being protected by the
> synchronized keyword.
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Jacopo Cappellato-4
In reply to this post by Adam Heath-2
On May 11, 2012, at 4:51 PM, Adam Heath wrote:

> ==
> result = parsedScripts.get(key)
> if (result == null) {
>  syncyronized (parsedScripts) {
>    result = parsedScripts.get(key)
>    if (result == null) {
>      result = genResult()
>      parsedScripts.put(key, result)
>    }
>  }
> }
> return result
> ==
>
> DCL is about the protected resource, parsedScripts.  Please try again.
>
> The clue here it to look at what is being protected by the
> synchronized keyword.

Just to be sure I understand: are you saying that the above code is an example of the DCL anti-pattern and it is wrong?

Jacopo
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Adam Heath-2
On 05/11/2012 10:30 AM, Jacopo Cappellato wrote:

> On May 11, 2012, at 4:51 PM, Adam Heath wrote:
>
>> ==
>> result = parsedScripts.get(key)
>> if (result == null) {
>>  syncyronized (parsedScripts) {
>>    result = parsedScripts.get(key)
>>    if (result == null) {
>>      result = genResult()
>>      parsedScripts.put(key, result)
>>    }
>>  }
>> }
>> return result
>> ==
>>
>> DCL is about the protected resource, parsedScripts.  Please try again.
>>
>> The clue here it to look at what is being protected by the
>> synchronized keyword.
>
> Just to be sure I understand: are you saying that the above code is an example of the DCL anti-pattern and it is wrong?

I'm not saying that.  Others are saying that, and I'm just repeating it.

http://en.wikipedia.org/wiki/Double-checked_locking

==
class Foo {
    private Helper helper = null;
    public Helper getHelper() {
        if (helper == null) {
            synchronized(this) {
                if (helper == null) {
                    helper = new Helper();
                }
            }
        }
        return helper;
    }

    // other functions and members...
}
==

In my example, the protected variable is parsedScripts(instead of
this), the get/put on the map is setting the variable on the class.
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Jacopo Cappellato-4

On May 11, 2012, at 5:38 PM, Adam Heath wrote:

> On 05/11/2012 10:30 AM, Jacopo Cappellato wrote:
>> On May 11, 2012, at 4:51 PM, Adam Heath wrote:
>>
>>> ==
>>> result = parsedScripts.get(key)
>>> if (result == null) {
>>> syncyronized (parsedScripts) {
>>>   result = parsedScripts.get(key)
>>>   if (result == null) {
>>>     result = genResult()
>>>     parsedScripts.put(key, result)
>>>   }
>>> }
>>> }
>>> return result
>>> ==
>>>
>>> DCL is about the protected resource, parsedScripts.  Please try again.
>>>
>>> The clue here it to look at what is being protected by the
>>> synchronized keyword.
>>
>> Just to be sure I understand: are you saying that the above code is an example of the DCL anti-pattern and it is wrong?
>
> I'm not saying that.  Others are saying that, and I'm just repeating it.
>
> http://en.wikipedia.org/wiki/Double-checked_locking
>
> ==
> class Foo {
>    private Helper helper = null;
>    public Helper getHelper() {
>        if (helper == null) {
>            synchronized(this) {
>                if (helper == null) {
>                    helper = new Helper();
>                }
>            }
>        }
>        return helper;
>    }
>
>    // other functions and members...
> }
> ==
>
> In my example, the protected variable is parsedScripts(instead of
> this), the get/put on the map is setting the variable on the class.

Adam, don't you really see the big difference between the code from Wikipedia and your example?
The unprotected line:

> if (helper == null) {


is the real issue there (i.e. the DCL anti pattern) because another thread could acquire a lock on helper and initialize it; the thread executing the line if (helper == null) { would see a stale/incomplete value at that point.
On the other hand your example (that is basically what I did in my original code) is not doing an unprotected if (parsedScripts == null) check; it is instead checking a local variable if (result == null)

Inside the synchronized block of your example I see the equivalent of a putIfAbsent method; so your code can be simplified in the equivalent version:

result = parsedScripts.get(key);
if (result == null) {
 result = genResult();
 parsedScripts.putIfAbsent(key, result);
 result = parsedScripts.get(key);
}
return result;

Do we agree that this code is equivalent (well, it is slightly less efficient but it is equivalent as regards the concurrency issue we are discussing)?
There is nothing wrong in this code and if you compare it with the fix you suggested in your firs email you will see that they are exactly the same.

Jacopo


Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Jacopo Cappellato-4

On May 11, 2012, at 5:49 PM, Jacopo Cappellato wrote:

> is the real issue there (i.e. the DCL anti pattern) because another thread could acquire a lock on helper and initialize it;

oops; errata corrige:

is the real issue there (i.e. the DCL anti pattern) because another thread could acquire a lock on this and initialize helper;

Jacopo
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Adam Heath-2
In reply to this post by Jacopo Cappellato-4
On 05/11/2012 10:49 AM, Jacopo Cappellato wrote:

>
> On May 11, 2012, at 5:38 PM, Adam Heath wrote:
>
>> On 05/11/2012 10:30 AM, Jacopo Cappellato wrote:
>>> On May 11, 2012, at 4:51 PM, Adam Heath wrote:
>>>
>>>> ==
>>>> result = parsedScripts.get(key)
>>>> if (result == null) {
>>>> syncyronized (parsedScripts) {
>>>>   result = parsedScripts.get(key)
>>>>   if (result == null) {
>>>>     result = genResult()
>>>>     parsedScripts.put(key, result)
>>>>   }
>>>> }
>>>> }
>>>> return result
>>>> ==
>>>>
>>>> DCL is about the protected resource, parsedScripts.  Please try again.
>>>>
>>>> The clue here it to look at what is being protected by the
>>>> synchronized keyword.
>>>
>>> Just to be sure I understand: are you saying that the above code is an example of the DCL anti-pattern and it is wrong?
>>
>> I'm not saying that.  Others are saying that, and I'm just repeating it.
>>
>> http://en.wikipedia.org/wiki/Double-checked_locking
>>
>> ==
>> class Foo {
>>    private Helper helper = null;
>>    public Helper getHelper() {
>>        if (helper == null) {
>>            synchronized(this) {
>>                if (helper == null) {
>>                    helper = new Helper();
>>                }
>>            }
>>        }
>>        return helper;
>>    }
>>
>>    // other functions and members...
>> }
>> ==
>>
>> In my example, the protected variable is parsedScripts(instead of
>> this), the get/put on the map is setting the variable on the class.
>
> Adam, don't you really see the big difference between the code from Wikipedia and your example?
> The unprotected line:

There is no difference.  The act of map.get() can be doing a *lot* of
separate, individual steps.  Without any kind of
happens-before/happens-after on map.get(), the protected put/get in
the lock could interleave with the unprotected get.

You have to remember that map.put/map.get are *not* single
instructions that can't be threaded.  When you call map.put inside the
lock, just because *that* thread is protected from other threads that
got a null return on the outer map.get(), doesn't mean that *other*
threads who are requested *other* keys won't attempt to all map.get.
In those situations, you can have internal map memory corruption.

Saying again.  2 threads could be running.  One wants "foo", the other
wants "bar".  The first sees no value for "foo", starts generating the
result, then calls map.put("foo", result).  When map.put is running,
it's doing a lot of separate, individual steps.

Now, the second thread comes around, and calls map.get("bar").  The
first thread is still running inside the put.  The second thread is
not in the synchronized block, so it went straight into map.get.  And
now you have 2 threads both running code inside that map that has no
happens-before/happens-after guarantees, so now things blow up.

>> if (helper == null) {
>
>
> is the real issue there (i.e. the DCL anti pattern) because another thread could acquire a lock on helper and initialize it; the thread executing the line if (helper == null) { would see a stale/incomplete value at that point.
> On the other hand your example (that is basically what I did in my original code) is not doing an unprotected if (parsedScripts == null) check; it is instead checking a local variable if (result == null)

You can't get a lock on an unassigned variable.  So th

>
> Inside the synchronized block of your example I see the equivalent of a putIfAbsent method; so your code can be simplified in the equivalent version:

There is no putIfAbsent inside synchronized of any example I have given.

> result = parsedScripts.get(key);
> if (result == null) {
>  result = genResult();
>  parsedScripts.putIfAbsent(key, result);
>  result = parsedScripts.get(key);
> }
> return result;

And, technically, this is wrong with regard to UtilCache, and possibly
even normal ConcurrentMap.

Yet another thread might explicitly *remove* the key between the
putIfAsent and following get; so the above really needs to be done in
a loop.  With UtilCache, an item may be removed *at any time* when
soft-references are used, or when a TTL expires.  Even in such a short
timeframe as the 2 lines above.  Just because those 2 lines are right
next to each other, doesn't mean that the process execution context
will stay running this particular thread.

> Do we agree that this code is equivalent (well, it is slightly less efficient but it is equivalent as regards the concurrency issue we are discussing)?
> There is nothing wrong in this code and if you compare it with the fix you suggested in your firs email you will see that they are exactly the same.

It's not less efficient.  The act of entering the monitored section,
which keeps just *one* thread from ever generating a result, to the
non-locked version, which allows multiple threads to generate results,
is *more* efficient.  Especially when you consider that multiple,
completely separate keys may get requested, that don't overlap at all.
Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Jacopo Cappellato-4

On May 11, 2012, at 6:34 PM, Adam Heath wrote:

>> Inside the synchronized block of your example I see the equivalent of a putIfAbsent method; so your code can be simplified in the equivalent version:
>
> There is no putIfAbsent inside synchronized of any example I have given.

What I meant is that the whole synchronized block is equivalent to a putIfAbsent.

Jacopo

Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Jacopo Cappellato-4
In reply to this post by Adam Heath-2

On May 11, 2012, at 6:34 PM, Adam Heath wrote:

> There is no difference.  The act of map.get() can be doing a *lot* of
> separate, individual steps.  Without any kind of
> happens-before/happens-after on map.get(), the protected put/get in
> the lock could interleave with the unprotected get.

Ok, if you don't want to admit that there is no DCL in my code I will give up; what you describe next is a clearly totally different topic.
But even if a totally different topic I think it is an interesting one because the UtilCache get/put/putIfAbsent methods are used a lot and if they are not thread safe this would be a problem. A problem for my code, your code and a lot of code in OFBiz.
I didn't look too closely at the implementation of the two methods but it seems that they are actually handling the shared status in a thread safe way (even if they are not atomic), so I do not see a risk for reading objects in a stale/invalid status; but you actually did, if I am not wrong, the implementation of these method and you may know better than I.

By the way, all the issues that you are mentioning are actually all present in the fix that you suggested at the beginning of this thread:

Class scriptClass = parsedScripts.get(script);
if (scriptClass == null) {
 scriptClass = genClass();
 parsedScripts.putIfAbsent(script, scriptClass);
 scriptClass = parsedScripts.get(script);
}

while my initial code was safer (because it was wrapping get and put calls into synchronized blocks). But if the UtilCache method put/get/putIfAbsent are not thread safe, now it makes me nervous to call them within synchronized blocks because in them there are also synchronized blocks (risk for deadlocks). The best way to fix this would be to make the methods thread safe (if they are not already as it seems they are).

> Yet another thread might explicitly *remove* the key between the
> putIfAsent and following get; so the above really needs to be done in
> a loop.  With UtilCache, an item may be removed *at any time* when
> soft-references are used, or when a TTL expires.  Even in such a short
> timeframe as the 2 lines above.  Just because those 2 lines are right
> next to each other, doesn't mean that the process execution context
> will stay running this particular thread.

This is a good point and probably the only real issue of your initial code (and in the latest version of my code, the one without synchronized blocks); I would suggest the following fix:

Class scriptClass = parsedScripts.get(script);
if (scriptClass == null) {
 scriptClass = genClass();
 Class cachedScriptClass = null;
 do {
   parsedScripts.putIfAbsent(script, scriptClass);
   cachedScriptClass = parsedScripts.get(script);
 } while(cachedScriptClass != null);
 scriptClass = cachedScriptClass;
}

But of course here I am assuming that UtilCache put/get/putIfAbsent are thread safe.

What do you think?

Jacopo

Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Adam Heath-2
On 05/11/2012 12:30 PM, Jacopo Cappellato wrote:
>
> On May 11, 2012, at 6:34 PM, Adam Heath wrote:
>
>> There is no difference.  The act of map.get() can be doing a *lot* of
>> separate, individual steps.  Without any kind of
>> happens-before/happens-after on map.get(), the protected put/get in
>> the lock could interleave with the unprotected get.
>
> Ok, if you don't want to admit that there is no DCL in my code I will give up; what you describe next is a clearly totally different topic.

I won't admit that, because there IS a DCL in the code you added in
the original commit.

if (!initialized) {
  lock {
    if (!initialized) {
      initialize();
    }
  }
}

That is DCL to.  It doesn't matter if initialized is a simple read of
a variable, a null check, or something more complex(map.get).  They
are all the same *pattern*.

What is at issue, is that the code running before the lock can be
interleaved with the code run inside the lock.  This is because the
code outside the lock has no barrier, no guard, no kind of
happens-relationship at all with the code inside the lock.  All the
lock helps protect is 2 threads trying to run the internal block.  But
that external block can run at any time.

> But even if a totally different topic I think it is an interesting one because the UtilCache get/put/putIfAbsent methods are used a lot and if they are not thread safe this would be a problem. A problem for my code, your code and a lot of code in OFBiz.

The UtilCache objects *are* thread-safe.  You will not get any
internal corruption inside UtilCache.  That is all that can be guaranteed.

What isn't thread safe, is that calling code may end up having
multiple instances for the same key, unless calling code is written
correctly.

The changes I did to remove synchronized from most places in UtilCache
were correct, it reduced contention.  Each method on UtilCache still
has correct happens-before/happens-since, both before and after
removing synchronized.

Any such issues in calling code were always there previously; it's
just that because of synchronized, it slowed down threads enough so
that you wouldn't see bad code as often.  Remove the contention,
things happen quicker, and so do any bugs.

Why do you think java 1.1 introduced HashMap, to replace Hashtable?
Hashtable used to have synchronized on get/put, but in reality, that
did *nothing* to protect calling code that had *separate* calls to
get/put/containsKey/whatever.  It is the *external* code that needed
the lock, not the Hashtable.

Also, an unprotected map.entrySet().iterator() can have corruption if
other threads are mutating the structure of the map.  In those case,
external synchronization needs to happen between the iteration, *and*
the get/put/remove.  If you read the ConcurrentMap(and friends) code,
you'll find references to the iterators failing
fast(ConcurrentModificationException), or not failing at all(you'll
get a read-only snapshot of something that may not reflect the current
values of the Collection).

> I didn't look too closely at the implementation of the two methods but it seems that they are actually handling the shared status in a thread safe way (even if they are not atomic), so I do not see a risk for reading objects in a stale/invalid status; but you actually did, if I am not wrong, the implementation of these method and you may know better than I.

I have seen a map.get/null-check/put DCL as given in your code cause
threads to enter into never-ending loops, as they call .toString(),
which calls .iterator(), which, because of unprotected manipulation,
the hash-bucket links internal to the map become circular.

The first time(it's happened multiple times), I had no clue.  But
then, I added a -XX parameter to allow me to attach to the java
process.  The next time, I dumped the entire memory(that was huge, as
these bugs happen more often during OOM).  I then used IBM's
HeapAnaylzer(nice little swing app to memory dumps, better than jhat)
to see that the hashbucket links were circular.

> By the way, all the issues that you are mentioning are actually all present in the fix that you suggested at the beginning of this thread:
>
> Class scriptClass = parsedScripts.get(script);
> if (scriptClass == null) {
>  scriptClass = genClass();
>  parsedScripts.putIfAbsent(script, scriptClass);
>  scriptClass = parsedScripts.get(script);
> }

parsedScripts is a UtilCache; the methods on that object have
happens-before/happens-after guarantees, due to internally using
ConcurrentMap.  Deep inside ConcurrentMap, you'll find variables
marked volatile, and/or work-stealing/work-completing patterns, that
also again have happens-before/happens-after.

The only issue in the above code, as I have already mentioned, is that
some other background thread may call remove(script) inbetween
putIfAbsent() and the following get().  And the solution for that is
to change the if to a while.

> while my initial code was safer (because it was wrapping get and put calls into synchronized blocks). But if the UtilCache method put/get/putIfAbsent are not thread safe, now it makes me nervous to call them within synchronized blocks because in them there are also synchronized blocks (risk for deadlocks). The best way to fix this would be to make the methods thread safe (if they are not already as it seems they are).

Nope, your code still had the same removal problem.

>> Yet another thread might explicitly *remove* the key between the
>> putIfAsent and following get; so the above really needs to be done in
>> a loop.  With UtilCache, an item may be removed *at any time* when
>> soft-references are used, or when a TTL expires.  Even in such a short
>> timeframe as the 2 lines above.  Just because those 2 lines are right
>> next to each other, doesn't mean that the process execution context
>> will stay running this particular thread.
>
> This is a good point and probably the only real issue of your initial code (and in the latest version of my code, the one without synchronized blocks); I would suggest the following fix:
>
> Class scriptClass = parsedScripts.get(script);
> if (scriptClass == null) {
>  scriptClass = genClass();
>  Class cachedScriptClass = null;
>  do {
>    parsedScripts.putIfAbsent(script, scriptClass);
>    cachedScriptClass = parsedScripts.get(script);
>  } while(cachedScriptClass != null);
>  scriptClass = cachedScriptClass;
> }
>
> But of course here I am assuming that UtilCache put/get/putIfAbsent are thread safe.

They are, just without doing synchronized.

> What do you think?
>
> Jacopo
>

Reply | Threaded
Open this post in threaded view
|

Re: svn commit: r1328356 - /ofbiz/trunk/framework/base/src/org/ofbiz/base/util/GroovyUtil.java

Jacopo Cappellato-4

On May 11, 2012, at 8:00 PM, Adam Heath wrote:

> On 05/11/2012 12:30 PM, Jacopo Cappellato wrote:
>>
>> On May 11, 2012, at 6:34 PM, Adam Heath wrote:
>>
>>> There is no difference.  The act of map.get() can be doing a *lot* of
>>> separate, individual steps.  Without any kind of
>>> happens-before/happens-after on map.get(), the protected put/get in
>>> the lock could interleave with the unprotected get.
>>
>> Ok, if you don't want to admit that there is no DCL in my code I will give up; what you describe next is a clearly totally different topic.
>
> I won't admit that, because there IS a DCL in the code you added in
> the original commit.
>
> if (!initialized) {
>  lock {
>    if (!initialized) {
>      initialize();
>    }
>  }
> }
>

This is not what I implemented; I already explained it in my other emails, please read them carefully.I have to go now but if you can't find the answers from my others email let me know and I will add details during the week end (now I have to go).

> That is DCL to.  It doesn't matter if initialized is a simple read of
> a variable, a null check, or something more complex(map.get).  They
> are all the same *pattern*.
>
> What is at issue, is that the code running before the lock can be
> interleaved with the code run inside the lock.  This is because the
> code outside the lock has no barrier, no guard, no kind of
> happens-relationship at all with the code inside the lock.  All the
> lock helps protect is 2 threads trying to run the internal block.  But
> that external block can run at any time.

There is no need to explain to me why DCL is bad.

>
>> But even if a totally different topic I think it is an interesting one because the UtilCache get/put/putIfAbsent methods are used a lot and if they are not thread safe this would be a problem. A problem for my code, your code and a lot of code in OFBiz.
>
> The UtilCache objects *are* thread-safe.  You will not get any
> internal corruption inside UtilCache.  That is all that can be guaranteed.
>
> What isn't thread safe, is that calling code may end up having
> multiple instances for the same key, unless calling code is written
> correctly.

And so if these methods are thread safe all your comments below don't apply to the code I wrote:

>> Saying again.  2 threads could be running.  One wants "foo", the other
>> wants "bar".  The first sees no value for "foo", starts generating the
>> result, then calls map.put("foo", result).  When map.put is running,
>> it's doing a lot of separate, individual steps.
>>
>> Now, the second thread comes around, and calls map.get("bar").  The
>> first thread is still running inside the put.  The second thread is
>> not in the synchronized block, so it went straight into map.get.  And
>> now you have 2 threads both running code inside that map that has no
>> happens-before/happens-after guarantees, so now things blow up.


>
> The changes I did to remove synchronized from most places in UtilCache
> were correct, it reduced contention.  Each method on UtilCache still
> has correct happens-before/happens-since, both before and after
> removing synchronized.
>

And in fact I greatly appreciate the work you did at that time.

> Any such issues in calling code were always there previously; it's
> just that because of synchronized, it slowed down threads enough so
> that you wouldn't see bad code as often.  Remove the contention,
> things happen quicker, and so do any bugs.
>
> Why do you think java 1.1 introduced HashMap, to replace Hashtable?
> Hashtable used to have synchronized on get/put, but in reality, that
> did *nothing* to protect calling code that had *separate* calls to
> get/put/containsKey/whatever.  It is the *external* code that needed
> the lock, not the Hashtable.
>
> Also, an unprotected map.entrySet().iterator() can have corruption if
> other threads are mutating the structure of the map.  In those case,
> external synchronization needs to happen between the iteration, *and*
> the get/put/remove.  If you read the ConcurrentMap(and friends) code,
> you'll find references to the iterators failing
> fast(ConcurrentModificationException), or not failing at all(you'll
> get a read-only snapshot of something that may not reflect the current
> values of the Collection).

Yeah, I know all of these things: "Java Concurrency In Practice" is actually quite enlightening on these topics.

Regards,

Jacopo

>
>> I didn't look too closely at the implementation of the two methods but it seems that they are actually handling the shared status in a thread safe way (even if they are not atomic), so I do not see a risk for reading objects in a stale/invalid status; but you actually did, if I am not wrong, the implementation of these method and you may know better than I.
>
> I have seen a map.get/null-check/put DCL as given in your code cause
> threads to enter into never-ending loops, as they call .toString(),
> which calls .iterator(), which, because of unprotected manipulation,
> the hash-bucket links internal to the map become circular.
>
> The first time(it's happened multiple times), I had no clue.  But
> then, I added a -XX parameter to allow me to attach to the java
> process.  The next time, I dumped the entire memory(that was huge, as
> these bugs happen more often during OOM).  I then used IBM's
> HeapAnaylzer(nice little swing app to memory dumps, better than jhat)
> to see that the hashbucket links were circular.
>
>> By the way, all the issues that you are mentioning are actually all present in the fix that you suggested at the beginning of this thread:
>>
>> Class scriptClass = parsedScripts.get(script);
>> if (scriptClass == null) {
>> scriptClass = genClass();
>> parsedScripts.putIfAbsent(script, scriptClass);
>> scriptClass = parsedScripts.get(script);
>> }
>
> parsedScripts is a UtilCache; the methods on that object have
> happens-before/happens-after guarantees, due to internally using
> ConcurrentMap.  Deep inside ConcurrentMap, you'll find variables
> marked volatile, and/or work-stealing/work-completing patterns, that
> also again have happens-before/happens-after.
>
> The only issue in the above code, as I have already mentioned, is that
> some other background thread may call remove(script) inbetween
> putIfAbsent() and the following get().  And the solution for that is
> to change the if to a while.
>
>> while my initial code was safer (because it was wrapping get and put calls into synchronized blocks). But if the UtilCache method put/get/putIfAbsent are not thread safe, now it makes me nervous to call them within synchronized blocks because in them there are also synchronized blocks (risk for deadlocks). The best way to fix this would be to make the methods thread safe (if they are not already as it seems they are).
>
> Nope, your code still had the same removal problem.
>
>>> Yet another thread might explicitly *remove* the key between the
>>> putIfAsent and following get; so the above really needs to be done in
>>> a loop.  With UtilCache, an item may be removed *at any time* when
>>> soft-references are used, or when a TTL expires.  Even in such a short
>>> timeframe as the 2 lines above.  Just because those 2 lines are right
>>> next to each other, doesn't mean that the process execution context
>>> will stay running this particular thread.
>>
>> This is a good point and probably the only real issue of your initial code (and in the latest version of my code, the one without synchronized blocks); I would suggest the following fix:
>>
>> Class scriptClass = parsedScripts.get(script);
>> if (scriptClass == null) {
>> scriptClass = genClass();
>> Class cachedScriptClass = null;
>> do {
>>   parsedScripts.putIfAbsent(script, scriptClass);
>>   cachedScriptClass = parsedScripts.get(script);
>> } while(cachedScriptClass != null);
>> scriptClass = cachedScriptClass;
>> }
>>
>> But of course here I am assuming that UtilCache put/get/putIfAbsent are thread safe.
>
> They are, just without doing synchronized.
>
>> What do you think?
>>
>> Jacopo
>>
>

12