Login  Register

Re: EntityListIterator.getResultsSizeAfterPartialList() vs. delegator.getCountByCondition(...)

Posted by Scott Gray-2 on Aug 26, 2009; 5:13am
URL: http://ofbiz.116.s1.nabble.com/EntityListIterator-getResultsSizeAfterPartialList-vs-delegator-getCountByCondition-tp206959p206970.html

Hi David,

I did the testing with Derby and in deployment we're seeing the issue  
on MySql but I haven't tested on it as yet and am assuming it's the  
same issue.  I will test it on postgresql shortly out of curiosity.

Anyway thanks for your comments, I'll have a look at the ELI and see  
what can be done.

Regards
Scott

On 26/08/2009, at 4:34 PM, David E Jones wrote:

>
> Which database are you using? Ie, was it only with Derby? I think  
> this topic came up a short while back for another database... but I  
> don't remember if it was ever resolved.
>
> I faintly recall writing before something along these lines: We may  
> want to make this change within the EntityListIterator class itself  
> and add a method that gets the total count of the results, probably  
> by running the same query again with the condition from the original  
> query.
>
> Whatever the case, this may only help on certain databases (ie some  
> databases will have a better JDBC driver and cursor implementation  
> so that seeking to the end will not result in this sort of  
> inefficient behavior), but it probably worth doing in general  
> anyway, and perhaps have it configured on the datasource element in  
> the entityengine.xml file like various other database-dependent  
> settings.
>
> -David
>
>
> On Aug 25, 2009, at 10:04 PM, Scott Gray wrote:
>
>> Hi all,
>>
>> We've had a few slow query problems lately and I've narrowed it  
>> down to the ResultSet.last() method call in EntityListIterator when  
>> performed on large result sets.  I switched the FindGeneric.groovy  
>> script in webtools to use findCountByCondition instead of  
>> getResultSizeAfterPartialList and the page load time went from  
>> 40-50s down to 2-3s for a result containing 700,000 records.
>>
>> Based on that I assumed there was probably some magic number  
>> depending on your system where it becomes more efficient to do a  
>> separate count query rather than use the ResultSet so I put  
>> together a quick test to find out.  I threw together a script that  
>> repeatedly adds 500 rows to the jobsandbox and then outputs the  
>> average time taken of 3 attempts to get the list size for each  
>> method.  Here's a graph of the results using embedded Derby: http://imgur.com/ieR7m
>>
>> So unless the magic number lies somewhere in the first 500 records  
>> it looks to me like it always more efficient to do a separate count  
>> query.
>>
>> It makes me wonder if we should be taking a different approach to  
>> pagination in the form widget and in general.  Any thoughts?
>>
>> Thanks
>> Scott
>>
>


smime.p7s (3K) Download Attachment