> I would probably skip #2. No need for client code to be concerned
> with implementation details. We could just modify the existing
> delegator code to make the best decision on how to handle pagination.
>
> I agree with the rest of it - it sounds like a good plan.
>
> -Adrian
>
> Scott Gray wrote:
>> I can't do that at the moment because I've been modifying the same
>> script repeatedly to test different situations and it's gotten
>> pretty messy. But as soon as I've found a temporary solution to
>> the problem we're having I'll come back to it, clean it up and post
>> it so the discussion can continue with more people running tests.
>> What I have been able to determine so far is that regardless of the
>> database you are using it is ALWAYS faster to retrieve a resultset
>> limited to the records you are actually going to use. The problem
>> is that for pagination we "need" (I'm not sure how badly) to know
>> the full size of the resultset. It turns out there is a magic
>> number of records where it becomes faster to do a separate count
>> query but only if you are using EntityFindOptions.setMaxRows(int)
>> on your ELI along with it. On my machine it is somewhere between
>> 25,000-50,000 records.
>> Keep in mind also that this is only really a problem for screens
>> where we paginate through resultsets, in most other cases we always
>> use the entire resultset.
>> It was taking too long to test different table sizes so I ended up
>> just filling a table in MySql with 1,000,000 records and simulated
>> paginating through it (each result is the same query run 5 times
>> and the times are in milliseconds):
>> Here's the result for the way we currently do it with an ELI, there
>> is only one result because because it takes too long to test and
>> the result doesn't really vary regardless of the viewIndex being 50
>> or 500,000:
>> 0 viewIndex: min 13610, max 37307, avg 27386.2, total 136931
>> Here's the result using an ELI with maxRows and a separate count
>> query:
>> 50 viewIndex: min 363, max 377, avg 371.8, total 1859
>> 100 viewIndex: min 372, max 462, avg 413.4, total 2067
>> 200 viewIndex: min 385, max 412, avg 394.4, total 1972
>> 400 viewIndex: min 378, max 412, avg 392.2, total 1961
>> 800 viewIndex: min 373, max 1044, avg 510, total 2550
>> 1600 viewIndex: min 390, max 405, avg 397.4, total 1987
>> 3200 viewIndex: min 402, max 433, avg 418.6, total 2093
>> 6400 viewIndex: min 425, max 504, avg 449.8, total 2249
>> 12800 viewIndex: min 459, max 648, avg 536, total 2680
>> 25600 viewIndex: min 570, max 1173, avg 705.8, total 3529
>> 51200 viewIndex: min 756, max 1144, avg 958.8, total 4794
>> 102400 viewIndex: min 1252, max 2810, avg 1576, total 7880
>> 204800 viewIndex: min 2123, max 10120, avg 5253.6, total 26268
>> 409600 viewIndex: min 4212, max 10837, avg 7011, total 35055
>> That's for a 1,000,000 records but the ELI by itself is much faster
>> for me on small resultsets but gets progressively slower as the
>> resultset's size increases.
>> Another issue I encountered is that if I run the first portion of
>> this test twice in two separate browser windows at the same time
>> then an out of memory error occurs and the instance locks up until
>> I restart it. Should we be able to recover from an out of memory
>> error or should just we just concentrate on avoiding them?
>> The only solution I can think of so far is to:
>> 1. Add the ability for OFBiz to learn when a query becomes high
>> volume i.e. it's resultsize begins crossing the configurable magic
>> number threshold
>> 2. Add a new method for pagination to the delegator that can
>> decide whether or not to set maxRows based on #1 for the
>> EntityListIterator that it will return
>> 3. Provide the EntityListIterator with the information required to
>> be able to perform a separate count query (it needs the delegator
>> or dao + the where and having conditions or we could just give it a
>> SQLProcessor ready to go)
>> 4. Change the ELI's getResultSizeAfterPartialList to perform a
>> count query if maxRows was set and the info from #3 was provided
>> 5. For forms that use the generic performFind service for
>> paginated results, switch them over to using the performFindList
>> service and change it's implementation to use the new delegator
>> method from #2
>> Any thoughts?
>> Thanks
>> Scott
>> On 27/08/2009, at 3:22 AM, Adrian Crum wrote:
>>> Scott,
>>>
>>> It would be helpful if you could post your script in a Jira issues
>>> so we can run it against various databases. I would like to try it
>>> on ours.
>>>
>>> -Adrian
>>>
>>> --- On Tue, 8/25/09, Scott Gray <
[hidden email]> wrote:
>>>
>>>> From: Scott Gray <
[hidden email]>
>>>> Subject: EntityListIterator.getResultsSizeAfterPartialList() vs.
>>>> delegator.getCountByCondition(...)
>>>> To:
[hidden email]
>>>> Date: Tuesday, August 25, 2009, 9:04 PM
>>>> Hi all,
>>>>
>>>> We've had a few slow query problems lately and I've
>>>> narrowed it down to the ResultSet.last() method call in
>>>> EntityListIterator when performed on large result
>>>> sets. I switched the FindGeneric.groovy script in
>>>> webtools to use findCountByCondition instead of
>>>> getResultSizeAfterPartialList and the page load time went
>>>> from 40-50s down to 2-3s for a result containing 700,000
>>>> records.
>>>>
>>>> Based on that I assumed there was probably some magic
>>>> number depending on your system where it becomes more
>>>> efficient to do a separate count query rather than use the
>>>> ResultSet so I put together a quick test to find out.
>>>> I threw together a script that repeatedly adds 500 rows to
>>>> the jobsandbox and then outputs the average time taken of 3
>>>> attempts to get the list size for each method. Here's
>>>> a graph of the results using embedded Derby:
http://imgur.com/ieR7m>>>>
>>>> So unless the magic number lies somewhere in the first 500
>>>> records it looks to me like it always more efficient to do a
>>>> separate count query.
>>>>
>>>> It makes me wonder if we should be taking a different
>>>> approach to pagination in the form widget and in
>>>> general. Any thoughts?
>>>>
>>>> Thanks
>>>> Scott
>>>>
>>>>
>>>
>>>
>>>
>>>